Consistent with i2b2's Data Sharing Policy, we are pleased to offer the following data set that has been used in published work sponsored by our NCBC.
Extracting Physician Group Intelligence from Electronic Health Records to Support Evidence Based Medicine
Griffin M. Weber, Isaac S. Kohane
Research Article | published 29 May 2013 | PLOS ONE.
Description of Data
The data files contain lists of pairs of lab tests without any PHI. The fields are lab test name, year of initial test, initial test value, minutes until second test, and second test value. For example, a single record might look like:
Test = White Blood Cell (WBC)
Year1 = 1996
Value1 = 8.6
Duration = 43,200
Value2 = 7.0
The files contain a random selection of 100,000 records for each of 97 common lab tests, for a total of 9.7 million records. In addition, for WBC only we include another 1 million records, which also indicate whether the initial and repeat tests were performed in inpatient or outpatient settings, and we provide the patient age in years (up to 89 years old). The data for all files are from 1986 through 2004 from both BWH and MGH, and the data were extracted from the RPDR.
The paper shows that it is possible to derive normal ranges for laboratory test values by examining how frequently clinicians order the tests. The "worse" a test's value, the sooner it is until the same test is repeated. We can identify subpopulations, such as pediatric patients, with different normal ranges than others, and we can show that certain tests might be overused.