To improve on existing methods to infer race/ethnicity in health care data through an analysis of birth records from Connecticut.
A total of 162 467 Connecticut birth records from 2009 to 2013.
We developed a logistic model to predict race/ethnicity using data from Census and patient‐level information. Model performance was tested and compared to previous studies. Five performance measures were used for comparison.
Our full model correctly classifies 81 percent of subjects and shows improvement over extant methods. We achieved substantially improved sensitivity in predicting black race.
Predictive models using Census information and patients’ demographic characteristics can be used to accurately populate race/ethnicity information in health care databases, enhancing opportunities to investigate and address disparities in access to, utilization of, and outcomes of care.