Volume 54 | Number 4 | August 2019

Abstract List

Yishu Xue PhD, Ofer Harel PhD, Robert H. Aseltine PhD


To improve on existing methods to infer race/ethnicity in health care data through an analysis of birth records from Connecticut.

Data Source

A total of 162 467 Connecticut birth records from 2009 to 2013.

Study Design

We developed a logistic model to predict race/ethnicity using data from Census and patient‐level information. Model performance was tested and compared to previous studies. Five performance measures were used for comparison.

Principal Findings

Our full model correctly classifies 81 percent of subjects and shows improvement over extant methods. We achieved substantially improved sensitivity in predicting black race.


Predictive models using Census information and patients’ demographic characteristics can be used to accurately populate race/ethnicity information in health care databases, enhancing opportunities to investigate and address disparities in access to, utilization of, and outcomes of care.