Imputation of race/ethnicity to enable measurement of performance by race/ethnicity

Volume 54 | Number 1 | February 2019

Abstract List

Ann Haas M.S.,M.P.H, Marc N. Elliott, Jacob W. Dembosky M.P.M., John L. Adams Ph.D., M.S., Shondelle M. Wilson‐Frederick PhD, Joshua S. Mallett MS, Sarah Gaillot Ph.D., Samuel C. Haffer PhD, Amelia M. Haviland Ph.D.

Objective

To improve an existing method, Medicare Bayesian Improved Surname Geocoding () 1.0 that augments the Centers for Medicare & Medicaid Services’ () administrative measure of race/ethnicity with surname and geographic data to estimate race/ethnicity.

Data Sources/Study Setting

Data from 284 627 respondents to the 2014 Medicare survey.

Study Design

We compared performance (cross‐validated Pearson correlation of estimates and self‐reported race/ethnicity) for several alternative models predicting self‐reported race/ethnicity in cross‐sectional observational data to assess accuracy of estimates, resulting in 2.0. 2.0 adds to 1.0 first name, demographic, and coverage predictors of race/ethnicity and uses a more flexible data aggregation framework.

Data Collection/Extraction Methods

We linked survey‐reported race/ethnicity to administrative and census data.

Principal Findings

2.0 removed 25‐39 percent of the remaining 1.0 error for Hispanics, Whites, and Asian/Pacific Islanders (), and 9 percent for Blacks, resulting in correlations of 0.88 to 0.95 with self‐reported race/ethnicity for these groups.

Conclusions

2.0 represents a substantial improvement over 1.0 and the use of administrative data on race/ethnicity alone. 2.0 is used in ’ public reporting of Medicare Advantage contract measures stratified by race/ethnicity for Hispanics, Whites, , and Blacks.