Volume 43 | Number 5p1 | October 2008

Abstract List

Marc N. Elliott, Allen Fremont M.D., Peter A. Morrison, Philip Pantoja, Nicole Lurie


To efficiently estimate race/ethnicity using administrative records to facilitate health care organizations' efforts to address disparities when self‐reported race/ethnicity data are unavailable.

Data Source

Surname, geocoded residential address, and self‐reported race/ethnicity from 1,973,362 enrollees of a national health plan.

Study Design

We compare the accuracy of a Bayesian approach to combining surname and geocoded information to estimate race/ethnicity to two other indirect methods: a non‐Bayesian method that combines surname and geocoded information and geocoded information alone. We assess accuracy with respect to estimating (1) individual race/ethnicity and (2) overall racial/ethnic prevalence in a population.

Principal Findings

The Bayesian approach was 74 percent more efficient than geocoding alone in estimating individual race/ethnicity and 56 percent more efficient in estimating the prevalence of racial/ethnic groups, outperforming the non‐Bayesian hybrid on both measures. The non‐Bayesian hybrid was more efficient than geocoding alone in estimating individual race/ethnicity but less efficient with respect to prevalence (<.05 for all differences).


The Bayesian Surname and Geocoding (BSG) method presented here efficiently integrates administrative data, substantially improving upon what is possible with a single source or from other hybrid methods; it offers a powerful tool that can help health care organizations address disparities until self‐reported race/ethnicity data are available.