Volume 43 | Number 3 | June 2008

Abstract List

Daniel F. McCaffrey, Marc N. Elliott


Objective

To examine the implications for statistical power of using predicted probabilities for a dichotomous independent variable, rather than the actual variable.


Data Sources/Study Setting

An application uses 271,479 observations from the 2000 to 2002 CAHPS Medicare Fee‐for‐Service surveys.


Study Design and Data

A methodological study with simulation results and a substantive application to previously collected data.


Principle Findings

Researchers often must employ key dichotomous predictors that are unobserved but for which predictions exist. We consider three approaches to such data: the (1); the (2); the (3, PIMLE). The efficiency of (1) (its power relative to testing with the true variable) roughly scales with the square of one less the classification error. The efficiency of (2) roughly scales with the for predicting the unobserved dichotomous variable, and is usually more powerful than (1). Approach (3) is most powerful, but for testing differences in means of 0.2–0.5 standard deviations, (2) is typically more than 95 percent as efficient as (3).


Conclusions

The information loss from not observing actual values of dichotomous predictors can be quite large. Direct substitution is easy to implement and interpret and nearly as efficient as the PIMLE.