VOLUME 54 | NUMBER 2 | APRIL 2019
Sample selection in the face of design constraints: Use of clustering to define sample strata for qualitative research
Objective: To sample 40 physician organizations stratified on the basis of longitudinal cost of care measures for qualitative interviews in order to describe the range of care delivery structures and processes that are being deployed to influence the total costs of caring for patients.
Data Sources: Three years of physician organizationlevel total cost of care data (n = 156 in California) from the Integrated Healthcare Association's valuebased payforperformance program.
Study Design: We fit total cost of care data using mixture and Kmeans clustering algorithms to segment the population of physician organizations into sampling strata based on 3year cost trajectories (ie, cost curves).
Principal Findings: A mixture of multivariate normal distributions can classify physician organization cost curves into clusters defined by total cost level, shape, and withincluster variation. Kmeans clustering does not accommodate differing levels of withincluster variation and resulted in more clusters being allocated to unstable cost curves. A mixture of regressions approach focuses overly on anomalous trajectories and is sensitive to model coding.
Conclusions: Statistical clustering can be used to form sampling strata when longitudinal measures are of primary interest. Many clustering algorithms are available; the choice of the clustering algorithm can strongly impact the resulting strata because various algorithms focus on different aspects of the observed data.
Copyright© 2018, Health Research & Educational Trust. All rights reserved. Content Disclaimer
Health Research & Educational Trust, 155 North Wacker, 4th Floor Chicago, IL 60606 (312) 422.2600