Volume 53 | Number 6 | December 2018

Abstract List

Melissa D. Curtis M.Sc., Sandra D. Griffith Ph.D., Melisa Tucker, Michael D. Taylor PharmD., Ph.D., William B. Capra Ph.D., Gillis Carrigan Ph.D., Ben Holzman, Aracelis Z. Torres Ph.D., M.P.H., Paul You M.P.H., Brandon Arnieri, Amy P. Abernethy


To create a high‐quality electronic health record ()–derived mortality dataset for retrospective and prospective real‐world evidence generation.

Data Sources/Study Setting

Oncology data, supplemented with external commercial and Social Security Death Index data, benchmarked to the National Death Index ().

Study Design

We developed a recent, linkable, high‐quality mortality variable amalgamated from multiple data sources to supplement data, benchmarked against the highest completeness U.S. mortality data, the . Data quality of the mortality variable version 2.0 is reported here.

Principal Findings

For advanced non‐small‐cell lung cancer, sensitivity of mortality information improved from 66 percent in structured data to 91 percent in the composite dataset, with high date agreement compared to the . For advanced melanoma, metastatic colorectal cancer, and metastatic breast cancer, sensitivity of the final variable was 85 to 88 percent. Kaplan–Meier survival analyses showed that improving mortality data completeness minimized overestimation of survival relative to ‐based estimates.


For ‐derived data to yield reliable real‐world evidence, it needs to be of known and sufficiently high quality. Considering the impact of mortality data completeness on survival endpoints, we highlight the importance of data quality assessment and advocate benchmarking to the .