Survival Model Selection with Missing Data and Correlated Covariates

Time

-

Locations

SB 220

Host

Data Science

Speaker

Sydeaka Watson
Research Associate (Assistant Professor), Department of Public Health Sciences, University of Chicago
http://health.bsd.uchicago.edu/People/Watson-Sydeaka

Description

A novel combination of existing methods was used to develop a survival prediction equation for pulmonary arterial hypertension patients awaiting lung transplantation. The Scientific Registry of Transplant Recipients (SRTR) dataset featured censored survival times, missing covariate data, and a large number of highly correlated candidate predictor variables. Penalized weighted least squares regression was repeatedly applied to bootstrap resamples of multiply imputed data, yielding a parsimonious model that satisfied internal validation criteria of clinical interest. Simulation studies under various degrees of predictor variable missing-ness, survival time censoring, effect size, and proportion of variables unrelated to survival have shown that this method tends to accurately recover the true list of Cox regression predictor variables.

Event Topic

Data Science

Tags: