Real-time imputation of missing predictor values in clinical practice

Nijman, steven

View/Open

Manuscript research project Steven Nijman.docx (882.9Kb)

Publication date

2022

Author

Nijman, steven

Metadata

Show full item record

Summary

The identification of individual patients at risk of disease has become an integral part of recent trends towards a more personalized healthcare system. A healthcare system that is personalized allows us to administer the most applicable treatment to an individual patient given their risk profile and, in turn, make our healthcare much more efficient. To that end, clinical prediction models are situated as prime candidates to assist clinicians with accurate risk estimates. By harnessing the information captured in various patient or disease related properties, these risk prediction models are able to chart a likely path that a disease might take (i.e., prognosis) or identify whether a specific disease is likely present in individual patients (i.e., diagnosis). Recent efforts to computerize the use of various clinical prediction models in clinical practice have provided clinical decision support systems (CDSS) that are already usable in clinical practice. These CDSS already allow clinicians to potentially inform their clinical decision making by providing individual risk probabilities. However, because currently available risk prediction models require complete information to generate predictions, these models are severely hampered whenever any patient or disease properties are missing. Luckily, the ample guidance that exists on the handling of missing data provides useful stepping stones to develop flexible or missing data handling techniques usable in real-time clinical practice. The development of several imputation methods for missing predictor values in real-time were presented previously. In a case-study with a real-world empirical data set for cardiovascular risk prediction, the accuracy of two common imputation methods which were adjusted for use in real time clinical practice were compared. These consisted of conditional modeling imputation (CMI, where for each predictor a separate multivariable imputation model is derived) and joint modeling imputation (JMI, where we assume all predictors are normally distributed and use the observed patient information to generate imputations for each missing predictor). These methodologies were compared with a method which is often used in practice: mean imputation (where missing values are replaced by the sample mean). Congruent with expectations, simulations found that both JMI and CMI are generally to be recommended in terms of imputation accuracy. As JMI was generally faster and less complex, it was deemed more promising. This previous study evaluated these novel imputation methods strictly on their imputation accuracy in terms of their root mean squared error. In this study we continue with the more promising imputation method (i.e., JMI) and evaluate it using common evaluation methods for prediction models (i.e., discrimination and calibration of the model predictions). We specifically focus on the use of additional, or auxiliary, variables (i.e., variables not part of the prediction model), elaborate further on the idea of imputation model updating and make a comparison with the often-used method mean imputation. In summary, the use of JMI is found to be most beneficial when estimated in local data and with the use of these auxiliary variables. Its added value is most prominent whenever the missing predictors are correlated with other observed (auxiliary) variables.

URI

https://studenttheses.uu.nl/handle/20.500.12932/43354

Collections

Theses