Iterative Imputation in Python: A study on the performance of the package IterativeImputer

Klomp, Tinke

View/Open

ThesisTinkeKlomp.pdf (281.2Kb)

Publication date

2022

Author

Klomp, Tinke

Metadata

Show full item record

Summary

This study evaluates whether the Python package IterativeImputer can yield valid estimates through iterative imputation of missing data. The performance was analyzed by means of a simulation study and compared to the benchmark methods of iterative imputation with mice in R and complete case analysis. With each simulation repetition data was generated, amputed with varying conditions (e.g. missing data mechanisms and missing data proportions), handled by the three missingness techniques and multiple regression models were estimated. Estimates were evaluated on bias, coverage rate and confidence interval width were pooled and obtained. IterativeImputer generated results that were relatively low in bias. However, the produced coverage rates were found to be below nominal coverage. This may be explained by the confidence interval widths, as they were generally too small to contain the true value of the data. The Python package doesn’t operate as adequately as mice and doesn’t outperform complete case analysis. Therefore, IterativeImputer isn’t suitable as a imputation tool for drawing inferences.

URI

https://studenttheses.uu.nl/handle/20.500.12932/42511

Collections

Theses