Iterative Imputation in Python: A study on the performance of the package IterativeImputer
Summary
This study evaluates whether the Python package IterativeImputer can yield
valid estimates through iterative imputation of missing data. The performance
was analyzed by means of a simulation study and compared to the
benchmark methods of iterative imputation with mice in R and complete case
analysis. With each simulation repetition data was generated, amputed with
varying conditions (e.g. missing data mechanisms and missing data proportions),
handled by the three missingness techniques and multiple regression
models were estimated. Estimates were evaluated on bias, coverage rate and
confidence interval width were pooled and obtained. IterativeImputer generated
results that were relatively low in bias. However, the produced coverage
rates were found to be below nominal coverage. This may be explained by
the confidence interval widths, as they were generally too small to contain
the true value of the data. The Python package doesn’t operate as adequately
as mice and doesn’t outperform complete case analysis. Therefore,
IterativeImputer isn’t suitable as a imputation tool for drawing inferences.