Making Use of Multiple Imputation to Analyze Heaped Data
Summary
This paper examined the use of multiple imputation to analyze heaped data. When people are asked to recall certain durations such as unemployment spells, they tend to round their answers off to the nearest year or half year causing abnormal concentrations of response at these durations. In order to model these heaped data, a method is developed which specifes the heaping mechanism and the underlying true model referred to as the estimated underlying model. This model is used to create a new data set using multiple imputation so that new durations are generated for the persons who have rounded off their duration. The recent paper examined whether it is more favourable to obtain the estimates from the estimated underlying model directly or from the method in which multiple imputation is used. A simulation study is performed in which misspecification of the model and misspecification of the heaping mechanism is introduced so that the estimates using the different methods can
be compared. The results show that multiple imputation leads to more precise estimates and is more robust for model misspecification than estimates based directly on the estimated true underlying distribution. Both methods seemed to be robust to misspecification of the heaping intervals.