Household Fertility Forecasting in the Netherlands: Data-Driven Predictions
Summary
Forecasting fertility trends is crucial for understanding demographic shifts and their societal implications. However, accurately predicting fertility patterns remains a challenge due to the complex interplay of economic, social, and individual factors. This study, part of the PreFer Data challenge, proposes a data-driven framework to predict fertility trends in the Netherlands. Leveraging the large-scale longitudinal LISS dataset, this research explores the dataset’s suitability for predicting fertility by identifying key attributes related to demographic information, household characteristics, income, employment, and health metrics. Multiple models, including neural network, random forest, and linear regression classifiers, were trained and evaluated. The methodology involved initial stratified k-Fold Cross-Validation to find an ideal hyperparameter combination, followed by bootstrap resampling to assess the impact of training data size and model robustness. The results demonstrate that AI-driven methods can effectively capture the underlying patterns in fertility data, with models achieving an average F1 score of 0.7 and showing strong 95% confidence interval (CI) values within 0.01, indicating reliable and consistent performance. This provides insights into the predictive potential of the LISS dataset. However, further research is needed to validate these findings across different data sources and incorporate additional relevant attributes to enhance predictive accuracy.