Predicting Household Welfare Outcomes Using Observable Socio-Economic Characteristics
Summary
This thesis explores the predictive capacity of machine learning models to esti
mate household welfare outcomes using socio-economic characteristics within the
context of Index-Based Livestock Insurance (IBLI). IBLI aims to mitigate the
economic shocks of climate-induced livestock loss by offering insurance based on
satellite data. While previous studies have shown the heterogeneous welfare effects
of IBLI, this thesis investigates whether machine learning methods are effective
at uncovering heterogeneous welfare effects across household subgroups and can
show the most important household characteristics.
Four models have been evaluated: Lasso Regression, TabTransformers, Gen
eralized Random Forest, and Bayesian Ridge Regression, using cattle and goat
datasets. These datasets originate from a dataset with information about herders
in southern Ethiopia who make IBLI purchase decisions for their herd. Results
show that all models performed better on the goat dataset, with TabTransformers
outperforming others in terms of predictive power. Despite modest overall per
formance, key features such as the settlement status of households and trust in
village insurance promoters consistently emerged as key predictors. These find
ings highlight both the capabilities and limitations of applying machine learning
in these contexts.