Modelling Food Insecurity in Ethiopia: Towards a machine learning model that predicts the transitions in food security using scalable features.
Summary
Insecurity concerning food resources has become a more and more serious problem. Therefore an initiative of the Netherlands Cross, called the 510, wants to use data to positively impact faster and more (cost) effective humanitarian aid. The goal of this study is to create a machine learning model (in our case a Xgboost model) that uses scalable features (like satellite imagery) to predict the transitions in food insecurity with the 510. After optimization through resampling, feature engineering and hyperparameter tuning we validated the Xgboost model by comparing it against several baselines. The Xgboost model performance (f1 macro score of 0.526), on average, got close to the benchmark (predictions of Famine Early Warning Systems Network (2019), which had a f1 macro score of 0.637). Nevertheless our model did identify improvements in food security (f1 score of 0.506) of livelihood zones better than the benchmark (f1 score of 0.498). Other results of this study is that the features that the Xgboost model identified as most relevant, corresponds with the study of Misselhorn (2004), like climate and land drivers. Furthermore the performance of the Xgboost model is also better for varying prediction intervals (4 or 12 months ahead) compared to the baselines. Lastly the Xgboost model also revealed that there is a spatial dependency between livelihood zones, since similar predicted livelihood zones seem to be clustered. All in all, this study showed that our Xgboost model has predictive value, which gave new insights but also opens new doors. There is potential to improve it further, by adding more features and taking the spatial dependency into account. This can, in the future, hopefully get us closer to optimizing the decision-making of the humanitarian assistance and give more insight about the complex phenomena of food security.