Estimating Post-Earthquake Aid Priority Areas
MetadataShow full item record
In the first days following a disaster, humanitarian decision makers often deal with a scarcity of information on the spatial aspects of the event’s impact, and thus the need for humanitarian aid of the affected population. By learning from data of past events Priority Index Models (PIM’s) can rapidly produce an estimate of a disaster’s impact, which can help decision makers to identify aid priority areas. This enables empirically-based decision support, in contrast to the more subjective models that are currently used. The main objective of this study is to explore the usability of pre- and post-event open data to train a model to rapidly estimate post-earthquake aid neediness for any earthquake prone area on earth. As far as known, machine learning algorithms have not been applied before to predict aid priority areas after seismic hazards specifically. To achieve the research objective the Gorkha earthquake of 2015 in Nepal was used as a test case. Country- and hazard-specific open data related to this earthquake were used to predict aid-neediness. Damage to residential buildings was select as the most suitable aid-neediness indicating variable. Three different statistical models were fitted to the data: a multivariate linear regression model and two random forest regression models (one predicting completely damaged houses and the other predicting a combination of completely and partially damaged houses). 24 variables in four different categories (hazard, exposure, physical vulnerability and socio-economic vulnerability) were identified as predictors of post-earthquake structural damage. All three models could successfully produce an output on administrative level 4 (VDC) for the 16 most affected districts. Statistically, the random forest model predicting bot partially and completely damaged houses performed best with an R-squared of 0.63 on an independent test dataset. However, the random forest model predicting only completely damaged is favourable because the output is more intuitive and extendable. Also, the R-squared is not much lower with 0.60 and two-third of the highest priority areas were identified correctly. The linear model prediction resulted in an R-squared of 0.53. Additionally, this model’s output gave reason to suspect that the identified relationship between ‘school attendance’, ‘toilet presence’ and ‘foundation type’ and damage might not be applicable to other events or countries. The mean Macroseismic intensity and total population were most important in all models and are considered to be indispensable model components. For a future event within Nepal a model output of similar accuracy can be expected, but the presence of case- and country-specific relationships in the current model makes a useful estimation for a future event in another country very unlikely. However, after training the model on events in different countries the model is expected to be able to produce an output that is useful for aid prioritisation decision making. The extent to which the model can be successfully applied to different countries and cases can be improved by excluding secondary hazard susceptibility variables, finding an alternative uniform socio-economic vulnerability variable and using composite building quality variables. Model simplicity and data preparedness are key aspects in the successful further development of these models.