Predicting financial distress at Dutch general hospitals: a machine learning approach
Summary
A quarter of the Dutch hospital sector is financially unhealthy according to recent benchmark outcomes and the Dutch government hopes to save 1.9 billion euro structurally each year in the curative care. Therefore, a financial outlook for the coming years is desirable and can be of importance to the hospital sector in order to act before they are in possible financial distress. This thesis focuses on predicting the financial situation of Dutch general hospitals by using a combination of machine learning and text mining techniques. The use of machine learning to predict financial distress has been applied in other sectors before, however the (Dutch) hospital sector has not yet been investigated. This thesis also examines the findings from literature that patient ratings and textual data from annual reports improve the prediction of financial distress, a topic that has not been looked into in the context of this research.
This research shows that a combination of machine learning and text mining techniques can be used to predict financial distress in theory. However, data analysis has shown that the performance of the prediction models using financial statement data was poor to bad on average initially (an average AUC score under 0.6), probably due to a dimensionality problem of the dataset. Further analysis has shown that using the six financial ratios of the stress test to predict financial distress improved the performance of the prediction models in general to a fair AUC score (0.7 – 0.8). Adding patient ratings and/or textual data from annual reports lowered the AUC score in the initial data analysis by at least 0.08, where further analysis showed that adding patient ratings improved the score in some cases.