Machine Learning for Referral of Patients with Chest Pains using Regular Care Data
Summary
This research uses machine learning models to predict the right care for patients referred to the University Medical Center in Utrecht (UMCU). The right care will be predicted using regular care data which includes blood measurements, previous patient appointments, and referral letters. These referral letters will be text mined. The goal is to classify patients early, to prevent expensive scans. Only patients who have been referred to the cardiology department with chest pain symptoms will be considered. Patients are given one or more classes, making this a multi-label classification problem. Time series data will be constructed using patients' history, and models using time series data will be compared to models that cannot. As time series models, the LSTM model will be used and compared to the T-LSTM. The addition of target replication to both models is researched. As non-time series models, the XGBoost, SVM, and Neural Network models will be used. After creating labels using available data, the number of labels is reduced from 30 to 9 using a novel label combination algorithm. Additionally, subgroup discovery is performed, which was able to find a group based on one rule with an accuracy of 73%. Overall, the time-aware LSTM does not perform better than the vanilla LSTM. The time-series model outperforms the non-time-series models. Target replication only gives a performance increase for the vanilla LSTM. The best-performing model, the LSTM with target replication, has a micro F1 score of 0.62 and an AUC of 0.84. Performance might be limited by the fact that labels had to be constructed from scratch.