Predicting diagnoses of patients in the Emergency Room: a multi-label text classification approach
Summary
Artificial Intelligence (AI) models have big potential in the medical domain because there is so much data available. Previous research shows promising implementations of AI in the medical domain. This thesis aims to build an AI model that predicts diagnoses of patients from the History of Present Illness (HPI) text, to support clinicians in the Emergency Room (ER). The implementation of such a model can be of help to reduce diagnostic errors, and lower the workload of clinicians in the ER. No previous work has tried to make a predictive model for diagnoses, that can support clinicians in the ER, based on Dutch HPI texts that is explainable as well. A multi-label dataset with more than 120,000 HPI texts with corresponding diagnosis labels was used to train a simple baseline model and three deep learning models, including a sequence-tosequence model and two BERT-based models, to compare to each other. After testing and evaluating the models by F1 score, it was found that none of the models achieve high performance on the whole dataset. Interestingly, the two BERT-based models did achieve high performances when trained on smaller samples of the dataset, consisting of diagnoses that are more difficult for clinicians to distinguish. Therefore, it is proposed to further research this idea of more specific models, to support clinicians to make a decision between more specific diagnoses.