Machine learning in healthcare invoicing systems: Using text mining and supervised learning to verify classifications of unstructured medical texts
MetadataShow full item record
In the process of healthcare invoicing, many mistakes are made when physicians assign activity codes to treatments. Health insurance companies require hospitals to check the assigned activity codes and therefore Electronic Health Records (EHR) are examined manually. In this thesis, a system is proposed that automatically checks whether assigned activity codes are correct or not, based on unstructured EHR texts. This binary prediction was made with the use of supervised machine learning algorithms. Several algorithms are compared: Logistic regression, Naive Bayes, Neural Network, and Support Vector Machines. Furthermore, the classification problem was extended to a multi-class classification in which the reason of rejecting an incorrectly assigned activity code was predicted. Accuracies of 93.3% and 87.4% were achieved for respectively the binary and the multi-class classification. It was found that feature selection had a higher impact on the results than the choice of the algorithm. Future work can investigate new activity codes that have other requirements. Moreover, the current system can be used for prevention instead of checking.