Language as a Predictor of Anxiety, Depression, and Self-Efficacy Scores and Recovery Rate in Teenagers with Chronic Fatigue Syndrome
Summary
Nowadays, Artificial Intelligence (AI) models are being used in multiple areas of the healthcare sector. This thesis looks into the relationship between language use of teenaged patients with Chronic Fatigue Syndrome (CFS) and their anxiety, depression, self-efficacy, and CFS treatment outcome. This research aims to make it easier for healthcare professionals to get an indication of the level of a patient’s anxiety or depression, the measure of their self-efficacy, and whether or not a specific type of treatment will work for a patient. Using a short text written by the patient to get such an indication would facilitate an earlier start of effective treatment. This thesis uses data from 102 patients who received online email-based Cognitive Behavioural Therapy for its two main focus areas. The first focus area looks at the correlation between a patient’s language use and their anxiety, depression, and self-efficacy. This is done by training n-gram-based language models and Naive Bayes on the text in the emails to predict the patients’ anxiety, depression, and self-efficacy scores. The language models’ results were compared to those of models trained on randomly generated scores, and
it was shown that outputs of these models were statistically significant. The language model performed better than Naive Bayes, and it was concluded that there was a correlation between language use and anxiety, depression, and self-efficacy. The second focus area looks at how well the language used by the patients in the emails sent to their therapists can be used with various AI models to predict the level of their anxiety and depression, the measure of their self-efficacy, and their CFS treatment outcome. This was done using the number of non-agentic language features per email, Bag of Words, and BERTje embeddings. These features were used as input for both logistic regression models and neural networks. When using logistic regression, the models for predicting self-efficacy using BERTje embeddings performed best. The neural networks using BERTje embeddings outperformed the logistic regression models when predicting anxiety, depression, self-efficacy, and treatment outcome. Thus it was concluded that it is possible to predict anxiety, depression, self-efficacy, and patient recovery based on language use.