Applying machine learning in the classification of psychosis using syntactic, semantic and phonological features of speech

Slegers, F.

View/Open

Bachelor Thesis Fleur Slegers.pdf (1.139Mb)

Publication date

2019

Author

Slegers, F.

Metadata

Show full item record

Summary

A variety of neurological and psychiatric illnesses are characterized by verbal communication disorders. Recently, there has been growing interest in automated speech-based techniques for screening mental disorders. Schizophrenia spectrum disorders are characterized by diminished effective expression and disturbances in thought and language, which results in disorganized speech. Diagnosing schizophrenia is often a challenging process prone to subjectiveness, as deviancies in speech are subtle and follow each other rapidly. Schizophrenia is the most common disorder in psychosis, which is a set of related conditions. As speech contains markers for schizophrenia, we believe that automated speech-based techniques may also be used to improve and simplify the process of diagnosing this disorder. In this study, we implement multiple machine learning algorithms to examine the extent to which psychosis can be classified using syntactic, semantic and phonological features of speech. These features were extracted from speech samples using the tools T-Scan, Word2Vec and OpenSMILE, resulting in three separate data sets. Speech samples were collected by interviewing 50 psychotic patients and 50 healthy controls. We investigate the suitability of five different classification algorithms, namely Logistic Regression, Naïve Bayes, Random Forest, Stochastic Gradient Descent and Support Vector Machines on the separate data sets for classifying psychosis. Our results show that distinguishing psychotic patients from healthy controls is possible using speech-derived features and techniques. Reasonably high accuracy scores can be achieved by using syntactic, semantic or phonological information about speech. This research adds to the field of clinical language analysis and has implications for future use of speech-based analytics in the clinical diagnostic process.

URI

https://studenttheses.uu.nl/handle/20.500.12932/35135

Collections

Theses