View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Machine Learning for Sentiment Analysis of Children's Diaries

        Thumbnail
        View/Open
        MarjoleindeVries_FinalVersion_19mei.pdf (729.4Kb)
        Publication date
        2017
        Author
        Vries, M.J. de
        Metadata
        Show full item record
        Summary
        In this thesis, the automatic detection of sentiments expressed by children using machine learning is investigated. The automatic sentiment detection is used within a robotic companion that helps children with their daily struggles of living with Type 1 diabetes. The problem statement that guides this research is: To what extent is it possible to correctly classify the sentiment of a Dutch child’s diary entry by means of automatic text analysis? Results show that machine learning models yield a significantly higher performance than a symbolic sentiment scoring algorithm. Machine learning models have shown to be better at capturing context and at capturing complex negations. The usage of an ordinal or time-correlated adaptation of a machine learning model is evaluated as well, but results show that such a model does not have an advantage over a regular machine learning model. Additionally, a new algorithm for semantic normalization on top of standard morphological normalization is introduced in this thesis. Results show that using this newly created step consistently improves the performance in the current study. The new algorithm for semantic normalization is especially useful in this thesis, because the dataset is sparse and contains highly infrequent words.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/25972
        Collections
        • Theses
        Utrecht university logo