View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Analysis of question type classification and disambiguation

        Thumbnail
        View/Open
        MasterThesis_Nijhuis_MB_PublicationVersion.pdf (2.534Mb)
        Publication date
        2023
        Author
        Nijhuis, Antoine
        Metadata
        Show full item record
        Summary
        This thesis proposes a research on question context analysis, utilizing Natural Language Processing (NLP) techniques. In past time, Question & Answer interactions were handled manually. Recent advancements in NLP and Machine Learning (ML) have created the opportunity to implement a plethora of Question & Answering Systems (QAS). These systems often utilize a classifier to classify the questions into types before answering them. The two most prominent classification methods historically are Rule-based models and Machine Learning models. Rule-based models utilize grammar rules and a dictionary to classify questions into categories and map these to the correct answer. Machine Learning models, such as neural networks, utilize mathematical equations to learn from a Question & Answer data set, with the intention of minimizing classification errors when matching question and answer pairs. This research aims to discover how popular classification techniques perform in a restricted domain environment, with regards to question type recognition and question enrichment. These two tasks are performed on a large structured Dutch data set and a public English data set. To determine how the classifiers score on these two tasks, two rigorous metrics are applied to determine classification power; F1 Score & Area under the ROC surface (AUC). Results suggest that question ambiguity can be recognized with an F1-score upwards of 90%. ML techniques featuring deep learning perform best across both question type detection and question enrichment.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/45118
        Collections
        • Theses
        Utrecht university logo