Analysis of question type classification and disambiguation

Nijhuis, Antoine

dc.rights.license	CC-BY-NC-ND
dc.contributor	C.van Toledo
dc.contributor.advisor	Brinkhuis, Matthieu
dc.contributor.author	Nijhuis, Antoine
dc.date.accessioned	2023-09-07T23:00:53Z
dc.date.available	2023-09-07T23:00:53Z
dc.date.issued	2023
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/45118
dc.description.abstract	This thesis proposes a research on question context analysis, utilizing Natural Language Processing (NLP) techniques. In past time, Question & Answer interactions were handled manually. Recent advancements in NLP and Machine Learning (ML) have created the opportunity to implement a plethora of Question & Answering Systems (QAS). These systems often utilize a classifier to classify the questions into types before answering them. The two most prominent classification methods historically are Rule-based models and Machine Learning models. Rule-based models utilize grammar rules and a dictionary to classify questions into categories and map these to the correct answer. Machine Learning models, such as neural networks, utilize mathematical equations to learn from a Question & Answer data set, with the intention of minimizing classification errors when matching question and answer pairs. This research aims to discover how popular classification techniques perform in a restricted domain environment, with regards to question type recognition and question enrichment. These two tasks are performed on a large structured Dutch data set and a public English data set. To determine how the classifiers score on these two tasks, two rigorous metrics are applied to determine classification power; F1 Score & Area under the ROC surface (AUC). Results suggest that question ambiguity can be recognized with an F1-score upwards of 90%. ML techniques featuring deep learning perform best across both question type detection and question enrichment.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	An analysis on question ambiguity using a structured formal data set, utilizing popular Natural Language Processing techniques.
dc.title	Analysis of question type classification and disambiguation
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	machine learning, NLP, natural language processing, question classification, question disambiguation, context analysis
dc.subject.courseuu	Business Informatics
dc.thesis.id	24078

Files in this item

Name:: MasterThesis_Nijhuis_MB_Public ...
Size:: 2.534Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record