dc.rights.license | CC-BY-NC-ND | |
dc.contributor | C.van Toledo | |
dc.contributor.advisor | Brinkhuis, Matthieu | |
dc.contributor.author | Nijhuis, Antoine | |
dc.date.accessioned | 2023-09-07T23:00:53Z | |
dc.date.available | 2023-09-07T23:00:53Z | |
dc.date.issued | 2023 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/45118 | |
dc.description.abstract | This thesis proposes a research on question context analysis, utilizing Natural Language
Processing (NLP) techniques. In past time, Question & Answer interactions were handled
manually. Recent advancements in NLP and Machine Learning (ML) have created the opportunity to implement a plethora of Question & Answering Systems (QAS). These systems
often utilize a classifier to classify the questions into types before answering them. The
two most prominent classification methods historically are Rule-based models and Machine
Learning models. Rule-based models utilize grammar rules and a dictionary to classify questions into categories and map these to the correct answer. Machine Learning models, such
as neural networks, utilize mathematical equations to learn from a Question & Answer data
set, with the intention of minimizing classification errors when matching question and answer pairs. This research aims to discover how popular classification techniques perform in
a restricted domain environment, with regards to question type recognition and question
enrichment. These two tasks are performed on a large structured Dutch data set and a
public English data set. To determine how the classifiers score on these two tasks, two rigorous metrics are applied to determine classification power; F1 Score & Area under the ROC
surface (AUC). Results suggest that question ambiguity can be recognized with an F1-score
upwards of 90%. ML techniques featuring deep learning perform best across both question
type detection and question enrichment. | |
dc.description.sponsorship | Utrecht University | |
dc.language.iso | EN | |
dc.subject | An analysis on question ambiguity using a structured formal data set, utilizing popular Natural Language Processing techniques. | |
dc.title | Analysis of question type classification and disambiguation | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | machine learning, NLP, natural language processing, question classification, question disambiguation, context analysis | |
dc.subject.courseuu | Business Informatics | |
dc.thesis.id | 24078 | |