Investigating Syntactic Enhancements in LLMs with Graph Convolutional Networks for Natural Language Inference

Lin, Luca

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Abzianidze, Lasha
dc.contributor.author	Lin, Luca
dc.date.accessioned	2023-08-08T00:01:45Z
dc.date.available	2023-08-08T00:01:45Z
dc.date.issued	2023
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/44529
dc.description.abstract	This Master's Thesis presents an exploration of incorporating syntax trees into pre-trained Large Language Models (LLMs) for the task of Natural Language Inference (NLI). NLI is an important task for evaluating language models' ability to predict the entailment relationship between two sentences, thus showcasing a model's capacity for Natural Language Understanding (NLU). This study predominantly focuses on the BERT-base-uncased model, assessing the effects of enhancing it with an inductive bias toward linguistically derived syntactic trees using Graph Convolutional Networks, and the effects on performance on various NLI benchmark datasets and out-of-domain evaluation sets. While earlier research has delved into the impacts of enhancing LLMs with dependency structures, the effects of incorporating constituency structures and combining both parsing techniques remain largely unexplored. Experimental results reveal that while enhancement of BERT with syntactic structures does not notably benefit generic large-scale NLI datasets, it significantly aids models in scenarios where the underlying syntactic structure is important for the inference task, such as in semi-automatically generated datasets. This is particularly evident when training data is scarce, a common challenge in many real-world applications. Results further show that of the two investigated syntactic structures, constituency structures provide the most benefits in learning representations for monotonicity reasoning, an important skill that requires the ability to capture interactions between lexical and syntactic structures. Furthermore, we demonstrate that constituency parsing can help the BERT model learn useful representations for the syntactic structure of passive sentences, an area identified in previous research as a shortcoming of BERT.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	Enhancing the BERT-base model with linguistically derived syntactic structures with Graph Convolutional Networks for the task of Natural Language Inference
dc.title	Investigating Syntactic Enhancements in LLMs with Graph Convolutional Networks for Natural Language Inference
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	LLM; Transformer; Graph Neural Network; BERT; Syntax
dc.subject.courseuu	Artificial Intelligence
dc.thesis.id	21254

Files in this item

Name:: Master_Thesis_LucaLin_Final.pdf
Size:: 844.3Kb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record