dc.description.abstract | This Master's Thesis presents an exploration of incorporating syntax trees into pre-trained Large Language Models (LLMs) for the task of Natural Language Inference (NLI). NLI is an important task for evaluating language models' ability to predict the entailment relationship between two sentences, thus showcasing a model's capacity for Natural Language Understanding (NLU). This study predominantly focuses on the BERT-base-uncased model, assessing the effects of enhancing it with an inductive bias toward linguistically derived syntactic trees using Graph Convolutional Networks, and the effects on performance on various NLI benchmark datasets and out-of-domain evaluation sets. While earlier research has delved into the impacts of enhancing LLMs with dependency structures, the effects of incorporating constituency structures and combining both parsing techniques remain largely unexplored. Experimental results reveal that while enhancement of BERT with syntactic structures does not notably benefit generic large-scale NLI datasets, it significantly aids models in scenarios where the underlying syntactic structure is important for the inference task, such as in semi-automatically generated datasets. This is particularly evident when training data is scarce, a common challenge in many real-world applications. Results further show that of the two investigated syntactic structures, constituency structures provide the most benefits in learning representations for monotonicity reasoning, an important skill that requires the ability to capture interactions between lexical and syntactic structures. Furthermore, we demonstrate that constituency parsing can help the BERT model learn useful representations for the syntactic structure of passive sentences, an area identified in previous research as a shortcoming of BERT. | |