Learning about Non-Veridicality in Textual Entailment
MetadataShow full item record
Neural network based models are the state-of-the-art for Recognizing Textual Entailment (RTE) and have recently received much attention, yet little research has addressed the question of specifically what linguistic phenomena are learned by these models. Hence, this thesis analyzeswhat a neural RTE model learns about items that block entailment (nonveridical operators) and whether the model can be expanded to cover this linguistic phenomena. Thus, a neural model with Long Short-Term Memory (LSTM) is trained on general natural language inference (NLI) data and tested on data from the domain of annual reports, which are written in a particular register and contain many non-veridical operators. The general domain Stanford Natural Language Inference (SNLI) data is used for training the model. An analysis of the LSTM’s attention mechanism is performed in order to investigate precisely what the model pays attention to. In order to see whether the model can be improved, two datasets are added to the training set. Firstly, texts similar to the test domain are used in training, to see whether the model can learn features of the relevant register. Secondly, a dataset containing many non-veridical operators is used to train the model, to test whether the model can learn to deal with items that block entailment. For producing the latter training set this thesis suggests a method of recasting event factuality corpora, which is abundant with non-veridical contexts. Training the RTE model on factuality data enables it to perform the task of event factuality. This thesis proposes to address both the task of specified event extraction and event factuality in one step by testing sentences about events against informative sources for entailment. The events studied in this thesis are achievements of companies with regard to the Sustainable Development Goals (SDGs). The main contributions of this thesis are insights into the inner workings of a neural RTE model and high performance on the task of finding information about events in text. Firstly, this study shows that a textual entailment model trained on general data does not perform well on annual reports data which contains high instances of non-veridicality, and needs to be adapted. Secondly, I show that the model achieves high performance using a combination of the linguistically specialized Veridicality set and the domain-specific Annual Report datasets in training. Namely, combining these two training sets, an F1 score of 87.05 is achieved in determining entailment between sentences in annual reports and events of accomplished SDGs. The analysis of the attention mechanism of the model shows that the model is able to induce the importance of non-veridical operators for textual entailment. Thirdly, it appears that a semi-artificially constructed recast Veridicality data cannot be successfully combined with the more general SNLI data for training a neural RTE model, on account of the recast data being too homogeneous.