dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Winter, Y. | |
dc.contributor.advisor | Moortgat, M.J. | |
dc.contributor.author | Strien, G.C.F. van | |
dc.date.accessioned | 2009-09-17T17:03:02Z | |
dc.date.available | 2009-09-17 | |
dc.date.available | 2009-09-17T17:03:02Z | |
dc.date.issued | 2009 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/3419 | |
dc.description.abstract | Recognizing textual entailment (TE) is the task of deciding whether a sentence or
a text implies another, e.g. the sentence ‘Ostriches put their heads into the sand to
avoid the wind’ entails ‘Ostriches bury their heads in the sand’. While a trivial
task for humans in many ordinary situations, the problem of recognizing TE has
proven extremely difficult for machine learning algorithms. Participants in the
third PASCAL RTE workshop reported an accuracy of at most 80% (Giampiccolo
et al., 2007). Current approaches to the recognition of TE often use word alignment
functions, exploiting syntactic and structural properties of the text. In order to
find out whether semantic techniques are also valuable, we analyzed existing
training sets from the PASCAL Challenges. Some semantic properties that are
essential for defining the validity of an entailment were examined in detail and
subsequently annotated. The first stage consisted of analyzing semantic properties
that are essential for TE. In the second stage, we annotated the most common
relevant properties. In the third stage we revised the scheme and reapplied it to the
datasets of RTE1, RTE2 and RTE3. In order to simplify the process of annotation,
we developed an xml annotation scheme and we built a tool for executing the
actual annotation task.
We found that the defined annotation scheme is widely applicable, as 64.4% of
all valid entailments pairs that we reviewed, could be annotated. Further work
includes the testing of machine learning algorithms on the annotated dataset. | |
dc.description.sponsorship | Utrecht University | |
dc.format.extent | 695288 bytes | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | |
dc.title | Semantic Annotations for Automatic Recognition of Textual Entailments | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | RTE, semantic annotation, entailment, textual entailment | |
dc.subject.courseuu | Taal- en spraaktechnologie | |