Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorDeoskar, T.
dc.contributor.authorKamp, J.B.
dc.date.accessioned2019-09-04T17:00:53Z
dc.date.available2019-09-04T17:00:53Z
dc.date.issued2019
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/34028
dc.description.abstractSo far, the task of automatic verb classification has been widely explored through supervised as well as unsupervised machine learning techniques, based on syntactic and semantic features, and strictly related to argument structure theory and Levin (1993)’s verb classes. In the present study we go a step further than the previous research in this field (e.g. Lapata and Brew, 2004, Merlo and Stevenson, 2001, or Sun and Korhonen, 2009) by using automatically induced verb classes not as a goal, but rather as a starting point for a lexicon induction experiment for individual verbs. Inspired by Rooth, Riezler, Prescher, Carroll, and Beil (1999), a first experiment involves a clustering process of verbs represented by co-occurrence vectors of argument nouns extracted from the subcategorization frames of transitive and intransitive verbs; from the resulting model, a second experiment shows that lexicons of argument nouns for fixed verbs can be created by re-estimating the nouns’ absolute frequencies with respect to the same verb, modified by cluster-related probabilities from the model. Apart from being relatively simple statistical inference steps, the relevance of this study is also determined by the detailed and combined evaluation system used for model selection, including a Pseudo-Disambiguation task, in-depth cluster metrics, and a Variational Bayes Gaussian Mixture. It was found that argument selectional preference is a good indicator of verb classes, especially for the data set that included verbs of the alternation in which the object of the transitive is the subject of the intransitive. Moreover, through the support of a quantitative, WordNet-based method, it was shown that such classes are relatively little levinian. Future research could be directed to the exploration of adjunct slots, as well as an extension of the evaluation architecture to other clustering tasks within NLP.
dc.description.sponsorshipUtrecht University
dc.format.extent674557
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleStatistical Modeling at the Syntax-Semantics Interface: Exploiting Automatically Induced Lexical Classes Evaluated through Variational Bayesian Inference
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsverb classes, syntax-semantics interface, distributional semantics, unsupervised machine learning, clustering, variational bayes, NLP, computational linguistics, evaluation
dc.subject.courseuuLinguistics


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record