Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorDeoskar, Tejaswini
dc.contributor.authorBais, Giacomo
dc.date.accessioned2024-11-15T01:03:27Z
dc.date.available2024-11-15T01:03:27Z
dc.date.issued2024
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/48151
dc.description.abstractTransformers have achieved great success in various natural language processing tasks. Traditionally, generation happens in a left-to-right fashion in such models, mimicking how we process and produce text as humans. However, for tasks in which tokens do not exhibit strong left-to-right sequential dependencies such as Combinatory Categorical Grammar (CCG) tagging, alternative generation orders may be more effective. This thesis explores a novel approach to break away from the traditional left-to-right ordering in Transformerbased models. We introduce TagInsert, a sequence-to-sequence model designed to perform tagging tasks with an ordering that is learned by the model itself, potentially reducing error propagation typically associated with auto-regressive models. We evaluate the performance of our architecture across three tasks: Part-of-Speech tagging, CCG non-constructive tagging, and CCG constructive supertagging. Our results show that arbitrary generation order improves performance in Part- Of-Speech and CCG non-constructive tagging when compared to a left-to-right model. Notably, for CCG non-constructive tagging, we observe a statistically significant advantage over the standard left-to-right approach, indicating that breaking the traditional ordering may yield better results for tasks with weak left-toright sequential dependencies. In the case of constructive supertagging, another version of TagInsert adapted for the task is presented. While reaching comparable results with the existing literature, better solutions to introduce arbitrary ordering in the task are likely needed to more effectively exploit the benefits the architecture brings. We also present extensive qualitative analyses to understand the behaviour of both the non-constructive and constructive model, in order to identify properties in which there is a benefit from breaking the left-to-right ordering.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectIn the thesis, we implement a Transformer based architecture that breaks the common left-to-right order of generation and is trained for several tagging tasks.
dc.titleBreaking left-to-right generation in Transformer models: arbitrary orderings on tagging tasks
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.courseuuArtificial Intelligence
dc.thesis.id41042


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record