Breaking left-to-right generation in Transformer models: arbitrary orderings on tagging tasks

Bais, Giacomo

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Deoskar, Tejaswini
dc.contributor.author	Bais, Giacomo
dc.date.accessioned	2024-11-15T01:03:27Z
dc.date.available	2024-11-15T01:03:27Z
dc.date.issued	2024
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/48151
dc.description.abstract	Transformers have achieved great success in various natural language processing tasks. Traditionally, generation happens in a left-to-right fashion in such models, mimicking how we process and produce text as humans. However, for tasks in which tokens do not exhibit strong left-to-right sequential dependencies such as Combinatory Categorical Grammar (CCG) tagging, alternative generation orders may be more effective. This thesis explores a novel approach to break away from the traditional left-to-right ordering in Transformerbased models. We introduce TagInsert, a sequence-to-sequence model designed to perform tagging tasks with an ordering that is learned by the model itself, potentially reducing error propagation typically associated with auto-regressive models. We evaluate the performance of our architecture across three tasks: Part-of-Speech tagging, CCG non-constructive tagging, and CCG constructive supertagging. Our results show that arbitrary generation order improves performance in Part- Of-Speech and CCG non-constructive tagging when compared to a left-to-right model. Notably, for CCG non-constructive tagging, we observe a statistically significant advantage over the standard left-to-right approach, indicating that breaking the traditional ordering may yield better results for tasks with weak left-toright sequential dependencies. In the case of constructive supertagging, another version of TagInsert adapted for the task is presented. While reaching comparable results with the existing literature, better solutions to introduce arbitrary ordering in the task are likely needed to more effectively exploit the benefits the architecture brings. We also present extensive qualitative analyses to understand the behaviour of both the non-constructive and constructive model, in order to identify properties in which there is a benefit from breaking the left-to-right ordering.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	In the thesis, we implement a Transformer based architecture that breaks the common left-to-right order of generation and is trained for several tagging tasks.
dc.title	Breaking left-to-right generation in Transformer models: arbitrary orderings on tagging tasks
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.courseuu	Artificial Intelligence
dc.thesis.id	41042

Files in this item

Name:: Master's Thesis.pdf
Size:: 1.871Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record