View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Breaking left-to-right generation in Transformer models: arbitrary orderings on tagging tasks

        Thumbnail
        View/Open
        Master's Thesis.pdf (1.871Mb)
        Publication date
        2024
        Author
        Bais, Giacomo
        Metadata
        Show full item record
        Summary
        Transformers have achieved great success in various natural language processing tasks. Traditionally, generation happens in a left-to-right fashion in such models, mimicking how we process and produce text as humans. However, for tasks in which tokens do not exhibit strong left-to-right sequential dependencies such as Combinatory Categorical Grammar (CCG) tagging, alternative generation orders may be more effective. This thesis explores a novel approach to break away from the traditional left-to-right ordering in Transformerbased models. We introduce TagInsert, a sequence-to-sequence model designed to perform tagging tasks with an ordering that is learned by the model itself, potentially reducing error propagation typically associated with auto-regressive models. We evaluate the performance of our architecture across three tasks: Part-of-Speech tagging, CCG non-constructive tagging, and CCG constructive supertagging. Our results show that arbitrary generation order improves performance in Part- Of-Speech and CCG non-constructive tagging when compared to a left-to-right model. Notably, for CCG non-constructive tagging, we observe a statistically significant advantage over the standard left-to-right approach, indicating that breaking the traditional ordering may yield better results for tasks with weak left-toright sequential dependencies. In the case of constructive supertagging, another version of TagInsert adapted for the task is presented. While reaching comparable results with the existing literature, better solutions to introduce arbitrary ordering in the task are likely needed to more effectively exploit the benefits the architecture brings. We also present extensive qualitative analyses to understand the behaviour of both the non-constructive and constructive model, in order to identify properties in which there is a benefit from breaking the left-to-right ordering.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/48151
        Collections
        • Theses
        Utrecht university logo