View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Easy Data Augmentation Techniques for Traditional Machine Learning Models on Text Classification Tasks

        Thumbnail
        View/Open
        Santing_5727456.pdf (391.3Kb)
        Publication date
        2021
        Author
        Santing, L.B.
        Metadata
        Show full item record
        Summary
        The use of data augmentation techniques in NLP for the creation of more robust models has increased in recent years. Easy Data Augmentation (EDA) techniques by Wei & Zou (2019) proposed a simple method to augment small datasets for text classification that showed promising results. While most research in the topic of data augmentation for NLP has been focused on deep learning models and not traditional machine learning models, these models are still commonly used for text classification. On three text classification tasks, this research tests the application of EDA on the performance of three traditional machine learning models: logistic regression, naïve bayes and decision tree. Results show that EDA marginally improves performance for these classifiers on small and large datasets.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/40651
        Collections
        • Theses
        Utrecht university logo