View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        "What Has Been Said Cannot Be Taken Back": A Toxic Speech Detection Framework for TikTok using Whisper and Perspective API

        Thumbnail
        View/Open
        MSc_Applied_Data_Science_Thesis_2023.pdf (1.486Mb)
        Publication date
        2023
        Author
        Prins, Jelle
        Metadata
        Show full item record
        Summary
        ["""On social media platforms, such as TikTok, toxic speech is a common problem. With a focus on videos from the 2020 US presidential election, this study suggests a framework for spotting toxic speech in TikTok videos. For the purpose of transcribing and analyzing spoken content in TikTok videos, the framework combines a speech-to-text algorithm and a toxicity detection API. The findings show that TikTok videos have varying amounts of toxic speech, with the majority of texts scoring low for toxicity. With the help of BERTopic, semantic characteristics extraction, dominant topics like Joe Biden's actions and discussions of race and politics are identified. Sentiment analysis shows different emotional tones across topics. It is also shown that there may be a correlation between some sentiments and higher levels of toxicity by looking at the relationship between toxicity and sentiment. These findings provide insights into the characteristics of toxic speech in TikTok videos. The results contribute to the development of strategies for content moderation and the promotion of healthier online communities. Future research should address limitations and further explore toxic speech on video-based social media platforms.""]
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/44949
        Collections
        • Theses

        Related items

        Showing items related by title, author, creator and subject.

        • Speech recognition at higher-than-normal speech and noise levels 

          Gelder, M.E. van (2010)
          Previous research has demonstrated reduced speech recognition of normal hearing listeners when speech is presented at higher-than-normal levels (e.g., above conversational speech levels), particularly in the presence of ...
        • Applying Image Recognition to Automatic Speech Recognition: Determining Suitability of Spectrograms for Training a Deep Neural Network for Speech Recognition 

          Lambooij, N.L.C. (2017)
          In speech recognition, Neural Networks are used to recognise the sequence of phonemes in an audio signal. These networks are trained on audio data pre-processed into some (type of) spectral vector. We present an alternative ...
        • Speech Anomalies as a Symptom of Formal Thought Disorder in Schizophrenia: The Sensitivity of The Thought and Language Dysfunction Scale on Speech Related Items 

          Nieuwenhuizen, M. (2020)
          Formal thought disorder (FTD) is a core symptom of schizophrenia and has been described as a set of language, thinking and communication deficits. The diagnosis of FTD takes place using a clinical rating scale that often ...
        Utrecht university logo