View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        On the effects of using speech transcripts and subtitles to detect topic shifts in news broadcasts

        Thumbnail
        View/Open
        Thesis_Sahra_Mohamed_6254357.pdf (528.8Kb)
        Publication date
        2020
        Author
        Mohamed, S.
        Metadata
        Show full item record
        Summary
        In this research, topic segmentation in texts (a.k.a. text segmentation) is used as a proxy for topic segmentation in videos. The main application is automatically providing a topic transition structure for videos, because it is difficult to quickly scan them and figure out where a new subject starts. Topic models are used to figure out the topic transition positions. The available data for this research is provided by the Netherlands Institute for Sound and Vision and consists of 25,600 transcripts and subtitles of the same Dutch news broadcasts. The research questions whether it is better to use automatic speech recognition transcripts or subtitles when segmenting a video based on topics.The subtitles and speech transcripts were compared for the same news broadcasts and both qualitative and quantitative differences between them were found. However, no significant difference was found between the performance of the text segmentation algorithm using subtitles and speech transcripts. The research presents the challenges and benefits of the developed text segmentation algorithm. The research can give insight into the realizability of the application of text segmentation to help structure videos, which can become a starting point for future research.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/37415
        Collections
        • Theses
        Utrecht university logo