View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Applying Image Recognition to Automatic Speech Recognition: Determining Suitability of Spectrograms for Training a Deep Neural Network for Speech Recognition

        Thumbnail
        View/Open
        Bachelor Scriptie - Applying Image Recognition to Automatic Speech Recognition.pdf (872.4Kb)
        Publication date
        2017
        Author
        Lambooij, N.L.C.
        Metadata
        Show full item record
        Summary
        In speech recognition, Neural Networks are used to recognise the sequence of phonemes in an audio signal. These networks are trained on audio data pre-processed into some (type of) spectral vector. We present an alternative method that pre-processes speech utterances into visual representations, called spectrograms, and train a neural network suitable for image recognition to identify phonemes. The resulting network was able to classify 99.73% of a set of vowels containing samples of ‘iy’, ‘ah’ and ‘uw’ correctly, 91.87% of a set of vowels containing samples ‘iy’, ‘ih’ and ‘eh’, and 75.97% of the full dataset of twelve vowels. These results show that using image recognition in automatic speech recognition is worth further investigating.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/27440
        Collections
        • Theses
        Utrecht university logo