dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Adriaans, F.W. | |
dc.contributor.author | Lambooij, N.L.C. | |
dc.date.accessioned | 2017-09-06T17:02:21Z | |
dc.date.available | 2017-09-06T17:02:21Z | |
dc.date.issued | 2017 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/27440 | |
dc.description.abstract | In speech recognition, Neural Networks are used to recognise the sequence of phonemes in an audio signal. These networks are trained on audio data pre-processed into some (type of) spectral vector. We present an alternative method that pre-processes speech utterances into visual representations, called spectrograms, and train a neural network suitable for image recognition to identify phonemes. The resulting network was able to classify 99.73% of a set of vowels containing samples of ‘iy’, ‘ah’ and ‘uw’ correctly, 91.87% of a set of vowels containing samples ‘iy’, ‘ih’ and ‘eh’, and 75.97% of the full dataset of twelve vowels. These results show that using image recognition in automatic speech recognition is worth further investigating. | |
dc.description.sponsorship | Utrecht University | |
dc.format.extent | 893431 | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | |
dc.title | Applying Image Recognition to Automatic Speech Recognition: Determining Suitability of Spectrograms for Training a Deep Neural Network for Speech
Recognition | |
dc.type.content | Bachelor Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | Speech Recognition, Neural Network, Spectrogram, Image Recognition, | |
dc.subject.courseuu | Kunstmatige Intelligentie | |