Prosodic modification of infant-directed speech for improved automatic speech recognition
Summary
The current performance of automatic speech recognition (ASR) for infant-directed speech (IDS) leaves a lot to be desired. If this performance could be increased to more reliable levels, it would enable researchers to use automatic speech recognisers for IDS research, which would drastically decrease the time needed for research on IDS. This thesis focuses on possible improvements for this problem, utilizing a front-end approach. IDS-fragments were first prosodically modified for average pitch and speaking rate, and were subsequently fed to an ASR-system. The resulting automatic annotations were then evaluated for their precision, recall and F-score. However, no improvements of the ASR-system’s performance were found after prosodic modification. Although these results did not show improved performance of the recognition system, this study does provide some interesting insights for future research regarding ASR for IDS.