Towards Automated Infant Hand Gesture Recognition for Predicting Language Development

Dijkstra, Riemer

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Kaya, Heysem
dc.contributor.author	Dijkstra, Riemer
dc.date.accessioned	2025-08-21T00:06:30Z
dc.date.available	2025-08-21T00:06:30Z
dc.date.issued	2025
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/49900
dc.description.abstract	Infant hand gestures such as pointing, showing, and giving are known predictors of early language development. However, automated recognition of infant gestures remains an underexplored task, particularly due to the scarcity of annotated infant datasets and the unique variability in infant motor behavior. This thesis investigates the feasibility of Deep Learning (DL) models for infant hand gesture recognition using video data from the YOUth Cohort Study. Several model architectures have been evaluated, including a two-stream CNN coupled with an SVM classifier, 3D CNNs and transformer-based approaches. These were either trained from scratch or pretrained on both general and gesture-specific datasets. Performance was assessed for both binary (gesture or no gesture) and multiclass (7 classes) classification. Top performance, in both cases, was achieved by end-to-end training on temporal features, scoring macro average F1 scores of 73.06% and 40.98% respectively. Furthermore, the study explored the relationship between gesture frequencies and Peabody Picture Vocabulary Test (PPVT) scores using linear regression and Random Forests (RF). Child-related metadata, such as maternal education level, were also incorporated as predictor variables. Based on our limited dataset, no predictors of language development were found other than the age of a child at the time of PPVT testing. While classification results appear promising, the study on automated hand gesture recognition for early language development is still in its infancy.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	Through Deep Learning techniques we evaluate the possibility of automatic infant hand gesture recognition in video clips. Subsequently, we use the gesture classifications and annotations to predict PPVT-III-NL scores, which capture language development.
dc.title	Towards Automated Infant Hand Gesture Recognition for Predicting Language Development
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	Deep Learning; Computer Vision; Hand Gesture Recognition; Language Development
dc.subject.courseuu	Artificial Intelligence
dc.thesis.id	51994

Files in this item

Name:: MSc_Thesis_Riemer_Dijkstra.pdf
Size:: 4.747Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record