Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorKaya, Heysem
dc.contributor.authorDijkstra, Riemer
dc.date.accessioned2025-08-21T00:06:30Z
dc.date.available2025-08-21T00:06:30Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/49900
dc.description.abstractInfant hand gestures such as pointing, showing, and giving are known predictors of early language development. However, automated recognition of infant gestures remains an underexplored task, particularly due to the scarcity of annotated infant datasets and the unique variability in infant motor behavior. This thesis investigates the feasibility of Deep Learning (DL) models for infant hand gesture recognition using video data from the YOUth Cohort Study. Several model architectures have been evaluated, including a two-stream CNN coupled with an SVM classifier, 3D CNNs and transformer-based approaches. These were either trained from scratch or pretrained on both general and gesture-specific datasets. Performance was assessed for both binary (gesture or no gesture) and multiclass (7 classes) classification. Top performance, in both cases, was achieved by end-to-end training on temporal features, scoring macro average F1 scores of 73.06% and 40.98% respectively. Furthermore, the study explored the relationship between gesture frequencies and Peabody Picture Vocabulary Test (PPVT) scores using linear regression and Random Forests (RF). Child-related metadata, such as maternal education level, were also incorporated as predictor variables. Based on our limited dataset, no predictors of language development were found other than the age of a child at the time of PPVT testing. While classification results appear promising, the study on automated hand gesture recognition for early language development is still in its infancy.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectThrough Deep Learning techniques we evaluate the possibility of automatic infant hand gesture recognition in video clips. Subsequently, we use the gesture classifications and annotations to predict PPVT-III-NL scores, which capture language development.
dc.titleTowards Automated Infant Hand Gesture Recognition for Predicting Language Development
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsDeep Learning; Computer Vision; Hand Gesture Recognition; Language Development
dc.subject.courseuuArtificial Intelligence
dc.thesis.id51994


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record