dc.description.abstract | In recent years, Human Pose Estimation (HPE) algorithms have become increasingly well-performing in localizing the joint locations of humans from images. Besides benefitting from the fast-paced innovations in the field of deep-learning, these models benefit from large-scale manually labeled HPE datasets. These datasets, however, consist mostly of annotations for adult people, whilst underrepresenting children. As children go through a considerable change in body structure throughout puberty, there are several distinct anatomical differences between prepubescent children and adults. This provides reason to believe there to be a performance regression when state-of-the-art HPE models are tested on children.
We experimentally demonstrate that modern pose estimators indeed struggle comparatively more with estimating child poses than the poses of adults. We furthermore finetune a benchmark HPE model on child data to verify if this performance difference is due to data limitation or due to model limitations. This is done using a newly collected child-specific dataset that we dub Kinetikids-pose. This experiment, however, did not culminate in a conclusive result.
Kinetikids-pose is compiled from photos and video frames of children performing sporting activities. It is to our knowledge the first monocular child HPE dataset that is publicly available. We also present and share two filtered subsets of the COCO validation split: COCO Adult and COCO Child. These are, as the names suggest, subsets filtered to contain either solely adults or solely children. | |