3D Human Reconstruction on the Multi-View, In-The-Wild, YOUth data
Summary
The main goal of this research thesis is to retrieve three dimensional human body models, of the parent and
infant, depicted in the private YOUth dataset, from multiple uncalibrated cameras. The previous research in
this area is primarily reliant on ground-truth annotations of two dimensional poses across the multi-view data
or the prior knowledge of the camera parameters. To this end, we develop a mechanism which bridges two
dimensional pose estimation methods with camera calibration and three dimensional human reconstruction
models. To reliably achieve our goal, we study the mechanisms of top-down and bottom-up two dimensional pose estimation methods, as well as, one-stage and two-stage three dimensional human reconstruction
strategies. To link the data between these different models, we develop a pipeline which identifies the same
individual across sequential frames and different points of view, ensuring to accommodate for missing, or redundant, information. We quantify the quality of the reconstruction based on the estimated two dimensional
pose data. The study of the qualitative results show the implications of challenges, such as occlusions and two
dimensional pose detection ambiguities, which cannot be accounted for in the absence of ground-truth pose
annotations or ground-truth camera parameters.