dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Önal Ertugrul, I. | |
dc.contributor.author | Tang, Huaixi | |
dc.date.accessioned | 2023-02-17T01:00:50Z | |
dc.date.available | 2023-02-17T01:00:50Z | |
dc.date.issued | 2023 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/43547 | |
dc.description.abstract | Kinship verification is the term of verifying whether the given two people have a kin
relationship from their facial images or videos or other biological features. As a soft
bio-metric modality, visual kinship verification has high availability and extremely low
cost compared to DNA-based methods. It is a huge challenge to analyze kinship based on
visual information, mainly because the kin relationship has s large intra-class differences
and small inter-class differences due to factors such as gender and age. This requires us to
extract more discriminative features. Video data can bring us a new dimension. Previous
studies have shown that people with kinship not only have similar appearances but also
have similar expression patterns, which suggests that we can extract dynamic features of
facial videos for kinship verification. Traditional methods use handcraft features to extract
dynamic features, and some new research begins to use neural networks. Our research
focuses on smiling expressions, trying to extract spatio-temporal features from facial videos
using a state-of-the-art video vision transformers. We created a video vision transformer
based siamese network and trained it on a face video dataset. We experimentally compare
the impact of using dynamic features versus purely texture features on kinship verification.
We then compared the capabilities of CNNs and ViTs in extracting facial dynamic features.
We tested the performance of the model by adjusting the initialization and training methods
of the model. Referring to the latest research, we developed a pre-training method based
on matched expression sequences to solve the challenge brings by the small size of the
dataset. Our study is trained on smiling videos provided by the UvA-NEMO dataset and
presents results and analytics. | |
dc.description.sponsorship | Utrecht University | |
dc.language.iso | EN | |
dc.subject | Artificial Intelligence, Computer Vision, Kinship Verification | |
dc.title | KINSHIP VERIFICATION USING VISION TRANSFORMERS | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | Artificial Intelligence, Computer Vision, Kinship Verification | |
dc.subject.courseuu | Artificial Intelligence | |
dc.thesis.id | 13980 | |