KINSHIP VERIFICATION USING VISION TRANSFORMERS

Tang, Huaixi

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Önal Ertugrul, I.
dc.contributor.author	Tang, Huaixi
dc.date.accessioned	2023-02-17T01:00:50Z
dc.date.available	2023-02-17T01:00:50Z
dc.date.issued	2023
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/43547
dc.description.abstract	Kinship verification is the term of verifying whether the given two people have a kin relationship from their facial images or videos or other biological features. As a soft bio-metric modality, visual kinship verification has high availability and extremely low cost compared to DNA-based methods. It is a huge challenge to analyze kinship based on visual information, mainly because the kin relationship has s large intra-class differences and small inter-class differences due to factors such as gender and age. This requires us to extract more discriminative features. Video data can bring us a new dimension. Previous studies have shown that people with kinship not only have similar appearances but also have similar expression patterns, which suggests that we can extract dynamic features of facial videos for kinship verification. Traditional methods use handcraft features to extract dynamic features, and some new research begins to use neural networks. Our research focuses on smiling expressions, trying to extract spatio-temporal features from facial videos using a state-of-the-art video vision transformers. We created a video vision transformer based siamese network and trained it on a face video dataset. We experimentally compare the impact of using dynamic features versus purely texture features on kinship verification. We then compared the capabilities of CNNs and ViTs in extracting facial dynamic features. We tested the performance of the model by adjusting the initialization and training methods of the model. Referring to the latest research, we developed a pre-training method based on matched expression sequences to solve the challenge brings by the small size of the dataset. Our study is trained on smiling videos provided by the UvA-NEMO dataset and presents results and analytics.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	Artificial Intelligence, Computer Vision, Kinship Verification
dc.title	KINSHIP VERIFICATION USING VISION TRANSFORMERS
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	Artificial Intelligence, Computer Vision, Kinship Verification
dc.subject.courseuu	Artificial Intelligence
dc.thesis.id	13980

Files in this item

Name:: KVViT 23-02-09.pdf
Size:: 7.764Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record