How does CLIP process negation? A multimodal interpretability study

Quantmeyer, Vincent

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Gatt, A.
dc.contributor.author	Quantmeyer, Vincent
dc.date.accessioned	2024-03-15T00:02:02Z
dc.date.available	2024-03-15T00:02:02Z
dc.date.issued	2024
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/46171
dc.description.abstract	Various benchmarks have measured linguistic capabilities of vision-and-language (VL) models, but do not provide insights into how models implement these capabilities. This thesis translates model interpretability techniques developed for large language models to the mul- timodal space in order to investigate the mechanisms involved in CLIP’s processing of negation. In the text encoder, specific negator-selective attention heads are found that seem crucial in controlling the movement of negation-related information through the model. Early evidence suggests that these heads are dataset-independent. In the image encoder, MLPs seem more relevant than attention, particularly in early layers, but further research is needed to elucidate these processes. As for CLIP’s imperfect ability to process negation correctly, multiple dataset features are identified that partly explain its performance, suggesting that benchmark performance isn’t a direct indicator of linguistic understanding. Future research directions are discussed that refine our understanding of the discovered mechanisms and test their generalisability on other datasets and models.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	This thesis translates model interpretability techniques developed for large language models to the multimodal space to investigate the mechanisms involved in CLIP’s processing of negation.
dc.title	How does CLIP process negation? A multimodal interpretability study
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.courseuu	Artificial Intelligence
dc.thesis.id	29198

Files in this item

Name:: VQ_Thesis_Final_Version.pdf
Size:: 1.077Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record