View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Selecting meaningful images: The importance of global and local visual cues in EQA

        Thumbnail
        View/Open
        Master_Thesis_Bart_Verhoef_6601723.pdf (9.422Mb)
        Publication date
        2025
        Author
        Verhoef, Bart
        Metadata
        Show full item record
        Summary
        Embodied Question Answering (EQA) challenges agents to explore an environment and select relevant visual information to answer questions posed in natural language. Existing EQA systems often use vision-language models (VLMs) to handle perception, language understanding, and action, making it difficult to assess their capabilities in each individual dimension. In this work, we present a novel preprocessing step that focuses specifically on the vision-language integration problem, independent of action or navigation. Our method leverages global scene embeddings and grounded object-centric features to identify relevant frames. To improve object-level grounding, we use a large language model (LLM) to extract explicitly stated and contextually implied objects. Evaluated on the OpenEQA EM-EQA benchmark, using both global and local visual cues achieves an average LLM-Match score of 70.8, improving the state of the art by 15.5 percentage point. By reducing the number of frames processed, our approach increases correctness, reduces computational cost and improves explainability.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/50735
        Collections
        • Theses
        Utrecht university logo