Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorÖnal Ertugrul, I.
dc.contributor.authorZou, Zihan
dc.date.accessioned2025-03-25T00:01:45Z
dc.date.available2025-03-25T00:01:45Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/48669
dc.description.abstractThis research aims to explore the potential of predicting investment decisions in entrepreneurial pitches using multimodal signals, particularly visual and linguistic features. Entrepreneurial pitches are critical for securing funding, and signals from both verbal content and non-verbal cues, such as gestures and facial expressions, play a crucial role in shaping investors' decisions. However, current studies have largely focused on isolated forms of signaling, leaving a gap in understanding how multimodal features interact to influence investment outcomes. This study proposes a machine-learning approach that leverages visual and linguistic cues from pitch videos to predict the likelihood of investment. Using the "Data Management Entrepreneurial Pitches" dataset, the research seeks to address several key questions, including the efficacy of visual and linguistic unimodal models, the benefits of combining modalities into a unified linguistic space, and the performance of multimodal fusion models. To this end, a series of neural network models will be designed and tested, utilizing advanced techniques in Natural Language Processing (NLP) and Computer Vision, such as BERT, MEGA, VideoMAE, and VideoLLaVA. This thesis investigated the effectiveness of visual and linguistic multi-modal models in predicting the probability of entrepreneurial investment and comparing the performance of unimodal models with that of multi-modal fusion models.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectLarge vision and language models-based investment predictions on entrepreneurial pitch videos
dc.titleLarge vision and language models-based investment predictions on entrepreneurial pitch videos
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsArtificial Intelligence; VLM; Multi-modal learning; Investment decision
dc.subject.courseuuArtificial Intelligence
dc.thesis.id44515


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record