Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorNguyen, Dennis
dc.contributor.authorWekking, Michael
dc.date.accessioned2022-09-09T02:03:04Z
dc.date.available2022-09-09T02:03:04Z
dc.date.issued2022
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/42557
dc.description.abstractThe value of diversity, in terms of representation of people, has recently come to the forefront for public broadcasters, including the Dutch NPO. The NPO measures diversity through a questionnaire, which asks people to what extent they see or hear people from different population groups in an episode. This thesis aims to predict this ‘diversity score’ using TF, TF-IDF and LDA, to gain insight into the predictive capacity of words and topics for diversity in media content. Both words and topics are found that predict this measure of diversity: the diversity score can be predicted with explained variances between 8% and 49.7%, depending on the dataset.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectAutomating a measure of diversity for the NPO using subtitles of NPO shows
dc.titlePredicting diversity in subtitles of NPO shows
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsmachine learning; data mining; natural language processing; diversity; public broadcasting;
dc.subject.courseuuApplied Data Science
dc.thesis.id9851


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record