dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Nguyen, Dennis | |
dc.contributor.author | Wekking, Michael | |
dc.date.accessioned | 2022-09-09T02:03:04Z | |
dc.date.available | 2022-09-09T02:03:04Z | |
dc.date.issued | 2022 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/42557 | |
dc.description.abstract | The value of diversity, in terms of representation of people, has recently come to the forefront for public broadcasters, including the Dutch NPO. The NPO measures diversity through a questionnaire, which asks people to what extent they see or hear people from different population groups in an episode. This thesis aims to predict this ‘diversity score’ using TF, TF-IDF and LDA, to gain insight into the predictive capacity of words and topics for diversity in media content. Both words and topics are found that predict this measure of diversity: the diversity score can be predicted with explained variances between 8% and 49.7%, depending on the dataset. | |
dc.description.sponsorship | Utrecht University | |
dc.language.iso | EN | |
dc.subject | Automating a measure of diversity for the NPO using subtitles of NPO shows | |
dc.title | Predicting diversity in subtitles of NPO shows | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | machine learning; data mining; natural language processing; diversity; public broadcasting; | |
dc.subject.courseuu | Applied Data Science | |
dc.thesis.id | 9851 | |