Predicting Online Participation in Public Broadcasting Using Machine Learning
Summary
As it remains unclear how the success of public broadcasting on social media should
be measured, this thesis argues that online participation could serve as an alternative, social
media metric that aligns with public broadcasting’s traditional aims and draws support from
more general public research and audience research. Consequently, public broadcasting
benefits from understanding what online participation can be expected of their new media
content. For this reason, and to help characterize online participation as an alternative metric
of online participation, this thesis aims to investigate to what extent a predictive model of
online participation can be built, and which predictors are important in the process. In addition,
this thesis aims to investigate whether topic modeling can be applied on the subtitles of public
broadcasting shows to generate useful features for the prediction of online participation. Using
data from the Dutch public broadcasting service, 22 potentially predictive features of online
participation were collected and created, missing values and outliers were dealt with, and 7
models were individually tuned and compared with the aim of achieving the best prediction
performance. The results suggest that although most values of online participation can be
predicted with decent accuracy, the model performs poorly on large values of online
participation. Furthermore, the results indicate that the inclusion of topics as features did not
lead to significant improvements in prediction performance but do generate some useful
insights. Scholars and public broadcasting organizations may use the results of this thesis to
enhance their understanding of online participation as an alternative metric of broadcasting
success.