Sexual health information: credible or not?
Summary
Nowadays, youth use online media as their primary source of sexual health information. However, user-generated content is not always reliable and can cause health problems, especially since it is difficult to distinguish between credible and unreliable information. Multiple studies have been done on automatic credibility assessment of online media, but not specifically sexual health information. These studies make use of markers that may indicate misinformation. Therefore, this study aims to obtain an overview of these markers and to examine whether and how these markers can be applied to the automatic credibility evaluation of user-generated sexual health data.
Based on the literature, this study created a comprehensive overview of all content-based markers (i.e., markers that could be derived from the text) that aided in credibility detection. A subset of these markers was modelled on the data using both supervised machine learning and more conventional methods. Subsequently, their relationships were examined to see if they aligned with the literature.
This study illustrates the disagreement between the existing literature on how the markers aid in credibility detection. Besides, the results indicate that there were no relationships between the vast majority of markers. This may be the first indication that most markers from the literature cannot be generalised to the automatic credibility assessment of sexual health data. However, this cannot be said with certainty due to the lack of a labelled dataset. The study listed several ideas for future research if a labelled dataset became available, which emphasized the idea that content-based markers should be used in combination with other markers.