S.O.C.C.E.R - A novel framework for measuring sports facts relevance
Summary
Sports stakeholders capture more details about all the facet of a sport due to economical reasons. Next, they have to select the relevant information to sports fans, gamblers, journalists and other stakeholders that can use this information. Sports has been a research subject of the Natural Language Generation (NLG) field due to the data it produces and the standard natural language at times. But, NLG system have struggled to deliver a suitable level of performance compared with journalists or other sports experts due to various issues. We established that one of these issues is due to the lack of a proper definition of relevance for sports content. We set out to fix this issue by defining a relevance measurement framework for the sports domain.
We introduce the term sports fact to refer to natural language about sports. We defined seven general types of content that are part of a sports fact and can have an influence on relevance. We identified twelve properties that are used to measure relevance in other domains. We established a set a measuring guidelines that describe what and how a measurement should be done for a given property and content type. All of these form the S.O.C.C.E.R framework, the main artifact of this research project.