Community Detection in Historical Data Using Knowledge Graphs
Summary
In the field of Digital Humanities, knowledge graphs are used to store and model archival data. For instance, the ECARTICO dataset describes actors involved in the cultural industries of the Low Countries and the STCN dataset describes published books and their authors and printers. This data opens up the possibility of performing community detection on parts of these (combined) datasets. For instance, a combination of parts of the STCN and ECARTICO knowledge graphs could reveal networks of people who worked together through shared acquaintances, such as printers and publishers. The goal of this research project is to performs static and dynamic community detection on these (combined) datasets, in order to find interesting clusters and follow their evolution through time. To do this, community detection algorithms designed for Heterogeneous Information Networks are analysed, the historical data is converted and the algorithms are applied to the data. Internal evaluation measures indicate that our method finds structure in the data and that, according to domain experts’ evaluation, the results are valid.