MetadataShow full item record
Cities exist as nodes in a network and today, more than ever, network embeddedness is of critical importance in understanding how they develop. There was a time when a network of cities was highly clustered and relationships were limited to geographical neighbors. As recreational travel grew and communication technology emerged, the network topology of cities changed. People could build their networks and migrate with greater ease. Today, a large part of human interaction and relationship building occurs online, often in an informal and colloquial forum. This paper aims to apply the toponym co-occurrence method in a novel way to online conversations in order to determine how well we can to explain the co-occurrence of city names in social media. To this end, we will focus on European cities and the social media website Reddit. There are some difficulties with disambiguation as the name of a city can often take on multiple meanings. We will approach this task in three stages. First, we will explore the data to get a better understanding of toponym co-occurrences. Then we will regression techniques to determine how well the information collected can predict variability in co-occurrence of city names. Finally, we will use unsupervised machine learning techniques to assess how well we can disambiguate between different word senses when searching for toponyms.