View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Assessing the spatial context of sentiments in geo-social media

        Thumbnail
        View/Open
        MScThesisReport_VStrasser_SpatialContextOfSentimentsInGeosocialMedia_v2.pdf (13.97Mb)
        Publication date
        2019
        Author
        Straßer, V.E.
        Metadata
        Show full item record
        Summary
        Various social media networking platforms are nowadays empowering millions of worldwide users to easily publish their contents on the internet. As the amount of data created by social media users is steadily growing on a daily basis, many researchers are investigating the possibilities to exploit this vast cloud of data and to derive useful information from it. The sheer amount of social media records available requires automated solutions in data handling and text processing in order to derive the desired information and to conduct research on it. This research particularly aims at assessing sentiments and their spatial context at hand of data from the platforms Twitter and Flickr gathered throughout the year 2018. Textual contents from these platforms are assessed and classified regarding to the sentiments they contain. Furthermore, options to derive accurate location information from these social media records are reviewed and employed for data originated in the Greater London area. Notably, the attention is turned to suitable techniques for the creation of representative data samples of the respective population with minimized bias towards certain user groups. Therefore, average sentiment scores are aggregated for different granularities of administrative area types. The suitability of different output granularities for this purpose is also subject to investigation. Eventually, spatial patterns for sentiments are assessed, and it is inquired if correlations between average sentiments and socio-economic indicators are detectable. The paper investigates to which degree sentiments on social media can serve as a socio-economic indicator. The methods used in this research comprise the handling and processing of large data volumes within SQL databases, automated text processing (with sentiment analysis tools) and various approaches to spatial analysis. The nature of spatial patterns is examined at hand of a Global Moran's I assessment and an optimized Hot-spot-analysis. Socio-economic indicators are reviewed regarding their Pearson correlation with derived sentiment scores. Analysis results are discussed and interpreted at hand of these findings and a series of reviewed small-scale examples within the city area. While different approaches to an improved data sampling are successfully developed and employed within this research, the results suggest that further investigation on this topic is recommended in order to reach an adequate representation of a population within a data sample. It is clearly identified that Twitter data has a better potential of being used for the purpose of this research than Flickr data. Regarding automatized sentiment analysis, the tool SentiStrength is identified as an ideal tool for the purpose of this research within a comparative study on available instruments. It is employed to classify sentiments for a vast amount of social media records, which are subsequently aggregated to average values at different output granularities. While spatial patterns within social media sentiments are clearly detected, correlations between those sentiments and other socio-economic are only traceable to a weak degree, up to a Pearson' r correlation coefficient slightly higher than +0.3. Finally, with the aim of strengthening the eligibility of social media contents as a socio-economic indicator, suggestions on further improvements in data sampling and analysis methods are given.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/35256
        Collections
        • Theses
        Utrecht university logo