dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Monachesi, Paola | |
dc.contributor.advisor | Feelders, Ad | |
dc.contributor.author | Tunru, V.A. | |
dc.date.accessioned | 2014-09-16T17:00:59Z | |
dc.date.available | 2014-09-16T17:00:59Z | |
dc.date.issued | 2014 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/18330 | |
dc.description.abstract | In this thesis I propose the repurposing of Latent Dirichlet Allocation (LDA), a topic modeling algorithm, for the discovery of communities of interest. To test it, I use it to discover communities on the social news and entertainment website reddit. I then use it to compare the composition of communities of interest to that of topological communities: communities discovered based on the topology of social graphs. I use both methods to find communities based on the Enron email corpus, and compare their results using cluster evaluation methods. | |
dc.description.sponsorship | Utrecht University | |
dc.format.extent | 1375006 | |
dc.format.mimetype | application/pdf | |
dc.language.iso | en | |
dc.title | Comparing Topological Communities and Communities of Interest Using Topic Modeling | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | topic modeling;latent dirichlet allocation;LDA;machine learning;unsupervised learning;communities;community of interest;topological community;graph;social graph;reddit;Enron;mutual information;normalised mutual information;NMI;Jaccard Index;cluster validation;information theory | |
dc.subject.courseuu | Technical Artificial Intelligence | |