Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorMonachesi, Paola
dc.contributor.advisorFeelders, Ad
dc.contributor.authorTunru, V.A.
dc.date.accessioned2014-09-16T17:00:59Z
dc.date.available2014-09-16T17:00:59Z
dc.date.issued2014
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/18330
dc.description.abstractIn this thesis I propose the repurposing of Latent Dirichlet Allocation (LDA), a topic modeling algorithm, for the discovery of communities of interest. To test it, I use it to discover communities on the social news and entertainment website reddit. I then use it to compare the composition of communities of interest to that of topological communities: communities discovered based on the topology of social graphs. I use both methods to find communities based on the Enron email corpus, and compare their results using cluster evaluation methods.
dc.description.sponsorshipUtrecht University
dc.format.extent1375006
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleComparing Topological Communities and Communities of Interest Using Topic Modeling
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordstopic modeling;latent dirichlet allocation;LDA;machine learning;unsupervised learning;communities;community of interest;topological community;graph;social graph;reddit;Enron;mutual information;normalised mutual information;NMI;Jaccard Index;cluster validation;information theory
dc.subject.courseuuTechnical Artificial Intelligence


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record