Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorFeelders, A.J.
dc.contributor.authorSarantopoulos, C.
dc.date.accessioned2015-10-15T17:00:28Z
dc.date.available2015-10-15T17:00:28Z
dc.date.issued2015
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/28112
dc.description.abstractThe ever growing spread of spam emails, despite being adequately fought by spam filters, can be more effectively addressed by understanding how spammers act. Grouping spam emails into spam campaigns, provides valu- able information on many aspects; how spammers obfuscate and correlation between seemingly different spam campaigns as well as many descriptive statistics. In this thesis, we focus on identifying spam campaigns from a 7.5 months period by clustering the web pages, which are referred to by the URLs inside the spam emails, based on their content. Following that, we apply Latent Dirichlet Allocation to assign a topic to every cluster and finally, we present a mechanism that incrementally clusters the incoming spam emails into spam campaigns in an automatic and on-line environment. We argue that our method for spam campaign identification is quick and efficient, able to represent the identified spam campaigns in a compact man- ner. On top of that it can assist towards better understanding of the domain and its applications.
dc.description.sponsorshipUtrecht University
dc.format.extent520196
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleIdentification and on-line incremental clustering of spam campaigns
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsSpam campaigns, spam emails, clustering, topic modelling, incremental, online, automatically
dc.subject.courseuuComputing Science


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record