Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorQahtan, Hakim
dc.contributor.authorTsiamis, Thanos
dc.date.accessioned2024-08-30T23:02:12Z
dc.date.available2024-08-30T23:02:12Z
dc.date.issued2024
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/47516
dc.description.abstractFIONA (FInding Outliers iN Attributes) is a novel framework designed for detecting outliers in categorical data. Outliers, often indicative of errors or anomalous observations, can have a significant impact on data analysis and decision-making processes. In the case of categorical attributes, the task of detecting outliers necessitates the definition of a similarity metric between different values, a task more intricate than with numerical attributes. FIONA aims to address this challenge by focusing on the syntactic structures of attribute values, providing a powerful tool for identifying unusual patterns within datasets. The framework operates in an unsupervised manner, eliminating the need for training examples. It leverages syntactic transformations, such as regular expressions and generalizations, to capture and analyze the structural characteristics of categorical values. By constructing a tree-like structure and applying a custom scoring function, FIONA systematically compares and evaluates the similarity of attribute values. The evaluation of FIONA on various datasets, demonstrates its effectiveness in outlier detection. While some false positives are identified, further analysis reveals interesting insights and highlights the importance of considering semantic context alongside the syntactic structures. FIONA’s scalability allows it to handle large datasets efficiently in contrast to conventional baseline methods, making it a valuable tool for outlier detection in various real-world applications.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectFIONA, short for "Finding Outliers in Attributes," is an innovative framework designed for the detection of categorical outliers within datasets. This configuration-free and user-friendly tool specializes in identifying unusual patterns in data attributes, making it invaluable for data analysis and decision-making processes.
dc.titleFIONA - A Categorical Outlier Detector Framework
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsCategorical Outlier Detection; Framework; Configuration-free
dc.subject.courseuuComputing Science
dc.thesis.id24776


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record