View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Hybrid Trace Clustering

        Thumbnail
        View/Open
        Thesis_Final.pdf (6.489Mb)
        Publication date
        2021
        Author
        Abdollahi, M.
        Metadata
        Show full item record
        Summary
        Process mining is a relatively young analytical discipline that is used as a bridge between data mining and business process management. Experts have been using process mining to extract insights about how processes work in real life, how many deviations they have compared to their anticipations, and how these processes can be improved. However, when it comes to large and complex data, discovered process models by process mining algorithms might be quite complicated. In such cases, called the discovery of spaghetti-like models, one cannot simply understand the required knowledge from a model. One possible approach to avoid this type of models is to cluster traces that share homogeneous behaviors. While existing approaches offer promising results, each of them suffers from a drawback, including having high computational complexity, not producing high quality models, and not explaining the existence of irrelevant traces in some clusters to name a few. This thesis aims to hybridize two trace clustering types by introducing an algorithm that makes a balance between the quality of cluster models and run time of the algorithm. Evaluation of our technique (Hybrid) on six real-life event logs show meaningful improvements against applying similarity-based or model-driven techniques individually in terms of quality of process models. The obtained results of performance and scalability evaluation also reveal that Hybrid technique delivers clusters on average with lower running times compared to a state-of-the-art model-driven technique.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/39160
        Collections
        • Theses
        Utrecht university logo