View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Comparing Random Forest, Logistic Regression, and Heterogeneous Graph Neural Networks: Classifying Money Laundering in High Liquidity Sectors

        Thumbnail
        View/Open
        Thesis Manh Tri Ngo.pdf (1.081Mb)
        Publication date
        2025
        Author
        Ngô, Tri
        Metadata
        Show full item record
        Summary
        Money laundering is the act of organizations or individuals aimed at legitimizing the origins of assets obtained through criminal activities. Modern money laundering activities tend to form sophisticated criminal networks involving various entities and individuals with different roles, making detection and prevention using traditional methods, such as rule-based approaches, more challenging. This study combines machine learning methods (Random Forest, Logistic Regression) and deep learning method (specifically Heterogeneous Graph Neural Network) to classify suspicious money laundering companies in high-liquidity sectors. The results indicate that the Heterogeneous Graph Neural Network outperforms the other models with higher recall and AUC-ROC performance. Comparing the network metrics and confusion matrix, the common characteristics of suspicious companies are clarified. Companies that tend to connect with many other firms, play a crucial intermediary role in the network, form a distinct community, and maintain close connections with each other are potentially illegal. These results provide a foundation for building robust anti-money laundering systems in the future. However, further research should focus more on addressing data imbalance and gray data (unconfirmed money laundering) issues to improve the accuracy of the algorithms.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/49805
        Collections
        • Theses
        Utrecht university logo