View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        Synthetic network generation for financial data

        Thumbnail
        View/Open
        Thesis_Karen_Schutte_final.pdf (3.215Mb)
        Publication date
        2024
        Author
        Schutte, Karen
        Metadata
        Show full item record
        Summary
        This thesis presents a novel approach to generating synthetic transaction networks. The research focuses on developing a graph-based generative model capable of replicating charac- teristics observed in real-world financial networks. The motivation of this model is to preserve data privacy, and it generates networks that exhibit power-law degree distributions, no as- sortativity or disassortativity, exponential weight distributions, and community structures similar to those found in actual financial transaction data. The methodology involves a clustering analysis of a real transaction dataset to identify node-types, which are then integrated into the generative model. Parameters for node gen- eration, edge densification, and a probability matrix governing type-based connections are established to control the network’s structural properties. The model is validated against this real network dataset from Rabobank, by comparing the metrics and structural properties. Experimental results show that the model can produce stable synthetic networks over 200,000 iterations, with generated networks exhibiting comparable degree distributions, edge densities, and community structures to the real dataset. However, limitations include the use of a sampled and aggregated dataset for validation, which restricts the model’s ability to capture the full complexity of real financial networks, and the model’s exponential weight distribution diverging from the real dataset’s power-law weight distribution. This research contributes a publicly available tool, which can be used as a starting point for generating synthetic financial transaction networks, facilitating applications in machine learning model training for detecting criminal financial activity. Future research directions include improving weight distribution modeling, exploring algorithms for power-law distribu- tions, and extending the model to include interbank networks and temporal dynamics.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/46862
        Collections
        • Theses
        Utrecht university logo