45 Anomaly detection with similarity graphs and active learning Building and storing static and dynamic similarity graphs with the help of a vector database
Summary
Fraudulent transactions of credit cards are a major problem for financial institutions and continues to grow along digital transformation. A conventional view states that fraudulent transactions are anomalies. A novel view suggests fraudulent transactions exists within fraud rings. An anonymous, sizeable, and unbalanced dataset of principal component analysis is investigated to juxtapose the perspectives on fraudulent transactions. Approximate nearest neighbour search identifies similar items in terms of Euclidean distance, which is applicable to create similarity graphs. The similarity graphs yield valuable metrics for the classification of fraudulent transactions. The findings in respect to the given approach are as following. First, the assortative mixing between fraudulent transactions is high in similarity graphs. Second, no topological difference exists between fraudulent and legitimate transactions. Third, fraudulent transactions are anomalies but also exist in fraud rings. Fourth, the effect of fraud rings is stronger than the effect of anomalies. Fifth, both perspectives make useful variables for a classification model which is competitive to the state-of-the-art.