Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorCampos, C.P. de
dc.contributor.advisorBrinkhuis, M.J.S.
dc.contributor.authorWit, R.C. de
dc.date.accessioned2019-07-19T17:00:39Z
dc.date.available2019-07-19T17:00:39Z
dc.date.issued2019
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/32878
dc.description.abstractThe opaqueness of machine learning (ML) models is a major hurdle for adoption throughout the industry. Actors do not outright trust predictions generated by models. Instead they opt to use business rules-based systems in combination with manual classification, rather than ML models. Reliable machine learning (RML) seeks to mitigate this issue by providing a metric that establishes to which extent the model is certain about its predictions. We investigate the applicability of RML techniques for high-stakes contexts. Specifically, we conduct a case-study in fraud detection at bol.com, the Netherlands' largest online retail platform. Using hand-labelled data from 199,939 orders, we train models that classify those orders as either fraudulent or legitimate. These models are a Naive Bayes classifier (NB), a Credal Sum-Product Network (CSPN), and an XGBoost gradient booster (XGB). All three of these models provide probability as a reliability metrics for their predictions, and the NB and CSPN models provide robustness as a metrics as well. The analysis of robustness in a CSPN with a continous feature is a novelty and extends its application to many new domains. While the overall accuracy of the models does not exceed that of manual classification, we demonstrate how RML can improve upon existing business processes in four manners: 1) providing accurate predictions over a subset of the observations; 2) allowing for a flexible accuracy/cover trade-off; 3) inferring the latent difficulty variable for classifying individual observations; and 4) eliciting features and feature combinations that allow for increased business knowledge. This shows how RML can yield improvements upon existing business processes even when overall predictive accuracy of models is low, validating and building upon existing research in this domain.
dc.description.sponsorshipUtrecht University
dc.language.isoen
dc.titleTo Catch a Thief: fraud detection with reliable machine learning
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.courseuuBusiness Informatics


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record