Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorDirksen, S
dc.contributor.authorIlic, Linda
dc.date.accessioned2023-11-09T00:01:11Z
dc.date.available2023-11-09T00:01:11Z
dc.date.issued2023
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/45509
dc.description.abstractAs machine learning becomes more popular and available to all sectors ML models have to be maintained and their performance has to be mon- itored. Large-scale disruptive events such as the COVID-19 pandemic have a big influence on society and possibly also on the data that reflects it. As a result, the performance of an ML model might decrease substan- tially. This change in data is difficult to monitor in the absence of labels. As this project is in collaboration with the Auditdienst Rijk and labels are not readily available in their data environment this paper proposes the use of the STUDD method to detect drift in an unsupervised way. The main hypothesis is that drift should be detected in new unseen accounting data around early 2020 when new COVID-19-related policies were implemented and affected the budget and spending patterns of the Dutch government. Here we show that the STUDD method successfully detected drift in the new unseen data early on in the pandemic year. However, this change can not be attributed to the spread of COVID-19 as policies were implemented a substantial time after the first drift was detected. This might indicate other reasons for the changes in the data such as time or some external events that occurred in the previous year and already induced drift.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectThe thesis deals with drift detection in a setting were labels are not readily avaiable. The thesis was done in collaboration with the Auditdienst Rijk. The hypothesis was that drift should be detected in new unseen accounting data around early 2020 when new COVID-19-related policies were implemented and affected the budget and spending patterns of the Dutch government. Here we show that the STUDD method successfully detected drift in the new unseen data early on in the pandemic year.
dc.titleUnsupervised drift detection in accounting data
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.courseuuApplied Data Science
dc.thesis.id25789


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record