View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        The estimation of model performance on unseen data

        Thumbnail
        View/Open
        Thesis_MSc Applied Data Science - Confidence Based Performance Estimation - Alex Essaijan.pdf (516.6Kb)
        Publication date
        2023
        Author
        Essaijan, Alex
        Metadata
        Show full item record
        Summary
        In the field of machine learning, the evaluation of models typically involves training them on a specific dataset and assessing their performance on a separate test set. However, assessing their performance in real-world environments can be challenging, especially when there is a shortage of labeled data. This study focuses on estimating the performance of machine learning classifiers in financial audits, specifically on unseen accounting data. By employing the Confidence Based Probability Estimation methodology, accurate estimation of performance metrics can be achieved, considering both predicted labels and probabilities. These estimates can be made under the assumption that there is no concept drift, the model is well calibrated, and it exhibits consistent performance across all classes. The findings of this study have practical implications for auditors, offering insights into the feasibility and usability of integrating machine learning models into audit procedures. This enables auditors to make informed decisions regarding the adoption of these models. Furthermore, this research contributes to the field by emphasizing the importance of considering class discrepancies and promoting a data-driven approach to improve sampling methods beyond traditional random sampling. In future research, it would be valuable to address challenges such as multiclass calibration, class imbalance, threshold selection methods, and real-time monitoring of model performance. These areas of investigation would enhance the robustness and applicability of machine learning models in production settings.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/44950
        Collections
        • Theses
        Utrecht university logo