A Data-Driven Decision Model for Machine Learning Model Selection

Steffens, Lex

View/Open

MasterThesisLexSteffens.pdf (1.523Mb)

Publication date

2024

Author

Steffens, Lex

Metadata

Show full item record

Summary

Context: Machine learning models are readily accessible and extensively utilized due to their practical utility in predictive modeling tasks. Despite the consistent performance of individual models, selecting the appropriate model for a specific applied machine learning problem remains a significant challenge for research modelers. Various features, such as model trainability and stakeholder comprehensibility, must be considered when applying these models. These considerations can critically influence the long-term viability of a machine learning model. Method: To address this challenge, we present a meta-model for the decision-making process in the context of machine learning model selection. The creation of this decision model adopts a systematic research approach, combining systematic literature review, expert interviews, case studies, and design science to investigate machine learning model selection approaches. The systematic literature review enables us to gather and analyze relevant information from existing literature. The expert interviews allow a critical approach to our collected data. The case studies help us assess the practical applicability of our findings. Design science allows for the finalization of a decision model. Results: Our study analyzed 43 common models across 72 common features. We provide a comprehensive taxonomy of machine learning paradigms, approaches, and domains. We provide insights into potential model combinations, trends in model selection, evaluation measures, and frequently used datasets for training and evaluating these models. The collected data was incorporated into a decision model, further developed through expert interview feedback. Finally, the decision model was practically evaluated through eight case studies. Contribution: Our study presents a data-driven decision model that could aid research modelers in machine learning model selection. We highlight the importance of further developing the decision model to improve its accuracy and scope beyond its current state.

URI

https://studenttheses.uu.nl/handle/20.500.12932/47702

Collections

Theses