Choosing between machine learning and logistic regression: criteria and considerations for model selection in diagnostic prediction studies
Summary
This article explores the crucial decision-making process involved in choosing between two types of models—logistic regression (LR) and machine learning (ML)—for studies that aim to diagnose diseases. The focus is on understanding the criteria and considerations that guide researchers in selecting the appropriate model for their study, as well as examining the explanations researchers provide for the performance of their model.
Studies published between 2022 and 2024 were examined to gain insights into how researchers make these choices. The review included studies that developed diagnostic models using both LR and various ML algorithms. Studies had to compare both LR and ML and had to develop a diagnostic model to be included.
Out of an initial pool of 78 studies, 39 studies met the inclusion criteria. These studies were categorized based on the types of diseases they focused on, using the International Classification of Diseases 11th revision for standardization. The analysis revealed that ML models generally outperformed LR models in terms of predictive accuracy based on the mean area under the curve (AUC) of 0.87, compared to 0.82 for LR models. Support vector machine was the top-performing ML algorithm in 13 studies (33.3%).
Researchers often chose ML models because they can handle more complex and high-dimensional data, identifying patterns that simpler models like LR might miss. This capability is particularly important in medical fields where data complexity can be very high, such as in imaging, genomics, and electronic health records analysis. LR was chosen since it was a simpler model to interpret. The explanations provided by researchers in the discussion were mainly about why ML and LR performed the way they did. These explanations were similar to the selection reasons mentioned in the methodology.
Still, the reporting of why researchers selected ML and LR was lacking, as were the explanations provided in the discussion. More rigorous reporting guidelines are needed.