Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorKrempl, G. M.
dc.contributor.authorFellegi, K.
dc.date.accessioned2019-01-28T18:00:28Z
dc.date.available2019-01-28T18:00:28Z
dc.date.issued2018
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/31758
dc.description.abstractThis research is a comparative study of feature selection methods for biomarker discovery. 10 different machine learning techniques were considered for feature selection. The main assumption behind the research was that certain biomarkers can reflect the perceived strenuousness of the different exercise levels. For measuring the perceived exercise intensity, the Borg scale was used. Using the top 10 most expressive biomarkers selected by each model, 39 different biomarkers were selected out of the total 64. The most frequently occurred one was "factord" selected by 7 models. Biomarkers "trp" and "CORT" were both selected by 6 of the models. "ifabp", "LEUCO" and "BICARB" were selected by 5 of the models. In general, the predictive power of the applied machine learning techniques do not vary much. The highest accuracy, 78% was achieved by Logistic Regression. Regarding the area under the ROC curve, the best result was achieved using the full logistic regression model with an AUC = 0.72. Applying feature selection however, a better performance can be achieved compared to the models with all the predictors. Recursive feature elimination on the random forest model yielded an 81% accuracy and the Lasso on logistic regression yielded an even higher 84% accuracy. All in all, considering the criteria for selecting candidate models, Logistic regression represents a balanced mix of model performance and interpretability.
dc.description.sponsorshipUtrecht University
dc.format.extent1403302
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleFeature selection for biomarker discovery
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsFeature selection, Dimension reduction, Classi?fication, Lasso, Biomarker discovery, Bioinformatics, Exercise Physiology
dc.subject.courseuuBusiness Informatics


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record