Decoding Neurodegeneration: Leveraging Machine Learning Approaches to Classify Single Cells and Identify Transcriptomic Features in Alzheimer’s Disease and ALS

Ballieux, Rutger

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Kenna, Kevin
dc.contributor.author	Ballieux, Rutger
dc.date.accessioned	2025-02-01T00:01:57Z
dc.date.available	2025-02-01T00:01:57Z
dc.date.issued	2025
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/48441
dc.description.abstract	Abstract Background: Neurodegenerative diseases like Alzheimer's disease (AD) and Amyotrophic Lateral Sclerosis (ALS) pose significant public health challenges due to their complex aetiologies involving genetic, environmental, and biological factors. Single-cell RNA sequencing (scRNA-seq) enables detailed analysis of cellular heterogeneity in these diseases. However, the high dimensionality and sparsity of scRNA-seq data complicate the classification of diseased versus healthy cells, necessitating systematic evaluation of machine learning models and feature engineering strategies. Methods: We analysed scRNA-seq datasets from the dorsolateral prefrontal cortex of AD patients and controls, and from the primary motor cortex of C9orf72-associated ALS (C9ALS) patients, sporadic ALS (SALS) patients, and shared controls. Logistic Regression and Random Forest classifiers were trained to distinguish diseased from healthy cells using various feature extraction methods: random feature selection, dimensionality reduction (Most Variable Features and Principal Component Analysis), and a biologically focussed approach combining Differential Expression (DE) analysis and Weighted Gene Co-expression Network Analysis (WGCNA). Five-fold cross-validation ensured robust evaluation. Results: Classification accuracy was exceptionally high across all datasets. The biologically focussed method achieved the highest performance, with Logistic Regression attaining peak test AUCs of 0.980 in SALS and 0.976 in C9ALS. Dimensionality reduction was also effective, particularly in AD, where fewer significant features limited the biologically focussed method. Classifiers identified 104 shared genes, among AD, C9ALS, and SALS, implicated in neurodegeneration. Pathway enrichment analysis of these genes highlighted associations with neurodegenerative pathways, mitochondrial dysfunction, and synaptic processes. Machine learning classifiers identified additional critical genes and pathways beyond those detected by DE analysis alone. Conclusions: Accurate classification of single-cell transcriptomic data in AD and ALS is feasible, with performance significantly influenced by feature extraction methods. Biologically focussed approaches and dimensionality reduction techniques enhanced classifier accuracy and identified key transcriptomic features distinguishing diseased and healthy cells. These findings deepen our understanding of molecular mechanisms in neurodegenerative diseases and may inform the development of novel diagnostic and therapeutic strategies. Further validation and functional studies are needed to translate these insights into clinical applications.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	This study applies machine learning to classify single cells from Alzheimer’s and ALS patients using scRNA-seq data. Logistic Regression and Random Forest models, with feature extraction methods like DE analysis and WGCNA, achieved high accuracy (AUC ~0.98). A biologically focused approach identified 104 shared neurodegeneration-related genes and key pathways. These results enhance understanding of AD and ALS mechanisms and may aid in developing diagnostics and therapies.
dc.title	Decoding Neurodegeneration: Leveraging Machine Learning Approaches to Classify Single Cells and Identify Transcriptomic Features in Alzheimer’s Disease and ALS
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.courseuu	Bioinformatics and Biocomplexity
dc.thesis.id	42596

Files in this item

Name:: Thesis V3.docx
Size:: 11.65Mb
Format:: Microsoft Word 2007

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record