Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorMatthieu Brinkhuis, Verónica Burriel
dc.contributor.authorGroot, M.L. de
dc.date.accessioned2020-02-20T19:04:04Z
dc.date.available2020-02-20T19:04:04Z
dc.date.issued2019
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/34903
dc.description.abstractAmyotrophic Lateral Sclerosis is a neurodegenerative and lethal disease that causes death 3-5 years after diagnosis. A cure has not been developed yet. Researchers require more knowledge on the genetic architecture of the disease in order to develop a treatment. Up until now, variants in the DNA have been identified as a cause for ALS. To use more opportunities that lie in the field of genetics, DNA data of many patients and healthy controls has been gathered. An initiative that addresses this challenge is Project MinE, that aims to bring researchers, patients and other stakeholders together worldwide. They have created a database with many DNA profiles that could be used for ALS research. In the last decade, it became clear that the focus must be on the whole DNA sequence, instead of only protein coding genes. This other part has a regulatory function, which means that it has a major influence on the activity of protein coding genes. Next to this, not only variants that are common in a certain population, but also variants that are rare (but have a damaging effect) must be studied. A technique that can help to make sense of these topics, is machine learning. This Business Informatics thesis aims to compare the two machine learning frameworks “CADD” and “ExPecto” on their ability to predict gene expression effects from variants in regulatory DNA sequences. It is shown that the tools do not perform well on validation data of the GTEx and MPRA projects. Furthermore, the tools do not give any significant predictions for MinE data, when variants of patients and controls are compared. However, it is shown that the ExPecto framework, which was introduced in 2018, outperforms the state-of-the-art technique CADD in the validation phase.
dc.description.sponsorshipUtrecht University
dc.language.isoen
dc.titlePredicting the effects of genetic variants in ALS patients
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsMachine Learning, DNA, genes, genetics, algorithm, ALS
dc.subject.courseuuBusiness Informatics


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record