Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorExterne beoordelaar - External assesor,
dc.contributor.authorHuijer, Cyriel
dc.date.accessioned2022-01-01T00:00:31Z
dc.date.available2022-01-01T00:00:31Z
dc.date.issued2022
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/339
dc.description.abstractNowadays, in a research hospital such as the Princess Máxima Centre (PMC), research and patient treatment is often substantiated on NGS data. Therefore, quality control of patient data is vital to preserve data integrity. However, several steps of the process from patient to genotype are vulnerable to sample swaps. For this purpose, NGSCheckMate was presented, a tool which retrospectively checks whether samples are labelled correctly based on a set of 21K SNPs. Nevertheless, running NGSCheckMate utilising the original 21K SNP set was found to be computationally inefficient in the PMC, with runtimes of patient samples adding up to ~68 hours. Moreover, data coming out of the PMC biobank sequencing pipeline was observed not to be compatible with NGSCheckMate as no integration of RNA-Seq with W[GX]S was achieved, even though samples were obtained from the same biomaterial. By selection of SNPs based on variety in minor allele and coverage across RNA-Seq samples, smaller SNP sets were created that maintained and improved performance compared to the original 21K set. Total runtime of NGSCheckMate was decreased from ~68 to ~2 hours. Furthermore, in combination with pre-processing and additional filtering of low-quality files, RNA-Seq integration was improved. In conclusion, this study presents a range of smaller SNP sets that both decrease runtime and improve performance of NGSCheckMate in sample swap detection.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectIn this thesis limitations of NGSCheckMate in the PMC are discussed and improvements are suggested.
dc.titleGenotype-matching NGS analyses in the Princess Máxima Centre: Future Proof?
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsBioinformatics; Next Generation Sequencing; Genotype Matching
dc.subject.courseuuDrug Innovation
dc.thesis.id1284


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record