Genotype-matching NGS analyses in the Princess Máxima Centre: Future Proof?

Huijer, Cyriel

dc.rights.license	CC-BY-NC-ND
dc.contributor.advisor	Externe beoordelaar - External assesor,
dc.contributor.author	Huijer, Cyriel
dc.date.accessioned	2022-01-01T00:00:31Z
dc.date.available	2022-01-01T00:00:31Z
dc.date.issued	2022
dc.identifier.uri	https://studenttheses.uu.nl/handle/20.500.12932/339
dc.description.abstract	Nowadays, in a research hospital such as the Princess Máxima Centre (PMC), research and patient treatment is often substantiated on NGS data. Therefore, quality control of patient data is vital to preserve data integrity. However, several steps of the process from patient to genotype are vulnerable to sample swaps. For this purpose, NGSCheckMate was presented, a tool which retrospectively checks whether samples are labelled correctly based on a set of 21K SNPs. Nevertheless, running NGSCheckMate utilising the original 21K SNP set was found to be computationally inefficient in the PMC, with runtimes of patient samples adding up to ~68 hours. Moreover, data coming out of the PMC biobank sequencing pipeline was observed not to be compatible with NGSCheckMate as no integration of RNA-Seq with W[GX]S was achieved, even though samples were obtained from the same biomaterial. By selection of SNPs based on variety in minor allele and coverage across RNA-Seq samples, smaller SNP sets were created that maintained and improved performance compared to the original 21K set. Total runtime of NGSCheckMate was decreased from ~68 to ~2 hours. Furthermore, in combination with pre-processing and additional filtering of low-quality files, RNA-Seq integration was improved. In conclusion, this study presents a range of smaller SNP sets that both decrease runtime and improve performance of NGSCheckMate in sample swap detection.
dc.description.sponsorship	Utrecht University
dc.language.iso	EN
dc.subject	In this thesis limitations of NGSCheckMate in the PMC are discussed and improvements are suggested.
dc.title	Genotype-matching NGS analyses in the Princess Máxima Centre: Future Proof?
dc.type.content	Master Thesis
dc.rights.accessrights	Open Access
dc.subject.keywords	Bioinformatics; Next Generation Sequencing; Genotype Matching
dc.subject.courseuu	Drug Innovation
dc.thesis.id	1284

Files in this item

Name:: Thesis_CAMHuijer_PMCKemmeren.pdf
Size:: 4.449Mb
Format:: PDF

View/Open

This item appears in the following Collection(s)

Theses

Show simple item record