View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        A framework for de-novo mutations discovery in Next Generation Sequencing data

        Thumbnail
        View/Open
        MScThessisReportMirceaCretuStancu.pdf (2.474Mb)
        Publication date
        2014
        Author
        Cretu Stancu, M.
        Metadata
        Show full item record
        Summary
        We address the problem of accurately and efficiently identifying de-novo mutations in the human germline. More precisely, how can we detect de-novo point mutations on the sex chromosome in a robust yet sensible manner? What are the challenges that arise from the quality of the available data for this chromosome? What is the pattern of de-novo events on this chromosome, compared to the rest of our genome? The challenge of devising a discovery method for such events comes from their rarity relative to the error rates of the underlying technology involved in DNA reading. We discuss the relevance of this research in the light of our increasing understanding of evolution and our genetic code’s structure and function, as well as its practical applications of finding genetic disease risk factors. We present the field’s currently most used analysis methods and technologies, and describe each step that influences the design and/or performance of the model we implement. We present a straightforward yet efficient general model of de-novo mutations discovery and then show how the model needs to be adapted in order to correctly capture the particularities of the chromosome. Furthermore we illustrate what information can be explained by our model and where we still need to apply domain knowledge to correct the output. Finally, we show how the model is integrated in the complex and modular analysis pipeline used in the community.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/18344
        Collections
        • Theses
        Utrecht university logo