Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorde Ligt, J.
dc.contributor.advisorCuppen, E.
dc.contributor.advisorWessels, L.
dc.contributor.authorWeide, R.H.W.E. van der
dc.date.accessioned2015-07-16T17:00:38Z
dc.date.available2015-07-16T17:00:38Z
dc.date.issued2015
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/20375
dc.description.abstractTo date, studies on non-coding regions of the genome, specifically in cancer, have been limited. This is mainly due to the complex nature of putative functional elements in these regions. In parallel with the ENCODE-project, the interest in these regions has increased: researchers are beginning to study causal non-coding variations in cancer1;2. Due to the increase in popularity and cost-effectiveness of various omics-approaches, more and more data is becoming available. The complexity of integrating and analysing information of these approaches increases with every added omics-layer or dimension (e.g. time-series, treatments). When studying the effects of structural variants in non-coding regions in cancer, this complexity is further increased due to cancer-specific (e.g. heterogeneous samples, rapid evolution) and the multiple types and consequences of different structural variant-specific factors. The current methods for integrating and analysing these layers and dimensions have two significant limitations in their design: scalability and generality (i.e. the possibility to add more levels or dimensions). Moreover, there isn’t an option to overview a dataset without filtering, dividing or restructuring the data. The integration of complex datasets is needed to understand the complex biology of cancer better 3, but is restricted by these limitations. Enter the Semantic Web and its Resource Description Framework (RDF). A simple and flexible framework for describing anything about anything. Since every type of data can be translated to this universal language, integration of large datasets of different levels and dimensions becomes possible and a lot more feasible. When researchers have converted their local data to RDF, they can easily connect and combine it with public repositories, which makes analyses even more powerful. By using the SPARQL Protocol and RDF Query Language (SPARQL), retrieving and manipulating data in RDF is easily readable by both humans and computers. The user can subsequently visualise the SPARQL-results as a whole or filter them further. Here, we propose the use of semantic web technologies and visual analytics to decrease the complexity of integrating and visualizing multi-level and -dimensional biological data. These methods will enable further elucidation of the complex biology of, for example, cancer. Firstly, we will create the framework needed to design the missing tools for converting the most-used NGS-formats to RDF. Next, visualisations (based on visual analytics) of the biological RDF-data will be created, which will be used to perform previously impossible integration-focussed analyses on the consequences of structural variation in the non-coding regions of cancer-genomes.
dc.description.sponsorshipUtrecht University
dc.format.extent2188901
dc.format.mimetypeapplication/pdf
dc.language.isoen
dc.titleEverything should be linked: linking and visualising data for dynamic multidimensional biological data interpretation.
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsstructural variation, multi-level data integration, next-generation sequencing, cancer, visual analytics
dc.subject.courseuuCancer, Stem Cells and Developmental Biology


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record