Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorTamarit Chulia, Daniel
dc.contributor.authorSanchez Marin, Miguel
dc.date.accessioned2025-03-06T00:01:26Z
dc.date.available2025-03-06T00:01:26Z
dc.date.issued2025
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/48616
dc.description.abstractThe outbreak of next-generation sequencing led to a boom in the number of available protein sequences. This opened a breach between sequence data, and structural and functional data. In recent years, deep learning algorithms like AlphaFold2 have managed to predict protein structures from sequence with an accuracy similar to that of experimental structures, saving the gap with structural data. Protein structure alignment tools have also experienced an upswing in terms of speed with Foldseek, enabling large-scale comparisons. In proteins, structural conservation is higher than sequence conservation. Because of this, large-scale comparisons opened the door to distant homology detection based on structures. The efficiency of protein structure predictors permits the generation of structures on a large scale, which, after the structure-based homology detection, can be used for inference annotation. This methodology has been used up to the whole UniProtKB level, showing promising evolutionary insights and perspectives. The fruitful combination of different methods and use of large datasets highlights the potential of protein structure-based tools, creating a whole new approach for the computational research of evolution.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectThis review focuses on the recent advances that permitted large-scale annotation from protein structure projects. Furthermore, it reviews the latest applications of these methods, and their current limitations and prospects, with the intention of showing the relevance of this innovative approach for the research community.
dc.titleLarge-scale protein structure prediction methods for enhanced annotation
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsprotein structure ; structure prediction ; structure alignment ; annotation ; large-scale ; deep learning ; AlphaFold2 ; Foldseek
dc.subject.courseuuBioinformatics and Biocomplexity
dc.thesis.id43996


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record