Show simple item record

dc.rights.licenseCC-BY-NC-ND
dc.contributor.advisorJonge, Ronnie de
dc.contributor.authorJanssen, Laurens
dc.date.accessioned2024-11-07T01:03:28Z
dc.date.available2024-11-07T01:03:28Z
dc.date.issued2024
dc.identifier.urihttps://studenttheses.uu.nl/handle/20.500.12932/48097
dc.description.abstractIn and around animals, plants and the soil live trillions of microscopic organisms from thousands of different species. All these bacteria, viruses and other microbial organisms can have a substantial impact on the health of their host. The microbial population in the human gut is heavily determined by the local environment, and if the regular balance of species and activity in the gut microbiome is disturbed, this is often a sign that things are going wrong in the body. The microbiome has been linked to various diseases, and studying the microbiome could offer options for diagnosis and treatments for those illnesses. Sequencing the DNA of the microbial beings in for example a stool sample makes it possible to discover what kind of species are present and what proteins are being produced. The data that is gathered from microbiome DNA sequencing is incredibly complex and difficult to understand. To gain useful information from this data, advanced computer programs that learn from the patterns in the data are commonly used in studies. These systems that learn to recognize certain patterns from input are called machine learning methods. Machine learning has been used for quite some time to study microbiome data, especially for using it to make predictions about the host. Simpler machine learning techniques such as random forest and SVM have been able to predict the state of the host with a decent level of accuracy. Random forest is able to successfully separate samples from sick and healthy patients apart from one another about 80 percent of the time, depending on the disease. There is still room to improve upon these results by transforming the data before putting them into the random forest model, or by using other machine learning methods instead, such as deep learning models like neural networks. The term deep learning describes a machine learning model with “hidden” layers. Deep learning models are often neural network models, which were developed after researchers were inspired by the functioning of neurons in the brain. These models have been used in science for the last fifteen years, and some deep learning models have also been used to improve the analysis of microbiome data as well. Other deep learning models, such as the ones used for image recognition or text generation, have not yet been used very much, but these have a large amount of potential in the field of microbiome research. There are quite a few difficulties with getting deep learning methods to work. They require a lot of data, and it is hard to understand how they came to their predictions. Though they have been able to make very good predictions and help out in other fields, getting deep learning techniques to work with microbiome data has only seen limited success so far. Certain developments from other fields in biology and image recognition have been able to use more advanced deep learning models with tremendous success, and applying these newer models to microbiome research could be just as successful.
dc.description.sponsorshipUtrecht University
dc.language.isoEN
dc.subjectAdvancements in the use of deep learning within single-cell sequencing analysis and DNA region function prediction have made methods like VAE, GAN and LLMs widespread in neighboring fields. Taking inspiration from these developments and adapting existing frameworks for use with data about the microbiome could improve upon the effectiveness of currently used machine learning methods in microbiome research.
dc.titleThe application of deep learning in microbiome research
dc.type.contentMaster Thesis
dc.rights.accessrightsOpen Access
dc.subject.keywordsMachine Learning; Deep Learning; LLM; Large-Language Model; Microbiome; Host-Phenotype Prediction
dc.subject.courseuuBioinformatics and Biocomplexity
dc.thesis.id40838


Files in this item

Thumbnail

This item appears in the following Collection(s)

Show simple item record