View Item 
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        •   Utrecht University Student Theses Repository Home
        • UU Theses Repository
        • Theses
        • View Item
        JavaScript is disabled for your browser. Some features of this site may not work without it.

        Browse

        All of UU Student Theses RepositoryBy Issue DateAuthorsTitlesSubjectsThis CollectionBy Issue DateAuthorsTitlesSubjects

        PlasmidEC: An ensemble of classifiers that improves plasmidome recall from short-read sequencing data in Escherichia coli

        Thumbnail
        View/Open
        Final_Report_Lisa_Vader.pdf (2.564Mb)
        Publication date
        2022
        Author
        Vader, Lisa
        Metadata
        Show full item record
        Summary
        Over the past decades, pathogenic lineages of Escherichia coli have rapidly acquired antibiotic resistance. Currently, multidrug resistant E. coli is the most frequent cause of lethal infections among resistant bacteria in a hospital setting. Antibiotic resistance genes (ARGs) are commonly spread via plasmids. From a clinical and epidemiological standpoint, it is very relevant to analyse the plasmid content in E. coli. The rise of Illumina whole genome sequencing (WGS) has enabled fast large-scale analysis of the genomic content of bacteria. However, it is usually not possible to reconstruct plasmids by genome assembly of short-read sequencing data. Therefore, several bioinformatic tools have been developed to uncover the total plasmid content in a sample, also referred to as the plasmidome, by classifying genomic sequences as either chromosome- or plasmid- derived. We benchmarked four of these binary classifiers (mlplasmids, PlaScope, Platon and RFPlasmids). They are at the basis of plasmidEC, an ensemble classifier that combines the output of three plasmid classifiers using a majority voting system. The combination of Platon/PlaScope/RFPlasmid presented the best plasmidome predictions (F1-score = 0.904). Compared to individual classifiers, plasmidEC achieved increased recall (0.885), especially for contigs derived from ARG-plasmids (recall = 0.941). Moreover, a plasmidome study of E. coli ST131 using plasmidEC was used to identify differences between this lineage and other E. coli. Finally, we show that plasmidEC removes chromosomal contamination in plasmid reconstructions obtained by MOB- suite.
        URI
        https://studenttheses.uu.nl/handle/20.500.12932/41637
        Collections
        • Theses

        Related items

        Showing items related by title, author, creator and subject.

        • Random Forests for Plasmid Detection - An Exercise in Model Building and Evaluation 

          Gilliquet, Ethel (2022)
          Plasmids are bacterial genetic elements that are replicated and transferred independently from the chromosome. Because of their independent mechanisms of replication and transfer, the study of plasmids is of special interest ...
        • Determining plasmids from short read sequences 

          Hein, Y.W.R. (2018)
          In this thesis we present a novel method to determine the DNA-sequence of plasmids from a De Bruijn Graph. The method uses coverage data as well as estimates, based on the program mlplasmids, how likely a certain contig ...
        • gplasCC: a robust plasmid reconstruction tool using short-read data from any bacterial species 

          Jordan, Oscar (2024)
          Plasmids play a pivotal role in the spread of antibiotic resistance genes (ARGs). Accurately reconstructing plasmids often requires long-read sequencing, which is more expensive and still more error-prone than short-read ...
        Utrecht university logo