dc.rights.license | CC-BY-NC-ND | |
dc.contributor.advisor | Willems, R.J.L. | |
dc.contributor.author | Vader, Lisa | |
dc.date.accessioned | 2022-06-15T00:00:43Z | |
dc.date.available | 2022-06-15T00:00:43Z | |
dc.date.issued | 2022 | |
dc.identifier.uri | https://studenttheses.uu.nl/handle/20.500.12932/41637 | |
dc.description.abstract | Over the past decades, pathogenic lineages of Escherichia coli have rapidly acquired antibiotic
resistance. Currently, multidrug resistant E. coli is the most frequent cause of lethal infections
among resistant bacteria in a hospital setting. Antibiotic resistance genes (ARGs) are commonly
spread via plasmids. From a clinical and epidemiological standpoint, it is very relevant to analyse the
plasmid content in E. coli. The rise of Illumina whole genome sequencing (WGS) has enabled fast
large-scale analysis of the genomic content of bacteria. However, it is usually not possible to
reconstruct plasmids by genome assembly of short-read sequencing data. Therefore, several
bioinformatic tools have been developed to uncover the total plasmid content in a sample, also
referred to as the plasmidome, by classifying genomic sequences as either chromosome- or plasmid-
derived. We benchmarked four of these binary classifiers (mlplasmids, PlaScope, Platon and
RFPlasmids). They are at the basis of plasmidEC, an ensemble classifier that combines the output of
three plasmid classifiers using a majority voting system. The combination of
Platon/PlaScope/RFPlasmid presented the best plasmidome predictions (F1-score = 0.904).
Compared to individual classifiers, plasmidEC achieved increased recall (0.885), especially for contigs
derived from ARG-plasmids (recall = 0.941). Moreover, a plasmidome study of E. coli ST131 using
plasmidEC was used to identify differences between this lineage and other E. coli. Finally, we show
that plasmidEC removes chromosomal contamination in plasmid reconstructions obtained by MOB-
suite. | |
dc.description.sponsorship | Utrecht University | |
dc.language.iso | EN | |
dc.subject | Antibiotic resistance genes (ARGs) are frequently carried on plasmids. We compared the performance of four bioinformatic tools that predict plasmid sequences in E. coli. We developed plasmidEC, an ensemble classifier that combines the predictions of these tools. PlasmidEC achieves improved recall, especially for plasmids that carry ARGs. We also show two applications of this tool; plasmidome analysis of E. coli ST131, and removal of contamination in reconstructed plasmids. | |
dc.title | PlasmidEC: An ensemble of classifiers that improves plasmidome recall from
short-read sequencing data in Escherichia coli | |
dc.type.content | Master Thesis | |
dc.rights.accessrights | Open Access | |
dc.subject.keywords | plasmids;antibiotic resistance;E. coli;plasmidome;bioinformatics | |
dc.subject.courseuu | Bioinformatics and Biocomplexity | |
dc.thesis.id | 4468 | |