Poster

  • P-MDE-001

BakRep – A searchable web repository for bacterial genomes and standardized characterizations

Beitrag in

Poster Session 1

Posterthemen

Mitwirkende

Linda Fenske (Giessen / DE), Lukas Jelonek (Giessen / DE), Alexander Goesmann (Giessen / DE), Oliver Schwengers (Giessen / DE)

Abstract

The abundance of bacterial genomic data in public genome databases is crucial for research in various fields. However, most data has been processed differently, making accurate comparisons challenging. Blackwell et al. used an uniform approach to assemble and characterize 661,405 bacterial genomes retrieved from the European Nucleotide Archive in November 2018. First, this revealed a highly uneven taxonomic composition, with just 20 of the 2,336 species making up 90% of the genomes. Secondly, new genomes of 311,006 isolates which had not been assembled before, were added. This data resource has been published by Blackwell et al., with the intention to be used as a comprehensive basis for further analysis.

Based on this we further analyzed the assembled genomes in a standardized way. A taxonomic classification was achieved using the Genome Taxonomy Database and all eligible genomes were further typed via multilocus-sequence typing. In addition we annotated all genomes assigning functional categories and database cross references to public databases. Here, we present a searchable web repository for bacterial genomes, to make this resource accessible to scientists through an interactive website. This repository provides researchers with a flexible search engine to query the data and search for specific subsets of genomes based on various features. This platform allows for customized searches integrating taxonomic, genomic, and meta information.

To handle this challenging amount of data and addressing upcoming data influx, the BakRep web repository is build on a scalable and reliable backend comprising a REST API server, an Elasticsearch cluster and S3 cloud storage. This setup is deployed in a Kubernetes cluster hosted within the deNBI cloud computing infrastructure.

751 GB of genome assemblies were imported and processed resulting in 6.15 TB of generated results. Out of the 661,405 input assemblies, 640,090 were effectively characterized.

The BakRep project conducts comprehensive and standardized characterization of one of the largest collections of bacterial genomes worldwide. We envision it as a high-quality open resource for microbial researchers wordwide.

    • v1.20.0
    • © Conventus Congressmanagement & Marketing GmbH
    • Impressum
    • Datenschutz