.css-1xsl8rf{width:100%;display:-webkit-box;display:-webkit-flex;display:-ms-flexbox;display:flex;-webkit-align-items:center;-webkit-box-align:center;-ms-flex-align:center;align-items:center;position:relative;overflow:hidden;background:var(--alert-bg);-webkit-padding-start:var(--chakra-space-4);padding-inline-start:var(--chakra-space-4);-webkit-padding-end:var(--chakra-space-4);padding-inline-end:var(--chakra-space-4);padding-top:var(--chakra-space-1);padding-bottom:var(--chakra-space-1);--alert-fg:var(--chakra-colors-orange-600);--alert-bg:var(--chakra-colors-orange-100);font-weight:var(--chakra-fontWeights-semibold);padding-left:var(--chakra-space-4);}.chakra-ui-dark .css-1xsl8rf:not([data-theme]),[data-theme=dark] .css-1xsl8rf:not([data-theme]),.css-1xsl8rf[data-theme=dark]{--alert-fg:var(--chakra-colors-orange-200);--alert-bg:rgba(251, 211, 141, 0.16);}@media screen and (min-width: 48em){.css-1xsl8rf{padding-left:var(--chakra-space-8);}}Bitte aktivieren Sie Javascript um alle Funktionen nutzen zu können und ihre Nutzererfahrung zu verbessern.

Poster presentation
P-II-0733

Proteogenomics identifies conserved and lineage-specific novel small proteins in clinical Mycobacterium tuberculosis reference strains

Termin

Datum: Di., 22.10.

Zeit: 13:00 – 13:00

Redezeit: 0 Min.

Diskussionszeit: 0 Min.

Ort / Stream:

Infectious Biology Insights

Poster

Proteogenomics identifies conserved and lineage-specific novel small proteins in clinical Mycobacterium tuberculosis reference strains

Session

Infectious Biology Insights

Thema

Infectious Biology Insights

Mitwirkende

Benjamin Heiniger (Zurich / CH), Christian Schori (Zurich / CH), Mohammad Arefian (Belfast / GB), Amir Banaei-Esfahani (Zurich / CH), Martin Schuler (Zurich / CH), Sonia Borell (Basel / CH), Daniela Brites (Basel / CH), Ruedi Aebersold (Zurich / CH), Sébastien Gagneux (Basel / CH), Ben Collins (Belfast / GB), Christian Ahrens (Zurich / CH)

Abstract

Objectives: Accurate and comprehensive prediction of all protein coding genes is still an unresolved issue. Genes encoding small proteins are often missed in genome annotations, even in well-studied bacterial model organisms such as Escherichia coli [1] and Pseudomonas aeruginosa [2]. Yet, these small proteins can carry out many important functions [3]. Using proteogenomics, we aim to identify protein coding genes missed by conventional genome annotation methods in six Mycobacterium tuberculosis clinical reference strains from lineages 1 and 2. M. tuberculosis is among the top bacterial infectious diseases world-wide, with the modern lineage 2 being especially virulent.

Methods: We de novo assembled complete genomes of six clinical reference strains from long read PacBio data. By hierarchically integrating reference annotations, ab initio gene predictions and a modified six-frame translation that considers alternative start codons, we created large but minimally redundant integrated proteogenomics search databases (iPtgxDB) [4], where ~95% of peptides unambiguously identify one protein [5]. Total cell extracts were analyzed with Parallel Accumulation-Serial Fragmentation (PASEF) mass spectrometry (MS). After strict FDR filtering and validation we prioritized novel small proteins based on functional predictions and conservation.

Results: The assembly of complete genomes allowed us to overcome drawbacks of fragmented, short-read based assemblies that can even miss essential genes [2]. Comparative genomics identified a large core genome (94% of genes), reflecting the lack of horizontal gene transfer in Mycobacterium. Proteogenomics allowed us to identify 35 novel proteins which were enriched in proteins shorter than 100 amino acids and had a significantly higher average of pI values than annotated proteins. Interestingly, the novel candidates included strain and lineage specific proteins and three candidates with a predicted function in toxin-antitoxin systems.

Conclusion: We successfully applied our proteogenomics framework for prokaroytes, which is available as a public web server (https://iptgxdb.expasy.org) [3], to six clinical reference strains of a major human pathogen [7]. By leveraging state of the art tandem MS we were able to identify novel small proteins with potential roles in pathogenicity without the need for sub-cellular fractionation.

References

[1]. Storz G, et al. EcoSal Plus 2020, 9:10.

[2]. Varadarajan AR et al. NPJ Biofilms & Microbiomes 2020, 6:46.

[3]. Storz G, et al. Annu Rev Biochem 2014, 83: 753-777.

[4]. Omasits U et al. Genome Res 2017, 27: 2083-2095.

[5]. Qeli E & Ahrens CH. Nat Biotechnol 2010, 28:647-650.

[7]. Heiniger B et al. (in preparation).