Zurück
  • Oral presentation
  • OP-66

A comparison of peptide- and spectrum-centric search engines beyond bar charts

Termin

Datum:
Zeit:
Redezeit:
Diskussionszeit:
Ort / Stream:
Plenary hall

Session

AI and Bioinformatics Approaches

Thema

  • New Technology: AI and Bioinformatics in Mass Spectrometry

Mitwirkende

Michelle T. Berger (Garching / DE), Johanna Tüshaus (Freising / DE), Alexander Hogrebe (Garching / DE), Florian Seefried (Garching / DE), Bernhard Küster (Freising / DE; Garching / DE), Daniel Zolg (Garching / DE), Mathias Wilhelm (Garching / DE; Freising / DE), Martin Frejno (Garching / DE)

Abstract

Background: Data-independent acquisition (DIA) offers reproducibility and proteome coverage but yields complex spectra. Various algorithms were developed for peptide identification from DIA data. Peptide-centric engines rely on fragment ion elution curves, while spectrum-centric methods analyze spectra individually. Algorithms aim to separate true from false identifications (IDs) through diverse scores, with q-values controlling false discovery rates (FDR). We compare DIA-NN, Spectronaut, and Chimerys examining their unique statistics and scores.

Methods: Publicly available mixed species and single-cell Orbitrap Astral DIA raw files were downloaded from PRIDE. Datasets were searched library-free with DIA-NN 1.8.1, Spectronaut 18.5 and Chimerys 2.0 in Proteome Discoverer 3.1 against normal or entrapment databases. Chimerys deconvolutes spectra, distributing shared fragment ion intensities among co-isolated precursors. Peptides are quantified by aggregating contributions over retention time with mokapot controlling FDR. Peptide and protein FDR is calculated by Proteome Discoverer 3.1. Empirical FDR measurements were conducted via entrapment experiments.

Results: Search engine comparisons typically focus on ID counts, yet FDR control and reporting levels differ among DIA-NN, Spectronaut, and Chimerys. Filtering is done at the peptide group (Chimerys) or precursor level (DIA-NN and Spectronaut), which affects the number of reported IDs. We investigated the effect of mismatching the level at which FDR is estimated and at which IDs are counted. We observe that the number of peptide groups artificially increases by ~10% if FDR is controlled at the precursor level, but IDs are counted at the peptide group level. Applying 1% precursor FDR to all search engines reveals that Chimerys detects the same number of precursors as Spectronaut, while DIA-NN seemingly identifies 17% more.

The accuracy of FDR estimates significantly impacts the number of IDs. We empirically assessed FDR control using entrapment experiments. Chimerys and Spectronaut maintain precursor counts, while DIA-NN loses ~16% of IDs at 1% empirical FDR. Accurate FDR control equalizes the performance of all search engines in identifying precursors on this dataset. However, identifying spurious IDs from DIA-NN necessitates entrapment analyses.

Subsequently, we investigated IDs unique to each specific tool. Unique IDs from each tool often rely on few data points or fragments, and measures of fragment ion interference can inflate ID numbers. In conclusion, close inspection of search engine results is vital for accurate peptide identification and quantification from DIA data.

Conclusion: We demonstrate the importance of scrutinizing search engine scores and FDR for accurate peptide identification and quantification from DIA data.

    • v1.20.0
    • © Conventus Congressmanagement & Marketing GmbH
    • Impressum
    • Datenschutz