Zurück
  • Poster presentation
  • P-II-0499

Benchmarking predictions models for improved peptide identification with MSBooster and Koina

Termin

Datum:
Zeit:
Redezeit:
Diskussionszeit:
Ort / Stream:
New Technology: AI and Bioinformatics in Mass Spectrometry

Poster

Benchmarking predictions models for improved peptide identification with MSBooster and Koina

Thema

  • New Technology: AI and Bioinformatics in Mass Spectrometry

Mitwirkende

Kevin Li Yang (Ann Arbor, MI / US), Ludwig Lautenbacher (Freising / DE), Fengchao Yu (Ann Arbor, MI / US), Mathias Wilhelm (Freising / DE), Alexey Nesvizhskii (Ann Arbor, MI / US)

Abstract

Peptide-spectrum-match (PSM) rescoring in bottom-up proteomics synthesizes multiple characteristics per PSM to better separate true positive target PSMs from decoy and false positive target PSMs. Multiple models (e.g. DIA-NN, Prosit, AlphaPeptDeep, MS2PIP/DeepLC) have been trained to predict MS/MS spectra and retention time (RT) and have been leveraged to improve PSM rescoring by comparing predicted to experimental values, leading to increased peptide identifications. It has been difficult to benchmark these prediction models, as they are usually run within their individual analysis pipelines, so the number of peptide identifications is influenced by factors external to the prediction model. To enable fair comparisons of the models while keeping all other factors constant, we have integrated the Koina prediction web server into MSBooster as part of the FragPipe computational pipeline, giving us access to a growing list of models designed by the proteomics community. We provide an easy way to choose/mix-and-match models. Furthermore, we benchmark the models on various data types, namely phosphoproteomics, immunopeptidomics, DIA data from the Thermo Scientific Astral mass spectrometer, and TMT11 data. We find that DIA-NN and the Koina models each have certain data types they perform best on, with Koina models improving peptide identifications by as much as 6% over the FragPipe default DIA-NN. To decrease the burden that comes with having to find the optimal models for one"s dataset out of a growing list of available Koina models, we propose a heuristic best model search that predicts which MS2 and RT models will perform best on the data at hand. Our approach selects RT models that on average identifies 90% of the peptides found by the empirically best RT model; for MS2 models, our heuristic approach"s performance rises to 98%. When designing the best model search algorithm, we realized that prediction models achieving the highest median MS/MS similarity with experimental spectra or lowest median RT differences does not always translate to that model getting the most peptide identifications. Koina is integrated into FragPipe, allowing users to visualize spectra from Koina models in FragPipe-PDV mirror plots. Various quality control plots are also made available. MSBooster code is freely available at https://github.com/Nesvilab/MSBooster.

    • v1.20.0
    • © Conventus Congressmanagement & Marketing GmbH
    • Impressum
    • Datenschutz