Zurück
  • Poster presentation
  • P-II-0460

Altimeter and pioneer: fast and open-source software for analyzing narrow-window data-independent acquisition proteomics experiments

Termin

Datum:
Zeit:
Redezeit:
Diskussionszeit:
Ort / Stream:
New Technology: AI and Bioinformatics in Mass Spectrometry

Poster

Altimeter and pioneer: fast and open-source software for analyzing narrow-window data-independent acquisition proteomics experiments

Thema

  • New Technology: AI and Bioinformatics in Mass Spectrometry

Mitwirkende

Nathan Wamsley (Saint Louis, WA / US), Ben Major (Saint Louis, WA / US), Dennis Goldfarb (Saint Louis, WA / US)

Abstract

We present Altimeter and Pioneer, cross-platform and open-source tools to predict peptide fragmentation patterns (Altimeter) and identify and quantify proteins and peptides from data independent acquisition (DIA) experiments given either in silico or empirically generated spectral libraries (Pioneer). DIA experiments deliberately co-isolate and co-fragment multiple peptides during each MS2 scan. Given a spectral library of predicted fragment ion intensities and precursor retention times, Pioneer quantifies the relative contributions of likely candidate precursor ions to each MS2 spectrum. In our benchmarks, Pioneer quantified 17% more precursor ions as DIA-NN 1.8.1 on an Astral three-proteome experiment, but in less than half the computation time and with more conservative FDR control. Pioneer is written in the Julia programming language and is entirely open-source.

Altimeter is a deep learning model utilizing the UniSpec architecture (Lapin et al. 2024) trained to predict a fragment"s total abundance rather than the monoisotopic abundance. Pioneer searches an MSFragger-style fragment ion intensity index to identify all precursors for which the most abundant and at least two of the top three ranked library ions match to the spectrum. Subsequent expensive calculations are restricted to these candidates. Pioneer then computes expected isotope distributions for fragments from the spectral library based on the given isolation window and regresses them onto each empirical spectrum. It does so by a linear model that minimizes the pseudo-Huber loss under a non-negativity constraint. Pioneer then quantifies precursors using these regression coefficients. Pioneer scores precursors using several features and spectral similarity scores, most notably spectral entropy (Li et al. 2021), scribe score (Searle, Shannon, and Wilburn 2023), and Manhattan distance.

We compared Pioneer to DIA-NN (default settings) with respect to data processing time and the number of quantifiable identifications using a publicly available three-proteome benchmark data set from Guzman et al. 2024. The data were searched against an Altimeter generated spectral library of tryptic H. sapien, S. cerevisiae, and E. coli, proteomes using Pioneer and an internally generated library with DIA-NN. Excluding library generation and formatting and including searching, quantitation, and scoring, data processing time for the six raw files was 19.5 min and 52.5 min for Pioneer and DIA-NN respectively. Benchmarking was carried out on a desktop computer (Windows 10, 3.00 GHz 18-Core Intel i9-10980XE CPU, 64 GB 2666 MHz DDR4) using 24 threads. For the comparison, we retained identifications at 1% FDR or better. Precursors identified in at least 3 of 3 samples in both conditions with a coefficient of variation of less than 20% were considered quantifiable. By these definitions, Pioneer quantified 112,750 precursors and DIA-NN quantified 96,405 precursors.

    • v1.20.0
    • © Conventus Congressmanagement & Marketing GmbH
    • Impressum
    • Datenschutz