Pinar Altiner (Toulouse / FR), Anne Gonzalez de Peredo (Toulouse / FR), Carine Froment (Toulouse / FR), Veit Schwämmle (Odense / DK), Odile Schiltz (Toulouse / FR), David Bouyssié (Toulouse / FR)
Phospho-proteomic analysis seeks to identify, localize, and quantify phospho-peptides purified from complex protein samples. Despite advances in analytical techniques, determining the exact localization of phosphorylation site(s) for a given phospho-peptide remains challenging due to the ambiguity in experimental fragmentation patterns within MS/MS spectra. Evaluating existing localization algorithms is crucial to assess their relative sensitivity and specificity. Numerous studies were published to that aim, but they solely focused on localization accuracy at the identification level. However, the precise evaluation of the relationship between localization and label-free quantification accuracy remains still unclear to our knowledge.
Thus, we performed several experiments designed for a quantitative evaluation across various acquisition methods and bioinformatic tools. We used either a library of 180 synthetic phospho-peptides of known phosphosite localization, or phospho-peptides enriched from mouse T-cells. Both phosphorylated standards were spiked at various concentrations into an E. coli background. Those samples were analyzed using different MS instruments (Thermo Exploris, Bruker timsTOF) and various acquisition methods (Data Dependent Acquisition (DDA), Data Independent Acquisition (DIA)), with/out ion mobility separation. Several bioinformatic tools were used for the post-processing analysis: Proline, MaxQuant, ProteomeDiscoverer (PD), DIA-NN, and Spectronaut.
As we hypothesized, we could show that label-free quantification errors are increased for phospho-peptides present in the sample as different isomers, compared to non-isomeric peptides, for both DDA and DIA experiments analyzed with a given software. We thus provide a ground truth phosphoproteomic dataset, which illustrates how positional isomers with different eluting times can be erroneously matched and quantified in classical label-free strategies. This could constitute a useful tool for the future improvement of processing methods in quantitative phosphoproteomics. This benchmarking study also enabled us to assess the relative quantitative performance of existing tools and acquisition methods. The DIA data analyzed with Spectronaut exhibited the best sensitivity/specificity ratio among all tested approaches, while the PD results led to the highest sensitivity, at the expense of a higher number of wrongly quantified phospho-peptides. The spiked T-cells dataset provided higher complexity to complement this benchmarking study and illustrated for example the benefit of ion-mobility (e.g. the FAIMS module on the Thermo Exploris instrument) for the identification and quantification of phospho-peptides.
Finally, the created scripts, automating the whole comparison of the spiked-in dataset will be inside WOMBAT-P, which is a platform for benchmarking MS-based proteomics pipelines. This should ease further evaluations of the provided datasets by the proteomics community.