Pegah Zahedimaram (Stuttgart / DE), Hannah Heinrich (Stuttgart / DE), Natalie Widmann (Stuttgart / DE), Elke Schaeffeler (Stuttgart / DE), Ute Hofmann (Stuttgart / DE), Thomas Mürdter (Stuttgart / DE), Matthias Schwab (Stuttgart / DE)
Recent advancements in Orbitrap mass spectrometry and AI-based data processing have significantly enhanced the depth and accuracy of proteomic analyses, enabling the identification of more proteins and improved quantification precision. In label-free quantification (LFQ), improving protein coverage, reproducibility of peptide abundances, and minimizing the coefficient of variation (CV) are crucial. This study aimed to evaluate different LFQ methods in proteomics for robust and reproducible results when comparing biological samples.
To identify the optimal LFQ method, using HEK293T total cell lysate, we compared data-dependent acquisition (DDA) and data-independent acquisition (DIA) techniques using various sample preparation protocols, LC gradients, and data processing workflows. The study utilized the Vanquish Neo Nano-LC system coupled with the Exploris 480 Orbitrap mass spectrometer, with data processing performed using PD 3.1. We assessed data quality based on the number of proteins and peptides identified, CV% of measured abundances, and accuracy of abundance measurements using spiked-in peptide standards.
We tested IW ranging from 35 to 200, using normal, variable, and staggered patterns. Variable window sizes were designed based on peptide counts in each m/z range from a DDA acquisition of the same sample, either manually or using the Encyclopedia wizard. Our results indicated that all tested acquisition methods demonstrated similar performance in terms of peptide and protein identifications and reproducibility. Interestingly, 200 IW resulted in the highest number of identified proteins (~ 7000) at a false discovery rate (FDR) of 1%, with an abundance CV of less than 20% for over 80% of quantified proteins, though, the CV% distribution tended toward higher values, indicating the importance of the number of data points for LFQ.
To determine the most accurate method, a standard digest was spiked into the sample in varying amounts, allowing for the assessment of the method reporting the closest ratio. The DIA method, designed based on peptide counts in each m/z range from a DDA acquisition of the same sample, exhibited the highest accuracy in the full range of 400-1,000 m/z. This method utilized a variable isolation windows pattern with isolation widths of 10, 5.5, 10, and 15 Da, identifying 6,000 total proteins and quantifying 5,861 proteins, achieving a CV of less than 20% for 85% of the quantified proteins. Finally, it was tested to identify differentially expressed proteins in two biologically distinct samples to validate the method.
This study underscores the critical role of optimizing DIA parameters, particularly isolation windows, in enhancing the accuracy and reproducibility of LFQ in proteomics. The findings suggest that careful method design, tailored to specific sample characteristics, can significantly improve proteomic analyses, paving the way for more precise and reliable biological insights.