Abstract
Comprehensive characterization of the entirety of major histocompatibility complex (MHC)-bound peptides, collectively referred to as the immunopeptidome, is essential to advance our understanding of immunology and provide the foundation to design vaccines effectively and safely. The SysteMHC Atlas v1.0 was the first public repository dedicated to mass spectrometry-based immunopeptidomics. Here, we introduce a new version of the SysteMHC Atlas (https://systemhc.sjtu.edu.cn), with an extensive collection of over 7000 MS measurements and the incorporation of a few novel computational tools. With all these features, we believe that the atlas can serve as a more useful community resource to provide key insights to the immunology and proteomics community, and will accelerate the development of vaccines and immunotherapies.
First, we extended and optimized a new computational pipeline that allows the identification of MHC-bound peptides carrying on unexpected post-translational modifications (PTMs). In addition, the SysteMHC Atlas v2.5 introduces several new features, including the inclusion of non-UniProt peptides, and the incorporation of several novel computational tools for FDR estimation, binding affinity prediction and motif deconvolution. Moreover, we enhanced the user interface, upgraded website framework, and provided external links to other resources related. Finally, we built and provided various spectral libraries as community resources for data mining and future immunopeptidomic and proteomic analysis.Preliminary Data The current release of the atlas contains >180 million tandem mass spectra obtained from >7000 mass spectrometric raw files. The new computational pipeline includes the major updates: (i) identification of MHC-bound peptides carrying on unexpected PTMs; (ii) accurate FDR estimation by three different methods and (iii) incorporation of novel computational tools for binding affinity prediction and motif deconvolution. By re-analyzing the whole atlas using the computational pipeline, we identified a total of >2 million unique peptides at a peptide level FDR of 1%, which represents a >10-fold increase in comparison with the atlas v1.0. Among them, we identified >450K MHC class I binders and >350K MHC class II binders. These binders can be attributed to 153 MHC class I allotypes and 149 MHC class II allotypes. Regarding PTMs, our analysis identified >450K modified peptides, a >20-fold increase to that in v1.0. In total, the atlas contains 61 distinct modification types (i.e. 422 distinct amino acid - PTM combinations) found on MHC-bound peptides.