Robbe Devreese (Ghent / BE), Alireza Nameni (Ghent / BE), Arthur Declerq (Ghent / BE), Alexander Kensert (Ghent / BE), Ralf Gabriels (Ghent / BE), Sven Degroeve (Ghent / BE), Lennart Martens (Ghent / BE), Robbin Bouwmeester (Ghent / BE)
Instruments capable of measuring the ion mobility of peptides are now often part of the standard proteomics workflows. This opens up opportunities for the incorporation of ion mobility information in the peptide identification pipeline. As a result, there has recently been substantial progress in the development of collisional cross-section (CCS) prediction models. However, most of these recently developed CCS predictors struggle to accurately predict CCS for peptidoforms with modifications. This is especially problematic for increasingly popular open searches that must deal with extreme peptide identification ambiguity.
Additionally, existing CCS prediction models overlook peptide ions that can adopt multiple conformations in the gas phase, since these are typically trained on only the most abundant conformer within a given dataset. However, this obviously removes valuable knowledge and creates a potential bias towards the most frequently occurring conformers.
Addressing these limitations, we present IM2Deep, a deep learning-based CCS predictor capable of handling both peptides with any modification, as well as peptide ions exhibiting multiple conformations. IM2Deep uses the same principles as DeepLC, a retention time predictor, by encoding peptidoforms on the atomic composition level. This enables accurate CCS prediction for peptides with modifications unseen during training. To allow multi-output prediction for multiconformational peptide ions, we adapted IM2Deep"s architecture to support multi-output predictions, applying transfer learning to refine predictions on a dataset enriched with multiconformational peptide ions. Our results demonstrate that IM2Deep not only predicts CCS for multiple conformations with high accuracy but also improves the baseline performance for single-conformational peptide ions.
Furthermore, we explore the application of graph neural networks (GNNs) in predicting CCS for peptide ions. GNNs leverage the full structural information of molecular graphs, making them particularly suited for CCS prediction, which is closely tied to the ion's structural configuration. By computing saliency maps from these trained networks, we can better understand the significance of specific molecular sub-structures in CCS prediction, enhancing interpretability of our predictions.