Back
  • Poster presentation
  • P-II-0455

Phosphorylation site prediction model using structural context

Appointment

Date:
Time:
Talk time:
Discussion time:
Location / Stream:
New Technology: AI and Bioinformatics in Mass Spectrometry

Poster

Phosphorylation site prediction model using structural context

Topic

  • New Technology: AI and Bioinformatics in Mass Spectrometry

Authors

Yujin Choo (Seoul / KR), Eunok Paek (Seoul / KR), Seungjin Na (Cheongju / KR)

Abstract

Post-translational modifications are covalent proceesing events that occur after protein synthesis and change the properties of a protein. In particular, phosphorylation is essential for cellular function and signaling and therefore, their identification is crucial for understanding the protein mechanisms. While various machine learning and deep learning models have been proposed to predict phosphorylation sites, the advent of AlphaFold2, with its high-accuracy protein structure predictions, presents new opportunities. We have developed a prediction model that leverages AlphaFold2"s structural prediction, enhancing sequence embeddings with structural information through a Transformer-based cross-attention mechanism. Our phosphorylation site prediction model was benchmarked against MusiteDeep, a sequence-based predictor, using an independent test set with an equal number of positive and negative phosphosite instances (5,074 each) from proteins not included in training/validation sets. MusiteDeep achieved an area under the ROC of 0.9118, whereas our model demonstrated an improved result of 0.9417. Additionally, our model showed enhanced precision (0.8321 over MusiteDeep"s 0.8042) and improved recall (0.9253 over Musitedeep"s 0.9034), at the same time. We also explored kinase-specific prediction through transfer learning to assess how effectively structural information is integrated into the embeddings. The results from our transfer learning approach surpassed those from fine-tuning and DeepPhos model, a sequence-based prediction tool, across all kinase-specific datasets.

    • v1.20.0
    • © Conventus Congressmanagement & Marketing GmbH
    • Imprint
    • Privacy