Back
  • Poster presentation
  • P-III-0783

To be or not to be – a non-canonical ORF MS-detectable?

Appointment

Date:
Time:
Talk time:
Discussion time:
Location / Stream:
Data Integration: With Bioinformatics to Biological Knowledge

Poster

To be or not to be – a non-canonical ORF MS-detectable?

Topic

  • Data Integration: With Bioinformatics to Biological Knowledge

Authors

Ivo Fierro-Monti (Basel / CH), Eric Deutsch (Seattle, WA / US)

Abstract

We present two statistical neural network learning models aimed at classifying non-canonical open reading frame (ncORF) peptides and microproteins or micropeptides into detectable or undetectable categories by mass spectrometry. For ncORF predicted HLA-binding peptides, a Multilayer Perceptron (MLP) Classifier model was trained on approximately equal number of detected and undetected peptides, including 22 features (selected using the Boruta algorithm). The MLP Classifier model achieved an accuracy of 0.699 and ROC AUC of 0.694, with the highest variable of importance showing correlation, thus suggesting potential for ncORF peptide detectability. For ncORF microproteins or micropeptides, a TensorFlow-Keras model was trained on 7264 instances, with 25% detectable, and 43 features (of which 36 Boruta-selected), yielding an accuracy of 0.60 and ROC AUC of 0.676. The model's performance is narrowly favourable to classify ORFs as detectable or undetectable, the ratio of Number of predicted HLA-binding peptides to the corresponding ncORF amino acid length emerged as the most important attribute for their detectability, followed by two attributes with similar importance: the designated orf_biotype4 - lncRNA category and the microprotein ncORF length in number of amino acids. The model's most important features allow to a certain extent prediction of ORF microprotein/micropeptide detectability. These models provide insights into the discriminatory features influencing ncORF peptide and microprotein detectability, facilitating a deeper understanding of HLA-I peptide and ncORF biology. Future work should focus on refining model performance with additional data e.g. on peptide abundance.

    • v1.20.0
    • © Conventus Congressmanagement & Marketing GmbH
    • Imprint
    • Privacy