Poster

  • P-III-0824

Beyond the reference code: unveiling multicoding genes and their alternative proteins and microproteins

Presented in

Data Integration: With Bioinformatics to Biological Knowledge

Poster topics

Authors

Xavier Roucou (Sherbrooke / CA), Frédérick Comtois (Sherbrooke / CA), Jean-François Jacques (Sherbrooke / CA), Davy Ouedraogo (Sherbrooke / CA), Aïda Ouangraoua (Sherbrooke / CA)

Abstract

In humans, protein-coding genes are typically annotated with a single functional open reading frame (ORF) or coding sequence, typically the longest ORF. The corresponding transcripts contain the reference ORF or a variant of the ORF after alternative splicing, resulting in the expression of the reference protein and different isoforms. However, there is increasing evidence that many genes have more than one ORF and produce transcripts with more than one translated ORF. There is currently no consensus on the terminology of these new proteins. They are presently labeled small ORF-encoded peptides, microproteins, or alternative proteins. In this study, we use the term "alternative ORFs and proteins" (altORFs and altProts) in comparison with reference ORFs and proteins (refORFs and refProts).

The lack of annotation of altORFs and altProts represents a significant challenge to their detection and the study of their functions. Several studies and proteogenomic resources combining ribo-seq or/and proteomics have been developed to help annotate altORFs and altProts to address this issue. We created the OpenProt proteogenomic resource, which enables the identification of multiple ORFs per transcript and the annotation of all ORFs exceeding 29 codons in the transcriptome of various species, along with their corresponding proteins. The reanalysis of large-scale ribo-seq and proteomics data with OpenProt enables the retrieval of evidence of expression at the translatome (altORFs) and the proteome (altProts) levels, respectively, and the identification of potential multicoding genes.

The P53-Induced Protein With A Death Domain gene (PIDD1) encodes a 910-amino-acid protein that engages with a variety of interactors and stimulates the DNA-damage response, centrosome surveillance, NFkB activation, and cell death. It has been established that several mutations in PIDD1 are linked to cerebral cortex malformations and intellectual disability. However, the underlying molecular pathophysiology remains unclear. We show here that PIDD1 is among the potential multicoding genes annotated by OpenProt with strong evidence of expression by ribo-seq and MS-based proteomics for a 171 amino acid-long alternative protein termed altPIDD1. AltPIDD1 and PIDD1 are co-expressed from two overlapping ORFs in the same transcript. The relative quantification of PIDD1 and altPIDD1 indicates that altPIDD1 is the primary product of translation, with a ratio of 1:40. This previously unknown protein is found in actin-rich cytoskeletal structures and is cleaved by caspase 3 and/or 7 during apoptosis. In contrast to other non-canonical proteins that emerged in primates and are relatively new, AltPIDD1 emerged in placental mammals.

The findings of this study underscore the importance of further investigating the potential for multicoding genes, reinforcing the necessity of more detailed and accurate descriptions of the genome's coding potential.

    • v1.20.0
    • © Conventus Congressmanagement & Marketing GmbH
    • Imprint
    • Privacy