Stephane Liva (Paris / FR), Matthieu Najm (Paris / FR), Michael Richard (Paris / FR), Patrick Poullet (Paris / FR), Laurence Calzone (Paris / FR), Loredana Martignetti (Paris / FR)
Mass spectrometry based proteomic approaches have enabled the quantification of thousands of proteins and phospho-sites in a single experiment. The volume and complexity of the data generated by this technology require dedicated computational methods for converting the data into relevant biological knowkledge. An important part of proteomics data analysis is moving from focusing on individual proteins to a more integrative approach that looks at the entire system. This shift emphasizes the activity of groups of related proteins rather than just the differences in individual protein abundance.
Here, we present the rROMA software for fast and accurate computation of the activity of sets of features with coordinated abundance. Sets of features can correspond to proteins with the same functional activities, proteins belonging to the same signaling pathway or target phospho-sites of a common kinase. Based on known protein assignments to pathways and kinase-subtrate interactions, it assesses the extent to which a particular biological pathway is active or dysregulated in each individual sample. Subsequently, these samples are analyzed and stratified according to their profiles across these active signalling pathways and kinases.
The rROMA algorithm has been demonstrated to successfully identify disease-related active signaling pathways using transcriptomic and proteomic data [1]. It is an open-source package available on github at: www.github.com/sysbio-curie/rROMA.
Here, we applied rROMA to multiple publicly available proteomic and phospho-proteomic breast cancer datasets, to characterize tumor subtypes in terms of signalling and kinases activity. Results indicate that rROMA performed well with the proteomic and phosphoproteomc datasets, revealing distinct clusters of breast cancer samples and higher disease heterogeneity compared to clinical stratification based on the PAM50 subtypes. The pathways identified as active by rROMA within these clusters are consistent with established knowledge regarding signaling pathways relevant to the subtypes, such as high activation epithelial to mesenchymal transition and inflammatory pathways in the basal subtype, ATM kinase dysregulation in basal tumors, high activation of estrogen response in the luminal subtype. Furthermore, it highlights new pathways and kinases that appear deregulated in multiple datasets, uncovering intriguing proteins that warrants further investigation.
[1] Representation and quantification of module activity from omics data with rROMA. Najm M, Cornet M, Albergante L, Zinovyev A, Sermet-Gaudelus I, Stoven V, Calzone L, Martignetti L. NPJ Syst Biol Appl. 2024 Jan 19;10(1):8.