Cecile Le Sueur (Heidelberg / DE), Mikhail Savitski (Heidelberg / DE), Magnus Rattray (Manchester / GB)
Thermal proteome profiling (TPP) is a proteome-wide technology combining the cellular thermal shift assay with quantitative mass spectrometry to provide insights into protein interactions and states. Statistical analysis of temperature range TPP (TPP-TR) datasets involves comparing protein melting curves, describing the amount of non-denatured proteins as a function of temperature, between different conditions (e.g. presence or absence of a drug). Current state-of-the-art models are restricted to sigmoidal melting behaviors, yet unconventional melting curves represent up to 20% of TPP-TR datasets. As shown in the literature, important biological information is likely to be carried by non-sigmoidal melting curves. We recently introduced GPMelt, a novel statistical framework based on hierarchical Gaussian Process (GP) models, to make TPP-TR datasets analysis unbiased with respect to the proteins' melting profiles. The model robustly integrates replicates information, accommodates multiple conditions and handles any melting curve shape. By further formulating the hierarchical GP model into a multi-task GP regression, we provide an interpretable statistical model which can easily be extended to more complex TPP protocols. As an illustration, the analysis of peptide-level TPP-TR datasets, considering melting curves of tryptic peptides instead of protein averages, is implemented using deeper hierarchies. Unbiased analyses of peptide-level TPP-TR datasets, of high value for the study of protein post-translational modifications, were yet hindered due to the abundance of unconventional melting curves. A second example is the extension of the GPMelt model to 2D-TPP datasets. 2D-TPP datasets compare protein thermal stability across a larger number of conditions, using a distinct sample multiplexing strategy that previously impeded melting curve reconstruction. Retaining the melting curve shape, and hence key biological information, strengthens the accuracy of 2D-TPP discoveries. Collectively, GPMelt extends the analysis of TPP-TR datasets for both protein and peptide levels melting curves, offering access to thousands of previously excluded melting curves. The hierarchical GP model's versatility and interpretability permits its extensions for application to nearly any TPP-based protocol, paving the way for groundbreaking biological discoveries in protein interactions, localisation and functions.