Yan Zhou (Hangzhou / CN), Yingrui Wang (Hangzhou / CN), Xiao Shi (Shanghai / CN), Jiatong Wang (Hangzhou / CN), Zelin Zang (Hangzhou / CN), Likun Zhang (Shanghai / CN), Zhiqiang Gui (Shenyang / CN), Honghan Chen (Hangzhou / CN), Jiaxi Wang (Wuhan / CN), Hanqing Liu (Hangzhou / CN), Zhihong Wang (Shenyang / CN), Haixia Guan (Guangzhou / CN), Xiaohong Wu (Hangzhou / CN), Cenkai Shen (Shanghai / CN), Yi He (Dalian / CN), Bo Huang (Shenyang / CN), Hao Zhang (Shenyang / CN), Jianbiao Wang (Hangzhou / CN), Yijun Wu (Hangzhou / CN), Chuang Chen (Wuhan / CN), Yaoting Sun (Hangzhou / CN), Zhiyan Liu (Shanghai / CN), Yu Wang (Shanghai / CN), Tiannan Guo (Hangzhou / CN)
Introduction
Thyroid carcinoma ranks 7th globally in cancer incidence, with 821,173 new cases in 2022. Medullary thyroid carcinoma (MTC), a rare neuroendocrine tumor originating from C cells, represents 1~2% of thyroid neoplasms and is relatively aggressive due to metastasis and treatment resistance. Effective stratification and high-risk identification are crucial. Our study aims to identify molecular prognosis biomarkers and AI models from proteomic and gene panel data for personalized postoperative management.
Methods
We retrospectively collected 573 MTC FFPE samples (#patient=542) from 10 Chinese hospitals covering different graphical areas, with a discovery cohort from 6 centers (n=455, #patient=424) and an independent test cohort from 4 centers (n=118, #patient=118) (Fig1A). All samples were reviewed by at least two experienced pathologists. Non-MTC samples and samples lacking clinical information and loss to follow-up were removed, and 389 (#patient=358) and 106 (#patient=106) samples remained eligible in the discovery and the test cohort, respectively. A semi-auto sample preparation method was used to process samples to peptides. Proteomic data were acquired using data-independent acquisition (DIA) on a timsTOF Pro. MS/MS spectra were searched against a self-built thyroid-specific library. A 28-gene panel was sequenced, and MTC tumor grade was scored by HE and ki67 staining.
Preliminary Data
We quantified 10,092 proteins from the discovery cohort. We applied a nonnegative matrix factorization algorithm to identify MTC proteomic subtypes and extracted 52 proteins from proteins with high CV and differently expressed proteins. Based on the 52 proteins, discovery samples were separated into three subtypes (Fig1B). The structural recurrence significantly differs among the three clusters (P<0.0001). Cluster #2 has a poor prognosis with 70.7% of 5-year recurrence-free survival (RFS) and 14.6% of disease-specific death (DSD), while samples in cluster #3 have a favorable prognosis with 94.6% 5-year RFS and 0.9% DSD. The 52-protein classifier was validated in an independent testing cohort (P=0.019) and 102 MTC samples (P=0.014) from published resources, showing the robustness of our 52-protein classifier in distinguishing patients into different prognosis groups. We further explored the tumor immune microenvironment in the three clusters by analyzing the immune infiltration by CIBERSORTx and conducting multiplex immunofluorescence imaging.
To predict the recurrence risk, an AI model trained using proteomic, gene mutation and clinical data achieved AUC=0.99 in the discovery cohort and validated in the test cohort (AUC=0.87).
Novel Aspect
This is the largest multi-center multi-omic study of MTC to date. The MTC subtyping classifier and recurrence risk prediction model enable individualized risk stratification and offer resources for drug target exploration.