Xue Cai (Hangzhou / CN), Yaoting Sun (Hangzhou / CN), Lu Li (Hangzhou / CN), Zhangzhi Xue (Hangzhou / CN), Yi Zhu (Hangzhou / CN), Yu Wang (Shanghai / CN), Tiannan Guo (Hangzhou / CN)
Background
Approximately 15-30% of thyroid nodules display indeterminate cytology on fine needle aspiration (FNA) biopsy, creating a clinical diagnostic dilemma in determining their benign or malignant nature. Multiple gene-panel tests have been developed; however, these tests also possess certain limitations, such as heterogeneity. Protein samples are less susceptible to spontaneous degradation in clinical specimens. We hypothesized that the introduction of proteomics molecular testing may help tackle this clinical challenge.
Methods
We conducted a prospective, noninterventional, multicenter study. All patient's FNA samples were classified using the Bethesda System for Reporting Thyroid Cytopathology. Results of a histopathological review served as the reference standard. Proteins are extracted and digested from FNA samples and peptide samples mixed with synthesized stable isotope labeled peptides (called AQUA peptides) were analyzed using multiple reaction monitoring (MRM) mass spectrometry (MS) on a Sciex 4500 MD instrument over an 8-min LC gradient. MS data were analyzed using OpenMS. BRAF V600E mutation status was determined by ARMS-PCR Kit. A machine learning approach was employed, utilizing the Sklearn package to construct a random forest model.
Results
A total of 1954 FNA samples from 1952 patients collected from 15 medical centers from January 2020 to September 2022 were used for feature selection, model training and validation. Based on previous studies, 211 proteins were selected as candidate diagnostic biomarkers. After multiple rounds of protein biomarker screening, including machine learning model screening, evaluation of AQUA peptides stability and linearity, and MS signal screening, three proteins were ultimately selected as the final features of the classifier model (ThyroProt). The ThyroProt incorporated the absolute quantitative data of three proteins, along with age, sex, and BRAF V600E mutation information.
Independent test set is composed of 290 FNA samples collected from 283 patients from five medical centers, spanning from October 2022 to March 2023. Classifier test on independent set 1 showed an area under the curve (AUC) of 0.94, an accuracy of 89.7% (95% confidence interval [CI]: 89.6-89.7), sensitivity of 85.9% (95% CI: 80.7-91.0), specificity of 95.6% (95% CI: 91.8-99.4), positive predictive value (PPV) of 96.8% (95% CI: 94.1-99.6) and negative predictive value (NPV) of 81.2% (95% CI: 74.6-87.8) for thyroid malignancy. Test on 45 FNA samples reported as Bethesda III or IV showed an AUC of 0.87 and an accuracy of 88.9% (95% CI: 88.5-89.3), sensitivity, specificity, PPV, and NPV of 86.7% (95% CI: 69.5-100), 90.0% (95% CI: 79.3-100), 81.2% (95% CI: 62.1-100), and 93.1% (95% CI: 83.9-100), respectively.
Conclusions
This study demonstrates the feasibility and practicality of using targeted mass spectrometry to detect multiple proteins and machine learning for clinical diagnostics in real-world scenarios.