Ola Caster (Uppsala / SE), Linn Fagerberg (Uppsala / SE), Hilda Andersson (Uppsala / SE), Markus Sällman Almén (Uppsala / SE), Ida Grundberg (Uppsala / SE)
Thanks to increased sample throughput and highly parallelized assaying capabilities, proteomics has emerged as an indispensable tool in biomarker discovery for the detection, prognosis, and treatment of disease. Its impact on healthcare in general and precision medicine specifically is expected to continue to grow.
The UK Biobank (UKB) Pharma Proteomics Project (PPP) represents the prime example of large-scale proteomics, with protein level quantification of plasma samples from more than 50,000 individuals. Based on Olink® Explore technology, UKB-PPP has generated data on nearly 3,000 proteins for this vast cohort. When combined with e.g., genomic or healthcare data in UKB, the opportunities for biomarker research in biology and medicine are tremendous.
This study aimed to estimate the future risk of a large and diverse set of diseases for all protein biomarkers available in UKB, thus generating a library of protein-disease risk associations freely available to researchers worldwide.
In total, 107 diseases, including e.g., cancers, neurological, and cardiovascular diseases, were selected from the PheWAS ontology and mapped to diagnosis codes from longitudinal hospital records, cancer registries, and death registries in UKB. A cohort was created for each disease, consisting of all incident cases within 10 years from the time of blood sampling as well as a set of matched, randomly selected controls with no occurrence of the disease during follow-up. All individuals with a first occurrence of the disease prior to their blood sample were excluded. For each protein, the association between measured plasma levels and the time to first occurrence of the disease was estimated using Cox regression, adjusting for sex, age, body mass index, and smoking status. In total, over 300,000 protein-disease risk associations were quantified in terms of hazard ratios with associated p-values.
Our results reveal a large heterogeneity in strength and number of associations both across diseases and proteins. Some proteins, for example growth differentiation factor 15 (GDF15) and WAP four-disulfide core domain 2 (WFDC2), have statistically significant associations to a high proportion of all included diseases, suggesting they might be less suitable as targeted biomarkers. Several other strong associations identified here, including TNF receptor superfamily member 13B (TNFRSF13B) with leukemia and neurofascin (NFASC) with type II diabetes, have been previously reported in independent research, which provides a basic level of validation.
The complete set of results has been made freely available via Olink Insight, an online portal to support proteomic research. With its broad coverage of diseases and its large pool of UKB data as foundation, this new resource can enhance future studies. For example, our results can help guide biomarker selection for a given disease of interest or be used to cross-reference findings in the post-study analysis phase.
Auf unserem Internetauftritt verwenden wir Cookies. Bei Cookies handelt es sich um kleine (Text-)Dateien, die auf Ihrem Endgerät (z.B. Smartphone, Notebook, Tablet, PC) angelegt und gespeichert werden. Einige dieser Cookies sind technisch notwendig um die Webseite zu betreiben, andere Cookies dienen dazu die Funktionalität der Webseite zu erweitern oder zu Marketingzwecken. Abgesehen von den technisch notwendigen Cookies, steht es Ihnen frei Cookies beim Besuch unserer Webseite zuzulassen oder nicht.