Alex Henneman (Amsterdam / NL), Roza Opperman (Amsterdam / NL), Thang V. Pham (Amsterdam / NL), Sofie Bosch (Amsterdam / NL), Meike de Wit (Amsterdam / NL), Beatriz Carvalho (Amsterdam / NL), Remond Fijneman (Amsterdam / NL), Gerrit Meijer (Amsterdam / NL), Nanne de Boer (Amsterdam / NL), Connie Jimenez (Amsterdam / NL)
Introduction
The human gut microbiome is a complex ecosystem that is increasingly connected to a variety of conditions, such as obesity and depression. To date, the majority of current investigations use genome-wide or targeted DNA sequencing, such as 16S. This type of approaches fall short of the determination and quantification of the actual proteins produced and secreted by both host and microbiome, while these may provide more complete insights. To that end, we used mass spectrometry to examine a large set of stool samples of healthy controls, adenoma and cancer patients to map the extent of the microbiome constitution and the corresponding variation across these progression stages. Importantly, current small-host-number data analysis approaches are inadequate in a metaproteomics setting, requiring the development of novel processing and visualization methods, to which we aim to contribute to foster the emerging field of gut microbiome metaproteomics.
Methods
Approximately 380 human stool samples, of which a subset also had corresponding 16S sequencing data, were profiled using LC-MS/MS in data-dependent acquisition mode. Resulting fragmentation spectra were searched using MSFragger in search space composed of the human proteome supplemented with the proteins of approx. 350 carefully selected gut bacterial species from the Human Microbiome Project. Peptide spectrum matches were processed using Trans Proteomic Pipeline elements in conjunction with in-house developed software into FDR-controlled count-based peptide abundance tables. A novel method, specially developed for our metaproteomics applications, was used for aggregating these peptide tables into protein tables and also higher-level taxonomic units. Subsequently, specialized diversity estimation and visualization methods were developed and applied to this dataset.
Results and conclusions
This study provides an extensive and complex proteomic human-microbial landscape extending across several orders of magnitude in abundance level. A clear signal reflected by both human and microbial protein components can be discerned across normal controls, adenoma and cancer patient stool samples. Proteomics and 16S signatures are largely in agreement, although the unbiased nature of a proteomics-based approach provides a more informative and higher resolution landscape.