Metaproteomics, the large-scale study of proteins from microbial communities, presents complex challenges in taxonomic inference due to sequence homologies between proteins within and across taxa. Commonly, taxonomic inference relies on heuristics such as peptide-spectrum match counting. Only few more advanced methods are available. We introduce the Peptonizer2000, a novel graphical model-based workflow designed to provide high-resolution taxonomic identifications of metaproteomic samples with associated confidence scores. Our tool integrates peptide scores from any proteomic search engine with peptide-taxon mappings from Unipept. We subsequently represent the joint probability distribution of peptides and taxa through a factor graph and use belief propagation to compute marginal probabilities of a taxon's presence, resulting in taxonomic identifications with associated probability scores.
We demonstrate the Peptonizer2000's accuracy and robustness through the analysis of various publicly available metaproteomic samples, including lab-assembled communities and actual microbiomes. We showcase the Peptonizer's ability to deliver reliable probabilistic taxonomic identifications at various taxonomic resolution levels. Our results highlight the Peptonizer2000's potential to improve the specificity and confidence of taxonomic assignments in metaproteomics, providing a valuable resource for the study of complex microbial communities.