MMETHANE: interpretable AI for predicting host status from microbial composition and metabolomics data

Work led by Jen Dawkins – see our new manuscript.

Metabolite production, consumption, and exchange are intimately involved with host health and disease, as well as being key drivers of host-microbiome interactions. Despite the increasing prevalence of datasets that jointly measure microbiome composition and metabolites, computational tools for linking these data to the status of the host remain limited. To address these limitations, we developed MMETHANE, an open-source software package that implements a purpose-built deep learning model for predicting host status from paired microbial sequencing and metabolomic data. MMETHANE incorporates prior biological knowledge, including phylogenetic and chemical relationships, and is intrinsically interpretable, outputting an English-language set of rules that explains its decisions. Using a compendium of six datasets with paired microbial composition and metabolomics measurements, we showed that MMETHANE always performed at least on par with existing methods, including blackbox machine learning techniques, and outperformed other methods on >80% of the datasets evaluated. We additionally demonstrated through two cases studies analyzing inflammatory bowel disease gut microbiome datasets that MMETHANE uncovers biologically meaningful links between microbes, metabolites, and disease status.

MCSPACE: inferring microbiome spatiotemporal dynamics from high-throughput co-localization data

Work led by Gary Uppal – see our new manuscript.

Recent advances in high-throughput approaches for estimating co-localization of microbes, such as SAMPL-seq, allow characterization of the biogeography of the gut microbiome longitudinally and at unprecedented scale. However, these high-dimensional data are complex and have unique noise properties. To address these challenges, we developed MCSPACE, a probabilistic AI method that infers from microbiome co-localization data spatially coherent assemblages of taxa, their dynamics over time, and their responses to perturbations. To evaluate MCSPACE’s capabilities, we generated the largest longitudinal microbiome co-localization dataset to date, profiling spatial relationships of microbes in the guts of mice subjected to serial dietary perturbations over 76 days. Analyses of these data and an existing human longitudinal dataset demonstrated superior benchmarking performance of MCSPACE over existing methods, and moreover yielded insights into spatiotemporal structuring of the gut microbiome, including identifying temporally persistent and dynamic microbial assemblages in the human gut, and shifts in assemblages in the murine gut induced by specific dietary components. Our results highlight the utility of our method, which we make available to the community as an open-source software tool, for elucidating dynamics of microbiome biogeography and gaining insights into the role of spatial relationships in host-microbial ecosystem function.

Dr. Gerber to speak at the Nestle Nutrition Institute Workshop in Rio de Janeiro, Brazil on June 18, 2024

Dr. Gerber will speak at the 101st Nestlé Nutrition Institute workshop – Nutrition, microbiome and health: latest findings and future research.

The NNI and WNSC workshop 101 brought together renowned leading experts for a deep dive into the latest discoveries on the gut microbiome. Among the topics discussed were the ecological patterns occurring in early life, the influence of maternal microbiota and infant feeding, as well as the strategies to modulate gut microbiome and promote child health. The workshop also delved deeper into the future research such as microbiome environment intersection, infant gut virome and the use of artificial intelligence.

See a recording of his talk here.

Dr. Gerber to speak at the Festival of Genomics & Biodata in Boston on June 12, 2024

The Festival is designed principally for scientists and clinicians who are working in the fields of genomics, other omics, and/or using biodata , to further research, drug discovery, healthcare and – ultimately – deliver better patient outcomes.

Topics covered include genomics, single cell/spatial biology, multiomics, biodata, cancer research, drug discovery, AI, microbiome, proteomics, liquid biopsy and so much more.

Two papers accepted at ICML WCB 2023

Two papers accepted at ICML WCB 2023

Gerber GK, Bhattarai SK, Du M, Glickman MS, Bucci V. Discovery of Host-Microbiome Interactions Using Multi-Modal, Sparse, Time-Aware, Bayesian Network-Structured Neural Topic Models. International Conference on Machine Learning Workshop on Computational Biology, 2023.

Uppal G, Urtecho G, Richardson M, Moody T, Wang HH, Gerber GKMC-SPACE: Microbial communities from spatially associated counts engine. International Conference on Machine Learning Workshop on Computational Biology, 2023.

“MDITRE: Scalable and Interpretable Machine Learning for Predicting Host Status from Temporal Microbiome Dynamics” is mSystems Editor’s Pick

“MDITRE: Scalable and Interpretable Machine Learning for Predicting Host Status from Temporal Microbiome Dynamics” is mSystems Editor’s Pick

Longitudinal microbiome data sets are being generated with increasing regularity, and there is broad recognition that these studies are critical for unlocking the mechanisms through which the microbiome impacts human health and disease. However, there is a dearth of computational tools for analyzing microbiome time-series data. To address this gap, we developed an open-source software package, Microbiome Differentiable Interpretable Temporal Rule Engine (MDITRE), which implements a new highly efficient method leveraging deep-learning technologies to derive human-interpretable rules that predict host status from longitudinal microbiome data. Using semi-synthetic and a large compendium of publicly available 16S rRNA amplicon and metagenomics sequencing data sets, we demonstrate that in almost all cases, MDITRE performs on par with or better than popular uninterpretable machine learning methods, and orders-of-magnitude faster than the prior interpretable technique. MDITRE also provides a graphical user interface, which we show through case studies can be used to derive biologically meaningful interpretations linking patterns of microbiome changes over time with host phenotypes. 

Gerber lab study showing gut metabolites predict C. diff recurrence

Gerber lab study showing gut metabolites predict C. diff recurrence

Clostridioides difficile infection (CDI) is the most common hospital acquired infection in the USA, with recurrence rates > 15%. Although primary CDI has been extensively linked to gut microbial dysbiosis, less is known about the factors that promote or mitigate recurrence. Using broad metabolomics data and statistics and machine learning models, Jen Dawkins, a HST PhD student and member of the Gerber lab, showed the metabolites in the gut can accurately predict C. difficile recurrence. These findings have implications for development of diagnostic tests and treatments that could ultimately short-circuit the cycle of CDI recurrence, by providing candidate metabolic biomarkers for diagnostics development, as well as offering insights into the complex microbial and metabolic alterations that are protective or permissive for recurrence.

Dawkins JJ, Allegretti JR, Gibson TE, McClure E, Delaney M, Bry L, Gerber GK. Gut metabolites predict Clostridioides difficile recurrence. Microbiome. 2022 Jun 9;10(1):87. doi: 10.1186/s40168-022-01284-1. PMID: 35681218; PMCID: PMC9178838.