Dr. Gerber will speak at the American Association for Advancement of Science Annual meeting, “Science Shaping Tomorrow” at the Hynes Convention Center in Boston.
MMETHANE: interpretable AI for predicting host status from microbial composition and metabolomics data
Work led by Jen Dawkins – see our new manuscript.
Metabolite production, consumption, and exchange are intimately involved with host health and disease, as well as being key drivers of host-microbiome interactions. Despite the increasing prevalence of datasets that jointly measure microbiome composition and metabolites, computational tools for linking these data to the status of the host remain limited. To address these limitations, we developed MMETHANE, an open-source software package that implements a purpose-built deep learning model for predicting host status from paired microbial sequencing and metabolomic data. MMETHANE incorporates prior biological knowledge, including phylogenetic and chemical relationships, and is intrinsically interpretable, outputting an English-language set of rules that explains its decisions. Using a compendium of six datasets with paired microbial composition and metabolomics measurements, we showed that MMETHANE always performed at least on par with existing methods, including blackbox machine learning techniques, and outperformed other methods on >80% of the datasets evaluated. We additionally demonstrated through two cases studies analyzing inflammatory bowel disease gut microbiome datasets that MMETHANE uncovers biologically meaningful links between microbes, metabolites, and disease status.
MCSPACE: inferring microbiome spatiotemporal dynamics from high-throughput co-localization data
Work led by Gary Uppal – see our new manuscript.
Recent advances in high-throughput approaches for estimating co-localization of microbes, such as SAMPL-seq, allow characterization of the biogeography of the gut microbiome longitudinally and at unprecedented scale. However, these high-dimensional data are complex and have unique noise properties. To address these challenges, we developed MCSPACE, a probabilistic AI method that infers from microbiome co-localization data spatially coherent assemblages of taxa, their dynamics over time, and their responses to perturbations. To evaluate MCSPACE’s capabilities, we generated the largest longitudinal microbiome co-localization dataset to date, profiling spatial relationships of microbes in the guts of mice subjected to serial dietary perturbations over 76 days. Analyses of these data and an existing human longitudinal dataset demonstrated superior benchmarking performance of MCSPACE over existing methods, and moreover yielded insights into spatiotemporal structuring of the gut microbiome, including identifying temporally persistent and dynamic microbial assemblages in the human gut, and shifts in assemblages in the murine gut induced by specific dietary components. Our results highlight the utility of our method, which we make available to the community as an open-source software tool, for elucidating dynamics of microbiome biogeography and gaining insights into the role of spatial relationships in host-microbial ecosystem function.
“AI in microbiome research: Where have we been, where are we going?”
Dr. Gerber lays out his perspective on how AI will impact microbiome research in this Cell Host & Microbe piece.
Dr. Gerber to speak at the Nestle Nutrition Institute Workshop in Rio de Janeiro, Brazil on June 18, 2024
Dr. Gerber will speak at the 101st Nestlé Nutrition Institute workshop – Nutrition, microbiome and health: latest findings and future research.
The NNI and WNSC workshop 101 brought together renowned leading experts for a deep dive into the latest discoveries on the gut microbiome. Among the topics discussed were the ecological patterns occurring in early life, the influence of maternal microbiota and infant feeding, as well as the strategies to modulate gut microbiome and promote child health. The workshop also delved deeper into the future research such as microbiome environment intersection, infant gut virome and the use of artificial intelligence.
See a recording of his talk here.
Dr. Gerber to speak at the Festival of Genomics & Biodata in Boston on June 12, 2024
The Festival is designed principally for scientists and clinicians who are working in the fields of genomics, other omics, and/or using biodata , to further research, drug discovery, healthcare and – ultimately – deliver better patient outcomes.
Topics covered include genomics, single cell/spatial biology, multiomics, biodata, cancer research, drug discovery, AI, microbiome, proteomics, liquid biopsy and so much more.
Jennifer Dawkins thesis defense April 30, 2024
Congrats Jen!!
Computational prediction of health status
from the human gut microbiome and metabolome
Jennifer Dawkins
Tuesday, April 30, 2024 – 1:30 PM
MIT E25-119/121 and Zoom
(See below for full information)
A healthy gut microbiome is crucial to overall human well-being. Gut microbiome dysfunction, or dysbiosis, has been implicated in a broad range of diseases, including inflammatory bowel diseases (IBDs), cardiovascular diseases, kidney diseases, metabolic diseases, and gastrointestinal infections like Clostridioides difficile infection (CDI). Often, microbiome-linked illnesses arise after the microbiome is disrupted, such as by antibiotic treatment. However, because the microbiome is so diverse and individual-specific, very little is known about the specific microbial changes that may lead to it human disease. Thus, it is extremely difficult to predict whether a given disruption to the microbiome will result in disease.
Of the diseases linked to gut microbial disfunction, dysbiosis is perhaps most prominently linked to CDI. As the most common health-care associate infection, CDI is thought to occur when an individual has had both exposure to the C. difficile pathogen and gut dysbiosis caused by a past perturbation, such as antibiotic treatment. Infection recurrence, with an estimated rate of 15.5%, is a particularly insidious problem, and there is currently no reliable method to predict which individuals will recur. There is a need for early prediction of CDI after a perturbation, as this can allow physicians to start or restart more effective treatments immediately and prevent further sickness and risk of death.
Current research into the microbiome and microbiome dysbiosis, including CDI, focuses heavily on identifying the microbial taxonomic composition using next generation sequencing. However, there is growing evidence that the gut metabolome may provide crucial information that cannot be gained from microbial composition alone, as metabolites provide the means by which host cells and microbe cells communicate with each-other. Predictive analysis is especially useful for uncovering links between metabolic or microbial composition features and host disease state as it models all input covariates simultaneously. However, current predictive methods often fall short when applied to the microbiome, as simpler methods lack the capabilities to model this complex system, whereas highly non-linear “black box” methods lack interpretability. When predicting from biological or medical data with the goals of clinical utility and advancement of scientific knowledge, a model that can explain its decisions is crucial for increasing physician trust and uncovering avenues for future investigation. There is a need for interpretable computational models that can learn non-linear relationships between host outcome and paired microbial composition and metabolomic profiles.
This thesis addresses these two challenges. First, we present the analysis of a novel longitudinal study of CDI recurrence in patients, including predictive analyses, which demonstrate that a small set of metabolites can accurately predict future recurrence. Our findings have clinical utility in the development of diagnostic tests and treatments that could ultimately short-circuit the cycle of CDI recurrence. Secondly, we present a novel predictive model developed specifically for making interpretable predictions on paired microbial composition and untargeted metabolic profiles. We demonstrate our model’s ability to predict a variety of host disease states accurately while providing clear and biologically compelling explanations of its decisions, thereby demonstrating high clinical and scientific utility.
Thesis Supervisor
Georg K. Gerber, MD, PhD
Associate Professor of Pathology, HMS; Member of the Faculty, Harvard-MIT Program in Health Sciences and Technology
Thesis Committee Chair
Emery Brown, MD, PhD
Warren M. Zapol Professor of Anesthesia, HMS, MGH; Edward Hood Taplin Professor of Medical Engineering and of Computational Neuroscience, MIT
Thesis Reader
Eric Alm, PhD
Professor of Biological Engineering, MIT
Thesis Reader
Emily Balskus, PhD Thomas Dudley Cabot Professor of Chemistry and Chemical Biology, HU; Howard Hughes Medical Institute Investigator
Dr. Gerber to speak at Institut Pasteur in Paris on April 2, 2024.
Dr. Gerber will talk about “Novel Machine Learning Methods for Dissecting the Microbiome to Improve Human Health” at the Institut Pasteur Statistical and Mathematical Modeling in Biological Applications (SaMMBA) seminar on April 2, 2024.
Gerber Lab awarded $3.1 Million Five Year NIH-NIGMS R35 Grant “Probabilistic deep learning models and integrated biological experiments for analyzing dynamic and heterogeneous microbiomes”
This work will leverage deep learning technologies to advance the microbiome field beyond finding associations in data, to accurately predicting the effects of perturbations on microbiota, elucidating mechanisms through which the microbiota affects the host, and improving bacteriotherapies to enable their success in the clinic. New deep learning models will be developed that address specific challenges for the microbiome, including noisy/small datasets, highly heterogenous human microbiomes, the need for direct interpretability of model outputs, complex multi-modal datasets, and constraints imposed by biological principles. Computational models and biological experiments will be directly coupled through reinforcing cycles of predicting, testing predictions with new experiments, and improving models. An important objective will also be to make computational tools widely available to the research community, through release of quality open-source software.
Christine Tataru, PhD, Data Scientist
Welcome to the lab, Christine!