Research Scientist, Machine Learning and Computational Biology for Microbiome

The Microbiome AI/Deep Learning Lab in the Massachusetts Host-Microbiome Center and Division of Computational Pathology at Brigham and Women’s Hospital/Harvard Medical School is seeking a computational scientist with experience in machine learning. You will develop, deploy, and apply machine learning approaches, with a special emphasis on deep learning, to a variety of microbiology data sources. Applications will include forecasting microbial population dynamics in the gut, characterizing the impact of spatial structure of the microbiome, predicting impact of the microbiome on host phenotype, tracking infections in human populations, elucidating microbial metabolism, and discovering functions of uncharacterized microbial metabolites and proteins. An important component of the position will also include engagement with the broader research community to identify new application areas.

Applicants should have a high level of interest in:

  • Applying new deep learning technologies to biomedical problems.
  • Advancing knowledge of the microbiome and its role in human health and disease.
  • Having your work make a direct impact on healthcare outcomes.
  • Working on an interdisciplinary team and collaborating with computational, wet lab and clinical scientists.
  • Engaging with the broader research community to advance applications of AI/deep learning for the microbiome.

About the environment: The Microbiome AI/Deep Learning Lab is an initiative within the Massachusetts Host-Microbiome Center (MHMC) and the Division of Computational Pathology (DCP) at Brigham and Women’s Hospital (BWH)/Harvard Medical School (HMS). With recent funding from the Massachusetts Life Sciences Center, the Lab has built a state-of-the-art compute cluster with extensive GPU and CPU nodes, with the objective of making advanced deep learning technologies broadly available to microbiome researchers. The MHMC is a research and core facility that has worked with 100+ groups in the US and internationally to promote understanding of host-microbiome interactions in health and disease, emphasizing a focus on function to define causative effects of the microbiota and to harness this knowledge in developing new therapies, diagnostics and further commercial applications. The DCP is a research division with a broad mandate to develop and apply advanced computational methods for furthering the understanding, diagnosis and treatment of human diseases. BWH is an HMS affiliated teaching hospital, adjacent to the HMS main quad, and the second largest non-university recipient of NIH research funding.

PRINCIPAL DUTIES AND RESPONSIBILITIES:

  • Develop machine learning approaches, with a special emphasis on deep learning, for a variety of microbiology data sources, including next generation sequencing and metabolomic data.
  • Deploy computational pipelines on local workstations and on high performance CPU and GPU clusters.
  • Analyze datasets and produce visualizations and written reports, including contributing to scientific publications and grant applications.
  • Engage with BWH researchers and conduct broader outreach, with the goal of increasing application of machine learning technologies for the microbiome.
  • Other duties as assigned.

QUALIFICATIONS:

  • PhD in Computational Biology, Computer Science, Physics, Statistics, Quantitative Microbial Genetics, Quantitative Ecology, or related quantitative discipline.
  • Experience in machine learning applications demonstrated through authorship on scientific publications.
  • 3+ years minimum Python programming experience.
  • 3+ years minimum experience working in high-performance computing environments.
  • Experience with bioinformatics methods and pipelines for next generation sequencing data analysis.
  • Experience with organizing and managing large multi-omics datasets.
  • Strong verbal and written communication, and interpersonal skills.
  • Experience with microbiology/microbiome applications and metabolic modeling tools is highly desired.
  • Experience with deep learning and PyTorch is highly desired.

SKILLS/ABILITIES/COMPETENCIES REQUIRED:

  • Must be capable of contributing within an interdisciplinary team, exhibit a high level of initiative, and have an eagerness to learn new technologies.
  • Ability to manage entire projects in a research environment, from design to implementation, and interpretation of final results.
  • Must possess advanced knowledge of machine learning, including model development, training, testing and deploying.
  • Demonstrated ability to develop and implement novel computational approaches for analyzing complex biomedical datasets including next generation sequencing data.
  • Demonstrated ability to manage large and complex biomedical datasets, using tools such as databases.
  • Experience working with microbiology or microbiome datasets is highly desired.
  • Excellent written and verbal communication skills with demonstrated ability to communicate complex results to both technical and non-technical audiences, through publications and presentations.
  • Ability to implement machine learning methods in Python; experience with deep learning and using PyTorch is highly desired.
  • Knowledge of software engineering best practices, including source code management/control (e.g., Git) and containerizer approaches.
  • Experience with high-performance computing environments, including scheduling systems, e.g., SLURM.
  • Ability to multitask and prioritize work, to achieve desired goals and deliverables.
  • Excellent interpersonal skills to effectively communicate with multidisciplinary teams including staff at all levels of the organization
  • Ability to share expertise, coach, and give general direction to others of different skill sets, backgrounds and levels.
  • Ability to lead outreach efforts within and external to the organization, to further the goal of the project.

PLEASE SUBMIT A COVER LETTER WITH YOUR APPLICATION. 

Posted in Job

Post Doctoral Fellow in Deep Learning for Microbiome Spatial Omics

The Gerber Lab (http://gerber.bwh.harvard.edu) is a multidisciplinary group at Brigham and Women’s Hospital/Harvard Medical School that develops novel computational models and high-throughput experimental systems to understand the role of the microbiota in human diseases, and applies these findings to develop new diagnostic tests and therapies. A long-standing and continuing focus of the lab is on incorporating principled probabilistic models into machine learning methods. The director of the lab, Dr. Georg Gerber, MD, PhD, MPH, uses his unique expertise, combining deep learning method development, medical microbiology, and human pathology, to leverage cutting-edge technologies to tackle scientifically and clinically important problems. 

We are looking for an exceptional researcher who will play a major role in new initiatives in the lab to develop novel deep learning (DL) approaches to further understanding of the spatial organization of the microbiome–the trillions of microbes living on and within us—and its interactions with mammalian cells. The successful candidate will be highly motivated and creative, taking a lead role in developing new deep learning-based methods, analyzing data, and interpreting results. Although experience analyzing data from biological systems is required, microbiome specific knowledge is not.

Qualifications:

  • PhD in Computer Science, Computational Biology, or other highly quantitative discipline.
  • Outstanding publication track record.
  • Strong mathematical background and skills.
  • Experience developing DL methods.
  • Experience analyzing data from biological systems, including sequencing data.
  • Solid programming skills in Python, including PyTorch.
  • Superior verbal and written communication skills, and ability to work on multidisciplinary teams.

Environment:  the Gerber Lab is located in the Brigham and Women’s Hospital Division of Computational Pathology (http://comp-path.bwh.harvard.edu) at Harvard Medical School (HMS). With a recent grant from the Massachusetts Life Science center, the Division has built the Lab for AI/Deep Learning for the Microbiome, which has a state-of-the-art GPU cluster for model development, training and deployment. BWH is part of the greater Longwood Medical Area in Boston, a rich, stimulating environment conducive to intellectual development and research collaborations, which includes HMS, Harvard School of Public Health and Boston Children’s Hospital.

To apply: email a single PDF including cover letter, CV, brief research statement and a list of at least three references to Dr. Georg Gerber (ggerber@bwh.harvard.edu).

We are an equal opportunity employer and all qualified applicants will receive consideration for employment without regard to race, color, religion, sex, national origin, disability status, protected veteran status, gender identity, sexual orientation, pregnancy and pregnancy-related conditions or any other characteristic protected by law.

Posted in Job

MMETHANE: interpretable AI for predicting host status from microbial composition and metabolomics data

Work led by Jen Dawkins – see our new manuscript.

Metabolite production, consumption, and exchange are intimately involved with host health and disease, as well as being key drivers of host-microbiome interactions. Despite the increasing prevalence of datasets that jointly measure microbiome composition and metabolites, computational tools for linking these data to the status of the host remain limited. To address these limitations, we developed MMETHANE, an open-source software package that implements a purpose-built deep learning model for predicting host status from paired microbial sequencing and metabolomic data. MMETHANE incorporates prior biological knowledge, including phylogenetic and chemical relationships, and is intrinsically interpretable, outputting an English-language set of rules that explains its decisions. Using a compendium of six datasets with paired microbial composition and metabolomics measurements, we showed that MMETHANE always performed at least on par with existing methods, including blackbox machine learning techniques, and outperformed other methods on >80% of the datasets evaluated. We additionally demonstrated through two cases studies analyzing inflammatory bowel disease gut microbiome datasets that MMETHANE uncovers biologically meaningful links between microbes, metabolites, and disease status.

MCSPACE: inferring microbiome spatiotemporal dynamics from high-throughput co-localization data

Work led by Gary Uppal – see our new manuscript.

Recent advances in high-throughput approaches for estimating co-localization of microbes, such as SAMPL-seq, allow characterization of the biogeography of the gut microbiome longitudinally and at unprecedented scale. However, these high-dimensional data are complex and have unique noise properties. To address these challenges, we developed MCSPACE, a probabilistic AI method that infers from microbiome co-localization data spatially coherent assemblages of taxa, their dynamics over time, and their responses to perturbations. To evaluate MCSPACE’s capabilities, we generated the largest longitudinal microbiome co-localization dataset to date, profiling spatial relationships of microbes in the guts of mice subjected to serial dietary perturbations over 76 days. Analyses of these data and an existing human longitudinal dataset demonstrated superior benchmarking performance of MCSPACE over existing methods, and moreover yielded insights into spatiotemporal structuring of the gut microbiome, including identifying temporally persistent and dynamic microbial assemblages in the human gut, and shifts in assemblages in the murine gut induced by specific dietary components. Our results highlight the utility of our method, which we make available to the community as an open-source software tool, for elucidating dynamics of microbiome biogeography and gaining insights into the role of spatial relationships in host-microbial ecosystem function.

Dr. Gerber to speak at the Nestle Nutrition Institute Workshop in Rio de Janeiro, Brazil on June 18, 2024

Dr. Gerber will speak at the 101st Nestlé Nutrition Institute workshop – Nutrition, microbiome and health: latest findings and future research.

The NNI and WNSC workshop 101 brought together renowned leading experts for a deep dive into the latest discoveries on the gut microbiome. Among the topics discussed were the ecological patterns occurring in early life, the influence of maternal microbiota and infant feeding, as well as the strategies to modulate gut microbiome and promote child health. The workshop also delved deeper into the future research such as microbiome environment intersection, infant gut virome and the use of artificial intelligence.

See a recording of his talk here.

Dr. Gerber to speak at the Festival of Genomics & Biodata in Boston on June 12, 2024

The Festival is designed principally for scientists and clinicians who are working in the fields of genomics, other omics, and/or using biodata , to further research, drug discovery, healthcare and – ultimately – deliver better patient outcomes.

Topics covered include genomics, single cell/spatial biology, multiomics, biodata, cancer research, drug discovery, AI, microbiome, proteomics, liquid biopsy and so much more.

Jennifer Dawkins thesis defense April 30, 2024

Congrats Jen!!

Computational prediction of health status
from the human gut microbiome and metabolome


Jennifer Dawkins
Tuesday, April 30, 2024 – 1:30 PM
MIT E25-119/121 and Zoom
(See below for full information)


A healthy gut microbiome is crucial to overall human well-being. Gut microbiome dysfunction, or dysbiosis, has been implicated in a broad range of diseases, including inflammatory bowel diseases (IBDs), cardiovascular diseases, kidney diseases, metabolic diseases, and gastrointestinal infections like Clostridioides difficile infection (CDI). Often, microbiome-linked illnesses arise after the microbiome is disrupted, such as by antibiotic treatment. However, because the microbiome is so diverse and individual-specific, very little is known about the specific microbial changes that may lead to it human disease. Thus, it is extremely difficult to predict whether a given disruption to the microbiome will result in disease. 


Of the diseases linked to gut microbial disfunction, dysbiosis is perhaps most prominently linked to CDI. As the most common health-care associate infection, CDI is thought to occur when an individual has had both exposure to the C. difficile pathogen and gut dysbiosis caused by a past perturbation, such as antibiotic treatment. Infection recurrence, with an estimated rate of 15.5%, is a particularly insidious problem, and there is currently no reliable method to predict which individuals will recur. There is a need for early prediction of CDI after a perturbation, as this can allow physicians to start or restart more effective treatments immediately and prevent further sickness and risk of death.


Current research into the microbiome and microbiome dysbiosis, including CDI, focuses heavily on identifying the microbial taxonomic composition using next generation sequencing. However, there is growing evidence that the gut metabolome may provide crucial information that cannot be gained from microbial composition alone, as metabolites provide the means by which host cells and microbe cells communicate with each-other. Predictive analysis is especially useful for uncovering links between metabolic or microbial composition features and host disease state as it models all input covariates simultaneously. However, current predictive methods often fall short when applied to the microbiome, as simpler methods lack the capabilities to model this complex system, whereas highly non-linear “black box” methods lack interpretability. When predicting from biological or medical data with the goals of clinical utility and advancement of scientific knowledge, a model that can explain its decisions is crucial for increasing physician trust and uncovering avenues for future investigation. There is a need for interpretable computational models that can learn non-linear relationships between host outcome and paired microbial composition and metabolomic profiles.

           
This thesis addresses these two challenges. First, we present the analysis of a novel longitudinal study of CDI recurrence in patients, including predictive analyses, which demonstrate that a small set of metabolites can accurately predict future recurrence. Our findings have clinical utility in the development of diagnostic tests and treatments that could ultimately short-circuit the cycle of CDI recurrence. Secondly, we present a novel predictive model developed specifically for making interpretable predictions on paired microbial composition and untargeted metabolic profiles. We demonstrate our model’s ability to predict a variety of host disease states accurately while providing clear and biologically compelling explanations of its decisions, thereby demonstrating high clinical and scientific utility.


Thesis Supervisor

Georg K. Gerber, MD, PhD

Associate Professor of Pathology, HMS; Member of the Faculty, Harvard-MIT Program in Health Sciences and Technology

Thesis Committee Chair

Emery Brown, MD, PhD

Warren M. Zapol Professor of Anesthesia, HMS, MGH; Edward Hood Taplin Professor of Medical Engineering and of Computational Neuroscience, MIT

Thesis Reader

Eric Alm, PhD

Professor of Biological Engineering, MIT

Thesis Reader

Emily Balskus, PhD Thomas Dudley Cabot Professor of Chemistry and Chemical Biology, HU; Howard Hughes Medical Institute Investigator