The past decade of biotechnology development has allowed us to see this microbial universe in better detail than ever before, using methods like high-throughput DNA sequencing. These and other types of molecular profiling can skip the need to grow every bug of interest individually in the lab. If anything, this has turned a vast pool of unknown-unknowns into a surprising number of known-but-still-unknowns: over two-thirds of the genes in most microbial communities are uncharacterized, often 90% or more, and in the human microbiome alone this leaves millions of microbial proteins interacting in ways that we still don’t understand.
Microbial communities associated with the human body - the human microbiome - are thus of particular interest as a way to better diagnose and treat disease, and as a way to maintain health and wellness. The human microbiome resides primarily in the gut, but communities throughout our body - in the mouth, on the skin, in our nostrils and lungs - influence everything from wound healing to cancer risk. The gut microbiome is particularly interesting as the site of greatest interaction between resident microbes and our immune system, and it most directly influences disease both in the gastrointestinal tract and throughout the body. The gut microbiome can be used to prospectively diagnose colorectal cancer, to predict responses to cancer therapies, to change how we extract nutrients from food or respond to drug treatments, and to fight off systemic diseases like arthritis or diabetes. I personally find microbial communities to be an amazing scientific challenge, and their application to precision medicine in the human microbiome to be one of the greatest opportunities in modern life sciences.
In order to tackle these challenges, the lab works on a variety of areas related to understanding microbial community function and applying it to improve human health:
As part of the Harvard Chan Biostatistics Department, we also have an emphasis on accurate quantitative methods development for statistics in microbiome epidemiology. Microbial community data is famously challenging to analyze, due to a combination of properties including measurement error, sparsity / zero inflation, compositionality, heteroscedasticity, and dynamics. While many methods from fields like transcriptome or genetic analysis are almost applicable, they often make subtle mistakes if applied inappropriately – typically inflating false positives, which can be highly misleading for human microbiome applications. This has led us to research into appropriate models for microbiome phenotypic and environmental (e.g. diet or chemical) associations; simulating microbial communities; microbial community dynamics and ecological interactions; and forensics and identifiability.
All of these methods come together in the study of the human microbiome in health, with the inflammatory bowel diseases (IBD) being one of the lab’s focus areas both for its own sake, and as a generalizable model of the gut microbiome in complex, chronic disease. IBD, including Crohn’s disease and ulcerative colitis, is highly variable among patients – some have extremely severe disease, some very mild; some respond well to first-line therapies, some do not; disease can manifest anywhere between early childhood and late adulthood; and the degree of microbial and immune changes can vary greatly from person to person. We’ve carried out the first study characterizing the microbial functions underlying ecological changes during IBD, investigated new-onset patients, provided the first genome-wide genetic and transcriptional interaction screens in IBD, and identified specific microbial strains, chemical products, and host-microbial molecular responses disrupted during the disease.
Many of our studies apply techniques for understanding microbial community function to the human microbiome in populations of interest, seeking to determine whether microbiome changes can be used as a biomarker to detect disease status or as a point of intervention to maintain or improve health. These include launching the Harvard Chan Center for the Microbiome in Public Health to support trainees and researchers interested in the human microbiome at scale, for conditions as diverse as colorectal cancer, oral health, HIV, inflammatory bowel disease, Parkinson’s disease, arthritis, diabetes, and diet and nutrition during health.
Like other types of molecular epidemiology, such as genome-wide association studies or transcriptional biomarker discovery, often the best way to use the microbiome for human health is to deeply understand the underlying molecular mechanisms. This has been true in the genomic era for some of the most successful cancer treatments, for example, and we likewise believe that the best way to leverage the microbiome for health is not just by observation, but by validating the specific molecular and ecological interactions that explain why a particular phenotype emerges. To paraphrase Feynman, what we cannot understand, we can only use in precision medicine if we get really, really lucky.To that end, the lab spends a lot of time creating new computational methods to understand microbial communities at the cellular and molecular level, and publishing free implementations of these in the bioBakery software suite. We put a lot of thought both into developing ways to “look at” microbial communities bioinformatically, and into robust software implementations of these tools that are open source, freely available, and tested and maintained using good software engineering practices. This not only helps us get our own research done reproducibly, but hopefully to support the broader community of microbiome scientists as well.
Major Projects and Resources
The Harvard Chan Center for the Microbiome in Public Health (HCMPH), with the mission of expanding our understanding of the microbiome to improve population health through basic research, translation, policy, education, and outreach. This includes human microbiome contributions to and interactions with chronic disease, basic and infectious disease microbiology, molecular epidemiology, nutrition, environmental health, computational and quantitative methods, and public health policy and best practices.
The HCMPH includes the BIOM-Mass platform for microbiome population studies, which provides end-to-end capabilities for microbiome sample collection, handling, data generation, and analysis. These are facilitated by the Harvard Chan Microbiome Collection Core and the Harvard Chan Microbiome Analysis Core.
The lab’s bioBakery software suite is a free, open-source platform for microbial community analysis, focusing on shotgun meta’omic and integrative multi’omic community profiling. It includes facilities for raw microbial community data handling and quality control, taxonomic and functional analysis, and downstream human and environmental population statistics.
The first phase of the Human Microbiome Project (HMP1) characterized the microbial communities from 300 healthy individuals across several different sites on the human body: nasal passages, oral cavity, skin, gastrointestinal tract, and urogenital tract, resulting in a total of over 5,600 16S rRNA gene amplicon profiles and >2,400 shotgun metagenomes from up to three time points per subject.
Our portion of the second phase of the Integrative Human Microbiome Project (HMP2 or iHMP) resulted in the Inflammatory Bowel Disease Multi’omics Database (IBDMDB), which followed 132 subjects from five clinical centers over the course of one year each. Integrated longitudinal molecular profiles for over 1,700 biweekly stool samples, >650 intestinal biopsies, and >500 quarterly blood draws (metagenomes, metatranscriptomes, metaproteomes, viromes, metabolomes, host exomes, epigenomes, transcriptomes, and serological profiles) are available at the IBDMDB data portal, along with protocols and further resources from the study.
The Microbiome Quality Control Project (MBQC) performed a first evaluation of two of the several steps typically used to obtain and analyze the human microbiome. The baseline assessment included contributions from 16 sample handling laboratories and nine bioinformatics laboratories, in addition to several additional groups participating in data analysis and manuscript preparation - all on a much-appreciated volunteer basis! The resulting baseline data include raw sequences, sequence data re-blinded prior to bioinformatics processing, raw OTU tables, and the final integrated data products.
The Human Microbiome Bioactives Resource (HMBR) provides a comprehensive platform for discovery, validation, and early-stage translation of novel therapeutics derived from the microbiome, including protocols for microbiome multi’omic sampling and profiling, meta-analysis methods and results for tens of thousands of standardized gut microbiome profiles, prioritized potentially bioactive elements of the gut microbiome in IBD (microbial species and strains, proteins, secreted peptides, biosynthetic gene clusters, and small molecules), and screens for high-priority bioactives in mammalian cell and tissue culture and in animal models.