MetaPhlAn2

The Huttenhower Lab > MetaPhlAn2
MetaPhlAn 2.0

MetaPhlAn2 is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.

MetaPhlAn2 relies on ~1M unique clade-specific marker genes (the marker information file mpa_v20_m200_marker_info.txt.bz2 can be found here) identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic), allowing:

  • unambiguous taxonomic assignments;
  • accurate estimation of organismal relative abundance;
  • species-level resolution for bacteria, archaea, eukaryotes and viruses;
  • strain identification and tracking
  • orders of magnitude speedups compared to existing methods.
  • metagenomic strain-level population genomics

For more information on the technical aspects of:
User manual || Tutorial || Forum

 

Citation:
If you use MetaPhlAn version 1, please cite:

Metagenomic microbial community profiling using unique clade-specific marker genes. Nicola Segata, Levi Waldron, Annalisa Ballarini, Vagheesh Narasimhan, Olivier Jousson, & Curtis Huttenhower. Nature Methods 9, 811-814 (2012)

If you use MetaPhlAn2, please cite:

MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Duy Tin Truong, Eric A Franzosa, Timothy L Tickle, Matthias Scholz, George Weingart, Edoardo Pasolli, Adrian Tett, Curtis Huttenhower & Nicola Segata. Nature Methods 12, 902-903 (2015)

If you use StrainPhlAn, please cite the MetaPhlAn2 paper and the following StrainPhlAn paper:

Microbial strain-level population structure and genetic diversity from metagenomes. Duy Tin Truong, Adrian Tett, Edoardo Pasolli, Curtis Huttenhower, & Nicola Segata. Genome Research 27:626-638 (2017)

Interested in trying out the latest version? We're in the late-stage testing for a new version MetaPhlAn 3.0

FEATURES

MetaPhlAn relies on unique clade-specific marker genes identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic), allowing:

      • up to 25,000 reads-per-second (on one CPU) analysis speed (orders of magnitude faster compared to existing methods);
      • unambiguous taxonomic assignments as the MetaPhlAn markers are clade-specific;
      • accurate estimation of organismal relative abundance (in terms of number of cells rather than fraction of reads);
      • species-level resolution for bacteria, archaea, eukaryotes and viruses;
      • extensive validation of the profiling accuracy on several synthetic datasets and on thousands of real metagenomes.
Obtaining MetaPhlAn 2.0

Download MetaPhlAn 2.0 ( metaphlan2.zip ). Review the README or visit the MetaPhlAn 2.0 Github repository for information on installation and how to run.

Synthetic Metagenomes

Synthetic metagenomes can be accessed from here.