MetaPhlAn2 is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea, Eukaryotes and Viruses) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.
MetaPhlAn2 relies on ~1M unique clade-specific marker genes (the marker information file mpa_v20_m200_marker_info.txt.bz2
can be found here) identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic), allowing:
- unambiguous taxonomic assignments;
- accurate estimation of organismal relative abundance;
- species-level resolution for bacteria, archaea, eukaryotes and viruses;
- strain identification and tracking
- orders of magnitude speedups compared to existing methods.
- metagenomic strain-level population genomics
For more information on the technical aspects of:
User manual || Tutorial || Forum
Citation:
If you use MetaPhlAn version 1, please cite:
Metagenomic microbial community profiling using unique clade-specific marker genes. Nicola Segata, Levi Waldron, Annalisa Ballarini, Vagheesh Narasimhan, Olivier Jousson, & Curtis Huttenhower. Nature Methods 9, 811-814 (2012)
If you use MetaPhlAn2, please cite:
MetaPhlAn2 for enhanced metagenomic taxonomic profiling. Duy Tin Truong, Eric A Franzosa, Timothy L Tickle, Matthias Scholz, George Weingart, Edoardo Pasolli, Adrian Tett, Curtis Huttenhower & Nicola Segata. Nature Methods 12, 902-903 (2015)
If you use StrainPhlAn, please cite the MetaPhlAn2 paper and the following StrainPhlAn paper:
Microbial strain-level population structure and genetic diversity from metagenomes. Duy Tin Truong, Adrian Tett, Edoardo Pasolli, Curtis Huttenhower, & Nicola Segata. Genome Research 27:626-638 (2017)
Interested in trying out the latest version? We're in the late-stage testing for a new version MetaPhlAn 3.0
MetaPhlAn relies on unique clade-specific marker genes identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic), allowing:
-
-
- up to 25,000 reads-per-second (on one CPU) analysis speed (orders of magnitude faster compared to existing methods);
- unambiguous taxonomic assignments as the MetaPhlAn markers are clade-specific;
- accurate estimation of organismal relative abundance (in terms of number of cells rather than fraction of reads);
- species-level resolution for bacteria, archaea, eukaryotes and viruses;
- extensive validation of the profiling accuracy on several synthetic datasets and on thousands of real metagenomes.
-
Download MetaPhlAn 2.0 ( metaphlan2.zip ). Review the README or visit the MetaPhlAn 2.0 Github repository for information on installation and how to run.
Synthetic metagenomes can be accessed from here.