MetaPhlAn (Metagenomic Phylogenetic Analysis) is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data. MetaPhlAn relies on unique clade-specific marker genes identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic), allowing:
- up to 25,000 reads-per-second (on one CPU) analysis speed (orders of magnitude faster compared to existing methods);
- unambiguous taxonomic assignments as the MetaPhlAn markers are clade-specific;
- accurate estimation of organismal relative abundance (in terms of number of cells rather than fraction of reads);
- species-level resolution for bacteria, archaea, eukaryotes and viruses;
- extensive validation of the profiling accuracy on several synthetic datasets and on thousands of real metagenomes.
For more information on the technical aspects of:
User manual || Tutorial || Forum
Citation:
1 Department CIBIO, University of Trento, Italy
2 Harvard T. H. Chan School of Public Health, Boston, MA, USA
3 The Broad Institute of MIT and Harvard, Cambridge, MA, USA
4 IEO, European Institute of Oncology IRCCS, Milan, Italy
- New MetaPhlAn marker genes extracted with a newer version of ChocoPhlAn based on UniRef
- Estimation of metagenome composed by unknown microbes with parameter
--unknown_estimation
- Automatic retrieval and installation of the latest MetaPhlAn database with parameter
--index latest
- Virus profiling with
--add_viruses
- Calculation of metagenome size for improved estimation of reads mapped to a given clade
- Inclusion of NCBI taxonomy ID in the ouput file
- CAMI (Taxonomic) Profiling Output Format included
- Removal of reads with low MAPQ values
MetaPhlAn requires python 3 or newer with numpy, and Biopython libraries installed. Python libraries are automatically installed by pip
. MetaPhlAn relies on BowTie2 (version 2.3 or higher) to map reads against marker genes. Check that bowtie2
is present in the system path with execute and read permissions.
If MetaPhlAn is installed using conda, no pre-requisites are needed.
MetaPhlAn is integrated with advanced heatmap plotting with hclust2 and cladogram visualization with GraPhlAn. If you use such visualization tools please refer to their prerequisites.
The best way to install MetaPhlAn is through conda via the Bioconda channel. If you have not configured you Anaconda installation in order to fetch packages from Bioconda, please follow these steps in order to setup the channels.
You can install MetaPhlAn by running
$ conda install -c bioconda python=3.7 metaphlan
For installing it from the source code and for further installation instructions, please see the Wiki at the Installation paragraph.