MetaPhlAn3

The Huttenhower Lab > MetaPhlAn3
MetaPhlAn 3.0

MetaPhlAn (Metagenomic Phylogenetic Analysis) is a computational tool for profiling the composition of microbial communities from metagenomic shotgun sequencing data. MetaPhlAn relies on unique clade-specific marker genes identified from ~17,000 reference genomes (~13,500 bacterial and archaeal, ~3,500 viral, and ~110 eukaryotic), allowing:

  • up to 25,000 reads-per-second (on one CPU) analysis speed (orders of magnitude faster compared to existing methods);
  • unambiguous taxonomic assignments as the MetaPhlAn markers are clade-specific;
  • accurate estimation of organismal relative abundance (in terms of number of cells rather than fraction of reads);
  • species-level resolution for bacteria, archaea, eukaryotes and viruses;
  • extensive validation of the profiling accuracy on several synthetic datasets and on thousands of real metagenomes.

For more information on the technical aspects of:

User manual || Tutorial || Forum

 

Citation:

Francesco Beghini1 ,Lauren J McIver2 ,Aitor Blanco-Mìguez1 ,Leonard Dubois1 ,Francesco Asnicar1 ,Sagun Maharjan2,3 ,Ana Mailyan2,3 ,Andrew Maltez Thomas1 ,Paolo Manghi1 ,Mireia Valles-Colomer1 ,George Weingart2,3 ,Yancong Zhang2,3 ,Moreno Zolfo1 ,Curtis Huttenhower2,3 ,Eric A Franzosa2,3 ,Nicola Segata1,4

Integrating taxonomic, functional, and strain-level profiling of diverse microbial communities with bioBakery 3

bioRxiv preprint (2020)

1 Department CIBIO, University of Trento, Italy
2 Harvard T. H. Chan School of Public Health, Boston, MA, USA
3 The Broad Institute of MIT and Harvard, Cambridge, MA, USA
4 IEO, European Institute of Oncology IRCCS, Milan, Italy

Major updates in v3.0
  • New MetaPhlAn marker genes extracted with a newer version of ChocoPhlAn based on UniRef
  • Estimation of metagenome composed by unknown microbes with parameter --unknown_estimation
  • Automatic retrieval and installation of the latest MetaPhlAn database with parameter --index latest
  • Virus profiling with --add_viruses
  • Calculation of metagenome size for improved estimation of reads mapped to a given clade
  • Inclusion of NCBI taxonomy ID in the ouput file
  • CAMI (Taxonomic) Profiling Output Format included
  • Removal of reads with low MAPQ values
Pre-requisites

MetaPhlAn requires python 3 or newer with numpy, and Biopython libraries installed. Python libraries are automatically installed by pip. MetaPhlAn relies on BowTie2 (version 2.3 or higher) to map reads against marker genes. Check that bowtie2 is present in the system path with execute and read permissions.

If MetaPhlAn is installed using conda, no pre-requisites are needed.

MetaPhlAn is integrated with advanced heatmap plotting with hclust2 and cladogram visualization with GraPhlAn. If you use such visualization tools please refer to their prerequisites.

Installation

The best way to install MetaPhlAn is through conda via the Bioconda channel. If you have not configured you Anaconda installation in order to fetch packages from Bioconda, please follow these steps in order to setup the channels.

You can install MetaPhlAn by running

$ conda install -c bioconda python=3.7 metaphlan

For installing it from the source code and for further installation instructions, please see the Wiki at the Installation paragraph.