
MetaPhlAn is a computational tool for profiling the composition of microbial communities (Bacteria, Archaea and Eukaryotes) from metagenomic shotgun sequencing data (i.e. not 16S) with species-level. With the newly added StrainPhlAn module, it is now possible to perform accurate strain-level microbial profiling.
MetaPhlAn relies on ~1.1M unique clade-specific marker genes (the latest marker information file mpa_v31_CHOCOPhlAn_201901_marker_info.txt.bz2 can be found here) identified from ~100,000 reference genomes (~99,500 bacterial and archaeal and ~500 eukaryotic), allowing:
* unambiguous taxonomic assignments; * accurate estimation of organismal relative abundance; * species-level resolution for bacteria, archaea, eukaryotes and viruses; * strain identification and tracking * orders of magnitude speedups compared to existing methods. * metagenomic strain-level population genomics