Building MetaPhyler classifier To build MetaPhyler for a particular phylogenetic marker gene G and for length 60bp, we first simulate metagenomic reads from all reference marker genes, and as a negative set, from genomic sequences that do not contain marker genes. We then map these simulated reads against reference gene G using BLASTX. To build a classifier for gene G at a specific taxonomic level, say order, in vector B
we store BLASTX bit scores between gene G and the simulated reads that are from the same order; in vector B
we store bit scores for aligning all other reads against G. We then find the bit score cutoff b
that minimizes Equation 1. Finally, we repeat the previous steps to find bit score cutoffs for simulated reads of other lengths and for other genes.