DNA barcoding is an identification approach that uses short DNA sequences from a standardized region of the genome as a molecular diagnostic in species identification. Despite being extremely controversial (e.g., [1–5]), an increasing number of projects are attempting the DNA barcoding of diverse eukaryotic species, especially following the launch of the Consortium for the Barcode of Life (CBOL)  in 2004. An ideal DNA barcode should allow fast, reliable, automatable, and cost-effective species identification by users with little or no taxonomic experience [7–9]. Identifications are usually made by comparing unknown sequences against known species DNA barcodes via distance-based tree construction [7, 10, 11], alignment searching (e.g., BLAST; [12, 13]), or methods recently proposed such as the characteristic attribute organization system (CAOS) , decision theory , and the back-propagation neural network (BP-based species identification) .
One of the issues central to the efficacy of DNA barcoding is the selection of a suitable barcode . Interspecific variability in this region should be clearly greater than intraspecific variability, the so-called "barcoding gap"; a threshold value for the magnitude of interspecific variation being 10 times that of intraspecific variation has been proposed as being diagnostic of species-level differences [7, 11, 17]. Additionally, given that DNA barcoding aims to identify species efficiently, the use of a single barcode marker is preferable (cf. the multi-barcode approach applied in plants [18, 19]).
A barcode from the mitochondrial (mt) genome should represent the most effective single-locus marker because of it smaller population size relative to the nuclear genome, which increases the overall concordance between the gene tree and the underlying species tree [20, 21]. Accordingly, there has been considerable attention on the use of the mt genome as the source of a barcode locus in animals. The mt genomes of almost all bilaterian animals contain 13 protein-coding genes (PCGs) which encode proteins involved in the oxidative phosphorylation machinery: cytochrome oxidase subunits 1, 2, and 3 (CO1 to CO3); cytochrome b subunit (CytB), NADH dehydrogenase subunits 1, 2, 3, 4, 4L, 5, and 6 (ND1 to ND6, ND4L), and ATPase subunits 6 and 8 (ATP6 and ATP8). The mt genome also contains 2 ribosomal RNA genes (16S and 12S) and 22 transfer RNA genes. One confounding issue with the use of mt genes in any form of molecular systematics or diagnostics is the widespread nuclear integration of mtDNA resulting in nuclear mitochondrial pseudogenes, or NUMTs, which could introduce serious ambiguity into DNA barcoding [22, 23]. However, mtDNA still offers several advantages compared with nuclear DNA: rapid evolution, limited exposure to recombination, lack of introns, and high copy number. These characteristics of mtDNA are important for routine amplification by polymerase chain reaction (PCR) and use as a molecular marker for lower-level questions [7, 17, 24].
Till now, the most widely used DNA barcode locus for animal taxa is approximately 650 bp from the 5' end of CO1 comprising about 40% of the total gene. Although CO1 has long been used in animal molecular systematics, initially there was no compelling a priori reason to focus on this specific gene among the 13 mt PCGs for DNA barcoding. Indeed, Hebert et al.  gave no comparison of the utility of CO1 with other mt genes. In practice, CO1 has often been used to study relationships of closely related species or even to study phylogeographic groupings within species because of its high level of diversity (e.g., [25, 26]). However, the CO1 fragment initially chosen for barcoding does have the advantage of being flanked by two highly conserved "universal" primer sites for PCR [7, 27, 28], which has been helpful for automating the collection of DNA barcodes from a diverse range of organisms.
There have been cases in which the universal CO1 DNA barcode has been highly successful in species identification. For example, an identification rate of 100% was achieved in a study of 260 species of North American birds . In contrast, a relatively low success rate (< 70%) was achieved in identifying 449 species of flies (Diptera), owing to an extensive overlap between intra- and interspecific variability . Variability between benthic cnidarian species was found to be very low, with 94.1% of species pairs showing a < 2% difference in their DNA sequences . CO1 exhibits significant rate variation within plethodontid salamanders, indicating that genetic distance does not provide a good indication of the time since speciation in this group . Finally, Roe and Sperling  found that there was no single optimally informative 600 bp region across the CO1-CO2 region, and the universal DNA barcoding region was no better than other regions across these two genes.
Therefore, it is still necessary to search for alternative DNA barcodes to avoid an exclusive reliance on CO1. Given the increasing availability of complete mt genomes from a range of taxa, marker choice is no longer constrained by the accessibility of universal primers . Among the mt genes, the 13 PCGs are potentially better targets for DNA barcoding owing to lower levels of insertions and deletions (indels), which can complicate the process of sequence alignment , than are found in alignments of ribosomal RNA genes which have also been proposed as species-level markers [32, 33]. Recently, there have been certain studies that evaluated no more than 4 already proposed regions as DNA barcodes for amphibians, primates, birds, and other groups [33–37], but the majority of the mt PCGs have never been evaluated for their barcoding utility. Further, the evaluation of alternative barcode regions has focused on groups where CO1 has already been shown to underperform (e.g. [33, 34]) rather than test if any other gene may be superior. This stands in contrast to the systematic investigations into phylogenetic performance (e.g., [31, 38]) or adaptive evolution  of most mt PCGs, and the approach of the fungal barcoding protocol .
We here present a bioinformatics approach to evaluate the efficacy of each of the 12 mt PCGs (ND6 was excluded because of its situation on the opposing light strand and the presence of many indels) along with the universal CO1 barcoding region as potential DNA barcodes for eutherian mammals. For this major animal group, there are a large number of mt genomes publicly available, including multiple samples from many species, and a well-defined taxonomic system. Our evaluation of each gene profile includes the following: (1) the number of barcode species recovered in the neighbour-joining (NJ) tree, (2) sequence variability within and between species, (3) resolution to higher taxonomic levels, and (4) best-fit evolutionary model and DNA saturation.