The mitochondrial genomes of Ancylostoma caninum and Bunostomum phlebotomum – two hookworms of animal health and zoonotic importance

Background Hookworms are blood-feeding nematodes that parasitize the small intestines of many mammals, including humans and cattle. These nematodes are of major socioeconomic importance and cause disease, mainly as a consequence of anaemia (particularly in children or young animals), resulting in impaired development and sometimes deaths. Studying genetic variability within and among hookworm populations is central to addressing epidemiological and ecological questions, thus assisting in the control of hookworm disease. Mitochondrial (mt) genes are known to provide useful population markers for hookworms, but mt genome sequence data are scant. Results The present study characterizes the complete mt genomes of two species of hookworm, Ancylostoma caninum (from dogs) and Bunostomum phlebotomum (from cattle), each sequenced (by 454 technology or primer-walking), following long-PCR amplification from genomic DNA (~20–40 ng) isolated from individual adult worms. These mt genomes were 13717 bp and 13790 bp in size, respectively, and each contained 12 protein coding, 22 transfer RNA and 2 ribosomal RNA genes, typical for other secernentean nematodes. In addition, phylogenetic analysis (by Bayesian inference and maximum likelihood) of concatenated mt protein sequence data sets for 12 nematodes (including Ancylostoma caninum and Bunostomum phlebotomum), representing the Ascaridida, Spirurida and Strongylida, was conducted. The analysis yielded maximum statistical support for the formation of monophyletic clades for each recognized nematode order assessed, except for the Rhabditida. Conclusion The mt genomes characterized herein represent a rich source of population genetic markers for epidemiological and ecological studies. The strong statistical support for the construction of phylogenetic clades and consistency between the two different tree-building methods employed indicate the value of using whole mt genome data sets for systematic studies of nematodes. The grouping of the Spirurida and Ascaridida to the exclusion of the Strongylida was not supported in the present analysis, a finding which conflicts with the current evolutionary hypothesis for the Nematoda based on nuclear ribosomal gene data.


Background
Hookworms (Nematoda: Strongylida: Ancylostomatoidea) are blood-feeding nematodes that inhabit the small intestines of their mammalian host. Species of Ancylostoma, Necator, Bunostomum and Globocephalus, for instance, are of major human or animal health significance in various countries [1][2][3][4][5][6]. The infective, third-stage larvae (L3) can be ingested or penetrate the skin of the host and migrate via the circulatory system and the lungs to finally reside, as dioecious adults, usually in the duodenum. The adults attach via their buccal capsule to the intestinal mucosa, rupture capillaries and feed on blood. The pathogenesis of hookworm disease in humans and other animals is mainly a consequence of the blood loss, which occurs during parasite attachment and feeding in the intestine. Cutaneous infection can occur and is often associated with inflammatory/immune responses and painful, eruptive lesions during the migration of larvae through the skin [7,8].
Current estimates indicate that more than 740 million people are infected with the hookworms Ancylostoma duodenale and Necator americanus [9], and ~80 million are severely clinically affected by hookworm disease [10]. In a large number of developing countries, hookworms are a leading cause of iron deficiency anaemia, which, in heavy infections, can cause physical and mental retardation and deaths in children as well as adverse maternal-foetal outcomes [10,11]. Although there is considerably less information on the prevalence and geographical distribution of hookworms of animals [7,[12][13][14][15], these parasites are also clinically important in dogs (Ancylostoma braziliense, Ancylostoma caninum, Ancylostoma ceylanicum and Uncinaria stenocephala), cats (Ancylostoma tubaeforme), ruminants (Bunostomum phlebotomum, Bunostomum trigonocephalum and Gaigeria pachyscelis), pigs (e.g., Globocephalus urosubulatus) and other hosts [16]. Hookworms were originally thought to be host-specific [17,18]; however, the canine hookworm, Ancylostoma caninum, for example, can infect humans and cause dermatitis and eosinophilic enteritis [19], and some hookworm species, such as the bovine hookworm, Bunostomum phlebotomum, have been linked to cutaneous lesions in humans [20]. Significant genetic variation has been described among individuals of Ancylostoma caninum from dogs in Australia [21]. Such variation might reflect differences in host specificity, infectivity and/or pathogenicity among individual nematodes within a population or, in some cases, might be indicative of speciation events, as has been hypothesized previously for human hookworms [21,22]. Presently, there are no published studies of genetic variation within and among populations of Bunostomum phlebotomum and no molecular data are publicly available for this species.
The ability to accurately identify hookworms to species and to assess genetic variability in hookworm populations is central to studying their epidemiology as well as to diagnosis and control. Sequences of the first and second internal transcribed spacers (ITS-1 and ITS-2) of nuclear ribosomal DNA (rDNA) [23][24][25] and of cAMP-dependent protein kinase [26] have been utilized to identify and differentiate hookworm species. However, the ITS-1 and ITS-2 regions do not usually display sufficient within-species sequence variability to enable the study of the genetic structuring within and among hookworm populations [24]. In contrast, mitochondrial (mt) genomes have been shown to contain useful genetic markers for studying the population structures of hookworm species [27][28][29][30][31], because of their rapid mutation rates and apparent maternal inheritance [32][33][34]. Although the protein-coding mt gene cytochrome c oxidase subunit 1 (cox1) is applicable to population studies of a range of invertebrates, including parasitic platyhelminths [35,36] and some nematodes [37,38], there are still limited sequence data for cox1 and other mt genes of hookworms, and limited published information is available on sequence heterogeneity therein. Building on advances in long polymerase chain reaction (PCR)-based mt genome sequencing [39][40][41], the present study determined the sequences and structures of the two mt genomes from an individual of Ancylostoma caninum (from a dog) from Australia and a specimen of Bunostomum phlebotomum (from a calf) from South Africa. The sequences derived for the mt genomes of these two hookworms were compared in detail with mt genomic data available for the predominant hookworms of humans, Ancylostoma duodenale and Necator americanus [42], as well as those available for other selected species belonging to the orders Strongylida [41], Ascaridida [43][44][45] and Spirurida [46][47][48].

Mitochondrial genome features, characteristics and gene organization
The circular mt genomes of Ancylostoma caninum and Bunostomum phlebotomum, sequenced from single adult worms, were 13717 and 13790 bp in size, respectively ( Figure 1 [49]. This arrangement is characteristic for the mt genomes of all members of the Strongylida and Ascaridida, as well as the free-living nematode Caenorhabditis elegans (Rhabditida), but not for Strongyloides stercoralis (Rhabditida) [41][42][43][44][45][46][47][48][49]. In accordance with other species of Strongylida for which complete mt genome sequences are available [41,42], the AT-rich regions for both Ancylostoma caninum and Bunostomum phlebotomum were located between the genes nad5 and nad6, flanked at the 5'-end by the tRNA gene for alanine, and at the 3'-end by the tRNA genes for proline and alanine.
Each protein-coding gene for each of the two species had an open reading frame (ORF), and all genes were located on the same strand and transcribed in the same direction (5' to 3'), consistent with the known mt genomes of secernentean nematodes [37]. The nucleotide usages (coding strand) of A, C, G and T in each mt genome were 29.0%, 6.5%, 16.1% and 48.5%, respectively, for Ancylostoma caninum (Table 1) and 26.9%, 6.2%, 16.7% and 50.1%, respectively, for Bunostomum phlebotomum (Table 1), with overall A+T contents of 77.5% and 77.0%, respectively. The A+T content of protein coding genes ranged from 70.9% (cox1) to 81.3% (nad6) for Ancylostoma caninum, and from 70.4% (cox1) to 82.6% (nad3) for Bunostomum phlebotomum. The A+T content for rrnS, rrnL (= ribosomal RNA genes) and the AT-rich region were 78.1%, 80.9% and 90.1%, respectively, for Ancylostoma caninum, and 75.2%, 82.4% and 88.0%, respectively, for Bunostomum phlebotomum. For the mt genome of Ancylostoma caninum, codon usage in individual protein coding genes (n = 12) ranged from 0% for CGC (arginine) and CCC (proline) to 15.7% for TTT (phenylalanine). For the mt genome of Bunostomum phlebotomum, codon usage ranged from 0% for CGC (arginine), CAC (histidine), CTC (leucine), CCC (proline), TCC (serine) and GTC (valine) to 15.0% for TTT (phenylalanine). For both species, individual tRNA structures were consistent with those predicted previously for hookworms and other secernentean nematodes [37,42,45,50,51]. All tRNA genes, except trnS(AGN) and trnS(UCN), had a predicted secondary structure containing a TV-replacement loop instead of the TψC arm and loop (not shown). The predicted secondary structure of each of the two serine tRNAs contained the TψC arm and loop but lacked the DHU loop. The genes rrnS and rrnL were 694 bp and 935 bp in length, respectively; the predicted secondary structures for the ribosomal RNA gene subunits for Ancylostoma caninum and Bunostomum phlebotomum (not shown) were similar to those of Necator americanus and Ancylostoma duodenale [42], which is also supported by the high nucleotide sequence similarity in the mt genes among these four hookworms (see Tables 2  and 3).
The AT-rich regions for Ancylostoma caninum and Bunostomum phlebotomum were 272 bp and 234 bp, respectively, and both exhibited complex secondary structures (not shown), as predicted previously for the AT-rich regions of nematodes [41,42,45,47,49]. Four AT-repeat regions of variable length were identified in the AT-rich region of the mt genome of Ancylostoma caninum: two were 6 nucleotides (nt) (3 AT-repeats), one was 14 nt (7 AT-repeats) and the longest was 16 nt (8 AT-repeats). Similar dinucleotide repeats have been described in the ATrich region of the mt genomes of other nematode species (e.g., [41,42,44]). Other repetitive elements have been identified within this region in the free-living nematode Caenorhabditis elegans, the largest and most conspicuous of which are the repetitive sequence motifs CR1-CR6 [45]. However, no such elements were identified in the AT-rich region of the mt genome of either Ancylostoma caninum, Bunostomum phlebotomum or any other species of animal-parasitic nematode sequenced to date [41,44,47,49].

Comparative analyses with other nematodes
The identities (%) in inferred amino acid sequences of each protein-coding mt gene were calculated based upon pairwise comparisons between Ancylostoma caninum and Bunostomum phlebotomum (Tables 2 and 3). Based on these comparisons, the sequence identities (in decreasing A representation of the circular mt genomes of Ancylostoma caninum (13717 bp) and Bunostomum phlebotomum (13790 bp) (GenBank accession numbers FJ483518 and FJ483517, respectively) Figure 1 A

Phylogenetic analyses of selected species of Ascaridida, Spirurida and Strongylida using concatenated amino acid sequence data inferred from mt genes
Because of the high degree of intraspecific variation in nucleotide sequence in the mt genes of nematodes [37,38,52] and the limited availability or lack of multiple  mt genome sequences for each species, previous work has suggested that phylogenetic analyses for nematodes be conducted using concatenated amino acid sequence datasets, utilizing sequences inferred from individual mt protein coding genes [47]. In order to further assess systematic relationships within and among members of the Ascaridida, Spirurida and Strongylida, a phylogenetic analysis was carried out using Bayesian inference (BI) and maximum likelihood (ML) (Figure 2). Almost all clades in the consensus tree were supported by maximum BI posterior probability (pp) values (pp = 1.00; expressed as a percentage in Figure 2) and/or ML bootstrap support (100). The phylogenetic analysis conducted herein clearly supports the distinct classification of the orders Ascaridida, Spirurida and Strongylida, each as monophyletic clades with maximum statistical support. The order Rhabditida appears to be paraphyletic, with Caenorhabditis elegans grouping closely with the Strongylida, and Steinernema carpocapsae and Strongyloides stercoralis placed externally to a clade comprising the Ascaridida, Strongylida and C. elegans. This relationship is consistent with the proposed molecular phylogeny for the Nematoda based on small subunit (18S) nuclear ribosomal DNA data [53]. In addition, the hookworms were represented as a monophyletic clade within the Strongylida.
For hookworms, the phylogenetic analysis using BI indicated a closer relationship between Ancylostoma spp. and Necator americanus than between either of them and Bunostomum phlebotomum. This finding conflicts with the current classification of the Strongylida [16], wherein both Necator and Bunostomum are placed within the subfamily Bunostominae, whereas Ancylostoma is placed within the subfamily Ancylostominae (poorly supported by the ML analysis; bootstrap support = 47). A larger analysis, including mt data for more hookworm species, is needed to test further this hypothesis.
The present phylogenetic analysis did not support the grouping of the Ascaridida and Spirurida to the exclusion Phylogenetic analysis (using Bayesian inference) of concate-nated mt amino acid sequence data inferred from all protein coding mitochondrial genes (n = 12) for 16 secernentean nematodes, including Ancylostoma caninum and Bunostomum phlebotomum (GenBank accession numbers FJ483518 and FJ483517, respectively) Figure 2 Phylogenetic analysis (using Bayesian inference) of concatenated mt amino acid sequence data inferred from all protein coding mitochondrial genes (n = 12) for 16 secernentean nematodes, including Ancylostoma caninum and Bunostomum phlebotomum (Gen-Bank accession numbers FJ483518and FJ483517, respectively). The concatenated mitochondrial amino acid sequence of three mermithids were employed as outgroups.
Bayesian posterior probability values (as a percentage) and maximum likelihood bootstrap support (n = 100) are indicated above and below the lines, respectively. The scale indicates an estimate of substitutions per site, using the optimized model setting.
of the Strongylida, which contrasts markedly the results of a previous study based on nuclear ribosomal gene data (e.g., clade III versus clade V; ref. [53]). The "common heritage" hypothesized herein for the Ascaridida and Strongylida to the exclusion of the Spirurida has been supported by previous studies using mt gene order data [49] and using concatenated amino acid sequence data inferred from protein-coding mt genes [38]. These findings stimulate further study of the evolutionary relationships among taxa within this phylum using mt datasets. The highthroughput sequencing potential of 454 technology [54] and the recent validation of this technique for the sequencing of mt genomes [41] should provide a platform for an in-depth analysis of the phylogeny of the Nematoda.

Utility of mt gene markers for population genetic, ecological and epidemiological studies of hookworms
Although some nuclear genetic regions (e.g., ITS-1 and ITS-2 of nuclear rDNA [22][23][24][25] or the cAMP-dependent protein kinase gene [26]) have been shown to be suitable for the specific identification and differentiation of hookworms, the nuclear loci examined to date do not usually display sufficient levels of intraspecific sequence variability for the investigation of the genetic structures of hookworm populations (or the identification of population variants or "strains"). The ability to estimate genetic variability within and among hookworm populations is central to studying their epidemiology and population genetics, and can have important practical implications in relation to control.
Sequence-based analyses (including mutation scanning) of protein-coding mt genes, such as cox1 and nad1, have been particularly useful or population genetic studies [21,27,[29][30][31][55][56][57][58][59]. For example, Hu et al. [21] employed a single-strand conformation polymorphism (SSCP)-coupled sequencing approach to explore haplotypic variability within a limited number of Ancylostoma caninum specimens from Australia and each of the human hookworms (Ancylostoma duodenale and Necator americanus). Significant population sub-structuring was recorded within each of these three species, and two genetically distinct subpopulations were detected within Ancylostoma caninum from dogs from Townsville, Australia. Previous morphological and clinical studies had shown that Ancylostoma caninum in Townsville (Australia) is not specific to dogs and can also infect humans (but not complete its life-cycle), causing eosinophilic enteritis [19]. It has been speculated [21] that particular, genetically distinct subpopulations within Ancylostoma caninum can selectively infect the non-canine host. The pattern of haplotypic variability within Ancylostoma caninum might be due to secondary contact between populations or subpopulations, which could have arisen due to host movement from other geographical areas where this hookworm has been recorded and where ecological conditions are distinct; for example, Ancylostoma caninum is endemic in tropical north-east Queensland, Australia [60], but also occurs in the north-west area of Western Australia [61]. It is also possible that feral dogs or dingoes (in different geographical or climatic regions) might harbour one or more genetic variants which might "spill-over" into domestic dogs and/or humans [60]. Future study of the genetic variation among Ancylostoma caninum specimens from domestic and feral dogs, cats and humans as well as between populations from other geographical and climatic regions in Australia and South-East Asia would allow such questions to be addressed. A comparison of the genetic make-up of Ancylostoma caninum from humans affected by eosinophilic enteritis with those from domestic dogs in the Townsville area would be particularly interesting.
In contrast to Ancylostoma caninum, no studies have yet explored the genetics or molecular epidemiology of Bunostomum phlebotomum. Mitochondrial markers might be used to examine sub-structuring in Bunostomum phlebotomum populations in endemic regions of South Africa. In addition, although there has been anecdotal evidence suggesting that Bunostomum phlebotomum may cause cutaneous larval migrans in humans ( [20] and unpublished observations [JVW]), the zoonotic potential of this species of hookworm has not yet been tested molecularly. In view of the lack of distinguishing morphological characters allowing the identification of individual larvae, the provision of molecular markers for Bunostomum phlebotomum might allow the extent of the zoonotic potential of this species to be assessed for the first time.
The two mt genomes characterized herein provide a solid foundation for studies of the epidemiology, ecology and population genetics of both Ancylostoma caninum and Bunostomum phlebotomum, which could have important implications for the control of infections by these parasites. Given the lack of morphological characters for specific identification and differentiation of hookworm larvae, there is a clear need for species and population genetic markers for in-depth exploration of the epidemiology of hookworms [59]. Combined with the use of specific markers in the internal transcribed spacers (ITS-1 and ITS-2) of nuclear rDNA [23][24][25], investigating the mt haplotypic variability in populations of Ancylostoma caninum and Bunostomum phlebotomum (irrespective of developmental stage) could provide important insights into host affiliations, gene flow and transmission patterns (cf. [62,63]) and thus assist in the control of these hookworms. Furthermore, the direct sequencing of the mt genome of Ancylostoma caninum by 454 technology is the second example of the use of this approach for the sequencing of mt genomes of nematodes [41] and reenforces the exciting potential of emerging technologies for the high-throughput sequencing of relatively small organellar genomes.

Parasites and DNA extraction
An adult male of Ancylostoma caninum (designated Ac1) was collected (by IB) at necropsy from the duodenum of a dog from Townsville, Australia [23]. An adult male of Bunostomum phlebotomum (Bp1) was collected at autopsy from the same site from a calf monospecifically infected with an isolate of Bunostomum phlebotomum, originally derived from a Jersey cow in Pretoria North suburb, South Africa (by JvW). Nematodes were washed in physiological saline, identified morphologically to species [16], fixed in 50% (v/v) ethanol and stored at -20°C until use. Total genomic DNA was isolated from individual worms using sodium dodecyl-sulphate/proteinase K treatment [64], followed by spin-column purification (Wizard Clean-Up, Promega). The specific identity of each nematode was verified using the sequence of the second internal transcribed spacer (ITS-2) of nuclear ribosomal DNA, which provides species-specific genetic markers for hookworms [25]. The ITS-2 sequence derived from sample Ac1 was identical to that reported previously for Ancylostoma caninum (accession number AJ001591) [25] and that obtained from Bp1 (accession number FJ616999) was 82.3% identical to the closely related species Bunostomum trigonocephalum (accession number AJ001595) [25].
For BI of amino acid data, tree construction and posterior probabilities (pp) were calculated via 2000000 genera-tions (ngen = 2000000) using the Metropolis-coupled Monte Carlo Markov Chain (MCMCMC) method and four simultaneous tree-building chains (nchains = 4), with every 10 th tree being saved (samplefreq = 10). A suitable burnin (burnin = 1000) was chosen using 'Trace' in the program Tracer v1.4 http://beast.bio.ed.ac.uk/. Evolutionary distance was estimated using the most appropriate amino acid model and calculated employing the MrBayes program (aamodelpr = mixed), allowing for a gammashaped variation in mutation rates with a proportion of invariable sites (rates = invgamma). Upon completion of the analysis, a 50% majority rule = consensus tree was constructed in TreeviewX v.0.5.0 http://darwin.zool ogy.gla.ac.uk/~rpage/treeviewx/. For the ML analysis using GARLI, tree construction was estimated with the model GTR+I+g using the mtRev amino acid substitution matrix, for two replicate runs, and termination criteria with setting genthresholdfortopoterm = 20000 (no new significantly better scoring topology found in > 20000 generations). Nodal support in the ML analysis was estimated by bootstrap re-sampling (n = 100) using GARLI and the same model settings.