The complete mitochondrial genome of the sexual oribatid mite Steganacarus magnus: genome rearrangements and loss of tRNAs
© Domes et al. 2008
Received: 14 May 2008
Accepted: 07 November 2008
Published: 07 November 2008
Skip to main content
© Domes et al. 2008
Received: 14 May 2008
Accepted: 07 November 2008
Published: 07 November 2008
Complete mitochondrial (mt) genomes and the gene rearrangements therein are increasingly used as molecular markers for investigating phylogenetic relationships, especially for elucidating deep splits. Contributing to the complete mt genomes of arthropods, especially Arachnida, available so far, we provide the first complete mt genome of a sarcoptiform mite species, the sexually reproducing oribatid mite Steganacarus magnus (Acari, Oribatida) which was determined by sequencing of long PCR products.
The mt genome of S. magnus lacks 16 tRNAs, only those for leucine, histidine, proline, tryptophan, glutamine and serine are present. Within those tRNAs only tRNA-His and tRNA-Pro have kept their original position, the others are translocated. Furthermore, the mt genome of S. magnus consists of 13,818 bp and it is composed of 13 protein-coding genes and two genes for the ribosomal RNA subunits that are typically found in metazoan mt genomes. The gene order in S. magnus differs from the hypothetical ancestral chelicerate arrangement as conserved in Limulus polyphemus: instead of nad1-rrnL-rrnS-LNR-nad2 (tRNAs excluded) S. magnus is nad2-rrnL-nad1-rrnS-LNR. Phylogenetic analyses of a concatenated amino acid dataset of all mt protein-coding genes of 28 arthropod species suggest a sister-group relationship of sarcoptiform and prostigmatid mites (S. magnus and Leptotrombidium).
The mt gene arrangement of S. magnus differs from the hypothetical ground plan of arthropods and from that of other mites further contributing to the variety of mt gene arrangements found in Arachnida. The unexpected lack of tRNAs is enigmatic, probably showing that the loss of mt genes is an ongoing evolutionary process. For solving phylogenetic relationships of oribatid mite lineages and their position within Acari further complete mt genomes are needed.
Mitochondria are maternally inherited cell organelles that contain a circular genome of about 14–19 kb in bilaterian animals; the mitochondrial (mt) DNA in metazoans usually codes for 13 proteins, 22 transfer RNAs (tRNA), two ribosomal RNAs (rRNA; large (rrnL) and small (rrnS) ribosomal subunit) and contains a non-coding control region (LNR) of variable length [1, 2]. The loss of genes in mitochondria is a commonly recognized and ongoing process in eukaryotes . Eukaryotic mt genomes generally contain fewer genes than their free-living bacterial ancestors since the majority of the original mt proteins are now encoded in the nucleus; this is either caused by the transfer of the original mt gene to the nucleus or by the replacement of its function by a preexisting nuclear gene . The protein-coding genes which have been retained in mt genomes are mainly those involved in electron transport and phosphorylation, e.g., cytochrome b (cob) and the cytochrome oxidases (cox1, cox2, cox3) , but their number is variable ranging from three in the malaria parasite Plasmodium falsciparum (Apicomplexa)  to 67 in Reclinomonas americana, the earliest branching aerobic protist (Chlorophyta) .
In addition to gene loss, the positions of genes relative to each other exhibit frequent rearrangement. While the arrangement of mt genes is conserved in some lineages of arthropods [7, 8], it is highly variable in others [9–11]. In particular, the positions of the relatively small genes for tRNAs frequently vary within and among taxa. The arrangement of the hypothetical ancestor of arthropods is conserved in the horseshoe crab Limulus polyphemus , whereas most insect genomes differ from the ancestral state by the location of one tRNA .
Oribatid mites (Acari, Oribatida) are soil-dwelling animals that occur in high numbers in almost all terrestrial ecosystems . A characteristic feature of this group is the unusually high percentage of parthenogenetic taxa (~10% of all species) and the co-occurrence of sexuality and parthenogenesis in the same habitat . Oribatid mite fossils date back at least 360 million years [15, 16] and therefore a number of parthenogenetic lineages of oribatid mites join bdelloid rotifers as "ancient asexual scandals" [17–20]. Since oribatid mites provide insights into the evolution and maintenance of sex, recently much attention has been paid to their phylogeny and radiation [21–24]. However, studies based on single genes such as the ribosomal 18S region (18S), the heat shock protein 82 (hsp82), the elongation factor 1α (ef1α) or cox1 could neither satisfactorily resolve phylogenetic relationships  nor clarify the number of parthenogenetic radiations  nor delineate the age of the group . Since oribatid mites apparently are among the first terrestrial animals (I. Schaefer, R.A. Norton, S. Scheu & M. Maraun, unpubl. data) and species exhibit different evolutionary mutation rates, lineages are probably vulnerable to long-branch attraction in phylogenetic reconstruction. Therefore, further markers, such as gene rearrangements in mt genomes, are needed to resolve phylogenetic relationships of oribatid mite lineages, among mite taxa (Acari) and among chelicerates in general.
Until now no complete mt genome of an oribatid mite species was available although mitochondrial genomes have become invaluable phylogenetic markers during the last few years. Complete mt genome sequences are now known for about 150 arthropods, including 26 chelicerates with 15 species of Acari. These 15 acarine genomes represent ten species of ticks (Ixodida, Parasitiformes), two mesostigmate mites (Mesostigmata, Parasitiformes) and three species of the genus Leptotrombidium (Prostigmata, Acariformes). In contrast to other arthropods and Metazoa in general , the arrangement of the mt genes differs markedly within and among taxa of Acari. While the ancestral state of arthropods is retained in soft- and prostriate-hard ticks [8, 26], there is a major rearrangement shared by all metastriate hard ticks [26, 27], and there are numerous, lineage specific rearrangements in the mesostigmate species, such as Varroa destructor  and Metaseiulus occidentalis , and also in the prostigmate mite genus Leptotrombidium . Since the deep phylogeny of Acari, especially within the large subgroup of Acariformes, is still controversial, synapomorphic rearrangements of mt genes, if present, would likely allow new insights into phylogenetic relationships.
We report the first mt genome for an oribatid mite, Steganacarus magnus (Nicolet, 1855). The mt genome of S. magnus is compared with that of other mites and with basal chelicerates to clarify phylogenetic relationships within Acari and Chelicerata in total. We show that the mt genome of S. magnus is slightly rearranged and lacks 16 tRNA genes.
Rearrangements of genes in mt genomes are useful markers for deep splits within phylogenies  although the relative frequency of rearrangements varies among lineages . For example, Drosophila (Insecta) and Daphnia (Crustacea) share the same mt rearrangement although they diverged 400–500 million years ago  and the mt genome of soft ticks remained unchanged for a similar time period . In contrast, gene arrangements in mt genomes of lice are highly variable [9, 11], and in the mite genus Leptotrombidium they even differ between species . In general, in mt genomes of Acari a number of rearrangements have occurred. While the ground plan is retained in soft- and prostriate-hard ticks , the arrangement of mt genes in Mesostigmata (Parasitiformes), especially Metaseiulus occidentalis is strongly derived. Notably, the Metaseiulus mt genome is the largest within chelicerates, even though nad3 and nad6 were lost, due to the duplication of many of the remaining genes .
Fossils of oribatid mites are known from Devonian sediments and molecular studies suggest that their origin may predate this record by ~180 mya (I. Schaefer, R.A. Norton, S. Scheu & M. Maraun, unpubl. data). Further, oribatids probably diverged from other acariform mites, such as Leptotrombidium, about 570 mya ago (I. Schaefer, R.A. Norton, S. Scheu & M. Maraun, unpubl. data). The mt genomes of Leptotrombidium and S. magnus differ markedly; they only share the gene boundaries cox1-cox2, nad6-cob and atp8-atp6-cox3 (Fig. 1). However, since these boundaries are also present in the ground plan of chelicerate mt genomes there are no derived arrangements supporting the common ancestry of Acariformes.
All 13 protein-coding genes typically present in arthropods could be identified in the mt genome of S. magnus. They start with the mt start codons ATT, ATG, ATA and ATC (Additional File 1). Six genes (cox1, atp6, nad3, nad5, cob, nad2) terminate with incomplete stop codons (T or TA; Additional file 1) while all others terminate with either TAA, TAG or TTT (atp8, cox3, nad4, nad4L, nad6, nad1; Additional file 1). In cox2, which is flanked by two other protein-coding genes (cox1 and atp8), no stop codon is present; as shown for sets of protein-coding genes in Anabrus simplex (Orthoptera, Insecta) a stem-loop formation in the secondary structure of the transcribed polycistronic mRNA probably functions as terminator .
The percentage nucleotide composition of the mt (+)-strand is A = 36.5, C = 13.2, G = 12.2 and T = 38.1. Therefore, there are approximately equal numbers of each complementary nucleotide pairs (A:T, G:C) but a strong AT-bias is present. The pattern for all protein-coding genes is also strongly AT-biased but with a much higher T than A content (Additional File 1). Skews calculated for neutral fourfold degenerate sites do not indicate consistent asymmetric strand bias (Additional File 1). Genes encoded on the (+)-strand show either neutral (atp6, nad2), positive (cox2, nad6) or negative CG-skew (cox1, cox3, nad3, cob). The majority of genes encoded on the (-)-strand are positive CG-skewed (nad5, nad4L, nad1) (Additional File 1). The AT skew at fourfold degenerate sites is only positive for nad2 and nad5, but negative for all other protein-coding genes (Additional File 1).
A reversal of the strand bias is usually explained by an inversion of the control region (LNR) which contains the origin of replication and translation [2, 32]. During replication the two different strands ((-)- and (+)-strand) are exposed to different mutational pressures, typically causing distinct skews since one strand remains longer in the single-stranded state than the other . Therefore, the LNR likely functions as a key region for determining strand bias and an inversion results in a complete reversal of the strand nucleotide composition over time . In S. magnus most genes encoded on the (+)-strand show a negative CG-skew at fourfold degenerate third codon positions which is inverted to the common pattern and probably indicates a reversal of the LNR. On the other hand, the presence of neutral or positive skewed genes may indicate that this reversal is of recent origin and consequently the process of inverting nucleotide skew is not completed so far.
The absence of a distinct strand bias can also be explained by the recent inversion of single genes which homogenize general patterns of asymmetry [33, 34]. However, since no inversions of protein-coding genes were found in S. magnus, the absence of a distinct asymmetrical skew in the genome awaits explanation.
The major non-coding region (LNR), which presumably functions as the mitochondrial control region, is 1019 bp in length and located between rrnS and cox1 (Fig. 1). There are additional non-coding intergenic regions ranging in size from 2–127 bp. These regions were blasted and checked for tRNA genes but could not be assigned to any functional gene.
In the mesostigmate mite M. occidentalis the stem-loop structure which probably represents the OR comprises only AT nucleotides but does not have any similarity to sequences from other chelicerates . The LNR of the mite V. destructor includes several repetitions of a 157-bp motif and eleven sites of potential stem-loop structures have been identified close to it . In the genus Leptotrombidium, the closest relative to S. magnus for which an mt genome is sequenced, the LNR is duplicated (and one is inverted) in L. akamushi and L. deliense and four copies are present in L. pallidum [10, 27]. Although mt LNR possess several distinct structural features (e.g., high AT-content, concerted evolution of tandem repeats, stem-loop structures), their use for evolutionary studies is limited by the high variability of the sequence and the possibility of length heteroplasmy within individuals .
The secondary structures of the rRNA genes in Steganacarus differ from those published for Leptotrombidium  but are more similar to those published for insects [e.g., [38, 39]]. The Leptotrombium rrnS gene lacks helices 1, 2, 4, 5, 7, 8 and 22, and the compound helices 19–20–21 and 39–40–42 as depicted have a very different secondary structure from that found in other arthropods . In contrast, most of the helices found in arthropod rrnS genes are present in Steganacarus, but helices 7 and 8 are absent. However, helix 5 has a large loop such that there is limited difference in sequence length in this region between Steganacarus and arthropods which possess these helices; this is similar to the structure found in Hymenoptera . Helix 16 is greatly shortened relative to other species, consisting of just 2 paired bases compared to up to 8 in Leptotrombidium . The loop regions between helices 38 and 39–40–42 are also greatly reduced consisting of just 4 bp on the 5' side and 2 bp on the 3'; this is in contrast to insects where these loops consist of a dozen or more bases on each side.
Similarly, the rrnL gene of Leptotrombidium entirely lacks domain I (helices B12, B20), helices C1, cd1, D1 and H3 and the structure of the compound helices D17-D18-D19 is unique to Leptotrombidium. The Steganacarus rrnL secondary structure is again more similar to that of other arthropods, domain I is absent but helices C1, cd1 and D1 are present and D17-D18-D19 has a more canonical structure. Helix G3 is greatly reduced, consisting of just 2 paired stem bases and 3 loop bases, relative to both Leptotrombidium, 6 stem and 3 loop bases, and insects, up to 20 stem and 22 loop bases in Manduca . The 3' end of the rrnL molecule is ambiguous, bases 10346–10290 either form the tRNA gene for Serine or they form helices H3 and the 3' side of H2. Without an analysis of the mature transcribed genes it is not possible to determine which form is more likely in the mature molecule. Helix H3 is not present in all arthropod rrnL genes found in Hymenoptera and Lepidoptera but is absent from Coleoptera and Leptotrombidium and its function in the mature rRNA is unclear. Accordingly, we present both possibilities, helices H2 and H3 are included in Figure 3 with the region which potentially forms the tRNA shown in a box, while tRNA-Ser is included in Figure 1 and Additional file 1.
Out of the 22 tRNA genes typically present in arthopods only six are present in S. magnus. Out of these six tRNAs, only two have kept their original position (tRNA-His between nad5 and nad4, tRNA-Pro between nad4L and nad6 with tRNA-Thr missing) while all others are translocated relative to the ground plan. Further, tRNA-Leu overlaps the nad5 gene by four nucleotides and tRNA-Trp overlaps with the 3'-end of cob by 16 nucleotides (Additional File 1); as described above tRNA-Ser presumably forms part of the rrnL gene at the 3'-end (Fig. 3A). Remarkably, although 16 tRNA genes have been lost, the mt genome size of S. magnus (13,818) is comparable to those of Leptotrombidium deliense (13,731) and L. akamushi (13,698) [10, 27] which is due to a larger LNR and more intergenic spacer regions.
The tRNAs for histidine, proline, tryptophan and glutamine lack the TψC-stem and -loop but posses a complete DHU-stem with a smaller loop in tRNA-Trp and a larger in tRNA-Gln (Fig. 4); this lack of the TψC-arm is also present in L. pallidum  and M. occidentalis .
All tRNAs present in S. magnus are shorter than the average length of arthropod tRNAs (about 66 bp) and are highly modified. The loss of the TψC-arm and the replacement by a size-variable loop was first recognized in nematode tRNAs  and can also be found in M. occidentalis , L. pallidum  and other chelicerates including scorpions  and spiders . Further, a study on truncated tRNAs in Arachnida revealed that the tRNAs for proline, histidine and glutamine have experienced the greatest number of independent TψC-arm losses in arachnids whereas TψC-arm loss in genes for arginine, lysine and methione have occurred only once and is synapomorphic for opisthothele spiders . However, arachnids seem to have a compensatory mechanism that allows truncated tRNAs to function during translation and the interaction with the ribosome .
While the two rRNA genes are present in all eukaryotic genomes , the number of tRNA genes varies markedly among taxa. No tRNAs are present in the protists Plasmodium falciparum (Apicomplexa) and Trypanosoma brucei (Kinetoplastida)  but up to 27 are present in Reclinomonas americana (Chlorophyta) . The loss of mt tRNA genes may be facilitated by the fact that the proteins with which they interact during translation in the mitochondria are encoded in the nucleus . Furthermore, the mutation rate of mt encoded tRNA genes is about five-fold higher than that of nuclear genes which experience strong purifying selection [45, 46]. Since tRNAs are encoded in the nucleus anyway and selective pressure favors the reduction of mt genome size , it is likely that S. magnus simply lost its tRNA genes instead of transferring them to the nucleus. However, the differences in the genetic code between nucleus (universal code) and mitochondria (invertebrate mt code) would probably argue for two sets of tRNAs or alternatively some modifications of the tRNA-amino acyl transferases are needed to treat each tRNA isotype differently in the two different compartments.
The newly sequenced S. magnus formed the closest relative of the prostigmate mite genus Leptotrombidium which was expected following previous studies . As in previous studies  the ricinuleid species Pseudocellus pearsei clustered as sister-group of Acariformes but was only supported by Bayesian posterior probabilities (Fig. 5). In contrast to the study of Fahrein et al.  the Acariformes/Ricinulei clade did not form the sister-group of Araneae but of Parasitiformes; however, the support for both possibilities is similar weak [see ].
For a broader investigation of chelicerate phylogeny mt DNA data of key taxa such as Opiliones, Pseudoscorpiones, Palpigradi, Uropygi and Amblypygi are missing. Even for a complete study of Acari phylogeny many taxa remain to be sampled; no complete mt genome is available for Astigmata, Endeostigmata or Opilioacarida.
While using mt DNA for phylogenetic studies, a reversal of nucleotide strand bias or a reversal of nucleotide bias of single genes (caused by gene inversion or the inversion of the control region) can be misleading for phylogenetic relationships by causing long-branch attraction artifacts . Further, since mt genomes evolve at higher rates than the nuclear genome , saturation of the phylogenetic signal can also be problematic in deep split phylogenies. Species exhibiting unusual genomic features such as complicated gene arrangements or multiple control regions as well as species with small body sizes or parasitic lifestyle are also vulnerable to long branches . Collectively, Acari exhibit all these features and so a heavy sampling effort will be necessary to reliably use mt genomics in mite phylogenetic studies.
The first complete mt genome for an oribatid mite (S. magnus, Oribatida, Acariformes) is a typical circular molecule and comprises all protein-coding genes typically present in Metazoa and two genes for the ribosomal subunits. Compared to the putative ground pattern of arthropods the genome is rearranged, affecting the genes nad1, rrnL, rrnS, and nad3. Further, the genome of S. magnus lacks all but six tRNAs, but is comparable in size with genomes of the closely related genus Leptotrobidium; the close relationship of both of these acariform mites was confirmed by phylogenetic analyses using all mt protein-coding genes. Since mt gene arrangements vary strongly among mite species and no full mt genomes are available for Astigmata, Endeostigmata and Opilioacarida, the use of mtDNA rearrangements for phylogenetic studies is limited at present, especially since gene rearrangements in the studied Acari species do not show a distinct phylogenetic pattern. However, the growing number of published genomes and the better understanding of rearranging mechanisms make mt genomes promising markers for resolving phylogenetic relationships of acarine lineages and Chelicerata in general.
Specimens of S. magnus were collected from the Kranichstein forest located about 8 km northeast of Darmstadt, Germany. Animals were extracted from leaf litter by heat using a modified Kempson extractor , preserved in 75% ethanol and stored at -20°C until usage. Total DNA was extracted from single specimens using the DNeasy Tissue kit (Qiagen) following the manufactures' protocol (but final elution of DNA was in 40 μl instead of 200 μl). Polymerase chain reactions (PCR) were performed for the small ribosomal subunit (rrnS), the cytochrome b (cob) and cytochrome oxidase I (cox1) genes using the primers 12SA and 12SB, CB3 and CB4 and COIarch1 and COIarch2, respectively (Additional File 2). PCR reaction mixtures contained 12.5 μl HotStarTaq MasterMix (Qiagen), 0.7 μl of each primer (100 pmol/μl), and 4 μl DNA (unquantified) in a total volume of 25 μl. Amplification conditions included an initial activation step at 95°C for 15 min followed by 34 cycles of 95°C for 45 s; 44°C (rrnS), 50°C (cob) or 51°C (cox1) for 1 min; 72°C for 55 s and a final elongation at 72°C for 10 min. PCR products were visualized on a 1% agarose gel, purified using the QIAquickPCR Purification kit (Qiagen) and directly sequenced by Macrogen (South Korea).
Long PCR amplifications were performed using 1 μl Elongase (Invitrogen), 1 μl buffer A, 4 μl buffer B, 2.5 μl dNTPs (25 mM), 1 μl of each primer (10 mM) and 1–2 μl of DNA (unquantified) with the following conditions: 92°C for 2 min; 40 cycles of 92°C for 30 s, 50°C for 30 s, 68°C for 12 min and a final extension step of 68°C for 20 min. Initial primers for long PCR were designed from the previously obtained sequences of rrnS, cob and cox1 (Additional File 2). PCR products were visualized on 1% agarose gels and purified using Millipore Montage vacuum purification plates. Sequencing was performed using ABI BigDye 3.1 dye terminator technology with an ABI capillary sequencer at the John Curtin Medical School sequencing centre (Australian National University). Cycle sequencing reactions contained 1 μl BigDye, 0.5 μl of the primer (25 mM) and 0.5–1.0 μl template (in a total volume of 3 μl) and amplification conditions were 28 cycles of 94°C for 10 s, 50°C for 5 s and 60°C for 4 min. Within each long PCR product the complete double-stranded sequence was determined by primer walking (a list of all primers is available from the corresponding author upon request). Since PCR amplifications derived from DNA extracted from different specimens the final genome sequence is a consensus from a pool of individuals.
Data were assembled into contigs using Sequencher™ version 4.7 (Gene Codes Corperation 2006). Protein-coding genes (PCG) were identified by the comparison of their amino-acid sequences using the blastx search BLAST algorithm implemented at the NCBI website http://www.ncbi.nlm.nih.gov/ and by eye-comparison with other chelicerate sequences. Annotation of the N- and C-terminal ends of each PCG was checked by comparison with the translated amino acid sequences of homologous mt genes for other chelicerates in MEGA ver. 3.1 ; MEGA ver. 3.1 was also used for nucleotide composition analyses.
tRNA genes were initially identified by tRNAScan-SE  using both the generalized mitochondrial and the specific nematode mitochondrial tRNA settings and by the program ARWEN . Genes found were adjusted by eye to identify structures more similar to those found in other chelicerates [cf. ]. Non-coding regions were also searched by eye for stem-loop motifs which could form part of plausible tRNA-like structures but none were found. While it is possible that additional tRNAs are present but unannotated in the Steganacarus mt genome they cannot be identified at this time.
Candidate rRNA genes were identified by blastn searches and aligned with homologous rRNA genes from other chelicerates and insects. A secondary structure model for each gene was elucidated by comparison to the published rRNA secondary structures for Apis , Manduca  and Leptotrombidium .
For phylogenetic analyses we used the concatenated amino acid dataset of all protein-coding genes previously used by Fahrein et al. ; amino acid sequences were choosen since a more conservative and more unambiguous alignment was possible. Sequences were aligned using the default settings in ClustalX 1.81 ; the gene order was adapted from Fahrein et al. . Since most parts of the aligned amino acid sequences were unambiguous all parts were included in tree construction. Bayesian phylogenetic analysis was performed with MrBayes v3.1.2  using the mitochondrial genetic code for Metazoa and a haploid ploidy level with two independent runs of 1,000,000 generations and four chains each; trees were sampled every 500 generations. A majority consensus tree was generated using a burn-in of 200. Phylogenetic trees were also constructed using neighbour-joining (NJ) based on uncorrected p-distances as implemented in PAUP* 4b10 . Reliability of the branches was ascertained by bootstrap analyses with 10,000 replicates.
We thank M. Rosenberger and I. Schaefer for collecting the specimens and for supporting the DNA extractions. We are further grateful to L. Podsiadlowski and colleagues for providing their protein dataset. This study was funded by the German Research Foundation (KD, MM, SS) and the National Science Foundation (SLC grant # DEB0444972).
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.