The complete mitochondrial genome of Pseudocellus pearsei (Chelicerata: Ricinulei) and a comparison of mitochondrial gene rearrangements in Arachnida

Background Mitochondrial genomes are widely utilized for phylogenetic and population genetic analyses among animals. In addition to sequence data the mitochondrial gene order and RNA secondary structure data are used in phylogenetic analyses. Arachnid phylogeny is still highly debated and there is a lack of sufficient sequence data for many taxa. Ricinulei (hooded tickspiders) are a morphologically distinct clade of arachnids with uncertain phylogenetic affinities. Results The first complete mitochondrial DNA genome of a member of the Ricinulei, Pseudocellus pearsei (Arachnida: Ricinulei) was sequenced using a PCR-based approach. The mitochondrial genome is a typical circular duplex DNA molecule with a size of 15,099 bp, showing the complete set of genes usually present in bilaterian mitochondrial genomes. Five tRNA genes (trnW, trnY, trnN, trnL(CUN), trnV) show different relative positions compared to other Chelicerata (e.g. Limulus polyphemus, Ixodes spp.). We propose that two events led to this derived gene order: (1) a tandem duplication followed by random deletion and (2) an independent translocation of trnN. Most of the inferred tRNA secondary structures show the common cloverleaf pattern except tRNA-Glu where the TψC-arm is missing. In phylogenetic analyses (maximum likelihood, maximum parsimony, Bayesian inference) using concatenated amino acid and nucleotide sequences of protein-coding genes the basal relationships of arachnid orders remain unresolved. Conclusion Phylogenetic analyses (ML, MP, BI) of arachnid mitochondrial genomes fail to resolve interordinal relationships of Arachnida and remain in a preliminary stage because there is still a lack of mitogenomic data from important taxa such as Opiliones and Pseudoscorpiones. Gene order varies considerably within Arachnida – only eight out of 23 species have retained the putative arthropod ground pattern. Some gene order changes are valuable characters in phylogenetic analysis of intraordinal relationships, e.g. in Acari.


Background
Due to their bacterial origin [1,2] mitochondria have retained a circular DNA double-helix, which in animals is sized between 12-30 kb. This is only a small part of the original bacterial chromosome, the majority was eliminated or transferred to the nucleus [3]. The mitochondrial DNA of Bilateria typically contains 37 genes and one ATrich non-coding part, which putatively bears regulatory elements for transcription and translation and is therefore referred to as the mitochondrial control region [4]. In general the genes encode 13 protein subunits necessary for oxidative phosphorylation (atp6+8, cob, cox1-3, nad1-6 and nad4L), 22 transfer RNAs and two rRNAs (rrnS and rrnL) [5]. Except for the control region, mtDNA possesses only few non-coding sections between genes, even gene overlaps are common. E.g., in many species the last seven nucleotides of atp8 are also the first seven nucleotides of atp6. A similar overlap is often seen on the boundary between nad4 and nad4L. As a consequence, rearrangements in mitochondrial genomes most often disrupt genes and thus are deleterious -a possible reason for the stability of mitochondrial gene order [4].
Mitochondrial genomes have been proven useful for phylogenetic analyses [4]. Nucleotide or amino acid sequences as well as rearrangements in mitochondrial gene order are used as phylogenetic markers [6,7]. Gene rearrangements are considered to be valuable characters, because it is very unlikely that closely related taxa exhibit homoplastic translocations [8]. In addition the secondary structure of encoded tRNAs [9] and changes in the mitochondrial genetic code [10] have also been used as characters in phylogenetic analysis.
Ricinulei (hooded tickspiders) are a small order of arachnids, comprising 3 genera with 55 described species [11][12][13][14]. They are predatory animals that live in humid caves or leaf litter of tropical regions [15][16][17]. Species of Ricinoides occur in West Africa whereas species of Cryptocellus and Pseudocellus live in Central and South America [13]. Ricinuleids have body lengths of 3 to 10 mm [18] and their cuticle is strongly sclerotized and extraordinarily thick [19]. Several peculiarities characterize ricinuleids -a moveable hood (cucullus) in front of the prosoma covering the mouthparts, two jointed chelicerae, chelate pedipalps, elongated second legs, a tarsal copulatory organ on the third pair of legs of adult males, a locking mechanism between pro-and opisthosoma, which can be unlocked during mating and egg-laying, a 6-legged larvae, the lack of distinct eyes, and tracheal respiration [20][21][22][23][24][25]. According to morphological studies and some combined morphological and molecular analyses, ricinuleids are often considered to be the sister group of Acari [26][27][28][29], together forming the Acaromorpha. This clade is characterized by a unique post-embryonic development: a hexapodal larva followed by three octapod nymphal instars [28]. A gnathosoma with medially fused palpal coxae is another unique character of Acaromporpha. However the definition of a "gnathosoma" varies and its presence in Ricinulei is questioned by some authors [e.g. [30]]. The analysis of Shultz [27] obtained seven additional "homoplasious synapomorphies" supporting Acaromorpha (= these characters are not exclusively found in Acari and Ricinulei). Van der Hammen [31,32] however has questioned the monophyly of Acari and placed Ricinulei within the Acari as the sister group to Anactinotrichida (Parasitiformes s. str. + Opilioacariformes) [33]. Dunlop [34] suggested Ricinulei as sister group to the extinct Trigonotarbida which are collectively the sister of the extant Tetrapulmonata (Araneae, Uropygi, Amblypygi).
Because available sequence data from nuclear or mitochondrial genes is very limited for Ricinulei, recent molecular studies of arthropod phylogeny seldom included ricinuleids [e.g. [35]]. With a combined data set (93 morphological characters and 18S + 28S rRNA sequences) Wheeler & Hayashi [36] placed Ricinulei as sister group to Acari. In another combined analysis by Giribet et al. (253 morphological characters, 18S and 28S rRNA, [37]) Ricinulei appear as the sister group of Tetrapulmonata (Araneae + Amblypygi + Uropygi). When fossil taxa were included, a close relationship between the fossil Trigonotarbida and Ricinulei was recovered and both together were the sister group to Tetrapulmonata. This phylogenetic hypothesis corresponds with that of Dunlop [34], but Giribet et al. mentioned the instability of this relationship: In their analysis of only the molecular data set Ricinulei is early branching within the arachnid tree as the sister group to all remaining arachnids [37].
In this study, we present the first complete mitochondrial genome of a member of Ricinulei, the hooded tickspider Pseudocellus pearsei (Chamberlin & Ivie, 1938). The sequence data is used to unveil phylogenetic relationships between Arachnida. Furthermore, the gene order of mitochondrial genomes from all available arachnid species is compared in order to reconstruct the events leading to derived genome arrangements and to evaluate the phylogenetic significance of gene translocations within the Arachnida.

Genome organization, gene order and non-coding parts
The generation of overlapping PCR fragments and subsequent sequencing demonstrated that the mitochondrial genome of P. pearsei is a typical circular DNA molecule with a length of 15099 bp [GenBank:EU024483]. All 37 genes usually present in bilaterian mitochondrial genomes have been identified (Fig. 1, Table 1). With the exception of five translocated tRNAs, gene order is similar to that of the horseshoe crab, Limulus polyphemus, which is considered to represent the putative ground pattern of the Arthropoda [38,39]. The tRNAs (trnW, trnY, trnN, trnL(CUN), trnV) changed their position to a new location between trnM and nad2. In this part of the genome there are five non-coding regions, which are ranging in size from 87 to 250 bp. The three largest non-coding regions are located between rrnS and trnI (250 bp), trnY and trnN (183 bp) and trnV and nad2 (169 bp). The shorter ones are situated between trnI and trnQ (100 bp) and trnM and trnW (87 bp). The longest non-coding region (250 bp) is flanked by rrnSI and trn, and based on its similarity to other arthropods [5,40,41] can be identified as the putative mitochondrial control region. Part of this region is capable of folding into a hairpin-like formation (Fig. 2) with a loop consisting of 10 nucleotides and a stem composed of 21 paired nucleotides (five mismatches). Fur-thermore conserved motifs occur in the flanking sequences around the stem-loop structure: a TATA motif appears in the 5'-flanking sequence whereas the motif GA(A)T is found in the 3'-flanking sequence (Fig. 2). Both motifs are also present in flanking sequences of other arthropods, e.g. in metastriate ticks [42], crustaceans [43], and insects and are presumed to have functional significance in transcription and/or replication [44]. The other larger non-coding regions do not bear similar hairpin-like structures.
Besides the above mentioned regions only smaller noncoding regions of 2-23 bp are found in the mitochondrial genome. Gene overlaps occur between cox1 and cox2 (1 bp), nad4 and nad4L (4 bp), atp6 and atp8 (7 bp), and in four cases tRNA-genes are overlapping each other (Fig. 1,  Table 1).

Protein-coding genes and nucleotide composition
All of the 13 identified protein-coding genes begin with one of the common start codons for mtDNA ATG, ATA or ATT ( Table 1). Out of these 13 protein-coding genes, six show incomplete stop codons (cox2, cob and nad1, 3,4,5).
In cox2 the stop codon is truncated and the gene terminates with TA, in the other five protein-coding genes only a single thymine serves as partial stop codon. Similar structural events have also been described for the mtDNA of other species where post-transcriptional polyadenylation completes a T or TA partial stop codon into a functional one [45].
The CG-skew (= (%C-%G)/(%C+%G)) of mitochondrial genes is a good indicator of the strand specific nucleotide frequency bias [46,47]. In P. pearsei the CG-skew is positive in all (+)strand encoded genes and negative in all (-)strand encoded genes (Table 1). We defined the (+)strand as the strand bearing the majority of coding sequence. The strand specific bias found in P. pearsei occurs in most other arthropods, while a reversal of that bias has been reported for only a few species [43,46,[48][49][50]. Table 2 shows the CG-skews of third codon positions of cox1 for all chelicerates and outgroup taxa included in our phylogenetic analyses. We chose cox1 because this gene is found on (+)strand in all species examined. A reversal of CG-skew is seen in the two scorpions, in three Araneae (but not in Heptathela) and in the parasitiform mite Varroa.

Secondary structure of transfer RNAs
The mitochondrial genome of P. pearsei bears all of the 22 tRNAs commonly found in metazoan mtDNA ( Fig. 1, Table 1). Except for tRNA-Glu, all tRNAs possess the typical cloverleaf secondary structure, though the TψC stem is shortened in several tRNAs (Fig. 3). The TψC-arm of tRNA-Glu is entirely absent. It is shortened to a single pair of nucleotides in tRNA-Met and tRNA-Phe and it is composed of just two paired bases in tRNA-Gly, tRNA-His, tRNA-Ile, tRNA-Leu (CUN), tRNA-Ser (AGN) and tRNA-Thr.

Phylogenetic analysis
We performed phylogenetic analyses with two different data sets: concatenated amino acid and concatenated nucleotide sequences (without third codon positions) from all protein-coding genes. Topologies with best likelihood scores from maximum likelihood (ML) analysis are shown (Fig. 4). Topologies of the 50% majority rule consensus trees from ML bootstrapping and Bayesian Infer-Stem-loop structure in the mitochondrial control region of P. pearsei Figure 2 Stem-loop structure in the mitochondrial control region of P. pearsei. Underlined areas point out conserved motifs in 3'and 5'-flanking sequences.
itochondrial genome map of P. pearsei ence (BI) differ only slightly from the best topologies of ML analysis: BI with the amino acid dataset resulted in a basal split between Oltacola and the remainder of chelicerates including Limulus, but that node is not supported by bootstrapping or BI; in ML bootstrapping with the nucleotide dataset there was no resolution between Ricinulei, Scorpiones, Araneae and Acari (ML bootstrap <50%). In all performed analyses, good support was found for monophyly of Scorpiones, Opisthothelae (= all Araneae except Mesothelae, here represented by Heptathela), Ixodes, Leptotrombidium, Ornithodorinae (Carios + Ornithodoros), Ixodidae (Ixodes + Amblyomma + Haemaphysalis + Rhipicephalus), Metastriata (Amblyomma + Haemaphysalis + Rhipicephalus), and Dermanyssina (Metaseiulus + Varroa). Monophyly of Acari-Parasitiformes was also well supported by most analyses, except for the maximum parsimony (MP) bootstrapping (56%). Monophyly of Acari was recovered in the best topology of the nucleotide ML analysis and moderately supported by ML bootstrapping (80%), but not by BI or MP bootstrapping. Acari and Ricinulei form sister groups in the best ML topology in the analysis of the nucleotide alignment, but this clade found support only by BI (0.98), not by ML or MP bootstrap-ping. In contrast analyses of the amino acid dataset resulted in paraphyletic Acari, as Acariformes (Leptotrombidium) form a clade with Ricinulei, and both together form the sister group to Araneae. Both clades are well supported only by BI (Acariformes + Ricinulei: 1.0; Acariformes + Ricinulei + Araneae: 0.99).
Arachnid monophyly is only supported by ML analysis with the amino acid dataset (bootstrap: 74%). In all other analyses the three basal chelicerate branches -Limulus (Xiphosura), Oltacola (Solifugae), and the remainder of arachnids -are not well resolved. Monophyly of Arachnida excl. Solifugae (in this case Scorpiones + Araneae + Acari + Ricinulei) is well supported in ML and BI analyses of the amino acid alignment and weakly supported by ML analysis of the nucleotide alignment.
The classical, morphology based view of the phylogenetic position of Ricinulei is as the sister group to Acari [26][27][28][29]. Because our analysis does not resolve the basal relationships of arachnid orders well, it is not in conflict with actual morphology based analyses of arachnid interrelationships [27,36]. A sister group relation between Acari Bold numbers indicate higher values in comparison of G and C proportions. Only those taxa (24 chelicerate and 3 outgroup taxa) included in our phylogenetic analysis (Fig. 4) are listed.
Putative secondary structures of mitochondrial tRNAs found in P. pearsei Figure 3 Putative secondary structures of mitochondrial tRNAs found in P. pearsei. All tRNAs can be folded into the usual cloverleaf secondary structure.
Phylogenetic trees of chelicerate relationships, inferred from nucleotide (upper) and amino acid (lower) datasets   The phylogenetic analysis of mitochondrial genome data in chelicerate phylogenetics suffers from several problems. First, the currently incomplete taxon sampling (complete lack of data for Opiliones, Pseudoscorpiones, Palpigradi, Uropygi, and Amblypygi) highlights the preliminary nature of this analysis. While the problem of taxon sampling will be overcome in the near future, other problems lie in the nature of the sequence data itself. Several taxa show a reversed strand bias of nucleotide composition (Table 2), which complicates phylogenetic analyses [46,49]. This may strongly affect the phylogenetic position of Scorpiones in analyses with mitochondrial datsets [51]. Another problem is the heterogeneous substitution rate among arachnids. Our analysis demonstrates great variability in branch lengths, with very long branches for some Acari species (Leptotrombidium, Varroa, Metaseiulus), and very short branches, e.g. from Limulus and Oltacola.

Mitochondrial gene order variation in Arachnida
Apart from five translocated tRNA-genes, the mitochondrial gene order of P. pearsei does not differ from the putative arthropod ground pattern (Figs. 5, 6). A minimum of two events is required to lead to the derived gene order of P. pearsei: a tandem duplication/random deletion event and a single tRNA gene transposition. Tandem duplication and random loss of genes is widely accepted as a mode of genome shuffling in mitochondria [52]. In the present case a segment ranging from trnL to trnY (ground pattern) can be inferred to be involved in such a duplication event. Subsequently the first copy may have lost trnL, trnV, nad1, and trnC, the second copy may have lost rrnL, rrnS, trnI, trnQ, trnM, trnW, and trnY, as well as the control region. Besides the control region (between rrnS and trnI) there are four larger non-coding regions (87-183 bp) present between trnI and nad2 (Fig. 1, Table 1). This is a further hint towards a tandem duplication random deletion event, although there are no apparent sequence homologies to any of the lost genes. In addition, trnN is located in a novel relative position which is best explained by the transposition of this single gene, the second inferred event.
The comparison of mitochondrial gene order from Arachnida reveals a great variation (Fig. 5). Only eight out of 23 species have retained the arthropod ground pattern as represented by Limulus [38]. We present a parsimonious scenario of gene order changes and mapped these events on a phylogenetic tree (Fig. 6). Three different modes leading to a change in gene order were assumed: (1) tandem duplication random deletion events [52], (2) inversions, and (3) transposition of single tRNAs. Although the latter mode is currently not well explained by a molecular model, we have assumed that single tRNA transpositions have occurred in all cases in which tandem duplication and random deletion seems to be an implausible explanation, e.g. when only one tRNA gene found a new position distant from its original position, or on the other strand. The tandem duplication random deletion model was proposed when more than one gene was involved and/or non-coding sequence was found between the genes involved. The tree we have used for mapping the events is the best scoring topology from the ML analysis of the amino acid alignment (Fig. 4, lower tree), because this analysis provided bootstrap support for most of the branches. However, regarding gene order changes there is also no character conflict (= homoplastic characters/ events) with the other topology obtained from the nucleotide alignment.
Several independently derived mitochondrial gene orders are found in Acari. Metastriata (Parasitiformes: Ixodidae) show a derived gene order probably caused by a tandem duplication random deletion event and an additional tRNA translocation (Rhipicephalus [42], Haemaphysalis [53], Amblyomma, [GenBank:NC_005963]). The honeybee mite Varroa shows translocation of six tRNA genes [50]. In addition the reversed arrangement of rrnS and the control region in this species is hypothesized to be the result of a tandem duplication and random deletion event.
Metaseiulus (Parasitiformes: Mesostimata) possesses the most derived mitochondrial gene order among arachnids, probably due to multiple rearrangements [57]. This genome is the largest (25 kb) within the Chelicerata due to the presence of duplicate and triplicate regions. The duplicate region contains 18 genes plus a copy of the control region and partial trnL(UUR) sequence while the triplicate region comprises only the control region sequence and partial trnL(UUR) sequence (Fig. 5). The mitochon-Changes in gene order in mitochondrial genomes of Arachnida compared to the putative ancestral arthropod gene order Figure 5 Changes in gene order in mitochondrial genomes of Arachnida compared to the putative ancestral arthropod gene order. Transfer RNA genes are labelled according to the one letter amino acid code. Genes marked white show the same relative position as in the arthropod ground pattern; genes marked orange have relative positions differing from the arthropod ground pattern; the gene marked black indicates a duplicated rrnL gene in Leptotrombidium pallidum. Horizontal lines above genes illustrate adjacent genes which were probably translocated together; dotted lines indicate regions where tandem duplication and random deletion events may have occurred; connected arrows show adjacent genes which have switched their position, making it difficult to assess which gene was translocated. Braces accentuate the duplicated regions in the mitochondrial genome of Metaseiulus occidentalis. lnr: large non-coding region, putative mitochondrial control region; other non-coding regions (> 50 bp) are illustrated by gaps between genes. Numbers refer to rearrangement events, compare Fig. 6. For GenBank accession numbers see Table 2. Arthropod ground pattern, retained in: Limulus polyphemus (Xiphosura), Heptathela hangzhouensis (Araneae), Ixodes (4 sp., Acari), Ornithodoros (2 sp., Acari), Carios capensis (Acari)

Varroa destructor (Acari)
Habronattus oregonensis, Nephila clavata (Araneae) drial genome of Metaseiulus is also remarkable for the absence of nad3 and nad6, because no other chelicerate is known to have lost any of the protein coding genes. Due to the magnitude of these changes, it is difficult to reliably reconstruct the events leading to the gene order found in Metaseiulus which is unique amongst arachnids.
The same is true for the genus Leptotrombidium (Acariformes) [58,59]. The three species share a common, derived arrangement with secondary rearrangements in L.
pallidum. The secondarily derived gene order of L. pallidum (translocation of trnQ, duplication of rrnL and the presence of four large noncoding regions) is considered to have evolved by a combination of tandem duplication with random deletion plus intergenomic recombination between several genes with subsequent gene conversion [59].
It is apparent that the two taxa showing the most complicated genome rearrangement history (Leptotrombidium Hypothesized events leading to derived gene orders (black squares, compare Fig. 5) and strand bias reversal (red circles, com-pare Table 2) mapped on a topology based on the results of our phylogenetic analysis (compare Fig. 4, amino acid data set) Figure 6 Hypothesized events leading to derived gene orders (black squares, compare Fig. 5) and strand bias reversal (red circles, compare Table 2) mapped on a topology based on the results of our phylogenetic analysis (compare Fig. 4, amino acid data set).  and Metaseiulus) exhibit also the longest branches (= highest substitution rates) in phylogenetic analysis (Fig. 4, 6).

Reversals of strand bias in nucleotide frequency
A reversal of nucleotide strand bias was detected in six species (Table 2), most probably due to three independent events: (1) in the Varroa mite [50], (2) in Scorpiones and (3) in Araneae-Opisthothelae (Ornithoctonus [41], Nephila, [GenBank:NC_008063], Habronattus [54]). These changes are also mapped on the tree in Fig. 6 (characters  2, 4, and 20). The comparatively low substitution rate in the mitochondrial genome of Heptathela (Araneae: Mesothelae; Fig. 6) argues against a re-reversal of nucleotide bias within Araneae and suggests that the reversal is in fact a synapomorphy of the Opisthothelae (Fig. 6).

Conclusion
The first complete mitochondrial sequence of a hooded tickspider (P. pearsei, Ricinulei, Arachnida) reveals a typical circular duplex DNA molecule with a compact gene organisation as found in other bilaterians. In comparison to the putative arthropod ground pattern we observed a derived gene order with five tRNA genes found in different relative positions compared to the gene order of Limulus. Probably two events led to the derived gene order of Pseudocellus: (1) a duplication and random deletion event may be responsible for the translocation of four of these tRNA genes and (2) an additional translocation of the trnN. The putative mitochondrial control region is situated in the same position as in other arthropod mitochondrial genomes. Part of the putative control region can be folded into a characteristic stem-loop structure with conserved flanking sequences as found in other arthropods. All tRNAs, except tRNA-Glu, can be folded into a typical cloverleaf secondary structure.
Aligments of nucleotide and amino acid sequences from mitochondrial protein-coding genes were used in a phylogenetic analysis of arachnid relationships. In the best scoring topologies from ML analyses, Ricinulei appear as sister taxon to Acari (nucleotide alignment) or Acariformes (amino acid alignment), but in both cases without sufficient support from bootstrapping or Bayesian inference. Within the Acari, monophyly of Parasitiformes is well supported, while Acariformes are represented by only a single genus (Leptotrombidium). Monophyly of Acari was only recovered in the analysis of the nucleotide alignment and found strong support by BI and only weak support by ML/MP bootstrapping.
Because of highly divergent substitution rates amongst arachnid species, phylogenetic analyses may be generally biased due to long-branch attraction. Another complicating factor is the reversal of nucleotide bias, which has occurred independently in the arachnid clades Scorpi-ones, Acari (Varroa) and Araneae (Opisthothelae). Due to the lack of support for interordinal relations within Arachnida, the phylogenetic analysis of mitochondrial genome data reveals no strong conflict with recent morphological analyses.
Investigation of mitochondrial gene rearrangements across the range of taxa for which complete mitochondrial genomes are available on GenBank, reveals high variability in gene order within arachnids. Among arachnids, 15 of the 23 species investigated have derived features in gene order. When mapped onto the tree, the hypotheses of events leading to gene order changes are not in conflict with our phylogenetic analyses of sequence data. None of the hypothesized events is useful as a phylogenetic character to resolve interordinal relationships within arachnids, but some of these characters are promising in analyses of intraordinal relationships. Especially in Acari the comparison of mitochondrial genomes from a larger taxon sampling would be promising for the detection of gene order changes, which will be valuable in a phylogenetic analysis. Altogether, further mitogenomic data of a broader taxon sampling are necessary, especially from Opiliones, Pseudoscorpiones, Palpigradi, Uropygi and Amblypygi.

Methods
Samples and DNA extraction P. pearsei specimens were collected by Gerd Alberti in the Gruta Sabac-Ha, near Merida, province of Yucatán, México (20°10'18"N, 89°16'03"W). DNA extraction was done with two legs obtained from a freshly killed animal. We used "DNeasy blood and tissue kit" (Qiagen, Hilden, Germany) according to the manufacturers protocol, except for reduction of the elution volume to 50 µl.

Sequence assemblage and annotation
Primary sequence analysis was performed with the CEQ software (quality check). Sequence assemblage was done with Bioedit 7.0.1 [61]. Protein-coding and ribosomal RNA genes and gene boundaries were identified by BLAST search and in comparison with alignments from other chelicerate species. Genomic position and secondary structure from 20 out of 22 transfer RNAs were identified by tRNA-scan SE [62], the remaining two by eye inspection of the regions under suspect. Sequence data was deposited at NCBI database [GenBank:EU024483].

Phylogenetic analysis
Alignments from all protein-coding genes were used in phylogenetic analysis. Alignments of amino acid sequences and nucleotide sequences were performed with ClustalW [63], as implemented in Bioedit 7.0.1 [61]. Nucleotide sequences were translated to amino acid sequence prior to the multiple alignment and translated back afterwards (making use of the "toggle" function of Bioedit). Sequence data was obtained from 24 chelicerate species and three outgroup taxa (for a list of taxa with GenBank accession numbers see Table 2). Ambiguously aligned regions were omitted using Gblocks ver. 0.91b [64], using default settings, except for changing "allowed gap positions" to "with half". The final amino acid alignment consisted of 2786 amino acids. For the nucleotide dataset the "codons" option was used, so that only complete codons (not single nucleotides) were omitted from the alignment by Gblocks. A saturation analysis [65] was performed with DAMBE 4.2.13 [66] using subsets of the nucleotide dataset representing first, second and third codon positions. Due to significant saturation, third codon positions were omitted from the final nucleotide alignment (6490 nucleotides).
Maximum parsimony (MP) analysis was performed with PAUP* ver. 4.0b10 [67], 1,000 bootstrap replicates were performed, each with 10 replicates with random addition of taxa. Maximum likelihood (ML) analysis was done with Treefinder [68]. Model selection was done with Prot-Test ver. 1.3 [69]. According to the Akaike information criterion, the mtArt+G+I model was optimal for ML analysis. The tree with the best likelihood value was chosen out of 10 independent analyses with random taxon order. Bootstrapping was performed with 100 pseudoreplicates. Bayesian inference (BI) with MrBayes ver. 3.1.2 [70] was done with the model mtRev+G+I, as the mtArt model is not implemented in the current version of MrBayes. Eight chains ran for 1,000,000 generations, while tree sampling was done every 1,000 generations; we examined likelihood values of the sampled trees and decided to omit the first 100 trees as burn-in, while the remaining 900 were used to calculate Bayesian posterior probabilities (BPP).