Distinctive mitochondrial genome of Calanoid copepod Calanus sinicus with multiple large non-coding regions and reshuffled gene order: Useful molecular markers for phylogenetic and population studies
© Minxiao et al; licensee BioMed Central Ltd. 2011
Received: 2 July 2010
Accepted: 27 January 2011
Published: 27 January 2011
Copepods are highly diverse and abundant, resulting in extensive ecological radiation in marine ecosystems. Calanus sinicus dominates continental shelf waters in the northwest Pacific Ocean and plays an important role in the local ecosystem by linking primary production to higher trophic levels. A lack of effective molecular markers has hindered phylogenetic and population genetic studies concerning copepods. As they are genome-level informative, mitochondrial DNA sequences can be used as markers for population genetic studies and phylogenetic studies.
The mitochondrial genome of C. sinicus is distinct from other arthropods owing to the concurrence of multiple non-coding regions and a reshuffled gene arrangement. Further particularities in the mitogenome of C. sinicus include low A + T-content, symmetrical nucleotide composition between strands, abbreviated stop codons for several PCGs and extended lengths of the genes atp6 and atp8 relative to other copepods. The monophyletic Copepoda should be placed within the Vericrustacea. The close affinity between Cyclopoida and Poecilostomatoida suggests reassigning the latter as subordinate to the former. Monophyly of Maxillopoda is rejected. Within the alignment of 11 C. sinicus mitogenomes, there are 397 variable sites harbouring three 'hotspot' variable sites and three microsatellite loci.
The occurrence of the circular subgenomic fragment during laboratory assays suggests that special caution should be taken when sequencing mitogenomes using long PCR. Such a phenomenon may provide additional evidence of mitochondrial DNA recombination, which appears to have been a prerequisite for shaping the present mitochondrial profile of C. sinicus during its evolution. The lack of synapomorphic gene arrangements among copepods has cast doubt on the utility of gene order as a useful molecular marker for deep phylogenetic analysis. However, mitochondrial genomic sequences have been valuable markers for resolving phylogenetic issues concerning copepods. The variable site maps of C. sinicus mitogenomes provide a solid foundation for population genetic studies.
Copepods play an important role in the aquatic ecosystem and are highly diverse. They comprise a multitude of taxa including 200 families, 1,650 genera and 11,500 species , although this estimation may represent only 15% of the actual number . Copepods have successfully colonized almost all aquatic regimes and have developed diverse life styles . Therefore, phylogenetic studies are required to develop a complete biodiversity inventory of the group, which will enable the question of how copepods have acquired such diversity over time to be investigated.
Several incompatible classification schemes have been proposed for copepods on the basis of morphological characteristics . Since the incorporation of copepods as a monophyletic group in 1859, phylogenetic studies have focused on the natural relationships between the incorporated orders, Calanoida, Cyclopoida, Gellyelloida, Harpacticoida, Misophrioida, Monstrilloida, Mormonilloida, Platycopioida, Poecilostomatoida and Siphonostomatoida . Dussart (1984) classified Calanoida and Poecilostomatoida together in the lineage Cyclopinidae-Oithonidae-(Poecilostomatoida-Calanoida)  while other researchers have classified the Calanoida outside Podoplea, at the relative basal position [3, 6]. Kabata, Marcotte and Boxshall hypothesised that Poecilostomatoida is the sister group to Cyclopoida. However, other studies have placed Poecilostomatoida and Siphonostomatoida within close phylogenetic affinity [3, 6]. Recently, Boxshall reassigned Poecilostomatoida as a suborder of Cyclopoida . The relationships among copepods and other subgroups of Pancrustacea have yet to be elucidated with 11 alternative sister group hypotheses being proposed for the taxon . The recent ambiguous status of copepod phylogenetic research is due at least in part to the limited diagnostic morphological characteristics, difficulty in accessing morphological homology and a poor fossil record.
In metazoans, the mitochondrial genome is usually a circular, double-stranded DNA molecule (mtDNA), which spans a general length of 16 kb but can vary from 14 to 48 kb. The gene content is conserved with 37 genes: 13 protein-encoding genes, two ribosomal RNA genes, 22 transfer RNA (tRNA) genes and one or more non-coding region(s) containing signals for transcription and replication of the mtDNA . Several advantages including accelerated substitution rates, (almost) unambiguous orthology and being genome-level informative [10, 11] have allowed the mitochondrial genome to be widely used for population studies [12, 13], phylogeography [12, 14] and phylogenetic relationships at various taxonomic levels across animal taxa, particularly in arthropods [15–17]. Furthermore, extensive intraspecific polymorphism in the non-coding regions facilitates studies at population level . However, there is little information concerning the structure and genetic polymorphism of the non-coding regions in crustaceans.
Despite the vast diversity of copepods, few mitochondrial genomes have been charted. Taxon sampling has been biased to certain orders including Harpacticoida: Tigriopus japonicus[18, 19], Tigriopus californicus; Siphonostomatoida: Lepeophtheirus salmonis and Cyclopoida: Paracyclopina nana. More mitochondrial genomes with increased taxon coverage are required to resolve several issues concerning copepod phylogeny including its phylogenetic position within Pancrustacea and the relationship of its component orders. Calanus sinicus (Copepoda: Calanoida) dominates continental shelf waters in the northwest Pacific Ocean, linking primary production and the larvae and juveniles of fishes . Given its ecological importance, C. sinicus is one of the target species in the China-GLOBEC program. Despite this status, there is little information concerning the population genetics of this species owing to the lack of suitable genetic markers. This study presents a near complete mitochondrial genome of C. sinicus, which represents the first member of the Calanoida. The gene order of C. sinicus was compared with other copepods to identify the evolution of the mitochondrial genomes in this group. Combining the new mitogenome and previously published mitogenomes from arthropods, a preliminary phylogenetic analysis was carried out to investigate the relationships between several orders in Copepoda and their positions within Pancrustacea. In addition, intraspecific polymorphisms of major loci in 11 C. sinicus mitogenomes from four populations were compared to screen potential markers for population studies.
Results and Discussion
A long-PCR-based genome sequencing protocol was adopted for animal mtDNAs. However, this technique failed to amplify a fragment containing partial non-coding regions and two tRNA genes. Several unknown factors including gene rearrangement, notable base composition bias, an extended length of GC-rich tract, highly repeated regions and stable secondary structures could terminate the movement of the polymerase and therefore complicate the recovery of mitogenomes from copepods using this technique [19, 21, 23].
Mitochondrial genome profile and nucleotide composition of C. sinicu s.
2244 - 3382
3383 - 3447
3457 - 3518
3531 - 3593
3594 - 3650
3660 - 5207
5336 - 5671
5751 - 6887
6898 - 7377
7429 - 8082
8083 - 8146
8147 - 9063
9064 - 9126
9131 - 9193
9231 - 10954
10955 - 11017
12788 - 12851
12852 - 12912
12908 - 12971
13002 - 13067
13080 - 13143
13178 - 13240
13246 - 13309
13312 - 13374
13376 - 13439
13439 - 13498
13498 - 13565
13571 - 14275
14338 - 14691
14794 - 15585
16357 - 17658
17663 - 17728
17729 - 18697
18870 - 19031
19034 - 19744
Base Composition and Codon Usage
The H-strand in the C. sinicus mitogenome comprises 32.1% A, 19.1% C, 19.3% G and 29.6% T. As presented in Table 1, the overall A + T content of C. sinicus is relatively low (61.7%) in comparison with other crustaceans, but within the range for copepods, a minimum of 60.4% in T. japonicus to a maximum of 70.8% in P. nana (Additional file 1). The same trend was observed in the protein coding genes (PCGs, 60.3%) and non-coding sequences (58.2%), which were lower than those in the majority of crustaceans. The A + T content of structural RNA genes was much richer, being 72.3% and 72.2% for tRNA and rRNA genes, respectively, which is comparable with other crustaceans.
Metazoan mitogenomes normally bear a clear strand asymmetry in terms of nucleotide composition owing to asymmetric deamination of A and C nucleotides on each strand during replication and/or transcription processes . However, there are approximately equal numbers of each complementary nucleotide pair in C. sinicus. When measured as AT- and GC-skews ((A%-T%)/(A% + T%) and (G%-C%)/(G% + C%)), the result is close to equality (0.00521) for the former and only moderately positive (0.0405) for the latter. Similar results have been reported in other copepods (Additional file 1; P. nana: AT-skew = -0.0457, GC-skew = 0.0598; S. polycolpus: AT-skew = -0.0389, GC-skew = 0.0102). In contrast to the whole genome, an anti-A skewness was apparent for all PCGs (-0.177; with a range between -0.318 for nad3 to -0.109 for nad5) regardless of the strand on which they were encoded, while adenines were slightly preferred in all rRNA genes (0.0125 for rrnL and 0.0369 for rrnS). As demonstrated in Table 1 and Additional file 1, over-representation of guanines emerges in rRNA and tRNA genes. PCGs represent either neutral (cytb and nad1), negative (nad4, nad2, atp8 and atp6) or positive GC-skewness. Interestingly, all negative GC-skewed genes clustered in a gene block carrying the same transcriptional polarization, possibly because of an inversion or transcriptional polarization of the gene block.
The pattern of codon usage in the C. sinicus mitogenome was studied (Additional file 2). A preference for AT-rich codons was identified in C. sinicus, as is the case in mitochondrial PCGs of other arthropods. For example, the most frequently used codons are UUU(F) (63 codons per 1000 codons), followed by AUU(I) (54 codons per 1000 codons) and then AUA(M) (49 codons per 1000 codons). Among copepods, the A + T content of the overall mitogenome is highly correlated with the corresponding values in degenerate synonymous sites of protein coding genes (R2 = 0.9918). The A + T-content of the 3rd codon positions (62.6%) in C. sinicus, which is only slightly higher than that in T. japonicus, is lower than that in most other crustaceans.
Protein coding genes
More than one reasonable start codon can be predicted for several genes. Therefore, start codons were selected from the candidates following criteria to avoid large overlaps between genes and to keep a conserved length with other crustaceans. There are a total of 11,137 nucleotides encoding 13 protein coding genes in C. sinicus, which are at least 147 nt longer than other copepod mitogenomes. The genes atp6 and atp8 are heavily truncated in other copepods but maintain the regular size in C. sinicus, predominantly contributing to the elongation of mitochondrial PCGs. Each of the 13 protein-coding genes in C. sinicus start with a typical initiation codon ATD: ATA for cox1, nad1, nad2, nad4 and nad4L, ATT for atp8, cox2, nad3 and nad5-6, and ATG for the remainder. Previous studies have reported several atypical initiation codons for cox1 in arthropods . However, the copepods studied to date possess a regular start codon (ATA) for cox1.
The majority of the 13 protein coding genes terminate with the conventional stop codons TAG or TAA, but nad1 and nad5 have truncated stop codons (TA) adjacent to a downstream tRNA gene. The presence of incomplete stop codons is common in metazoan mitogenomes, and the shortened stop codons are likely to be completed via post-transcriptional polyadenylation .
Genetic divergence of the mitochondrial genes among five copepods and 11 individuals of C. sinicus.
To examine the evolutionary forces acting on the mitochondrial PCGs of copepods, rates of non-synonymous substitution (dN) versus synonymous substitution (dS) were determined. The observed dN/dS ratios (Table 2) were consistently lower than one, increasing from 0.0026 for cytb to 0.0926 for nad1. This indicates a strong purifying selection within this lineage. Values of dN/dS for genes of sparse polymorphism (cox1-3, cytb) were generally lower, in agreement with the idea that highly divergent genes are normally subjected to less selective pressure.
Ribosomal RNA genes
Compared with the insect D. yakuba, several compound helixes degenerate into a single one in the crustacean rrnS secondary structures. Both crustaceans lack helixes 8, 12, 39 and 41 whereas the counterparts are present in D. yakuba. All helixes present in D. pulex are shared by C. sinicus with the exception of helix 1. However, most loops and linking sequences between helixes are somewhat reduced, leading to a much shorter rrnS in C. sinicus. The alignment of rrnS genes for copepods indicates that 5' sequences upstream of helix 32 are more variable.
In terms of rrnL, upstream sequences of helix C1 were too ambitious to align. High diversity in this region has been reported in several species [33–35], where some or all helixes are truncated [33, 35]. Helix G13, present in D. yakuba, is absent in C. sinicus and D. pulex. In addition, the compound Helix D13/D14 is replaced by one stem-loop, and Helix H3 is absent in C. sinicus. The greatest sequence conservation was at the 3' end from Helix G2 to H2, consistent with the idea that this region is the main component of the transferase centre .
Transfer RNA genes
Complete cloverleaf structures containing the TΨC stem (mostly 3-5 bp) and loop (3-7 nt), variable loop, anti-codon stem (5 bp) and loop (7 nt), DHU (mostly 3-4 bp) stem and loop (highly variable from 3 to 10 nt), and the acceptor stem (7 bp), can be predicted for 19 tRNAs, whereas the DHU arm is absent in trnS2 (UCN). In addition, the DHU arm for another trnS1 (AGN) is highly reduced, leaving a short stem (2 nt) and a small loop (3 nt). Degenerative or unpaired DHU arms in trnS are considered to be a common condition in arthropods , and particularly in copepods [18, 20]. As for other arthropods, the anti-codon is preceded by a uracil and followed by a purine in C. sinicus. Deviating from the canonical mitochondrial tRNAs with four nucleotides in the variable loops, 5 nt variable loops were identified in trnI and trnS1 (AGN), and 3 nt variable loops were identified in trnS2 (UCN).
Within the fragment determined, there were 6,270 bp non-coding sequences in total (approximate 30% of complete sequence) distributed among 23 intergenic regions. Six long non-coding regions larger than 100 bp were identified between atp6 and rrnL (LNR1; >2,959 bp; not sequenced completely), cox1 and nad4L (LNR2; 128 bp), trnH and trnA (LNR3; 1,770 bp), nad3 and cox3 (LNR4; 102 bp), cox3 and nad4 (LNR5; 762 bp), and nad2 and atp8 (LNR6; 172 bp). Six additional non-coding regions larger than 30 bp were discovered. The mitogenome of C. sinicus is one of the largest among arthropods owing to the prevalence and enlargement of non-coding regions. The concurrence of numerous large non-coding regions is unusual . Because of the deletional bias, large inactive regions tend to be eliminated from mitogenomes so that they become economical . Intergenic spacers are normally limited in number and size. As far as crustaceans are concerned, most mitochondrial genomes reported so far possess a single long non-coding region. Exceptions to this include Speleonectes tulumensis, Hutchinsoniella macracantha and Geothelphusa dehaani. The largest non-coding sequences, rather than CRs, are usually smaller than 40 bp . To elucidate the origin of multiple non-coding regions, BLAST analysis was conducted on LNRs. With the exception of LNR2, in which a 26 bp stretch similar to other crustacean rrnS was screened, the BLAST analysis revealed that the LNRs of C. sinicus shared no significant similarities to any known sequences. Therefore, independent origins and evolutionary processes are likely to have given rise to the various non-coding regions.
Generally, large non-coding sequences act as control regions to initiate and/or regulate mitochondrial transcription and replication. However, functions of multiple heterologous non-coding regions are difficult to predict. AT-richness is broadly accepted as a characteristic for the identification of CRs. However, various cases have been reported , and appear to be common in copepods. Of five copepods, three possess equal (L. salmonis) or lower (T. japonicus and T. californicus) A + T-contents in their control regions. Relatively lower AT-contents are present in C. sinicus LNRs with the exception of LNR2 (68.0%). Although conserved sequence blocks (CSBs) are common in control regions of metazoans [26, 42], such conservative properties among copepods were not detected. Therefore, the control regions were screened on the basis of the secondary structure motifs.
Several secondary structure motifs commonly found in control regions of arthropods [42–45] were identified in LNR3 including: (1) a poly-T stretch 360 bp to the 3' end of LNR3; (2) a hairpin structure (Additional file 3) on the L-strand 140 bp downstream of the poly-T stretch; (3) conserved sequences at the lateral ends of the hairpin structure; and (4) a microsatellite locus following the hairpin structure with "AT" as the core repeat (14-28 repetitions). These motifs make LNR3 the most likely candidate for the mitochondrial control region. However, hairpin structures were identified in other LNRs, which could be related to the mode of regulation of replication and transcription. Considering the extreme complexity of the non-coding sequences in C. sinicus, more comparative and functional analyses are required to elucidate their exact roles during mitochondrial metabolism.
Mitochondrial gene order
In accordance with the high rate of sequence divergence between lineages, scrambled gene orders were observed among copepods. Large-scale gene rearrangements were identified within the family Calanidae . Owing to the small size and special secondary structures, tRNA genes are more mobile, and account for most translocations in crustaceans [25, 34]. To avoid confusion initiated by reversal translocations of tRNA genes, the discussion of gene order is restricted to the protein-coding and rRNA genes (Figure 6). Complete reshuffling can be found in all copepod mitogenomes, leading to a divergent pattern of gene order in this group.
Translocations involving protein coding or rRNA genes are rare in metazoans . Such dramatic rearrangement of genes in copepods challenges the view of conservation of mitochondrial gene order, as suggested by unusual gene translocations in molluscs [48, 49]. When pairwise gene orders of copepods are compared, there are few common intervals (0 to 40) indicated by results from CREx ; exceptions are two siphonostomatoid species belonging to the family Caligidae (Additional file 4). No conserved synteny is shared by the copepod samples studied here, questioning their homologous status. The similarities of gene order within the family Calanidae and between the orders were compared, with no significant differences being identified. Therefore, the phylogenetic signal may be diluted by frequent gene rearrangements within the lineage. The lack of unambiguous synapomorphic gene arrangements in copepods precludes their use in phylogenetic analysis concerning Copepoda.
With regard to the rearrangement of mitochondrial genes, two major categories of mechanisms have been advanced: (1) tandem duplication followed by random or non-random deletion of excess genes [51, 52]; and (2) non-homologous recombination [53, 54]. The first scenario is improbable in view of the presence of inversion or the absence of a conserved synteny. Consequently, involvement of non-homologous recombination, which can invoke translocation and inversion, may be required. To date, there is no direct evidence of mitochondrial DNA recombination in copepods. However, new evidence supporting recombination is emerging in invertebrates including molluscs , nematodes  and arthropods . Furthermore, the problematic circular subgenomic fragment identified in C. sinicus may provide additional insights concerning mitochondrial DNA recombination in copepods.
Technical problems during laboratory work
During the experiments an 18.6 kb DNA sequence was amplified, with reverse complementarity in its 505 bp long lateral ends. Such a covalent sequence, which represents a circular DNA molecule, is normally considered as a marker for the achievement of mitochondrial sequencing. However, the 4.5 kb fragment at the 3' end of the sequence was incomplete, with several mitochondrial elements being absent (see Methods section for details). Extended sequences were successfully determined with step-out PCRs , and verified by Long-PCR. Therefore, the circular molecule was confirmed to be a sub-genomic fragment nested within the complete mtDNA (circular subgenomic fragment). The occurrence of the problematic circular subgenomic fragment could be explained by the following scenarios: First, the 4.5 kb fragment could be nuclear copies of mitochondrial fragments (NUMT s) . NUMT s are normally composed of fragmented copies shorter than 4 kb [59, 60]. Pseudogenes provide another notorious characteristic for the NUMT s . However, sequences of the coding genes in this problematic fragment compare favourably with the counterparts obtained from the cDNA templates. Indel and nonsense mutations were not present. Therefore, because the sequences of coding genes in the 4.5 kb fragment are almost identical to those obtained from reverse transcriptase PCR (unpublished data), with only three substitutions in the 1372 nucleotides being compared, this first scenario is unlikely.
Second, the 4.5 kb fragment could be an artefact of PCR jumping when site-specific lesions exist or initial copies in the template are few . Fresh materials were used for the amplification to reduce the possibility of PCR jumping as breaks in the template would give rise to the bouncing of primers. Unfortunately, abnormal nucleotides or stable stem-and-loop structures in the unidentified regions may have acted in a similar manner to breaks, causing the extending primer to jump to another template during PCR.
Finally, the circular subgenomic fragment could be the product of mitochondrial DNA recombination. Recombination is normally absent in mitochondrial genomes of metazoans, but convincing evidence for this process has emerged [56, 58, 62, 63]. A defined feature of recombination is the breakage and rejoining of participating DNA strands . According to Lunt's recombination model, subgenomic circular molecules with the same gene organizations but smaller in size could be generated during recombination. The erroneous fragment mentioned above is consistent with Lunt's model , suggesting that recombination occurred. The results highlight the possibility of mitochondrial DNA recombination in copepods.
Such puzzles may be common to all copepod studies and caution should be applied when using long PCR technology to define complete mitochondrial genomes. Additional long PCRs are required to confirm whether the mitochondrial genome sequence is complete.
The position of Copepoda within Pancrustacea is still unclear; the present analyses produced conflicting results using different methods. The uncertainty regarding the position of Copepoda within Pancrustacea is in part due to heterogeneity in evolutionary rates and nucleotide compositions within the lineage , and may be exacerbated by the derived nature of copepod mitochondrial sequences. Considering that copepods possess notably biased nucleotide compositional profiles, the recovery of the phylogenetic signal may be impeded by lineage-specific compositional heterogeneity. However, exclusion of strand-biased amino acids does not change the relative position of Copepoda (Figure 8), indicating that heterogeneous nucleotide composition may not play a key role in misleading phylogenetic analysis. Nevertheless, accelerated substation rates of copepods do mask and erode phylogenetic signals by attracting long-branched taxa together. One example concerns the previously well-accepted monophyletic group Branchiopoda [27, 65], which was resolved polyphyletically by clustering A. franciscana with copepods in the original and balanced dataset under the mtArt model. However, analyses performed with only characters carrying a moderate evolutionary rate or with the CAT + mtArt, which has been confirmed as an effective model to overcome the effects of Long Branch Attraction (LBA) , consistently resolved a monophyletic Branchiopoda clade. Consequently, a possible LBA artefact could be introduced by the accelerated rate of evolution of the mitochondrial genomes of the sampled copepods and branchiopod. Therefore, the clustering of A. franciscana and copepods was regarded as a phylogenetic artefact due to the LBA rather than a sister grouping.
As for the position of Copepoda, four possible sister groups have been proposed in the present study (Figure 8): (1) Oligostraca including Ostracoda, Pentastomida, Branchiura , (2) Oligostraca plus Remipedia under the model of mtArt and (3) Branchiopoda, (4) Branchiopoda plus Malacostraca under the model of CAT + mtArt. It should be noted that by inspecting branch lengths, Copepoda, Oligostraca and Remipedia are rapidly evolving lineages. A decrease in support (PP = 0.37) for the close affinity of Oligostraca and Copepoda was observed in mtArt trees with the moderate-rate sites. Therefore, their grouping may be misleading owing to the artefact concerning the LBA, as the mtArt model is vulnerable to LBA artefacts. Consequently, although only moderately supported (PP from 0.66 to 0.76), the results obtained with the CAT + mtArt model, in which Copepoda was clustered in the monophyletic clade Vericrustacea, which joins Malacostraca, Branchiopoda, Thecostraca and Copepoda, are accepted in the present study. These results are compatible with a previous phylogenetic analysis based on nuclear protein-coding sequences .
The relationships among pancrustacean lineages are highly unstable, but several interesting findings resolved using various methods are noted. A monophyletic origin of Pancrustacea (Figures 7 and 8) is strongly supported by the current analyses, as recovered by a number of molecular studies on the bases of sequence data [16, 65, 70] or mitochondrial gene orders . In accordance with other mitochondrial studies , monophyly of Hexapoda and Crustacea was rejected in the present study, although relationships among the component lineages remain unstable. However, it is noteworthy that the affinity of Insecta and Collembola was resolved under the model of CAT + mtArt, while the grouping of Collembola with Diplura was recovered under the model of mtArt using only moderate-rate sites. These results prevent the rejection of the monophyly of Hexapoda on the basis of mitochondrial genomic data alone. Contradictory conclusions from mitochondrial sequences and nuclear sequences [69, 70, 73] mean that the monophyly of Hexapoda and Crustacea are open questions. Traditional but controversial Maxillopoda was resolved paraphyletic/polyphyletic in the present analyses, where it can be sub-divided into three groups including Copepoda, Cirripedia and Pentastomida plus Branchiura. This division of Maxillopoda is in agreement with recent studies based on combined data of 18S rRNA and two mitochondrial markers , and nuclear protein-coding genes .
Intraspecific sequence variability and its utility for population genetic analysis
Non-coding regions are most variable, followed by PCGs and rRNA genes, with tRNA genes being the most conserved (Table 2). No deletions or insertions were identified in the coding regions. Of the 191 nucleotide substitutions, 50.2% were identified as parsimony informative. A distinct bias for nucleotide transitions over transversions was evident, with 87% substitutions being transitions.
Of the 55 SNPs in PCGs, only 18 can give rise to amino acid substitutions. The majority of the SNPs (62.1%) are located in wobble codon positions. Only 14.0% were identified in second positions, all of which lead to amino acid substitutions. Among PCGs, nad6 is most divergent (0.00333). As presented in Table 2, intraspecific dN/dS ratios were either zero or at relatively high levels, ranging from 0.157 for atp6 to 0.914 for nad4. The overall intraspecific dN/dS ratio was 4.14 times larger than the counterpart between species, which is compatible with a nearly neutral model in which most amino acid substitutions are slightly deleterious .
In terms of the 11 nucleotide substitutions in structural RNA genes, seven were situated in stems and the remainder in loops. Three substitutions in stems induced deterioration of stem connections. However, they were all present at lateral margins, indicating little influence on the overall secondary structures. As mentioned above, such mis-pairing is considered to be restored during transcription.
In addition to substitution and indel variations, three microsatellite motifs were identified at regions from bases 11492 to 11504, 11645 to 11756 and 13066 to 13095. "TCC" unit is the core repeat for the first motif, while "TA" acts as a core repeat for the others. The second microsatellite is most variable with repeat numbers from 10 to 56. With additional subtle sequence variations within the motif, the 11 sequenced microsatellite loci can be sub-divided into seven alleles.
This study presents the first nearly complete mitochondrial genome of a Calanoida species. The circular subgenomic fragment obtained invoked caution when analyzing the mitogenome of copepods using long PCR technology, and may offer additional evidence for mitochondrial recombination.
Although the contents and lengths of individual genes are similar to other arthropods, the mitogenome of C. sinicus, one of the largest mtDNAs in crustaceans, is enlarged by the prevalence and extended length of non-coding regions. The concurrence of multiple non-coding regions and reshuffled gene arrangement results in the mitochondrial genome of C. sinicus being remarkably distinctive from other arthropods. Mitochondrial DNA recombination may have played an important role in shaping the present mitochondrial profile of C. sinicus. The lack of synapomorphic gene arrangements among copepods raises questions concerning the use of gene order as a useful molecular marker for deep phylogenetic analysis.
Recovery of the phylogenetic signal in mitochondrial genomes may be affected a variety of reconstruction artefacts including lineage-specific heterogeneities for nucleotide composition and evolutionary rate. In particular, the LBA artefact influenced the results during analysis. Several methods were designed to reduce the dilution effect of the reconstruction artefacts. Although unstable, some inspiring congruent results were noticed in the analyses. Monophyly of copepods and the basal split between Calanoida and Podoplea were successfully resolved. The close affinity between Cyclopoida and Poecilostomatoida in the present study supports Boxshall in reassigning the latter subordinate to the former. Copepoda was clustered within the monophyletic clade Vericrustacea, although relationships among the lineages remain ambiguous. Falsification of Maxillopoda was confirmed by unveiling its paraphyletic/polyphyletic nature. However, owing to the limited phylogenetic signals in the mitochondrial data sets, no consensus concerning relationships among pancrustacean lineages was reached.
Within the 16,670 bp alignment there are a total of 397 variable sites. Indel variations are present in non-coding regions and transitions dominate the nucleotide substitutions. Three "hot-spots", particularly the hyper-variable microsatellite locus in LNRs, provide rich polymorphisms for population studies.
Sample Collection and DNA Extraction
C. sinicus for mitogenome characterization were collected from the Yellow Sea (35.9N, 122.9E) with a 500 μm mesh zooplankton net during a summer cruise in 2006. The samples were preserved at -80°C pending DNA extraction. To compare the intraspecific polymorphism pattern of different loci among populations, C. sinicus from Yangtze estuary (28.6N, 122.1E; 4 individuals), North Yellow Sea (38.7N, 120.7E; 3 individuals) and Korea (36.9N, 126.3E; 3 individuals) were selected.
Fifty individuals of C. sinicus from the same population (the Yellow Sea) were pooled to prepare a DNA template for mitogenome sequencing. To avoid the potential influence of nuclear DNA sequences on mitochondrial origin, crude mitochondria were roughly separated from cell debris and nuclei using differential centrifugation with a commercial tissue mitochondria isolation kit (Beyotime, C3606). mtDNA was extracted using the DIAamp DNA Micro Kit (Qiagen, Valencia, CA) following the manufacturer's instructions. For intraspecific comparison, genome DNA was extracted individually.
PCR amplification and sequencing
Partial sequences of the genes atp6, cytb, nad4 and rrnS were determined using the primers presented in additional file 6. On the basis of the sequence data obtained, long PCR primers (Additional file 6) were designed to amplify the entire C. sinicus mitogenome. Two PCR fragments with lengths of 3.8 kb and 11 kb were successfully amplified with the combination of primer cs_cytb F1 plus cs_16sR1 and primer cytb f3 plus cs_nad4 f. PCR reactions were performed using a Mastercycler Pro gradient machine (Eppendorf) in a 50 μl system containing 30 pmol of each primer, 3 nmol of each dNTP, 1.5 units of LA taq polymerase and approximately 20 ng of mtDNA template in 1X La taq buffer supplied by Takara. The cycle profile was initiated with a denaturation step of 94°C for 3 min, followed by 33 cycles of 95°C for 20 s, 58°C for 30 s, 68°C for 1 min/1 kb, and terminated with a final extension cycle of 72°C for 10 min. The product was purified with an E.Z.N.A. gel extraction kit (Omega) and sequenced directly by the primer walking approach (Additional file 6).
Additional primers facing outward were designed from both ends of the contig. A 4.5 kb fragment, with which all contigs could be cyclized, was amplified. However, the absence of atp6 in the amplicon made the results unreliable. Illegality of the fragment was confirmed by failure of amplification with primers from dubious regions.
Step-out PCR techniques  were applied for the remaining mitogenome in two directions, with the primers targeting lateral margins. Despite repeated attempts, the amplification terminated in certain regions on both sides. PCR products smaller than 5 kb were sequenced directly. Some short PCR fragments were also cloned using PMD-18T (Takara) vector before sequencing when they were unable to be sequenced directly. The 11 kb PCR product was sheared into small fragments of 1-3 kb using HydroShear (Genomic Solutions), and then cloned with PUC19 vector (Fermentas) after being end-repaired with T4 DNA polymerase (NEB) following the manufacturer's protocol. Forty clones were sequenced with an ABI 3730 sequencer from Biosune (Shanghai) company.
On the basis of the mitogenome sequences obtained, four primer combinations from relatively conserved regions were designed for screening polymorphism loci in the C. sinicus mitogenome. Fragments of 1.1 kb to 9.5 kb in length from 10 individuals were sequenced using the methods described above. Base calling was performed with phred, and the reads were assembled in phrap with default parameters [76, 77]. All assembled sequences were manually verified with the aid of CONSED to remove misassembles . The nearly complete mitochondrial genome of C. sinicus has been deposited in GenBank with the accession number [GenBank: GU355641].
Sequence analysis and annotation
DNA sequences were analyzed using the software package Lasergene ver. 7.1.0 (DNASTAR, Inc. Madison) and Vector NTI Advance 9 (Invitrogen, Carlsbad, CA). Locations of protein coding genes were preliminarily identified by the ORFs finding method from GeneQuest, followed by BLAST searching on GenBank datasets. The locations were refined using multiple alignments to other crustacean nucleotide sequences. tRNA genes were identified by their proposed cloverleaf secondary structure and anti-codon sequences using tRNAscan-SE1.21 and ARWEN  with relaxed settings, and confirmed manually. Two rRNA genes were determined by comparing with other annotated crustacean mitogenomes, and reconfirmed using their secondary structures. Inferred rRNA sequences were aligned with other crustaceans, whose secondary structures have been launched (rrnS and rrnL obtained from the rRNA database: http://www.psb.ugent.be/rRNA/index.html) by means of the program DCSE . The program RnaViz 2  was used to draw secondary structures of tRNAs and rRNAs. Secondary structures of the putative control region, according to the model for arthropods , were estimated using UNAfold . Gene divergence and synonymous and non-synonymous substitution rates in the protein coding genes were calculated with DnaSP 4.0 and PAML 4.3 .
Sequence alignment and phylogenetic analysis
In addition to the complete mitochondrial genome of C. sinicus presented here, mitogenome sequences of another 105 arthropods were retrieved from GenBank. Information concerning phylogenetic position, gene order, nucleotides and amino acids of individual genes, and sizes of mitogenomes, was extracted from the combined datasets using purpose-built perl scripts (Additional file 7). To avoid artefacts due to asymmetric nucleotide composition, the nucleotide content of a concatenated sequence of PCGs from the initial dataset were compared. Principal components analysis (PCA) was performed with contents of component nucleotides as variables (Additional file 8). The results were used as a guide for sampling taxa with relatively homologous nucleotide compositions. The nucleotide composition and strand asymmetry of some maxillopods are not as balanced, but they were included for complete taxon coverage. A sample containing 36 species including three copepods was selected. Amino acid sequences of individual proteins were aligned using Probalign under the default settings for protein . atp8 was not included as it was absent from some taxa sampled. A dataset (Original dataset) of 2646 amino acids with posterior probabilities above five was accepted for the subsequent phylogenetic analysis. To explore the signal in the dataset and clarify the placement of Copepoda, two additional datasets were introduced. For the first dataset (Balanced dataset), only proteins whose nucleotide composition was not significantly strand-biased were included. Since too-rapidly and too-slowly evolving sites may affect phylogenetic analysis , another dataset (moderate-rate dataset) was constructed by removing classes of rapidly and slowly evolving sites using the slow-fast approach [64, 86], in which sites were partitioned into quartiles and only those from the two internal ones were accepted.
To understand phylogenetic relationships among copepods, two smaller datasets were built (Original and Balanced datasets). These datasets consisted of six copepods whose complete mitochondrial genomes have been (almost) entirely determined, in addition to two branchiopods and two maxillopods as out-groups. For the intraspecific sequence variability analysis, reads from another 10 individuals were assembled and manually aligned in BioEdit (North Carolina State University, NC) using the C. sinicus mitogenome as a template. Alignment was performed on individual genes with sequences from other copepods using Probalign to estimate sequence divergence of various loci.
According to preliminary analysis, the CAT + MtArt model and MtArt model fit the data best and were selected for further analysis. Bayesian analyses were carried out using MrBayes (MtArt model) and PhyloBayes (CAT + MtArt model), with an among-site rate variation under a gamma distribution using four activated categories. Two independent MCMC chains were run simultaneously to determine whether the searching reached stabilization, and were stopped when all chains converged (maxdiff less than 0.2, but in most of the cases less than 0.1 for PhyloBayes; standard deviation [SD] of split frequencies lower than 0.01 for MrBayes). If not, runs were continued until more than 5000 sample points were available per run. The ML analysis was carried out with PHYML 3.0 with 200 bootstrap replicates.
- atp6 and 8:
ATPase subunit 6 and 8
base pair (s)
cytochrome c oxidase subunits I-III
- cytb :
- rrnL :
16S ribosomal RNA
large non-coding region
- nad1-6 and 4L:
NADH dehydrogenase subunits 1-6 and 4L
open reading frame
protein coding gene
polymerase chain reaction
Bayesian posterior probabilities
single nucleotide polymorphism
- rrnS :
12S ribosomal RNA
- trnX (where X is replaced by single letter amino acid code of the corresponding amino acid):
We thank Fangping Cheng and Shiwei Wang for their assistance with sample collection and species identification. We are grateful to Rencheng Wang for his kind support during the laboratory work. We appreciate the editors, Ann Bucklin, Sha Zhongli, Luan Weisha, Yu Haiyan and all the reviewers for their valuable comments. The English revision of the manuscript was made by BioMedEs. The work was supported by the Chinese Academy of Sciences (KZCX2-YW-Q07 and GJHZ200808), National Natural Science Foundation of China (40821004 and 40631008) and State Oceanic Administration of China (200805042).
- Humes AG: How Many Copepods. Hydrobiologia. 1994, 293: 1-7. 10.1007/BF00229916.Google Scholar
- Mauchline J: The Biology of Calanoid Copepods. 1998, Academic Press, London, 710-Google Scholar
- Huys Rony, Boxshall GA: Copepod evolution. 1991, Ray Society, 159:Google Scholar
- Martin JW, Davis GE: An updated classification of the recent Crustacea. History Museum of Los Angeles County: Los Angeles, CA (USA) VII. 2001, 123-Science Series 39,Google Scholar
- Dussart BH: A propos du répertoire mondial des Calanoïdes des eaux continentales. Crustaceana. 1984, 25-31.Google Scholar
- Ho JS: Copepod Phylogeny - a Reconsideration of Huys-and-Boxhall Parsimony Versus Homology. Hydrobiologia. 1994, 293: 31-39. 10.1007/BF00229920.Google Scholar
- Boxshall G, Halsey S: An introduction to copepod diversity. 2004: Ray Soc. 2004Google Scholar
- Jenner RA: Higher-level crustacean phylogeny: Consensus and conflicting hypotheses. Arthropod Struct Dev. 2010, 39 (2-3): 143-153. 10.1016/j.asd.2009.11.001.PubMedGoogle Scholar
- Wolstenholme DR: Animal Mitochondrial-DNA - Structure and Evolution. International Review of Cytology-a Survey of Cell Biology. 1992, 141: 173-216.Google Scholar
- Boore JL: Animal mitochondrial genomes. Nucleic Acids Res. 1999, 27 (8): 1767-1780. 10.1093/nar/27.8.1767.PubMedPubMed CentralGoogle Scholar
- Boore JL, Fuerstenberg SI: Beyond linear sequence comparisons: the use of genome-level characters for phylogenetic reconstruction. Philosophical transactions of the Royal Society of London. 2008, 363 (1496): 1445-1451. 10.1098/rstb.2007.2234.PubMedPubMed CentralGoogle Scholar
- Park JK, Choe BL, Eom KS: Two mitochondrial lineages in Korean freshwater Corbicula (Corbiculidae: Bivalvia). Molecules and Cells. 2004, 17 (3): 410-414.PubMedGoogle Scholar
- Nuwer M, Frost B, Armbrust EV: Population structure of the planktonic copepod Calanus pacificus in the North Pacific Ocean. Marine Biology. 2008, 156 (2): 107-115. 10.1007/s00227-008-1068-y.Google Scholar
- Burton RS, Byrne RJ, Rawson PD: Three divergent mitochondrial genomes from California populations of the copepod Tigriopus californicus. Gene. 2007, 403 (1-2): 53-59. 10.1016/j.gene.2007.07.026.PubMedGoogle Scholar
- Simon C, Buckley TR, Frati F, Stewart JB, Beckenbach AT: Incorporating molecular evolution into phylogenetic analysis, and a new compilation of conserved polymerase chain reaction primers for animal mitochondrial DNA. Annual Review of Ecology Evolution and Systematics. 2006, 37: 545-579. 10.1146/annurev.ecolsys.37.091305.110018.Google Scholar
- Hassanin A: Phylogeny of Arthropoda inferred from mitochondrial sequences: Strategies for limiting the misleading effects of multiple changes in pattern and rates of substitution. Mol Phylogenet Evol. 2006, 38 (1): 100-116. 10.1016/j.ympev.2005.09.012.PubMedGoogle Scholar
- Place AR, Feng XJ, Steven CR, Fourcade HM, Boore JL: Genetic markers in blue crabs (Callinectes sapidus) II. Complete mitochondrial genome sequence and characterization of genetic variation. Journal of Experimental Marine Biology and Ecology. 2005, 319 (1-2): 15-27. 10.1016/j.jembe.2004.03.024.Google Scholar
- Machida RJ, Miya MU, Nishida M, Nishida S: Complete mitochondrial DNA sequence of Tigriopus japonicus (Crustacea: Copepoda). Marine Biotechnology. 2002, 4 (4): 406-417. 10.1007/s10126-002-0033-x.PubMedGoogle Scholar
- Jung SO, Lee YM, Park TJ, Park HG, Hagiwara A, Leung KMY, Dahms HU, Lee W, Lee JS: The complete mitochondrial genome of the intertidal copepod Tigriopus sp (Copepoda, Harpactidae) from Korea and phylogenetic considerations. Journal of Experimental Marine Biology and Ecology. 2006, 333 (2): 251-262. 10.1016/j.jembe.2005.12.047.Google Scholar
- Tjensvoll K, Hodneland K, Nilsen F, Nylund A: Genetic characterization of the mitochondrial DNA from Lepeophtheirus salmonis (Crustacea: Copepoda). A new gene organization revealed. Gene. 2005, 353 (2): 218-230. 10.1016/j.gene.2005.04.033.PubMedGoogle Scholar
- Ki JS, Park HG, Lee JS: The complete mitochondrial genome of the cyclopoid copepod Paracyclopina nana: A highly divergent genome with novel gene order and atypical gene numbers. Gene. 2009, 435 (1-2): 13-22. 10.1016/j.gene.2009.01.005.PubMedGoogle Scholar
- Uye S: Why does Calanus sinicus prosper in the shelf ecosystem of the Northwest Pacific Ocean?. Ices J Mar Sci. 2000, 57 (6): 1850-1855. 10.1006/jmsc.2000.0965.Google Scholar
- Machida RJ, Miya MU, Nishida M, Nishida S: Large-scale gene rearrangements in the mitochondrial genomes of two calanoid copepods Eucalanus bungii and Neocalanus cristatus (Crustacea), with notes on new versatile primers for the srRNA and COI genes. Gene. 2004, 332: 71-78. 10.1016/j.gene.2004.01.019.PubMedGoogle Scholar
- Hassanin A, Leger N, Deutsch J: Evidence for multiple reversals of asymmetric mutational constraints during the evolution of the mitochondrial genome of metazoa, and consequences for phylogenetic inferences. Systematic biology. 2005, 54 (2): 277-298. 10.1080/10635150590947843.PubMedGoogle Scholar
- Kilpert F, Podsiadlowski L: The complete mitochondrial genome of the common sea slater, Ligia oceanica (Crustacea, Isopoda) bears a novel gene order and unusual control region features. Bmc Genomics. 2006, 7: 10.1186/1471-2164-7-241.Google Scholar
- Clayton DA: Transcription and replication of mitochondrial DNA. Human reproduction (Oxford, England). 2000, 15 (Suppl 2): 11-17.Google Scholar
- Xu W, Jameson D, Tang B, Higgs PG: The relationship between the rate of molecular evolution and the rate of genome rearrangement in animal mitochondrial genomes. Journal of Molecular Evolution. 2006, 63 (3): 375-392. 10.1007/s00239-005-0246-5.PubMedGoogle Scholar
- Ruiz-Trillo I, Riutort M, Fourcade HM, Baguna J, Boore JL: Mitochondrial genome data support the basal position of Acoelomorpha and the polyphyly of the Platyhelminthes. Mol Phylogenet Evol. 2004, 33 (2): 321-332. 10.1016/j.ympev.2004.06.002.PubMedGoogle Scholar
- Gutell RR: Collection of Small-Subunit (16s- and 16s-Like) Ribosomal-Rna Structures - 1994. Nucleic Acids Res. 1994, 22 (17): 3502-3507. 10.1093/nar/22.17.3502.PubMedPubMed CentralGoogle Scholar
- De Rijk P, Robbrecht E, de Hoog S, Caers A, Van de Peer Y, De Wachter R: Database on the structure of large subunit ribosomal RNA. Nucleic Acids Research. 1999, 27 (1): 174-178. 10.1093/nar/27.1.174.PubMedPubMed CentralGoogle Scholar
- Crease TJ: The complete sequence of the mitochondrial genome of Daphnia pulex (Cladocera: Crustacea). Gene. 1999, 233 (1-2): 89-99. 10.1016/S0378-1119(99)00151-1.PubMedGoogle Scholar
- Van de Peer Y, De Rijk P, Wuyts J, Winkelmans T, De Wachter R: The European Small Subunit Ribosomal RNA database. Nucleic Acids Research. 2000, 28 (1): 175-176. 10.1093/nar/28.1.175.PubMedPubMed CentralGoogle Scholar
- Domes K, Maraun M, Scheu S, Cameron SL: The complete mitochondrial genome of the sexual oribatid mite Steganacarus magnus: genome rearrangements and loss of tRNAs. Bmc Genomics. 2008, 9: 10.1186/1471-2164-9-532.Google Scholar
- Segawa RD, Aotsuka T: The mitochondrial genome of the Japanese freshwater crab, Geothelphusa dehaani (Crustacea: Brachyura): Evidence for its evolution via gene duplication. Gene. 2005, 355: 28-39. 10.1016/j.gene.2005.05.020.PubMedGoogle Scholar
- Shao RF, Barker SC, Mitani H, Takahashi M, Fukunaga M: Molecular mechanisms for the variation of mitochondrial gene content and gene arrangement among chigger mites of the genus Leptotrombidium (Acari: Acariformes). Journal of Molecular Evolution. 2006, 63 (2): 251-261. 10.1007/s00239-005-0196-y.PubMedGoogle Scholar
- Dermauw W, Van Leeuwen T, Vanholme B, Tirry L: The complete mitochondrial genome of the house dust mite Dermatophagoides pteronyssinus (Trouessart): a novel gene arrangement among arthropods. Bmc Genomics. 2009, 10: 10.1186/1471-2164-10-107.Google Scholar
- Ogoh K, Ohmiya Y: Complete mitochondrial DNA sequence of the sea-firefly, Vargula hilgendorfii (Crustacea, Ostracoda) with duplicate control regions. Gene. 2004, 327 (1): 131-139. 10.1016/j.gene.2003.11.011.PubMedGoogle Scholar
- Mwinyi A, Meyer A, Bleidorn C, Lieb B, Bartolomaeus T, Podsiadlowski L: Mitochondrial genome sequence and gene order of Sipunculus nudus give additional support for an inclusion of Sipuncula into Annelida. Bmc Genomics. 2009, 10: 10.1186/1471-2164-10-27.Google Scholar
- Jang KH, Hwang UW: Complete mitochondrial genome of Bugula neritina (Bryozoa, Gymnolaemata, Cheilostomata): phylogenetic position of Bryozoa and phylogeny of lophophorates within the Lophotrochozoa. Bmc Genomics. 2009, 10: 10.1186/1471-2164-10-167.Google Scholar
- Kurabayashi A, Ueshima R: Complete sequence of the mitochondrial DNA of the primitive opisthobranch gastropod Pupa strigosa: systematic implication of the genome organization. Mol Biol Evol. 2000, 17 (2): 266-277.PubMedGoogle Scholar
- Lavrov DV, Brown WM, Boore JL: Phylogenetic position of the Pentastomida and (pan)crustacean relationships. P Roy Soc Lond B Bio. 2004, 271 (1538): 537-544. 10.1098/rspb.2003.2631.Google Scholar
- Zhang DX, Hewitt GM: Insect mitochondrial control region: A review of its structure, evolution and usefulness in evolutionary studies. Biochemical Systematics and Ecology. 1997, 25 (2): 99-120. 10.1016/S0305-1978(96)00042-7.Google Scholar
- Taanman JW: The mitochondrial genome: structure, transcription, translation and replication. Biochimica Et Biophysica Acta-Bioenergetics. 1999, 1410 (2): 103-123. 10.1016/S0005-2728(98)00161-3.Google Scholar
- Saito S, Tamura K, Aotsuka T: Replication origin of mitochondrial DNA in insects. Genetics. 2005, 171 (4): 1695-1705. 10.1534/genetics.105.046243.PubMedPubMed CentralGoogle Scholar
- Garesse R, Kaguni LS: A Drosophila model of mitochondrial DNA replication: Proteins, genes and regulation. Iubmb Life. 2005, 57 (8): 555-561. 10.1080/15216540500215572.PubMedGoogle Scholar
- Belinky F, Rot C, Ilan M, Huchon D: The complete mitochondrial genome of the demosponge Negombata magnifica (Poecilosclerida). Mol Phylogenet Evol. 2008, 47 (3): 1238-1243. 10.1016/j.ympev.2007.12.004.PubMedGoogle Scholar
- Lavrov DV, Boore JL, Brown WM: The complete mitochondrial DNA sequence of the horseshoe crab Limulus polyphemus. Molecular Biology and Evolution. 2000, 17 (5): 813-824.PubMedGoogle Scholar
- Boore JL, Medina M, Rosenberg LA: Complete sequences of the highly rearranged molluscan mitochondrial genomes of the scaphopod Graptacme eborea and the bivalve Mytilus edulis. Molecular Biology and Evolution. 2004, 21 (8): 1492-1503. 10.1093/molbev/msh090.PubMedGoogle Scholar
- Yu ZN, Wei ZP, Kong XY, Shi W: Complete mitochondrial DNA sequence of oyster Crassostrea hongkongensis-a case of "Tandem duplication-random loss" for genome rearrangement in Crassostrea?. Bmc Genomics. 2008, 9:Google Scholar
- Bernt M, Merkle D, Ramsch K, Fritzsch G, Perseke M, Bernhard D, Schlegel M, Stadler PF, Middendorf M: CREx: inferring genomic rearrangements based on common intervals. Bioinformatics. 2007, 23 (21): 2957-2958. 10.1093/bioinformatics/btm468.PubMedGoogle Scholar
- Lavrov DV, Boore JL, Brown WM: Complete mtDNA sequences of two millipedes suggest a new model for mitochondrial gene rearrangements: Duplication and nonrandom loss. Molecular Biology and Evolution. 2002, 19 (2): 163-169.PubMedGoogle Scholar
- Moritz C, Brown WM: Tandem duplication of D-loop and ribosomal RNA sequences in lizard mitochondrial DNA. Science. 1986, 233 (4771): 1425-1427. 10.1126/science.3018925.PubMedGoogle Scholar
- Lunt DH, Hyman BC: Animal mitochondrial DNA recombination. Nature. 1997, 387 (6630): 247-247. 10.1038/387247a0.PubMedGoogle Scholar
- Shao RF, Barker SC: The highly rearranged mitochondrial genome of the plague thrips, Thrips imaginis (Insecta: thysanoptera): Convergence of two novel gene boundaries and an extraordinary arrangement of rRNA genes. Molecular Biology and Evolution. 2003, 20 (3): 362-370. 10.1093/molbev/msg045.PubMedGoogle Scholar
- Ladoukakis ED, Zouros E: Recombination in animal mitochondrial DNA: Evidence from published sequences. Molecular Biology and Evolution. 2001, 18 (11): 2127-2131.PubMedGoogle Scholar
- Gibson T, Blok VC, Phillips MS, Hong G, Kumarasinghe D, Riley IT, Dowton M: The mitochondrial subgenomes of the nematode Globodera pallida are mosaics: Evidence of recombination in an animal mitochondrial genome. Journal of Molecular Evolution. 2007, 64 (4): 463-471. 10.1007/s00239-006-0187-7.PubMedGoogle Scholar
- Burger G, Lavrov DV, Forget L, Lang BF: Sequencing complete mitochondrial and plastid genomes. Nature Protocols. 2007, 2 (3): 603-614. 10.1038/nprot.2007.59.PubMedGoogle Scholar
- Armstrong MR, Blok VC, Phillips MS: A multipartite mitochondrial genome in the potato cyst nematode Globodera pallida. Genetics. 2000, 154 (1): 181-192.PubMedPubMed CentralGoogle Scholar
- Pamilo P, Viljakainen L, Vihavainen A: Exceptionally high density of NUMTs in the honeybee genome. Molecular Biology and Evolution. 2007, 24 (6): 1340-1346. 10.1093/molbev/msm055.PubMedGoogle Scholar
- Richly E, Leister D: NUPTs in sequenced eukaryotes and their genomic organization in relation to NUMTs. Molecular Biology and Evolution. 2004, 21 (10): 1972-1980. 10.1093/molbev/msh210.PubMedGoogle Scholar
- Paabo S, Irwin DM, Wilson AC: DNA Damage Promotes Jumping between Templates during Enzymatic Amplification. Journal of Biological Chemistry. 1990, 265 (8): 4718-4721.PubMedGoogle Scholar
- Piganeau G, Gardner M, Eyre-Walker A: A broad survey of recombination in animal mitochondria. Molecular Biology and Evolution. 2004, 21 (12): 2319-2325. 10.1093/molbev/msh244.PubMedGoogle Scholar
- Smith JM, Smith NH: Recombination in animal mitochondrial DNA. Molecular Biology and Evolution. 2002, 19 (12): 2330-2332.PubMedGoogle Scholar
- Rota-Stabelli O, Kayal E, Gleeson D, Daub J, Boore JL, Telford MJ, Pisani D, Blaxter M, Lavrov DV: Ecdysozoan Mitogenomics: Evidence for a Common Origin of the Legged Invertebrates, the Panarthropoda. Genome Biol Evol. 2010, 2: 425-440. 10.1093/gbe/evq030.PubMedPubMed CentralGoogle Scholar
- Koenemann S, Jenner RA, Hoenemann M, Stemme T, von Reumont BM: Arthropod phylogeny revisited, with a focus on crustacean relationships. Arthropod Struct Dev. 2010, 39 (2-3): 88-110. 10.1016/j.asd.2009.10.003.PubMedGoogle Scholar
- Huys R, Llewellyn-Hughes J, Conroy-Dalton S, Olson PD, Spinks JN, Johnston DA: Extraordinary host switching in siphonostomatoid copepods and the demise of the Monstrilloida: Integrating molecular data, ontogeny and antennulary morphology. Mol Phylogenet Evol. 2007, 43 (2): 368-378. 10.1016/j.ympev.2007.02.004.PubMedGoogle Scholar
- Huys R, Llewellyn-Hughes J, Olson PD, Nagasawa K: Small subunit rDNA and Bayesian inference reveal Pectenophilus ornatus (Copepoda incertae sedis) as highly transformed Mytilicolidae, and support assignment of Chondracanthidae and Xarifiidae to Lichomolgoidea (Cyclopoida). Biol J Linn Soc. 2006, 87 (3): 403-425. 10.1111/j.1095-8312.2005.00579.x.Google Scholar
- Sperling EA, Peterson KJ, Pisani D: Phylogenetic-Signal Dissection of Nuclear Housekeeping Genes Supports the Paraphyly of Sponges and the Monophyly of Eumetazoa. Molecular biology and evolution. 2009, 26 (10): 2261-2274. 10.1093/molbev/msp148.PubMedGoogle Scholar
- Regier JC, Shultz JW, Zwick A, Hussey A, Ball B, Wetzer R, Martin JW, Cunningham CW: Arthropod relationships revealed by phylogenomic analysis of nuclear protein-coding sequences. Nature. 2010, 463 (7284): 1079-U1098. 10.1038/nature08742.PubMedGoogle Scholar
- Giribet G, Edgecombe GD, Wheeler WC: Arthropod phylogeny based on eight molecular loci and morphology. Nature. 2001, 413 (6852): 157-161. 10.1038/35093097.PubMedGoogle Scholar
- Boore JL, Lavrov DV, Brown WM: Gene translocation links insects and crustaceans. Nature. 1998, 392 (6677): 667-668. 10.1038/33577.PubMedGoogle Scholar
- Carapelli A, Lio P, Nardi F, van der Wath E, Frati F: Phylogenetic analysis of mitochondrial protein coding genes confirms the reciprocal paraphyly of Hexapoda and Crustacea. BMC evolutionary biology. 2007, 7: 10.1186/1471-2148-7-S2-S8.Google Scholar
- Regier JC, Shultz JW, Kambic RE: Pancrustacean phylogeny: hexapods are terrestrial crustaceans and maxillopods are not monophyletic. Proc Biol Sci. 2005, 272 (1561): 395-401. 10.1098/rspb.2004.2917.PubMedPubMed CentralGoogle Scholar
- Rozas J, Sanchez-DelBarrio JC, Messeguer X, Rozas R: DnaSP, DNA polymorphism analyses by the coalescent and other methods. Bioinformatics. 2003, 19 (18): 2496-2497. 10.1093/bioinformatics/btg359.PubMedGoogle Scholar
- Foltz DW: Invertebrate species with nonpelagic larvae have elevated levels of nonsynonymous substitutions and reduced nucleotide diversities. Journal of Molecular Evolution. 2003, 57 (6): 607-612. 10.1007/s00239-003-2495-5.PubMedGoogle Scholar
- Ewing B, Hillier L, Wendl MC, Green P: Base-calling of automated sequencer traces using phred. I. Accuracy assessment. Genome Res. 1998, 8 (3): 175-185.PubMedGoogle Scholar
- Ewing B, Green P: Base-calling of automated sequencer traces using phred. II. Error probabilities. Genome Res. 1998, 8 (3): 186-194.PubMedGoogle Scholar
- Gordon D, Abajian C, Green P: Consed: A graphical tool for sequence finishing. Genome Res. 1998, 8 (3): 195-202.PubMedGoogle Scholar
- Lowe TM, Eddy SR: tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997, 25 (5): 955-964. 10.1093/nar/25.5.955.PubMedPubMed CentralGoogle Scholar
- Laslett D, Canback B: ARWEN: a program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics. 2008, 24 (2): 172-175. 10.1093/bioinformatics/btm573.PubMedGoogle Scholar
- Derijk P, Dewachter R: Dcse, an Interactive Tool for Sequence Alignment and Secondary Structure Research. Computer Applications in the Biosciences. 1993, 9 (6): 735-740.Google Scholar
- De Rijk P, Wuyts J, De Wachter R: RnaViz 2: an improved representation of RNA secondary structure. Bioinformatics. 2003, 19 (2): 299-300. 10.1093/bioinformatics/19.2.299.PubMedGoogle Scholar
- Zuker M: Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Research. 2003, 31 (13): 3406-3415. 10.1093/nar/gkg595.PubMedPubMed CentralGoogle Scholar
- Yang ZH: PAML 4: Phylogenetic analysis by maximum likelihood. Molecular Biology and Evolution. 2007, 24 (8): 1586-1591. 10.1093/molbev/msm088.PubMedGoogle Scholar
- Roshan U, Livesay DR: Probalign: multiple sequence alignment using partition function posterior probabilities. Bioinformatics. 2006, 22 (22): 2715-2721. 10.1093/bioinformatics/btl472.PubMedGoogle Scholar
- Brinkmann H, Philippe H: Archaea sister group of bacteria? Indications from tree reconstruction artefacts in ancient phylogenies. Mol Biol Evol. 1999, 16 (6): 817-825.PubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.