Inverted base composition skews and discontinuous mitochondrial genome architecture evolution in the Enoplea (Nematoda)
BMC Genomics volume 23, Article number: 376 (2022)
Within the class Enoplea, the earliest-branching lineages in the phylum Nematoda, the relatively highly conserved ancestral mitochondrial architecture of Trichinellida is in stark contrast to the rapidly evolving architecture of Dorylaimida and Mermithida. To better understand the evolution of mitogenomic architecture in this lineage, we sequenced the mitogenome of a fish parasite Pseudocapillaria tomentosa (Trichinellida: Capillariidae) and compared it to all available enoplean mitogenomes.
P. tomentosa exhibited highly reduced noncoding regions (the largest was 98 bp), and a unique base composition among the Enoplea. We attributed the latter to the inverted GC skew (0.08) in comparison to the ancestral skew in Trichinellidae (-0.43 to -0.37). Capillariidae, Trichuridae and Longidoridae (Dorylaimida) generally exhibited low negative or low positive skews (-0.1 to 0.1), whereas Mermithidae exhibited fully inverted low skews (0 to 0.05). This is indicative of inversions in the strand replication order or otherwise disrupted replication mechanism in the lineages with reduced/inverted skews. Among the Trichinellida, Trichinellidae and Trichuridae have almost perfectly conserved architecture, whereas Capillariidae exhibit multiple rearrangements of tRNA genes. In contrast, Mermithidae (Mermithida) and Longidoridae (Dorylaimida) exhibit almost no similarity to the ancestral architecture.
Longidoridae exhibited more rearranged mitogenomic architecture than the hypervariable Mermithidae. Similar to the Chromadorea, the evolution of mitochondrial architecture in enoplean nematodes exhibits a strong discontinuity: lineages possessing a mostly conserved architecture over tens of millions of years are interspersed with lineages exhibiting architectural hypervariability. As Longidoridae also have some of the smallest metazoan mitochondrial genomes, they contradict the prediction that compact mitogenomes should be structurally stable. Lineages exhibiting inverted skews appear to represent the intermediate phase between the Trichinellidae (ancestral) and fully derived skews in Chromadorean mitogenomes (GC skews = 0.18 to 0.64). Multiple lines of evidence (CAT-GTR analysis in our study, a majority of previous mitogenomic results, and skew disruption scenarios) support the Dorylaimia split into two sister-clades: Dorylaimida + Mermithida and Trichinellida. However, skew inversions produce strong base composition biases, which can hamper phylogenetic and other evolutionary studies, so enoplean mitogenomes have to be used with utmost care in evolutionary studies.
The class Enoplea comprises the earliest-branching lineages in the phylum Nematoda. It is divided into two subclasses: Dorylaimia (or Clade I) and Enoplia (Clade II) . Dorylaimia exhibit a diverse life history range: the vertebrate-parasitic order Trichinellida, the insect-parasitic Mermithida, the plant-parasitic Dorylaimida, and the free-living Mononchida . Previously sequenced enoplean mitochondrial genomes (mitogenomes) suggest that the evolution of mitochondrial architecture in this clade exhibits a rather stark discontinuity: Trichinellidae and Trichuridae families of the order Trichinellida appear to possess a perfectly conserved architecture [2,3,4,5], Dorylaimida (Longidoridae) exhibit a fast-evolving mitogenomic architecture [5, 6], and Mermithidae (Mermithida) exhibit a rampant gene order rearrangement rate, unmatched within the Bilateria [7, 8]. This pattern fits our previous observation that mitogenome architecture evolution is discontinuous (in nematodes) and that long evolutionary periods of stasis are interspersed with lineages that exhibit exponentially accelerated mitochondrial evolution rates . However, several key enoplean lineages remain unrepresented in terms of available mitogenomes, e.g. the entire Enoplia and Mononchida lineages, which limits our understanding of the evolution of mitochondrial architecture within this nematode lineage.
The Nematoda has been recognised as a problematic lineage in bilaterian/ecdysozoan mitochondrial phylogenomics [10, 11]. The phylogeny of Enoplea also remains unresolved, largely because genomic data (both mitogenomic and nuclear) remain unavailable for many key enoplean lineages, but also because previous mitogenomic studies of Dorylaimia produced inconsistent results [6, 12, 13]. Base composition biases may cause strong artefacts in phylogenetic analyses [14,15,16,17,18], and mitochondrial genomes often exhibit strong compositional biases as a consequence of directional mutational pressures predominantly associated with mitochondrial replication [19, 20]. During the replication, the parental H-strand is left in a mutagenic single-stranded state for almost two hours , which may cause spontaneous hydrolytic deamination of A and C (into G and T respectively). Over multiple generations, this results in the accumulation of T and G on the H-strand, and A and C on the L-strand . In some cases, mitochondrial architecture rearrangements may result in the inversion of the origin of replication, which then also causes an inversion of the direction of mutational pressures, i.e., now T and G accumulate on the L-strand, and A and C on the H-strand . Such skew inversions can affect the branch length, mutational saturation, codon usage, protein properties, and cause reverse mutations, so they can interfere with a broad range of evolutionary studies .
Pseudocapillaria tomentosa is a widespread intestinal nematode parasite found in the intestines of a broad range of fish hosts in the Northern Hemisphere [23, 24]. Until relatively recently, this parasite was very poorly studied , but this species often parasitizes zebrafish, so along with the relatively rapid increase in the popularity of zebrafish as a model animal for molecular biology, this nematode concomitantly received somewhat increased scientific attention in recent years [25,26,27]. Despite this, molecular data for P. tomentosa remain almost completely unavailable. The only currently (July 2021) available GenBank entry is a partial 18S rRNA gene sequence (KU987805) . Furthermore, at the onset of this study, there were no mitogenomes available for the entire family Capillariidae (Trichinellida). In the meantime, two sequences were submitted to the GenBank: the mitogenome of a fish parasite Eucoleus annulatus, collected in the USA , and the mitogenome of a cat parasite Capillaria sp. (Australia) (MH665363; unpublished).
For this study, we sequenced the complete mitogenome of Pseudocapillaria tomentosa (Dujardin, 1843) Moravec, 1987 (Dorylaimia: Trichinellida: Capillariidae) to generate molecular data necessary for the identification, population studies, and phylogeny of this species and family, as well as to test two working hypotheses: 1. mitogenomic architecture evolution is discontinuous in the Enoplea, and 2. base composition biases are hampering mitochondrial phylogenomic reconstruction in the Enoplea. Along with P. tomentosa, we used all mitogenomic data available for the Enoplea to conduct detailed comparative mitogenomic analyses and test the two hypotheses.
Results and Discussion
General mitogenomic features of P. tomentosa and Capillariidae
The complete mitogenome of P. tomentosa contained all 37 standard mitochondrial genes, which were encoded on both strands (Fig. 1; for additional results and discussion see Additional file 1). This is common for the Enoplea, whereas in Chromadorea genes are encoded on a single strand . The strand distribution of protein-coding genes (PCGs) was conserved among the Trichinellida, with nad2, nad5, nad4 and nad4L consistently encoded on the minus strand, but variable among the Dorylaimida and Mermithida (Fig. 1). In the mitogenome of P. tomentosa, a majority of genes kept the transcription sense: a stretch of 16 genes on the minus strand was punctuated only by trnK and trnT. trnW was the only gene encoded on the minus strand that was isolated from this main stretch. Start codons (ATG, ATT, ATA) and stop codons (TAA and TAG) were standard. Gene overlaps were few (4) and small (1 to 2 bp). The distribution of tRNA genes was conserved among most Trichinellida, apart from the Capillariidae. In this family, the architecture was conserved between P. tomentosa and E. annulatus , whereas Capillaria sp. appears to exhibit a slightly rearranged architecture, but these may be sequencing and annotation artefacts as the mitogenome is almost certainly incomplete (Additional file 1). In comparison to E. annulatus, P. tomentosa exhibited an identical gene order and strand distribution, but only partially conserved codon usage, low gene sequence similarity (average identity = 64.65%), partially conserved distribution of intergenic regions, and almost no similarity in the distribution of gene overlaps (Table 1).
We numerically assessed our observation that the evolution of mitochondrial architecture in enoplean nematodes appears to be discontinuous using the common intervals gene order similarity measure (where the value 0 indicates a complete absence of shared common intervals) . Almost all Trichinellidae and Trichuridae exhibited an identical gene order (Table 2). The only exceptions were Trichinella zimbabwensis, which exhibited a translocation of the trnK gene, and Trichinella patagoniensis, which exhibited a loss of the trnK gene. Both of these might be sequencing or annotation artefacts. Trichinellidae exhibit a great similarity to the ancestral arthropod mitochondrial architecture (gene arrangement), which is also similar to the ancestral ecdysozoan mitogenome architecture [2, 30,31,32]. Trichinellidae are therefore considered to possess the most ancestral architecture among all nematodes, whereas all other lineages exhibit highly derived architectures, so we will treat this gene order as the ancestral for all Nematoda. In comparison to the ancestral gene order, Capillariidae exhibited a large number of rearrangements (Fig. 1, Additional file 1), but within the enoplean dataset, they exhibited an intermediate gene rearrangement rate (168—178). Strongly accelerated evolutionary rates were observed in Mermithidae, which exhibited very little similarity to the ancestral gene order (2—12), and Longidoridae, which exhibited almost no similarity to the ancestral gene order (0 – 2). This partially corroborates the results of visual assessment, but surprisingly indicates that Longidoridae, and not Mermithidae, possess the most highly rearranged gene orders. The explanation is that rearrangements in Mermithidae are concentrated in specific, hypervariable regions , whereas some segments remain conserved. This is not the case in Longidoridae, which do not exhibit any conserved segments. Intriguingly, Longidoridae (Xiphinema) species have some of the smallest metazoan mitochondrial genomes sequenced so far [5, 6], which contradicts the proposition that compact mitogenomes are structurally stable, i.e. that shorter mitogenomes exhibit fewer gene order rearrangements relative to the ancestral state . Indeed, it is expected that rearrangements should produce an increase in the number of noncoding regions, regardless of whether the underlying mechanism is tandem duplication/random loss  of recombination . In agreement with this, Mermithidae, where the mechanism producing the mitogenomic architecture variability is believed to be recombination, exhibit multiple large (> 100 bp) NCR and some of the largest mitogenomes among the Bilateria [7, 8]. Therefore, this is an intriguing discrepancy, which was not discussed by previous studies (to our knowledge) [5, 6]. There are two possible mechanisms to explain it: 1) Longidoridae is undergoing a strong purifying selection directed towards the reduction of mitogenome size, 2) The mechanism of rearrangement is unique in this lineage, and it does not produce multiple NCRs. It should be noted that Longidoridae also exhibit a high level of within-family rearrangement rates (Table 2), which implies that the first scenario would require very strong purifying selection levels. A possible explanation could be found in the ‘race for replication’ hypothesis, which proposes that high metabolic demands impose stronger purifying selection constraints for small genome size , but it is unclear why Longidoridae would have higher metabolic demands than other enoplean nematodes. Further studies, and more mitogenomes, are needed to elucidate the rearrangement mechanisms in this family and the evolutionary pressures producing the small mitogenome size.
Until relatively recently the mitogenomic architectural variability was believed to be concentrated largely within the enoplean family Mermithidae, whereas Chromadorea, which contains the majority of Nematoda species, was believed to possess a relatively stable mitogenomic architecture [7, 13, 28, 37,38,39]. In our previous study, we have shown that some Chromadorean lineages also exhibit exponentially accelerated rates of mitogenomic architecture evolution, which prompted us to propose that the mitogenomic architecture evolution is discontinuous in nematodes . Our analyses conducted for this study indicate that the evolution of mitogenomic architecture is nonlinear in the class Enoplea as well.
A strongly reduced size of noncoding regions
The newly sequenced mitogenome exhibited a remarkably small largest noncoding region of 98 base pairs (bp) (Fig. 1; Table 1). Normally, metazoan mitogenomes possess a large noncoding region, commonly spanning > 500 bp to a few thousand bp, which usually comprises the control region (CR). For example, in the closely related Trichinellidae, the putative CR spans 700 to 1600 bp, and it is duplicated in most species [2, 3] (Fig. 1; see Additional file 1 for discussion of putative sequencing artefacts). Mermithidae exhibited on average about eight large (> 100 bp) NCRs, but some Romanomermis species had more than ten of them . This is attributed to the disrupted, rapidly evolving, mitogenomic architecture in this family, comprising multiple duplicated genes and some of the largest mitogenomes among the nematodes . Surprisingly, the longest NCR (putative CR) in the mitogenome of T. cosgrovei is also reduced in size (401 bp) in comparison to standard metazoan mitogenomes . Some Longidoridae (Xiphinema) species are known to have some of the smallest metazoan mitochondrial genomes [5, 6], as a result of shortened genes, short noncoding regions, and gene overlaps . However, the mitogenome (14,118 bp) and protein-coding genes (PCGs) of P. tomentosa were not reduced in size in comparison to other Trichinellida species, and the overlaps were few and small. Therefore, only the noncoding regions are strongly reduced in size in P. tomentosa. In Capillariidae and Trichuridae, only E. annulatus and Trichuris rhinopiptheroxella had an NCR larger than 200 bp (306 and 208 bp, respectively; see Additional file 1 for additional discussion). However, despite perfectly matching gene arrangements between P. tomentosa and E. annulatus, the former mitogenome exhibited a negligible intergenic region of 1 bp in the location corresponding to the location of the putative CR of E. annulatus (306 bp NCR between nad4 and trnL1) (Fig. 1; Additional file 1). This indicates that the putative CR is rapidly evolving in this family, which makes its identification very difficult. Trichuris species largely exhibit two “major” NCRs, both of which are also smaller than 200 bp: the longer one tends to be in the range of 120–170 bp, whereas the short one tends to be 90—120 bp [4, 41]. In conclusion, NCRs are normal-sized in Trichinellidae, slightly reduced in size but multiplied due to rearranged architecture in Mermithidae, and strongly reduced in the Longidoridae, Capilariidae and Trichuridae.
Base composition and skews
High AT content is common in the mitogenomes of nematodes , but P. tomentosa exhibited the highest AT bias among all available Trichinellida (79.3%). Among the Dorylaimia, only three Mermithidae species had marginally higher AT content (Agamermis and Romanomermis sp.; Table 3). As the content of A and G was average, the exceptional AT content is attributable to the very high T content (41.3%), with only three Mermithidae species exhibiting higher values, and to exceptionally low C content, which was the lowest in the entire dataset: 9.5%. This made us suspect that the unique base composition of this species may be associated with mutational pressures associated with the mitogenomic replication mechanism [16, 19, 22]. To assess this, we inspected their base composition skews (GC) . All Trichinellidae mitogenomes exhibited high negative GC skews (skews were calculated for the entire plus strand, and we refer to skew magnitude in absolute terms, i.e. distance from zero, throughout the manuscript) from -0.431 to -0.368. The remaining Trichinellida lineages, comprising Trichuridae and Capillariidae, exhibited low negative skews (-0.073 to -0.007), with Trichuris muris (0.043) and P. tomentosa (0.083) even exhibiting fully inverted (positive) skews. Dorylaimida (Longidoridae) exhibited a very similar pattern: low negative skews (-0.075 to -0.018), with Xiphinema pachtaicum exhibiting a fully inverted skew (0.124). Finally, all available Mermithida (Mermithidae) species exhibited fully inverted (positive) but low skews in the range between 0.007 and 0.054. After X. pachtaicum, P. tomentosa exhibited the second-highest positive skew in the dataset. The patterns produced by cumulative skew plots were inconsistent, even among some closely related taxa, such as P. tomentosa and Capillaria sp. (Additional file 1). Apart from Trichinellidae, most species (including P. tomentosa and E. annulatus) exhibited rather noisy patterns, with skew plots switching between positive and negative values.
This is indicative of inversions in the strand replication order, or otherwise disrupted replication mechanism (e.g. multiple origins of replication) in the lineages with reduced/inverted skews [16, 22]. As skews were not discussed in previous papers that reported clade I mitogenomes with inverted skews [6,7,8, 43,44,45], this is the first observation that some enoplean lineages underwent skew inversions. To get a better resolution we checked all other available Nematoda mitogenomes. This revealed that all chromadorean mitogenomes exhibit high positive GC skews on the plus strand: 0.18 to 0.64 (average = 0.40) (Additional file 2: panel D). As discussed above, Trichinellidae possess the most ancestral architecture among all nematodes, whereas all other lineages exhibit highly derived architectures. From this, we can infer that a high negative GC skew on the plus strand, similar to the one exhibited by most other Arthropoda [16, 46], is the ancestral skew for Nematodes. Therefore, chromadorean nematodes underwent a series of architectural rearrangements that resulted in all genes being encoded on a single strand and resulted in a full skew inversion on the plus strand (alternatively, but less likely, all genes from the plus strand may have migrated to the minus strand).
Theoretically, it may be possible to identify strand inversions of the origin of replication from conserved promoter sequence motifs . For example, in Xiphinema americanum, a sequence motif 5’-GAGACCTGAGCCCAAGATA-3’ similar to the conserved promoter element sequence in the human mitogenome was found in the putative CR . We assessed our dataset for the presence of this motif, but it matched to (highly derived) sequences in only four species: two Xiphinema and two Mermithidae species (Additional file 1: Figure S15). We also searched for other conserved motifs but, apart from several motifs that are common in Trichinellidae, other families did not exhibit any consistently conserved motifs (Additional file 3). This is in agreement with a previous study, which found no identifiable conserved motifs in Longidoridae . Several common motifs were identified in the Mermithidae, but their apparent abundance was largely attributable to the multiplication of NCRs in this family. This further corroborates the fast and unique evolution of mitochondrial control regions in the Enoplea.
Phylogenetic analyses and implications of inverted skews for evolutionary analyses
The phylogeny of Enoplea remains unresolved, as different datasets produce incongruent results. Previous evidence that skew inversions affect phylogenetics and other evolutionary studies was limited to arthropods (mostly crustaceans), which often exhibit fully inverted skew biases (i.e. almost equal magnitude in opposite direction; e.g. -0.25 and + 0.25) [14, 16, 17]. However, skew inversion in enoplean nematodes is only partial, so effects on evolutionary analyses may be less strongly pronounced. We found that only two sequences (Capillaria sp. and T. muris) passed the compositional homogeneity test. Mutational saturation analyses indicated substantial or high levels of saturation (defined as transitions and transversions plateauing) for most individual genes and for the 3rd codon position dataset. However, concatenated 12 PCGs dataset and 1st and 2nd codon datasets did not exhibit strong levels of saturation (Additional file 1: Fig. S16). Also, ORI species did not appear to exhibit elevated levels of saturation. This indicates that concatenated mitogenomic datasets in combination with an algorithm that accounts for compositional heterogeneity may be a suitable tool for studying the evolutionary history of the Enoplea. However, previous (recent) studies that studied the enoplean mitochondrial phylogenomics relied on standard phylogenetic analysis algorithms: Maximum Likelihood (ML), Bayesian Inference (BI), and Maximum Parsimony (MP) [6, 12, 13]. Herein we tested the performance of the standard ML phylogenetic approach implemented in IQ-TREE, as well as the CAT-GTR algorithm designed to account for compositional heterogeneity . The ML analysis produced paraphyletic Arthropoda, with Priapulida nested within, and a generally non-standard  topology of Ecdysozoa, but Nematoda and Nematomorpha were resolved as sister-phyla, which is accepted as the most likely evolutionary scenario . Chromadorea and Enoplea were monophyletic sister-taxa, and Longidoridae were the basal Dorylaimia lineage (i.e. sister-clade to all remaining lineages) (Fig. 2). Mermithida and Trichinellida were sister-groups, and Capillariidae + Trichuridae exhibited a close sister-clade relationship in the latter clade (Fig. 2). In all crucial aspects, the tree produced using all available genes was identical (Additional file 4). The CAT-GTR model produced a similar topology, but with monophyletic Arthropoda (sister-group relationship with Priapulida), switched places between Kinorhyncha and Tardigrada, and with the Enoplean topology resolved into two major sister-clades: Dorylaimida + Mermithida and Trichinellida (Fig. 3). Support values for key nodes were higher in the CAT-GTR analysis than in the ML analysis, where Trichinellidae + (Mermithidae + Capillariidae) clade was weakly supported: 72% (100% in the CAT-GTR).
With respect to the enoplean topology, 18S datasets appear to resolve Enoplia and Dorylaimia as sister clades, often with Trichinellidae and/or Dioctophymatida as the basal Dorylaimia clade, but results are inconsistent or topologies unresolved [1, 6, 50, 51]. The available nuclear genomic data resolve Mermithidae as the basal lineage , but many crucial lineages remain unrepresented. The available mitogenomic data mostly produce two different topologies across different studies: 1. Longidoridae as the basal clade, and 2. Enoplea comprised of two sister-clades: Dorylaimida + Mermithida and Trichinellida. Topology 1 was produced by the nucleotide dataset (NUC) in our ML analysis with high support (100%), and by two previous studies: ML AAs analysis (amino acids) , and ML and BI AAs . Topology 2 was produced by our CAT-GTR analysis (100% support; AAs dataset), and by several previous studies: BI and ML AAs , BI AAs , BI and ML NUC , and ML, BI, and MP AAs . The topology of the remaining enoplean lineages is remarkably consistent across all studies. In comparison, the available genomic data produce Mermithidae as the basal radiation, followed by the Dioctophymatida and finally Trichinellidae and Trichuridae as highly derived sister-families, but the absence of data for Longidoridae remains a major shortcoming .
This uncertainty also makes it difficult to infer the evolutionary history of replication disruption events (producing changes in skew magnitude) in the enoplean nematodes. If Trichniellidae is the earliest-branching lineage, which is unlikely as it is partially supported only by the 18S data, and consistently rejected by mitogenomic and nuclear genomic data, a single disruption of the replication mechanism is sufficient to explain the observed pattern. The topology produced by the mitogenomic data requires a more complex evolutionary scenario. Topology 1 would require three independent disruptions: in the common ancestors of Longidoridae, Mermithidae, and Capillariidae + Trichuridae. Topology 2 would require two independent disruptions: in the common ancestors of Longidoridae + Mermithidae and Capillariidae + Trichuridae. Therefore, topology 2 is supported by the more reliable CAT-GTR analysis in our study, a majority of previous mitogenomic results, and by being the more parsimonious of the latter two skew disruption scenarios.
To improve our understanding of the discontinuity in the evolution of mitogenomes in the nematode class Enoplea, we sequenced and characterised the complete mitochondrial genome of P. tomentosa (Trichinellida: Capillariidae). As mitochondrial molecular data were previously unavailable for this species, the sequence will facilitate future molecular identification and evolutionary studies of this species. Similar to the Chromadorea, the evolution of mitochondrial architecture in the enoplean nematodes exhibits a strong discontinuity: lineages with relatively conserved architecture over tens of millions of years are interspersed with lineages exhibiting architectural hypervariability. Surprisingly, Longidoridae exhibited more highly rearranged mitogenomes than Mermithidae, which possess some of the fastest-evolving mitochondrial architecture among the Bilateria. This is in contradiction with the expectation that high rearrangement rates should produce multiple noncoding regions in mitogenomes. We provide the first observation of an inverted base composition skew in the enoplean lineages. Lineages exhibiting inverted skews appear to represent the intermediate phase between the Trichinellidae (ancestral) and fully derived (high positive) skews in Chromadorean mitogenomes. Our observations have important repercussions for future studies that aim to apply mitochondrial phylogenomics to nematodes. Despite the inverted skews and overall compositional heterogeneity, we found evidence that mitochondrial and nuclear phylogenomics might produce congruent topologies. In the absence of Enoplia and Mononchida representatives, multiple lines of evidence (CAT-GTR analysis in our study, a majority of previous mitogenomic results, and skew disruption scenarios) support the Dorlylaimia split into two sister-clades: Dorylaimida + Mermithida and Trichinellida. However, it is necessary to sequence data from the missing crucial lineages before this can be assessed with confidence, and future studies should pay close attention to putative artefacts caused by compositional heterogeneity and seek agreement between different types of data before any conclusions are made.
Sample collection and identification
Parasitic nematodes were obtained post mortem from bighead carp Hypophthalmichthys nobilis specimens caught by fishermen in the Bailianhe reservoir (Huanggang city, Hubei province, China) and bought from the local market on 2/Aug/2020. Live nematodes were removed from the fish intestines, and then taxonomically identified by their morphological characteristics  via dissecting microscopy. All nematodes were washed in 0.6% saline before being stored in the absolute ethanol in the Museum of Aquatic Organisms, Institute of Hydrobiology, Chinese Academy of Sciences, Wuhan, China. Further identification was conducted using the 18S gene sequence, for which primers (Table 4) were designed on the basis of the previously sequenced conspecific 18S sequence KU987805 .
Genome sequencing and assembly
Genome sequencing and assembly were conducted as described before , so details are provided in Additional file 1. Briefly, DNA was isolated from a single specimen. Five primer pairs were designed on the basis of gene orthologues from Capillaria sp., and then used to amplify the entire mitogenome (Table 4). To avoid assembly artefacts, the amplified fragments were designed to overlap by approximately 100 bp. PCR products were sequenced using the Sanger method and an expanded set of primers, because fragments longer than 1kbp (amplified using long-range PCR) had to be sequenced in several steps (see Additional file 1 for full details). The mitogenome was assembled manually using DNAstar v7.1 , and roughly annotated using MITOS . The annotation was further refined using several different methods: DNAstar, BLAST BLASTx (PCGs), orthologous sequences (PCGs and rRNAs), and ARWEN  (tRNAs).
Comparative mitogenomic and phylogenetic analyses
We downloaded all available Enoplea mitogenomes from GenBank. We removed all duplicates, two unannotated mitogenomes, and left only one mitogenome per species (unless we found indications that conspecific mitogenomes might exhibit a different architecture). PhyloSuite  was used to conduct these steps, as well as to parse and extract the annotation recorded in a Word (Microsoft Office) document, generate the GenBank format file, update the taxonomy from the NCBI database, standardise and extract data, generate comparative tables, and generate annotation files for visualisation in iTOL . GC and AT base composition skews were also calculated by PhyloSuite, following the (G-C)/(G + C) and (A-T)/(A + T) formulas respectively . PhyloSuite and its plug-in programs were also used to conduct all phylogenetic analysis steps, for which we used nucleotide sequences of concatenated 12 mitochondrial protein-coding genes (PCGs); atp8 was removed because it was absent from many species, and only one copy per gene was kept in the Mermithidae that exhibited duplicated PCGs (further details in Additional file 1). Sequence alignment in batches was conducted using the accurate E-ins-i strategy in MAFFT . Alignments were concatenated by Phylosuite. Phylogenetic analyses were conducted using IQ-TREE (Maximum Likelihood – ML)  and the CAT-GTR site mixture model implemented in PhyloBayes-MPI 1.7a , which allows for site-specific rates of mutation, as there is evidence that in some cases it alleviates the base composition skew-driven long-branch attraction artefacts better than other standard phylogenetic algorithms, especially in combination with amino acid sequences . PhyloBayes run parameters were burnin = 500, invariable sites automatically removed from the alignment, two MCMC chains, and the analysis was stopped when the conditions considered to indicate a good run were reached: maxdiff < 0.1 and minimum effective size > 300 (PhyloBayes manual). The best-fit partitioning strategy and models for partitions for IQ-TREE were inferred using the inbuilt functions of IQ-TREE and immediately followed by phylogenetic reconstruction with 50,000 ultrafast bootstrap replicates . We also conducted an IQ-TREE analysis using the parameters described above to test whether the inclusion of all genes (PCGs + rRNAs + tRNAs) can stabilise the topology. IQ-TREE was also used to test for compositional heterogeneity. For mutational saturation analyses  and to generate cumulative skew plots for the entire plus strand, we used DAMBE 7.3.0 . CREx was used to infer the GO distances . All GenBank files were rearranged to start with the cox1 gene using the Reorder function in PhyloSuite, which was also used to generate the comparative mitogenomic architecture table for the P. tomentosa and E. annulatus mitogenomes. Conserved sequence motifs searches were conducted using MEME  and FIMO  tools.
Availability of data and materials
The datasets supporting the conclusions of this article are included within the article and its additional files, as well as in the GenBank repository. The complete mitogenome of P. tomentosa is available under the accession number MZ708958 (https://www.ncbi.nlm.nih.gov/nuccore/MZ708958.1/) and the partial 18S sequence is available under MZ724160 (https://www.ncbi.nlm.nih.gov/nuccore/MZ724160).
Blaxter ML, Ley PD, Garey JR, Liu LX, Scheldeman P, Vierstraete A, et al. A molecular evolutionary framework for the phylum Nematoda. Nature. 1998;392:71–5.
Lavrov DV, Brown WM. Trichinella spiralis mtDNA: A nematode mitochondrial genome that encodes a putative ATP8 and normally structured tRNAs and has a gene arrangement relatable to those of coelomate metazoans. Genetics. 2001;157:621–37.
Mohandas N, Pozio E, La Rosa G, Korhonen PK, Young ND, Koehler AV, et al. Mitochondrial genomes of Trichinella species and genotypes – a basis for diagnosis, and systematic and epidemiological explorations. Int J Parasitol. 2014;44:1073–80.
Liu G-H, Gasser RB, Su A, Nejsum P, Peng L, Lin R-Q, et al. Clear Genetic Distinctiveness between Human- and Pig-Derived Trichuris Based on Analyses of Mitochondrial Datasets. PLoS Negl Trop Dis. 2012;6:e1539.
He Y, Jones J, Armstrong M, Lamberti F, Moens M. The Mitochondrial Genome of Xiphinema americanum sensu stricto (Nematoda: Enoplea): Considerable Economization in the Length and Structural Features of Encoded Genes. J Mol Evol. 2005;61:819–33.
Palomares-Rius JE, Cantalapiedra-Navarrete C, Archidona-Yuste A, Blok VC, Castillo P. Mitochondrial genome diversity in dagger and needle nematodes (Nematoda: Longidoridae). Scientific Reports. 2017;7:srep41813.
Hyman BC, Lewis SC, Tang S, Wu Z. Rampant gene rearrangement and haplotype hypervariation among nematode mitochondrial genomes. Genetica. 2011;139:611–5.
Tang S, Hyman BC. Mitochondrial genome haplotype hypervariation within the isopod parasitic nematode Thaumamermis cosgrovei. Genetics. 2007;176:1139–50.
Zou H, Jakovlić I, Chen R, Zhang D, Zhang J, Li W-X, et al. The complete mitochondrial genome of parasitic nematode Camallanus cotti: extreme discontinuity in the rate of mitogenomic architecture evolution within the Chromadorea class. BMC Genomics. 2017;18:840.
Bernt M, Bleidorn C, Braband A, Dambach J, Donath A, Fritzsch G, et al. A comprehensive analysis of bilaterian mitochondrial genomes and phylogeny. Mol Phylogenet Evol. 2013;69:352–64.
Rota-Stabelli O, Kayal E, Gleeson D, Daub J, Boore JL, Telford MJ, et al. Ecdysozoan Mitogenomics: Evidence for a Common Origin of the Legged Invertebrates, the Panarthropoda. Genome Biol Evol. 2010;2:425–40.
Deng Y-P, Suleman, Zhang Y, Nie Y, Fu Y-T, Liu G-H. The complete mitochondrial genome of capillariid nematodes (Eucoleus annulatus): A novel gene arrangement and phylogenetic implications. Veterinary Parasitology. 2021;296:109476.
Kim J, Lee S-HH, Gazi M, Kim T, Jung D, Chun J-YY, et al. Mitochondrial genomes advance phylogenetic hypotheses for Tylenchina (Nematoda: Chromadorea). Zoologica Scripta. 2015;44:446–62.
Hassanin A. Phylogeny of Arthropoda inferred from mitochondrial sequences: Strategies for limiting the misleading effects of multiple changes in pattern and rates of substitution. Mol Phylogenet Evol. 2006;38:100–16.
Helfenbein KG, Brown WM, Boore JL. The Complete Mitochondrial Genome of the Articulate Brachiopod Terebratalia transversa. Mol Biol Evol. 2001;18:1734–44.
Jakovlić I, Zou H, Zhao X-M, Zhang J, Wang G-T, Zhang D. Evolutionary History of Inversions in Directional Mutational Pressures in Crustacean Mitochondrial Genomes: Implications for Evolutionary Studies. Mol Phylogenet Evol. 2021;164:107288.
Zhang D, Zou H, Hua C-J, Li W-X, Mahboob S, Al-Ghanim KA, et al. Mitochondrial Architecture Rearrangements Produce Asymmetrical Nonadaptive Mutational Pressures That Subvert the Phylogenetic Reconstruction in Isopoda. Genome Biol Evol. 2019;11:1797–812.
Zou H, Jakovlić I, Zhang D, Hua C-J, Chen R, Li W-X, et al. Architectural instability, inverted skews and mitochondrial phylogenomics of Isopoda: outgroup choice affects the long-branch attraction artefacts. Royal Society Open Science. 2020;7:191887.
Reyes A, Gissi C, Pesole G, Saccone C. Asymmetrical directional mutation pressure in the mitochondrial genome of mammals. Mol Biol Evol. 1998;15:957–66.
Xia X. DNA Replication and Strand Asymmetry in Prokaryotic and Mitochondrial Genomes. Curr Genomics. 2012;13:16–27.
Clayton DA. Transcription and replication of mitochondrial DNA. Human Reprod. 2000;15 suppl_2:11–7.
Hassanin A, Léger N, Deutsch J. Evidence for multiple reversals of asymmetric mutational constraints during the evolution of the mitochondrial genome of metazoa, and consequences for phylogenetic inferences. Syst Biol. 2005;54:277–98.
Leis E, Easy R, Cone D. Report of the Potential Fish Pathogen Pseudocapillaria (Pseudocapillaria) tomentosa (Dujardin, 1843) (Nematoda) from Red Shiner (Cyprinella lutrensis) Shipped from Missouri to Wisconsin. copa. 2016;83:275–8.
Moravec F. Parasitic Nematodes of Freshwater Fishes of Europe. Netherlands: Springer; 1994.
Kent ML, Gaulke CA, Watral V, Sharpton TJ. Pseudocapillaria tomentosa in laboratory zebrafish Danio rerio: patterns of infection and dose response. Dis Aquat Org. 2018;131:121–31.
Norris L, Lawler N, Hunkapiller A, Mulrooney DM, Kent ML, Sanders JL. Detection of the parasitic nematode, Pseudocapillaria tomentosa, in zebrafish tissues and environmental DNA in research aquaria. J Fish Dis. 2020;43:1087–95.
Martins ML, Watral V, Rodrigues-Soares JP, Kent ML. A method for collecting eggs of Pseudocapillaria tomentosa (Nematoda: Capillariidae) from zebrafish Danio rerio and efficacy of heat and chlorine for killing the nematode’s eggs. J Fish Dis. 2017;40:169–82.
Gissi C, Iannelli F, Pesole G. Evolution of the mitochondrial genome of Metazoa as exemplified by comparison of congeneric species. Heredity. 2008;101:301–20.
Bernt M, Merkle D, Ramsch K, Fritzsch G, Perseke M, Bernhard D, et al. CREx: Inferring genomic rearrangements based on common intervals. Bioinformatics. 2007;23:2957–8.
Braband A, Cameron SL, Podsiadlowski L, Daniels SR, Mayer G. The mitochondrial genome of the onychophoran Opisthopatus cinctipes (Peripatopsidae) reflects the ancestral mitochondrial gene arrangement of Panarthropoda and Ecdysozoa. Mol Phylogenet Evol. 2010;57:285–92.
Webster BL, Copley RR, Jenner RA, Mackenzie-Dodds JA, Bourlat SJ, Rota-Stabelli O, et al. Mitogenomics and phylogenomics reveal priapulid worms as extant models of the ancestral Ecdysozoan. Evol Dev. 2006;8:502–10.
Staton JL, Daehler LL, Brown WM. Mitochondrial gene arrangement of the horseshoe crab Limulus polyphemus L.: conservation of major features among arthropod classes. Molecular Biology and Evolution. 1997;14:867–74.
Lagisz M, Poulin R, Nakagawa S. You are where you live: parasitic nematode mitochondrial genome size is associated with the thermal environment generated by hosts. J Evol Biol. 2013;26:683–90.
Boore JL. The Duplication/Random Loss Model for Gene Rearrangement Exemplified by Mitochondrial Genomes of Deuterostome Animals. In: Sankoff D, Nadeau JH, editors. Comparative Genomics: Empirical and Analytical Approaches to Gene Order Dynamics, Map Alignment and the Evolution of Gene Families. Dordrecht: Springer, Netherlands; 2000. p. 133–47.
Mao M, Austin AD, Johnson NF, Dowton M. Coexistence of Minicircular and a Highly Rearranged mtDNA Molecule Suggests That Recombination Shapes Mitochondrial Genome Organization. Mol Biol Evol. 2014;31:636–44.
Rand DM. Endotherms, ectotherms, and mitochondrial genome-size variation. J Mol Evol. 1993;37:281–95.
Hu M, Gasser RB. Mitochondrial genomes of parasitic nematodes–progress and perspectives. Trends Parasitol. 2006;22:78–84.
Kang S, Sultana T, Eom KS, Park YC, Soonthornpong N, Nadler SA, et al. The mitochondrial genome sequence of Enterobius vermicularis (Nematoda: Oxyurida) - An idiosyncratic gene order and phylogenetic information for chromadorean nematodes. Gene. 2009;429:87–97.
Sultana T, Kim J, Lee S-H, Han H, Kim S, Min G-S, et al. Comparative analysis of complete mitochondrial genome sequences confirms independent origins of plant-parasitic nematodes. BMC Evol Biol. 2013;13:1–17.
Azevedo JLB, Hyman BC. Molecular characterization of lengthy mitochondrial DNA duplications from the parasitic nematode Romanomermis culicivorax. Genetics. 1993;133:933–42.
Liu G-H, Wang Y, Xu M-J, Zhou D-H, Ye Y-G, Li J-Y, et al. Characterization of the complete mitochondrial genomes of two whipworms Trichuris ovis and Trichuris discolor (Nematoda: Trichuridae). Infect Genet Evol. 2012;12:1635–41.
Perna NT, Kocher TD. Patterns of nucleotide composition at fourfold degenerate sites of animal mitochondrial genomes. J Mol Evol. 1995;41:353–8.
Hunt VL, Tsai IJ, Coghlan A, Reid AJ, Holroyd N, Foth BJ, et al. The genomic basis of parasitism in the Strongyloides clade of nematodes. Nat Genet. 2016;48:299–307.
Tang S, Hyman BC. Rolling circle amplification of complete nematode mitochondrial genomes. J Nematol. 2005;37:236–41.
Powers TO, Harris TS, Hyman BC. Mitochondrial DNA Sequence Divergence among Meloidogyne incognita, Romanomermis culicivorax, Ascaris suum, and Caenorhabditis elegans. J Nematol. 1993;25:564–72.
Lavrov DV, Boore JL, Brown WM. The complete mitochondrial DNA sequence of the horseshoe crab Limulus polyphemus. Mol Biol Evol. 2000;17:813–24.
Lartillot N, Brinkmann H, Philippe H. Suppression of long-branch attraction artefacts in the animal phylogeny using a site-heterogeneous model. BMC Evol Biol. 2007;7(Suppl 1):S4.
Laumer CE, Fernández R, Lemer S, Combosch D, Kocot KM, Riesgo A, et al. Revisiting metazoan phylogeny with genomic sampling of all phyla. Proceedings of the Royal Society B: Biological Sciences. 2019;286:20190831.
Giribet G, Edgecombe GD. Current Understanding of Ecdysozoa and its Internal Phylogenetic Relationships. Integr Comp Biol. 2017;57:455–66.
Bik HM, Lambshead PJD, Thomas WK, Lunt DH. Moving towards a complete molecular framework of the Nematoda: a focus on the Enoplida and early-branching clades. BMC Evol Biol. 2010;10:353.
Meldal BHM, Debenham NJ, De Ley P, De Ley IT, Vanfleteren JR, Vierstraete AR, et al. An improved molecular phylogeny of the Nematoda with special emphasis on marine taxa. Mol Phylogenet Evol. 2007;42:622–36.
International Helminth Genomes Consortium. Comparative genomics of the major parasitic worms. Nat Genet. 2019;51:163–74.
Burland TG. DNASTAR’s Lasergene sequence analysis software. In: Misener S, Krawetz SA, editors. Methods in Molecular Biology™. Totowa, NJ: Humana Press; 2000. p. 71–91.
Bernt M, Donath A, Jühling F, Externbrink F, Florentz C, Fritzsch G, et al. MITOS: Improved de novo metazoan mitochondrial genome annotation. Mol Phylogenet Evol. 2013;69:313–9.
Laslett D, Canbäck B. ARWEN: A program to detect tRNA genes in metazoan mitochondrial nucleotide sequences. Bioinformatics. 2008;24:172–5.
Zhang D, Gao F, Jakovlić I, Zou H, Zhang J, Li WX, et al. PhyloSuite: an integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol Ecol Resour. 2020;20:348–55.
Letunic I, Bork P. Interactive Tree Of Life (iTOL): An online tool for phylogenetic tree display and annotation. Bioinformatics. 2007;23:127–8.
Katoh K, Standley DM. MAFFT multiple sequence alignment software version 7: Improvements in performance and usability. Mol Biol Evol. 2013;30:772–80.
Trifinopoulos J, Nguyen LT, von Haeseler A, Minh BQ. W-IQ-TREE: a fast online phylogenetic tool for maximum likelihood analysis. Nucleic Acids Res. 2016;44:W232–5.
Minh BQ, Nguyen MAT, von Haeseler A. Ultrafast Approximation for Phylogenetic Bootstrap. Mol Biol Evol. 2013;30:1188–95.
Xia X, Xie Z, Salemi M, Chen L, Wang Y. An index of substitution saturation and its application. Mol Phylogenet Evol. 2003;26:1–7.
Xia X. DAMBE7: New and Improved Tools for Data Analysis in Molecular Biology and Evolution. Mol Biol Evol. 2018;35:1550–2.
Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Proc Int Conf Intell Syst Mol Biol. 1994;2:28–36.
Grant CE, Bailey TL, Noble WS. FIMO: scanning for occurrences of a given motif. Bioinformatics. 2011;27:1017–8.
We thank Rong Chen of the BT Lab for helping us with the molecular lab work, and mitogenome assembly and annotation.
This study was funded by the National Natural Science Foundation of China (31970408); the China Agriculture Research System of MOF and MARA (CARS-45); and the Start-up Funds of Introduced Talent in Lanzhou University (561120206). The funders had no role in the design of the study, collection, analysis and interpretation of data, and in writing the manuscript.
Ethics approval and consent to participate
All experimental procedures involving animals were reviewed, approved and supervised by the Animal Care Committee of the Institute of Hydrobiology, Chinese Academy of Sciences. As the study did not involve any live vertebrates, nor regulated invertebrates, no special permits were required to retrieve and process the samples.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Zou, H., Chen, FL., Li, WX. et al. Inverted base composition skews and discontinuous mitochondrial genome architecture evolution in the Enoplea (Nematoda). BMC Genomics 23, 376 (2022). https://doi.org/10.1186/s12864-022-08607-4