The Solemya mtDNA contains all genes of the standard metazoan mitochondrial genome (Figure 1). It is tempting to conclude that the loss/degeneracy of atp8 is restricted to Amarsipobranchia, given the presence of this gene in palaeoheterodonts and in S. velum (and, following the GenBank partial mitochondrial genome, also in Nucula nucleus). Whether the absence of atp8 gene is real or just an outcome of incorrect annotations (see, f.i., [18, 37]), its presence in a bivalve like Solemya further supports, if necessary, that the ancestral bivalve condition is the retention of a fully functional atp8 gene.
Moreover, it is common in metazoans to find neighboring atp6 and atp8 on the same strand  and it has been suggested that uncleaved transcripts may be co-translated [94, 95]. Tough this arrangement is not found in many phyla, like Plathyhelminthes, Nematoda, Annelida, Sipunculida, Brachiopoda, and Mollusca (and atp8 itself is also lacking in some of them; ), nevertheless atp6 and atp8 are neighboring in S. velum (Tab 3). The same association can be found in basal mollusks, like C. nitidulum and K. tunicata, and in other conchiferans, like cephalopods (with the exception of N. macromphalus), Caenogastropoda, and Heterobranchia (albeit on the opposite strand).
Mean A-T content in main molluscan classes ranges between 63.51% (bivalves) and 74.12% (scaphopod G. eborea): S. velum has a high A-T content (68.11%), being significantly more similar to aculiferans and cephalopods than to other bivalves (Additional file 6): actually, the A-T content of Autobranchia is between 55.20% (M. yessoensis; ) and 69.70% (V. philippinarum, [GenBank:NC_003354]). Irrespective of the functional constraints and gene features, we found an unbiased, when not low, G-T content in the “ + ” strand (Figure 3). Most commonly, the leading strand is G-T rich ([35, 97–103]; but see [104–106]). The G-T content is not particularly high in any region of the S. velum “ + ” strand, and it even drops to very low values where the molecule encodes PCGs on the “-” strand (Figure 3).
All genes are located on the same strand in Amarsipobranchia, and most of them in Palaeoheterodonta. Contrastingly, in Solemya velum, genes are evenly distributed among “ + ” and “-” strands, with 18 and 19 genes, respectively. Even if a H-biased distribution of genes is found in other lophotrochozoans, like annelids, brachiopods, bryozoans and platyhelminths (see, f.i., [13, 27]), an even distribution is the commonest situation among Mollusca (Additional file 6; but see ) and, notably, as for A-T content, S. velum is quite similar to Caudofoveata, Cephalopoda, Polyplacophora, and Scaphopoda. This gene distribution on both strands rises a stimulating question on strand assignment: which is the leading (heavy; antisense) strand in S. velum? Patterns evidenced in S. velum resemble those of N. macromphalus. Contrarily, a strand with a sharper G-T predominance has been signaled, f.i., in some gastropods  and in Katharina. It seems that mtDNAs with most genes on the same strand (e.g., Caenogastropoda, Amarsipobranchia) tend to have higher G-T values than mtDNAs with genes evenly distributed on either strand (e.g., Cephalopoda, Palaeoheterodonta).
Control region and origins of replication
The animal mtDNA control region (CR) should contain or neighbor the origins of replication (ORs).  and, specifically for bivalves, Breton and colleagues (; and reference therein) proposed several parameters to annotate the CR, like (i) UR length, (ii) evidence for secondary structure with T-rich loops, (iii) high A-T content and (iv) repetitive elements and palindromes. The best candidates for S. velum control region are the unassigned regions UR7 and UR8 (see Additional file 9), but, as expected, they give no BLAST hits with putative CRs of other mollusks. Among the two largest URs, UR8 is the longest unassigned region. Both URs evidenced complex putative secondary structures (Figure 5); again, both URs have a high A-T content, but while the A-T content of UR8 is 70.43% (somewhat near the overall genome score of 68.11%), it is up to 84.76% for UR7, much more than other putative CRs of mollusks . On the other side, the only 17 bp-long repeated motif found in these URs was found in UR8. So, based on the above mentioned characteristics, it is not possible to unambiguously assign the CR function to either UR.
 suggested that mutations at four-fold degenerate sites should be completely neutral, being positions under no or limited selection. Therefore, in absence of selective constraints, the heavy (antisense) strand would accumulate G and T at these sites, while the light (sense) strand would accumulate A and C. Consequently, A-T and G-C skews at the four-fold degenerate codon sites are known to be significantly correlated with the single-strand duration during duplication, and therefore with the position of each PCG with respect to the OR of that strand [22, 102]. Precise linear patterns of the percentages of each of the 4 nucleotides were found when cox3 was used as the starting point (Figure 4), being the correlation significant for 3 nucleotides out of 4. The slope was positive for A and C, and negative for G and T. This is very similar to the findings of  for mammals and  for vermetid gastropods. These data finally point to three conclusions: (i) the strand we call “ + ” (i.e., the one encoding cox1) is the heavy (antisense) strand, while the strand we call “-” is the light (sense) strand; (ii) the CR of S. velum mtDNA is located immediately before cox3 in the UR7 region; (iii) as UR7 is located at the H/L switch, we suggest this to be the OR of both strands, working in either direction.
The region of the putative OR of the H strand encompasses UR7 and a cassette of tRNAs on the “-” strand, namely trnM, trnC, trnY, trnW, trnQ, and trnE, a situation already signaled in the family Vermetidae  and in the unionid Inversidens japanensis. As shown by , tRNAs on one strand can sometimes work as OR in the opposite one by forming alternative secondary structures other than conventional cloverleaves. Further comparison with other bivalves is tricky because all genes are on the same strand within Amarsipobranchia, but, similarly, the CR of Mytilus spp. is located after a group of 7 tRNAs and the rrnL gene ; the major non-coding region (MNR) of Mimachamys nobilis is located after a cassette of 8 tRNAs ; the CR of Meretrix spp. is located after 7 consecutive tRNAs [39, 40, 110, 111].
The supranumerary ORFs within UR8
If UR7 is the putative CR on Solemya mtDNA, the function of UR8 remains unknown. Its length and secondary structure may have some kind of signaling function, but it is quite noteworthy that two ORFs were found here: ORF117 and ORF195. Are they functional or not? Remarkably, they span over the almost complete UR8, leaving only small unassigned nucleotide stretches of 37, 11, and 12 bp, similar to other intergenic spacers in S. velum mtDNA (see Additional file 9).
ORF117 has a Glimmer “raw” score comparable to other PCGs of S. velum mtDNA (i.e., nad4L and nad6). Although it is nested in the A-T rich UR8 (70.43%), it has a lower A-T content (59.83%), so that its composition is actually different from the rest of the UR8. Moreover, notwithstanding the low (for S. velum) A-T content, the most used codon is UUU (Phe), which is also among the most used ones in all PCGs of this mtDNA, and therefore its codon composition is similar to that of other PCGs. The nucleotide pattern at four-fold degenerate sites is also consistent with other PCGs (Figure 4). Finally, ORF117 is almost completely located in a region of UR8 with few secondary structures (Figure 5). The presence of an homolog of ORF117 in the putative CR of Haliotis rubra, a species that, as above mentioned, seems to retain most ancestral features of molluscan mtDNA , may be related to a common origin of this ORF.
Conversely, ORF195 is not found by Glimmer and has a higher A-T content than ORF117 (73.33%); consistently, the most used codon is AAA (Lys). However, possible homology of ORF195 with membrane proteins is confirmed to some extent by the finding of a large transmembrane domain; the presence of a signal peptide constitutes a further in silico evidence favoring the functionality of this ORF.
It is not easy to assign to a protein a functional role only relying on bioinformatics data: expectedly, given the low homology scores and the short length of both ORFs, many different kinds of proteins and ligands were suggested by tools hosted on the @TOME 2.0 server. The presence of supranumerary ORFs in mitochondrial genomes has been reported elsewhere (f.i., [11, 22, 112, 113]; and references therein) and they mostly are of obscure function, but they generally share either a DNA-binding motif or a transmembrane region.
The commonest hit of ORF117 was with DNA-binding domains of other polypeptides, and many of them were top-ranked using the alignments scores as a sorting criterion. The putative transmembrane region of ORF195 is 19 amminoacids long and it is found in the N-terminal part of the peptide; it is followed by 12 positively charged amminoacids (either K or R) out of 32 in the C-terminal half of the protein. Interestingly, this architecture is the same described by  for supranumerary sex-linked ORFs in unionid mitochondrial genomes. Breton and colleagues suggest a possible role for these ORFs, which must be involved in the complex machinery of the DUI mechanism. Present findings may confirm their claim that natural selection is working on maintaining the structure, rather than the sequence, of transmembrane supranumerary mitochondrial ORFs .
The presence of a putative transmembrane signaling peptide in ORF195 and the DNA-binding signal in ORF117 may suggest a regulatory role for both these proteins; moreover, their presence in the S. velum mtDNA might constitute an evidence of the ancestral presence of such supranumerary ORFs in all bivalves. However, it has to be noted that this remains an in silico analysis and that some features of ORF195 could be randomly due to the high A-T content (f.i., the signal peptide): therefore, detailed analyses of mRNA gene expression are required to shed light on these issues.
Phylogenetic analysis and the usefulness of mitochondrial markers
The position of Solemya is unexpected, being nested within Gastropoda in our tree. However, node support is evidently very low (44.5), so our analysis does not point to a diphyly of Bivalvia, but rather to a polytomy at the base of the tree, including Solemya, Katharina and Haliotis. We may suggest that the lineage leading to Opponobranchia arose so early in the molluscan radiation that little or no phylogenetic signal of this event may be retrievable in Solemya mtDNA. The structural similarities of Solemya mtDNA to Haliotis and Katharina, and the huge differences to other bivalvian mitochondrial genomes, may have affected the clustering, hence the incorrect weak relationship to Gastropoda, which has rather to be considered an artifact.
The diphyly of Bivalvia is not a new find however (see, f.i., [114–116]), even if most recent phylogenomic studies could strongly retrieve bivalves as monophyletic [29–31]. On this aspect,  misinterpreted our previous results [49, 51], as this is the first time we obtain bivalves as diphyletic with mitochondrial DNA; this is simply because in previous works, focusing on internal relationships of the Class, we invariably forced bivalves to be monophyletic [49, 51]. It is remarkable that consistency with bivalves’ lower-level taxonomy was always maintained by our previous mtDNA analyses, a consistency which is actually lacking in . On the other hand, mtDNA fails to retrieve strong phylogenetic signal for the most basal molluscan phylogenetic events, thus retrieving controversial results. Other molecular markers are needed on the issue.
Mitochondrial gene arrangement may better help in tracing basal phylogenetic relationships [23–26]. Remarkably, Solemya gene order connects to the Katarina one by a single gene inversion event (Figure 7), once again pointing out that Solemya belongs to a group of mollusks maintaining archaic mtDNA features. The same inversion can be traced in N. nucleus: the presence of the inversion can be extrapolated by the sequenced regions and therefore can be considered very likely. This may be taken as evidence for a sister group relationship between Nuculida and Solemyida, thus supporting again the monophyly of Opponobranchia. Given the questioned status of Nuculanoidea (f.i., [30, 49, 51, 117, 118]; and reference therein), it would be very interesting to obtain the complete mitochondrial genome of a species belonging to this superfamily and to compare it with the one of S. velum. On the other hand, gene orders of autobranchiate bivalves known so far are so highly derived and hardly connectible (if not at all) to this archaic condition that the gene order of S. velum is useless in tracing phylogenetic relationships between Opponobranchia and other bivalves. Only the invention of a slow-evolving autobranch bivalve mtDNA (if it exists) would help to trace Bivalvia deep phylogenetic relationships based on mtDNA gene arrangements.