Mitochondrial genomes and Doubly Uniparental Inheritance: new insights from Musculista senhousia sex-linked mitochondrial DNAs (Bivalvia Mytilidae)

Background Doubly Uniparental Inheritance (DUI) is a fascinating exception to matrilinear inheritance of mitochondrial DNA (mtDNA). Species with DUI are characterized by two distinct mtDNAs that are inherited either through females (F-mtDNA) or through males (M-mtDNA). DUI sex-linked mitochondrial genomes share several unusual features, such as additional protein coding genes and unusual gene duplications/structures, which have been related to the functionality of DUI. Recently, new evidence for DUI was found in the mytilid bivalve Musculista senhousia. This paper describes the complete sex-linked mitochondrial genomes of this species. Results Our analysis highlights that both M and F mtDNAs share roughly the same gene content and order, but with some remarkable differences. The Musculista sex-linked mtDNAs have differently organized putative control regions (CR), which include repeats and palindromic motifs, thought to provide sites for DNA-binding proteins involved in the transcriptional machinery. Moreover, in male mtDNA, two cox2 genes were found, one (M-cox2b) 123bp longer. Conclusions The complete mtDNA genome characterization of DUI bivalves is the first step to unravel the complex genetic signals allowing Doubly Uniparental Inheritance, and the evolutionary implications of such an unusual transmission route in mitochondrial genome evolution in Bivalvia. The observed redundancy of the palindromic motifs in Musculista M-mtDNA may have a role on the process by which sperm mtDNA becomes dominant or exclusive of the male germline of DUI species. Moreover, the duplicated M-COX2b gene may have a different, still unknown, function related to DUI, in accordance to what has been already proposed for other DUI species in which a similar cox2 extension has been hypothesized to be a tag for male mitochondria.


Background
Metazoan mitochondrial DNA (mtDNA) is generally a small molecule (15-20 kb), and although much larger mitochondrial genomes have occasionally been found, they are often products of duplications of mtDNA portions, rather than variations in gene content [1,2]. The typical mitochondrial gene complement encodes 13 protein subunits of the oxidative phosphorylation enzymes, 2 rRNAs and 22 tRNAs. However, the coding sequences (CDS) can be up to 16, the tRNAs up to 27 (source MitoZoa: http://mi.caspur.it/mitozoa see [3]), and the rRNAs can be duplicated and/or fragmented in discontinuous genes, as in oysters [4]. Generally, there is also a single large non-coding region that is known to contain regulatory elements for replication and transcription (i.e. 'Control Region', CR), but it is unclear whether it is homologous among distantly related animals or, alternatively, it independently arose from various non-coding sequences. This difficulty in establishing homology is because CRs share sequence similarity only among closely related taxa. Finally, the mtDNA is almost always a circular molecule: only the cnidarian classes Cubozoa, Scyphozoa and Hydrozoa have been found to have linear mtDNA chromosomes [5]. All metazoan mitochondrial genes have homologs in plants, fungi and/or protists [6][7][8][9].
The Mollusca is the second largest animal Phylum and currently 99 complete mitochondrial genomes are available in Genbank; among those, only 38 are from Bivalvia, the second class in terms of species richness among mollusks. So far, bivalve mtDNA displays an extraordinary amount of variation in gene arrangement, i.e. very few shared gene boundaries are detectable, and gene translocations are common across all gene classes (protein-coding genes, tRNAs and rRNAs). For this reason, bivalve mitochondrial genome may provide an excellent experimental system to review and test models of mt gene rearrangement evolution, which were mainly developed in groups with stable genomes, such as vertebrates or arthropods. In addition, gene duplications and/or losses are present in almost every bivalve taxon in which a complete mitochondrial genome is available (see [10]). It is therefore evident that efforts should be made to improve the knowledge of bivalve mitochondrial genomes.
Another interesting feature of bivalve mtDNA is its unusual transmission route, which is found in some species: while in Metazoa mtDNA is known to be usually transmitted by Strict Maternal Inheritance (SMI; [11,12]), some bivalve mollusks show a deviation from this rule, named Doubly Uniparental Inheritance (DUI; [13,14]). DUI was found in species belonging to seven different bivalve families: Donacidae, Hyriidae, Margaritiferidae, Mytilidae, Solenidae, Unionidae, and Veneridae ( [15,16]). Species with DUI are characterized by the presence of two distinct gender-associated mtDNAs: one transmitted through eggs (F) and one transmitted through sperm (M). The F and M genomes show up to 52% nucleotide divergence [17]. DUI seems at first to violate the universal rule of uniparental inheritance of organelles, because males receive their mtDNA from both parents and their tissues are heteroplasmic. However the two mtDNAs segregate independently: the Ftype is transmitted to the next generation only through females, while the M-type is only transmitted from father to sons, therefore both genomes are actually transmitted uniparentally.
Because of its unique features, DUI should be a choice model to address many aspects of a wide range of biological sub-fields such as mitochondria inheritance, mtDNA evolution and recombination, genomic conflicts, evolution of sex and developmental biology (see [18] for a review).
Recently, evidence for a new example of DUI was found in the mytilid Musculista senhousia [19]. In this work we characterized the two sex-linked mitochondrial genomes of M. senhousia, a step forward to the complete genetic characterization of DUI related sex-linked mitochondrial genomes. In fact, several unusual features are coming to light when analyzing mtDNAs in DUI systems, such as additional protein coding genes ( [20], and references therein) and gene duplications/features [21,22]. Functional explanations for these features will require much additional work, but are needed to understand the evolution and maintenance of DUI.

Mitochondrial genome features in M. senhousia
The obtained M. senhousia mtDNAs are 21,557 bp long in female (F-type) and 20,612 bp in male (M-type) (see Tables 1 and 2). Sequences are available in GenBank (Acc. No. GU001953-GU001954). The size of both F and M mitochondrial genomes are within the size range of mollusk mtDNAs sequenced to date, i.e. from 7808 bp in Batilaria cumingi to 32,115 bp in Placopecten magellanicus (source MitoZoa: http://mi.caspur.it/mitozoa; [3]).
M. senhousia F and M gene arrangements are remarkably different from other fully sequenced metazoan mtDNAs (see [10] for a review). Genome annotations are reported in Figure 1 and 2, Table 1 and 2. When compared to other Mytilidae, only four gene boundaries are shared with Mytilus (tRNAs are not considered), i.e. rrnS-nad6, nad2-cox3, nad4L-nad5 and nad3-cox1, while the rest of the genome is different, thus highlighting that gene arrangement evolves rapidly within the family.
Comparing the two sex linked genomes, protein-coding genes may have different lengths (Table 3). Both Ftype and M-type include a large number of Unassigned Regions (URs; 29 in F and 27 in M: see Tables 1, 2 and Additional File 1). Among these, the largest (4,521 and 2,844 bp in female and male respectively) are here referred as LURs (i.e. Large Unassigned Regions).
Both F and M mt genomes show the same gene order and contain the full gene complement of the typical metazoan mtDNA, with two additional tRNAs: trnM and trnL (Figures 1 and 2; Tables 1 and 2). In males the cox2 gene is duplicated ( Figure 2 and Table 2).
The atp8 gene was reported as missing in several bivalve mollusks, however, as recently reported [23], the lack of atp8 would rather be an annotation inaccuracy due to the extreme variability of the gene. Following [23], we found an atp8 gene in M. senhousia in both M and F genomes.
The position of the two ribosomal RNA genes, obtained through BLAST comparison, does not differ between male and female. In both sexes, rrnL is located in a region flanked by the trnM(AUG) and nad3 genes. Assuming that the first base at the 5'-end comes immediately after the trnM(AUG), and the 3'-end of the gene corresponds to the first base upstream of the start  GENE  atp6  9052  9765  714  H  ATG  TAG   UR  UR-9  9766  9791  26   tRNA  trnT  9792  9858  67  H  TGT   GENE  cob  9835  11031  1197  H  ATA  TAA   UR  UR-10  11032  11049  18   tRNA  trnD  11050  11114  65  H  GTC  UR  UR-11  11115  11123  9   tRNA  trnR  11124  11189  66  H  TCG   tRNA  trnS(AGN)  11191  11248  58  H  TCT   UR  UR-12  11249  11268  20   tRNA  trnG  11269  11336  68  H  TCC   rRNA  rrnS  11337  12154  818  H   GENE  nad6  12155  12778  624  H  ATG  TAA   UR  UR-13  12779  12828  50  GENE  nad2  12829  13773  945  H  ATA  TAA   UR  UR-14  13774  13855  82   GENE  cox3  13856  14710  855  H  ATG  TAA   UR  UR-15  14711  14721  11   tRNA  trnK  14722  14792  71  H  TTT   UR  UR-16  14793  14797  5   tRNA  trnF  14798  14865  68  H  GAA   UR  UR-17  14866  14878  13  tRNA  trnP  14879  14945  is 552 bp longer than the female one (1,130 bp in length). The rrnS gene is located in a region flanked by trnS and nad6 genes and, as above, we assumed that the first base at the 5'-end comes immediately after trnG, and that the 3'-end of the gene corresponds to the first base upstream of the start codon of nad6 gene. Here, the difference in length is reduced to 82 bp: the female rrnS gene is 819 bp long while the male one is 1,087 bp. F and M genomes of M. senhousia contain 22 tRNA genes (see Tables 1, 2 and Additional File 2). As observed in mtDNA of some other mollusks (Katharina tunicata, Cepaea nemoralis, Mytilus species complex and Argopecten irradians), two leucine tRNA genes are present in M. senhousia. These can be differentiated by their anticodons: TAA for trnL(UUR) and TAG for trnL (CUN), which are 2-fold and 4-fold redundant respectively. Consequently, tnrL is 6-fold redundant. An additional trnM was also detected, as in V. philippinarum, Mytilus species complex, Crassostrea gigas, C. hongkongensis and C. virginica. The additional tRNA coding for methionine, trnM(AUA), has the TAT anticodon.
In both male and female mtDNAs, trnS(AGN) have a shortened DHU (See Additional File 2) that is not atypical, as this arm is unpaired in many metazoan taxa [24][25][26][27]. Moreover, mispairing between bases in stems is consistent across several taxa. For example, the second base pair in the anticodon stem of trnW has a T-T mispairing in Lampsilis ornata, Mytilus, and K. tunicata and a T-G pairing in several gastropods [25].
In the F mitochondrial genome of Musculista, 20 out of 22 tRNA genes are clustered in five groups of two to six (see Figure 1 and Table 1). Of the remaining two, trnT lies between atp6 and the 5'-end of cob genes (with 24 bp overlapping each other) while trnA lies between nad5 and nad4 genes. Thus, 4 of the 13 protein-coding genes (cob, nad1, nad4L and nad4) have a tRNA preceding their 5'-end. In contrast, 7 other genes (cox1, cox2, atp8, atp6, nad2, cox3 and nad5) have a non-coding sequence at their 5'-end that is capable of forming a stem and loop structure (see Figure 3).
In male mitochondrial DNA, 19 of the 22 tRNA genes are clustered in five groups ranging from two to six (see Figure 2 and Table 2). Of the remaining three, trnT lies between atp6 and the 5'-end of cob genes (with 25 bp overlapping each other), trnA lies between nad5 and nad4 genes and trnE lies between the large unassigned region (LUR) and the 5'-end of cox1 gene. Thus, 5 of the 14 protein-coding genes (cox1, cob, nad1, nad4L and nad4) have a tRNA preceding their 5'-end, while 7 other genes (cox2b, cox2, atp8, atp6, nad2, cox3 and nad5) have a non-coding sequence preceding their 5'end that is capable of forming a stem and loop structure (see Figure 3). In a few cases those structures contain the translation initiation codon (cox1 and cox2 in females, nad2 in males).
The nucleotide compositions of the two genomes are summarized in Table 3. Given the G content of the F and M coding strand (see Table 3), this can be considered as the heavy (H) strand of the molecule. The A+T content of the H strand is also high (66.5%, F; 67.0%, M). Variable values of A+T content are common in mollusks, and they have been reported in L. ornata (62%, [28]), Pupa strigosa (61.1%, [29]), and C. nemoralis (59.8%, [25]). In other mollusks, the A+T content is much higher (Albinaria coerulea, 70.7%, [30]; K. tunicata, 69.0%, [6]; Graptame eborea, 74.1%, [31]). Musculista values in A+T content are among the highest observed in the Phylum, and reflect the high  heterogeneity of molluscan mtDNA [2]. Moreover, there is a marked bias in favor of T against C, which is not restricted to any particular class of genes and does not differ between the two genomes. The GC and AT asymmetry between the two mitochondrial DNA strands can be expressed in terms of GC skew and AT skew calculated according to [32]: GC skew = (G-C)/(G+C) and AT skew = (A-T)/(A+T), where G, C, A, and T are the occurrences of the four bases in the H strand. In M. senhousia F and M mitochondrial genomes, the GC skew and the AT skew are F: +0.28 and -0.18, and M: +0.23 and -0.17, respectively.
In the M. senhousia male mtDNA 6 out of 14 protein genes start with the ATA codon and 8 with ATG, while in the female 7 out of 13 start with ATG and 6 with ATA (Tables 1 and 2). This pattern differs from that observed for Mytilus galloprovincialis, where 9 out of 13 protein genes start with the ATG codon, 2 with the ATA and 2 with GTG [23,33]. In all known metazoan mtDNAs, the most common start codon is ATG, and it is a general opinion that the methionine tRNA with the CAT anticodon represents the ancestral form. Moreover [24] suggested that the second methionine tRNA arose by duplication. The F and M genomes of the venerid Venerupis philippinarum also have two tRNA genes for methionine, but both have the ancestral CAT anticodon. TAA is the termination codon ten times in F and nine times in M mtDNA, while TAG is a stop codon two times in F, and four times in M. In both M and F genomes, nad5 gene is terminated by an incomplete termination codon T- (Tables 1 and 2), with their likely completion occurring by polyadenylation after transcript processing [34].
The least used codons in males are UCG (6), CCG (8) and CGG (8), while in females they are CCG (4), CGC (7) and UAG (7). Of these, CGC is also among the least common in the mtDNA of other mollusks. Synonymous codons, whether four-fold (4FD) or two-fold (2FD) degenerate, are recognized by the same tRNA, with the exception of the methionine codons, which are recognized by different tRNAs (Table 5).
Moreover, 2,754 F and 2,967 M Musculista codons (72.6% and 72.4% in female and in male respectively) end with an A or T, a more pronounced phenomenon than what observed for a typical invertebrate codon bias. There is a strong bias against the use of C (9.3% and 11.3% in female and in male respectively) at the third position nucleotide in all codons: in detail, for residues with a fourfold degenerate third position, codon families ending with T are the most frequently used (46.7% and 46.6% in female and male respectively). This is also the case for two-fold degenerate codons. In other words, in every case an amino acid residue can be specified by any NNY codon, both female and male M. senhousia mitochondrial genomes Finally, in eight 2FD and seven 4FD codon families in females and in seven 2FD and seven 4FD codon families in males, the most frequently used codon does not match the tRNA anticodon. This has been observed in other metazoan mtDNA as well [46][47][48][49][50] and it suggests that strict codon-anticodon complementarity does not affect the codon composition of the genome. Deviations from equal frequency of the four nucleotides in 4FD sites are common in the animal mtDNA and have been attributed to several factors, such as unequal presence of the four nucleotides in the nucleotide pool, preference of the mitochondrial gamma DNA polymerase for specific nucleotides, or asymmetrical mutation rate owing to different duration of exposure of the lagging strand during replication [40,[51][52][53][54].
Comparing the two M. senhousia sex linked genomes, the most conserved protein-coding genes are cox1 and cob, and the least conserved are nad6 and atp8 (Table 4). Synonymous (Ks) and non-synonymous (Ka) substitution values between the two genomes do vary (Table 4). Ka is particularly low for cox1 (0.042), whereas Ks is not (0.838), suggesting that this gene is under some selective constraint (Ka/Ks = 0.05). The conservation of cox1 is common in animal mtDNA [55,56]. In cob gene, both K values are lower than average (Table 4) with a Ka/Ks ratio's value (0.10) which is close to that of cox1 gene.

The Large Unassigned Region (LUR)
As mentioned, in the female genome the LUR (F-LUR) is 4,521 bp long and it is included between trnE and the  Table 2). Both start with a dissimilar sequence/spacer 20 and 237 bp long, respectively.
The F-LUR contains two large repeats (Figure 4: Rep1 and Rep2) about 2,150 bp long (2,149 Rep1; 2,151 Rep2), both subdividable in three regions: A, B and C (named A 1 , A 2 , B 1 , B 2 , C 1 and C 2 ; see Figure 4 and Additional File 3). Between Rep1 and Rep2, the A subregion is the most conserved (pD = 0.000, see Table 6) while C is the most variable, although with a low pD (0.010 ± 0.005). Overall, Rep1 and Rep2 have a pD of 0.004 ± 0.001. The region including the last 202 bp of the F-LUR shows some similarity (pD = 0.449 ± 0.035) to the A subregions (A 1 and A 2 ), for this reason it is indicated here as subregion A'.
All the A-type subregions (A 1 , A 2 and A') start with a 46 bp conserved motif, named here α, that contains a 10 bp hairpin (αh; see Figure 5). Both the subunits C (C 1 and C 2 ) begin with a hairpin 27 bp long (Ch; Figure 5). The M-LUR contains an A-like subregion showing a pD of 0.362 ± 0.032 from A 1 and A 2 ( Table 6), indicated as A'' (Figure 4). A'' starts with a 37 bp motif, here named α*, similar to α, but 9 bp shorter and with three mutations that allow the formation of a longer hairpin, here named α*h (31 bp; Figure 5), in comparison to the female hairpin αh. The M-LUR continues with the subunit B that is the most conserved region compared to the F-LUR showing a pD from B 1 and B 2 of 0.098 ± 0.007 and 0.096 ± 0.007 respectively (Table 6). At the 3' end of B there is a motif, indicated as γ (Figure 4) that is similar to the first part of the subunits C. γ is repeated four times in tandem. The length of γ 1 , γ 2 and γ 3 ranges from 268 and 265 bp while the last repeat, γ 4 , is truncated and measures 17 bp (Additional File 3; Figure 4). The pD among the γ motifs is low and ranges from 0.008 ± 0.005 in the female (between γ c1 and γ c2 ) and 0.019 ± 0.009 between γ 1 and γ 3 ( Table 6). The pD of the γ motifs between male and female varies from 0.346 and 0.350 ± 0.027 (Table 6). At the 5' end of each γ motif a secondary structure is present (γ 1 h, γ 2 h, γ 3 h and γ 4 h respectively; Figure 5): γ 1 h is 14 bp long, while the other three are 28 bp long. γ 2 h and γ 3 h are identical, γ 4 h has a two bases mutation at the center of the loop and γ 1 h is identical to the upper portion of γ 4 h (see Figure 5).
Furthermore, in line with what has been found in other DUI bivalves, including Mytilus, an ORF coding for 121 amminoacids has been found in the F-LUR of M. senhousia. This protein was proposed to have a functional role in DUI. Detailed analyses on this novel DUI related putative protein have been published in a more comparative way (see [20]).
The cox2 duplication in the male mtDNA The male mtDNA contains an extra copy of the cox2 gene. This is not new for DUI animals, since the female mt genome of the marine clam V. philippinarum has a cox2 duplication as well (GenBank Acc. No. AB065375: Okazaki and Ueshima, unpublished). In the female Musculista, the cox2 gene (Fcox2) is 660 bp long and is flanked by the "cox1/UR-6" and "UR-7/ atp8" regions at the 5'-and 3'-end respectively (see Figure 1 and Table 1). In male mitochondrial genome, the two copies of cox2 are close to each other and linked by a little non coding region 41 bp long (UR-6). The two cox2 copies are located between "cox1/UR-5" and "UR-7/atp8" regions, and the first is 813 bp long, while the second is 690 bp long ( Figure 2 and Table 2).
Bayesian phylogenetic analyses on

Gene content and order of F and M Mitochondrial genomes in M. senhousia
In M. senhousia both M and F mtDNAs share the same gene content and order, except for a duplicated cox2 gene in males, and include the typical gene content of bivalve mtDNA. It has to be noted, however, that a common feature of bivalves is the apparent lack of the atp8 gene. For instance, [2] mentioned that a lack of the atp8 gene is one of several unusual features of the Mytilus mt sequence. The atp8 gene was considered missing for almost all bivalve species studied so far, including Crassostrea hongkongensis, C. gigas, C. virginica, Placopecten magellanicus, Argopecten irradians, Mizuhopecten yessoensis and Acanthocardia tuberculata. On the contrary, the apt8 gene was found in Hiatella arctica, as well as in the female mitochondrial genome of the unionid bivalve L. ornata [28]. A remarkable observation is that V. philippinarum, another species with DUI [57], was recently found to contain a putative atp8 gene [58], which was not found in the first analyses; nonetheless, this gene apparently encodes 37 amino acids only and therefore has a questionable gene function. Finally, [23] examined ORFs from several bivalve mitochondrial genomes and found two novel ORFs (F-orf-ur4 and M-orf-ur4) in the largest unassigned region of F and M mytilid ones (UR-4: see [33]). BLASTN searches against EST_others (all ESTs except human and mouse) showed that both are transcribed in Mytilus spp. BLASTX and PSI-BLAST searches using inferred aminoacid sequences of F-orf-ur4 and M-orf-ur4 failed to detect any significant sequence similarity with known proteins, so the identity of those putative proteins is still unclear. Further analyses on structure and evolution patterns suggested that the novel ORFs "represent good candidates for the previously 'missing' atp8 in mytilid mtDNAs" [23]. Therefore, following [23], we also found atp8 putative genes in both sex-linked mitochondrial genomes of M. senhousia. Our atp8 genes share the same characteristics of the above mentioned proteins, so we are confident to annotate them as Musculista atp8 genes.
Generally speaking, most mtDNAs are characterized by strand asymmetry in term of gene distribution. In both M. senhousia mt genomes, all genes are transcribed from the same strand, i.e. the asymmetry is at its highest among Metazoa. Most marine bivalves also share this feature (Mytilus species-complex, C. gigas, C. virginica, C. hongkongensis and V. philippinarum). In contrast, this is not true for the two freshwater species L. ornata [28] and Inversidens japanensis (Acc. No. AB055625 and AB055624) (see also [59]). In other mollusks, a relatively small number of mitochondrial genes are transcribed from the second strand. The scaphopods G. eborea and S. lobatum are an exception, with about an equal number of genes encoded by each strand [31,58]. The occurrence of all genes in the same strand is a relatively rare phenomenon in metazoans and, in addition to bivalves, it has been reported in some annelids (Lumbricus terrestris, [60]; Platynereis dumerilii, [61]) and brachiopods (Terebratulina retusa, [62]; Terebratalia transversa, [42]; Laqueus rubellus, [63]). Actually, almost 10% of the mitochondrial genomes examined to date do have all genes encoded in the same strand [10]. Moreover, most of the above mentioned groups, including Bivalvia, are also characterized by strong differences in gene content and/or gene order. This allowed [10] to suggest a possible correlation between these two features.
The trnS(AGN) could not be located with tRNAscan-SE [64] because of the absence of the DHU arm and therefore of a normal cloverleaf structure (see [27] for a detailed discussion), so we used the ARWEN software [65] to identify it. This unconventional tRNA was found also in several other animal groups ( [27] and references therein), and it evolved very early in Metazoa [66]. In vitro analyses confirmed its functionality [67].
In Table 7, the distribution of trnS(UCN) and trnS (AGN) among bivalves is reported (only complete mitochondrial genomes included; source: http://mi.caspur.it/ mitozoa see [3]). Most of the species (22) have both the tRNAs, 7 only trnS(UCN) and 3 (including M. senhousia) only trnS(AGN). Placopecten magellanicus have two copies of trnS(UCN), while Mizuhopecten yessoensis seems to lack a Serine tRNA. [68] suggested that the secondary structure of a tRNA gene between a pair of protein genes is responsible for the precise cleavage of the polycistronic primary transcript. In the absence of a tRNA, this role can be played by a stem-loop structure, the 5'-end part of the gene itself, or a combination of the two. Potential hairpin structures at protein-protein gene junctions with no intervening tRNA have been reported in several studies (e.g., [6,33,39,69,70]). Our analysis demonstrated that putative hairpins are present in all the gene junctions in which a tRNA lacks, suggesting a functional role of such intergenic sequences (Figure 3).

The Large Unassigned Region (LUR) and the sex-linked mt-DNA transmission
The structure of the F and M LUR palindromes found are reported on Figure 4 and 5. The presence of palindromes within a mtDNA CR is not new; in fact, the local fold symmetry created by the palindrome is thought to provide the site for DNA-binding proteins involved in the trascriptional machinery [71]. In more detail, palindromic motifs (and in general inverted repeats) have the potential to form single-stranded stem-loop cruciform structures which have been reported to be essential for replication of circular genomes in many prokaryotic and eukaryotic systems [72]. The redundancy of palindromic elements in the Musculista male LUR, when compared to that of the female, may be possibly related to an increased duplication ratio of the M mtDNA; we can also speculate that this feature may have some role in the process by which sperm mitochondrial DNA becomes dominant or exclusive of the male germline, although we know that this is also achieved through a differential segregation during early embryo development, and likely through a second, more strict, selection during primordial germ cells establishment (see [73]). Nevertheless, the question of how sperm mitochondrial DNA becomes dominant or the exclusive component of the male germline in DUI species still remains open, and may be the outcome of various coordinated processes.
The duplication of the cox2 gene One noteworthy finding of this analysis is the cox2 gene duplication in the male mtDNA, with the duplicated gene being longer than the original one, a feature that might be somehow related to DUI. In fact, an interesting analogy is evident with unionid bivalves, in which the male cox2 gene show a 200-codon extension, which is absent in the female mtDNA. Such a feature is found in all analyzed unionids so far, and it has been related to DUI functioning [21,22,[74][75][76]. Actually, [21,22] proposed several hypotheses for the role the cox2 extension may have for DUI, but all are dependent upon identifying a specific function for it, which is not a trivial task. Moreover, they detected in the male gonad a poly-adenylated mRNA transcript of the cox2 gene that includes the extension, and they concluded that the extension is protein-coding and functional. [21,22] also hypothesized that the COX2 protein extension might be involved in intracellular interactions determining the survival of the male mitochondrion. In other organisms, it has been shown that upon fertilization the sperm-derived mitochondria are targeted for elimination: a key process in sperm mitochondrial degradation is ubiquitination [77], in which mitochondria of paternal derivation are tagged with Ubiquitin and then degraded. In Mytilus, in which an Ubiquitinlike process has been proposed, this degradation would be sex-specific: the sperm-derived mitochondria survive in male embryos, whereas they are eliminated in females. All that considered, [21] proposed that the COX2 extension could be involved in blocking such elimination to ensure survival of the male mitochondrion, or, alternatively, the extension could play a role in the segregation of male mitochondria to the gonad. In either case, it should be possible to detect the protein product of the extension outside of the inner  mitochondrial membrane. An in situ hybridization seemed to demonstrate that the unionid male COX2 is present on both inner and outer membranes of the sperm mitochondria (see Figure 4 in [74]). According to the above mentioned rationales, we hypothesize that the duplicated cox2b gene in male M. senhousia may represent a variant of what found in unionoidean bivalves, with proper signals for DUI mitochondrial tagging lying in the COX2 protein extension of unionid bivalves, as well as in the duplicated COX2b protein of Musculista. A support to this view comes from the observation that an additional putative Trans Membrane Helix (TMH) is found in the 41 residue long tail of the Musculista COX2b, although this tail is considerably shorter that the unionid one (200 amminoacids). Actually, five putative TMHs were found in the unionid extended C-terminus of the male COX2, which led the Authors to hypothesize that it may have a functional significance for male unionoidean bivalve reproductive success [75,76].
In analogy, we suggest that COX2b might have some function related to mitochondrial tagging, like the COX2b and the Unionid COX2 extension. Further studies are needed to gain a more clear role of such proteins in the unusual DUI system of mitochondrial inheritance. Actually, a duplication similar to the Musculista one was also found in V. philippinarum, but quite surprisingly in the female mtDNA (see unpublished GenBank annotation). This suggests that cox2 duplication may be uncoupled with maleness. Moreover, no Mytilus genomes show a similar situation for cox2 or any other gene, so either duplicated genes or a cox2 tail may not be strictly necessary to sustain DUI.

Conclusions
The characteristics of the Musculista sex-linked mtDNAs evidently add to the knowledge of DUI systems, and highlight some unexpected features, shared among distantly related DUI species. Since it is commonly accepted that DUI is rather a variation of Strict   Maternal Inheritance, than a completely different mechanism, we think that DUI is a good experimental model to better understand the general rules, as well as the molecular features of Metazoan mitochondrial inheritance (see [18], for a detailed discussion). For the above mentioned reasons, the complete mtDNA genome characterization of DUI bivalves is not only a mere descriptive exercise, but rather a first step to unravel the complex genetic signals allowing Doubly Uniparental Inheritance of mitochondrial DNA, and the evolutionary implications of such unusual transmission route in mitochondrial genome evolution in Bivalvia.

Sample Collection
Alive M. senhousia specimens from Venice Lagoon (Italy) were used for this analysis. Males and females were stimulated to spawn gametes in seawater supplemented with hydrogen peroxide, according to [78]. Each emission was analyzed with a light microscope to sex specimens. A total of 10 sperm and 10 egg samples were then collected after a gentle centrifugation (3,000 g). Seawater was removed, and ethanol added before storing samples at -20°C.

PCR analyses
Total genomic DNA was extracted using the DNeasy Tissue Kit (Qiagen), and partial sequences of cytochrome b (cob) and mitochondrial ribosomal large subunit RNA (rrnL) were amplified and directly sequenced (primers reported in Table 8), as described in [79]. Sequencing reactions were performed on both strands with BigDye Terminator Cycle Sequencing Kit according to supplier's instructions (Applied Biosystem) in a 310 Genetic Analyzer (ABI) automatic sequencer. The 20 sequences obtained for both F and M genomes were aligned (not shown), and, after checking for variable sites, used to design sex-specific primers to amplify the entire mitochondrial genome in two overlapping fragments by long PCR reactions. LongPCR was performed on one Musculista specimen per sex. To obtain the F genome, F-cob383R and F-16S142F primers were used. The M genome was amplified with M-cob386R and M-16S103F. Both pairs of primers amplified a fragment of 10-11 kb respectively. Long PCR primer sequences are reported in Table 1. LongPCR amplifications were performed on a Gene Amp ® PCR System 2720 (Applied Biosystem) in 50 μl reaction volume composed of 31.5 μl of sterilized distilled water, 10 μl of 5 × Herculase II Fusion Reaction Buffer, 0.5 μl of dNTPs mix (25 mM each dNTP), 1.25 μl of each primer (10 μM), 5 μl of DNA template (25-50 ng) and 0.5 μl of Herculase II Fusion DNA Polymerase. Reaction conditions were according to supplier's recommendations: initial denaturation at 95°C for 5 min and then incubated at 95°C for 20 s, 50°C for 20 s, and 68°C for 10 min for 30 cycles and 68°C for 8 min for a final extension. Long-PCR fragments were then purified using Wizard ® SV Gel and PCR Clean-Up System (Promega).

Shotgun cloning
Sequencing of the two LongPCR fragments was done using shotgun cloning: amplicons were randomly sheared to 1.2-1.5 kb DNA segments using a Hydro-Shear device (GeneMachines). Sheared DNA was blunt end repaired at room temperature for 60 min using 6 U of T4 DNA Polymerase (Roche), 30 U of DNA Polymerase I Klenow (NEB), 10 μl of dNTPs mix, 13 μl of 10 × NEB buffer 2 in a 115 μl total volume, and then gel purified using the Wizard ® SV Gel and PCR Clean-Up System (Promega). The resulting fragments were ligated into the SmaI site of a pUC18 cloning vector using the Fast-Link DNA ligation Kit (Epicentre) and electroporated into One Shot ® TOP10 Electrocomp™ Escherichia coli cells (Invitrogen) using standard protocols. Clones were screened by PCR using M13 universal primers and recombinants were purified using Multiscreen (Millipore) according to the manufacturer's instructions. Clones were sequenced using M13 universal primers by Macrogen Inc. (Korea).
Raw sequences were manually corrected, and then assembled into contigs with Sequencher v.4.6 (Gene Codes). Hence, the final assemblies were based on a minimum sequence coverage of 3×.

Secondary structures and annotation
The tRNA genes were identified by their secondary structure using ARWEN [65], with invertebrate mitochondrial codon predictors. Analysis of Open Reading Frames (ORFs) was performed with the ORF Finder program of NCBI http://www.ncbi.nlm.nih.gov/projects/ gorf/ using the invertebrate mitochondrial genetic code. Sequences were identified using BLASTX, PSI-BLAST [80] and BLASTN [81] as implemented by the NCBI website http://www.ncbi.nlm.nih.gov/.