"Tandem duplication-random loss" is not a real feature of oyster mitochondrial genomes

Duplications and rearrangements of coding genes are major themes in the evolution of mitochondrial genomes, bearing important consequences in the function of mitochondria and the fitness of organisms. Yu et al. (BMC Genomics 2008, 9:477) reported the complete mt genome sequence of the oyster Crassostrea hongkongensis (16,475 bp) and found that a DNA segment containing four tRNA genes (trnK1, trnC, trnQ1 and trnN), a duplicated (rrnS) and a split rRNA gene (rrnL5') was absent compared with that of two other Crassostrea species. It was suggested that the absence was a novel case of "tandem duplication-random loss" with evolutionary significance. We independently sequenced the complete mt genome of three C. hongkongensis individuals, all of which were 18,622 bp and contained the segment that was missing in Yu et al.'s sequence. Further, we designed primers, verified sequences and demonstrated that the sequence loss in Yu et al.'s study was an artifact caused by placing primers in a duplicated region. The duplication and split of ribosomal RNA genes are unique for Crassostrea oysters and not lost in C. hongkongensis. Our study highlights the need for caution when amplifying and sequencing through duplicated regions of the genome.


Background
Because of its nature of maternal inheritance, mitochondrial (mt) genome has a fast rate of evolution and is particularly useful in phylogenetic analysis. The analysis of complete mt genome sequences provides not only information about nucleotide changes, but also insights into gene order and rearrangements that are indicative of major evolutionary changes.
We read with great interest an article appeared in a recent issue of BMC Genomics (9:477 2008) entitled 'Complete mitochondrial DNA sequence of oyster Crassostrea hong-kongensis -a case of "Tandem duplication-random loss" for genome rearrangement in Crassostrea?' by Yu, Z.N., Wei, Z.P., Kong, X.Y., and Shi, W. [1]. Based on our data, we believe that an important part of Yu et al.'s paper is incorrect and would like to share our results with the readers of this Journal.
In their paper, Yu et al. (2008) reported that the complete mt genome of C. hongkongensis is 16,475 bp in length (GenBank accession number EU266073) and pointed out that 'A striking finding of this study is that a DNA segment containing four tRNA genes (trnK 1 , trnC, trnQ 1 and trnN) and two duplicated or split rRNA genes (rrnL5' and rrnS) are absent from the genome, when compared with that of two other extant Crassostrea species, which is very likely a consequence of loss of a single genomic region present in ancestor of C. hongkongensis. It indicates this region seem to be a "hot spot" of genomic rearrangements over the Crassostrea mt-genomes' (p. 1, Abstract, line 14-19). We have independently sequenced the complete mt genomes of three C. hongkongensis individuals. All our three sequences contained the DNA segment that was reported missing in Yu et al.'s study. The discrepancy is not trivial as the loss of the duplicated region was central to Yu et al.'s hypothesis of a novel "tandem duplication and random loss" event during the evolution of C. hongkongensis. It was further suggested that this region was a "hot spot" for genomic rearrangement. Therefore, it is critical to determine if the loss of the duplicated region is real in view of the different sequences we obtained.  [2], and the results are presented in Table 1. Our sequence for C. hongkongensis has exactly the same gene order and arrangements as C. gigas, both containing the segment that is missing in Yu et al.'s sequence. The segment contains four tRNA genes, a duplicated rrnS and part of the split rrnL. The split rrnL is first discovered in C. virginica and appears to be unique for oysters [2].
The three C. honghongensis oysters used in our study were from diverse populations (Hainan, Guangxi and Fujian) PCR products amplified with different primers and separated on agarose gel electrophoresis  covering the entire geographic range of this species as we know (Guo et al., unpublished), and they were genetically identified using molecular markers prior to our study [3]. We compared one of our sequences with Yu et al.'s using BLAST http://blast.ncbi.nlm.nih.gov/bl2seq/ wblast2.cgi [4]. In the 16,475 shared nucleotides, there are 15 SNPs (single nucleotide polymorphisms) and the similarity between the two gnomes is 99.91%, suggesting that oysters used in our study and Yu et al.'s study are all C. hongkongensis. Sequence identity in major coding genes between our C. hongkongensis sequences and that of C. gigas is shown in Table 2. Considerable differentiation has occurred between the two sister-species at some genes (i.e., gene identity of 75.1% for nad2) despite the identical gene order. Analysis of all four C. hongkongensis mt sequences revealed 41 SNPs: 28 in coding and 13 in noncoding regions (Table 3). Of the 19 SNPs from protein coding genes, only one is non-synonymous, suggesting strong purifying selection. The non-synonymous mutation occurred at the atp6 gene in Yu's sequence only, and further studies are needed to determine whether it is a true SNP or sequencing error.
Yu et al. used ten pairs of primers to amplify the complete mt genome of C. hongkongensis (p. 11). We carefully studied the positions of each primer and located them in our mt genome sequences of C. hongkongensis (Figure 1). It occurred to us that Yu et al. might have failed to amplify the gene block of K 1 -C-Q 1 -rrnL5'-N-rrnS 2 because some of their primers were placed in a duplicated region. As shown in Figure 1, primer pair 1* is located in gene cob and rrnS 1 (or rrnS 2 ), primer pair 2* is completely located within the duplicated gene rrnS (rrnS 1 or/and rrnS 2 ), and primer pair 3* is located in rrnS 2 (or rrnS 1 ) and atp6 (primer pairs 1*, 2* and 3* correspond to the third, the fourth and the fifth primer pairs in Yu et al.' paper, Table  4). Because these three primer pairs are either completely or partially (one of the two primers) located in the duplicated gene rrnS 1 and rrnS 2 , they should theoretically amplify two fragments of different length, but in reality the smaller fragment may be preferentially amplified and sequenced. The length of shortest PCR products expected from the three primer pairs was 2,470 bp, 824 bp and 1,016 bp, respectively (Table 4). Primer pair 2* was completely located in the duplicated gene rrnS (rrnS 1 or rrnS 2 ); thus they may directly concatenate the sequence between the duplicated gene and artificially lose the gene block of K1-C-Q1-rrnL5'-N-rrnS 2 ( Figure 1).  Figure 2). We increased the elongation time for PCR trying to obtain the longer fragments, but failed probably because of distance between the duplicated genes (2,147 bp) is too long. We designed two new pairs of primers targeting the block between the duplicated rrnS genes, with one primer of each pair located in the rrnL gene that was supposed to be absent according to Yu et al. (Table 4, Fig, 1). The two new primer pairs designed by us successfully amplified and produced fragments of expected sizes, 2,658 and 1,905 bp (Table 4, Figure 2), proving that the gene block between the duplicated rrnS genes are actually there. To further confirm that the two products both contain the duplicated rrnS, each product was used as PCR template for amplification with the primers 2* that amplifies rrnS only; both PCR produced a fragment of the expected size (824 bp), the same as using genomic DNA as template ( Figure 2). We also sequenced some of the fragments, and the sequences are the same as expected from the mt sequences we obtained. These results clearly demonstrate that the duplicated rrnS and the split rrnL exist in the mt genome of C. hongkongensis. There is no loss of the duplicated genes and the gene block between them. "Tandem duplication-random loss" is not a real feature of oyster mt genomes and has not occurred during the evolution of C. hongkongensis. The possibility of Yu et al. sequenced a rare mutant of C. hongkongensis is extremely low considering: 1) we sequenced three individuals from three diverse populations; 2) Yu and colleagues screened more than one individual; and 3) we duplicated their results with our samples. This is a clear case of PCR artifacts involving duplicated genes.

Conclusion
In conclusion, the complete mt genome of C. hongkongensis is 18,622 bp in length, and its gene order and arrangement are identical to that of C. gigas. The loss of a gene segment reported by Yu et al. (2008) was an artifact due to placing PCR primers in a duplicated gene, and the phenomenon of "tandem duplication-random loss" does not exist in the mt genome of C. hongkongensis. Our study highlights the need for caution when amplifying and sequencing through regions with tandem duplication. When tandem duplication is expected, it is important to design long PCR fragments and not place primers in duplicated regions. Cross-verification with different sets of primers should be considered.  Yu et al. (2008). Because one or both of the primers are located in the duplicated rrnS gene, two fragments are expected but only the shorter one is actually amplified.