Complete Sequence and Analysis of the Mitochondrial Genome of Hemiselmis andersenii CCMP644 (Cryptophyceae)
© Kim et al; licensee BioMed Central Ltd. 2008
Received: 19 February 2008
Accepted: 12 May 2008
Published: 12 May 2008
Cryptophytes are an enigmatic group of unicellular eukaryotes with plastids derived by secondary (i.e., eukaryote-eukaryote) endosymbiosis. Cryptophytes are unusual in that they possess four genomes–a host cell-derived nuclear and mitochondrial genome and an endosymbiont-derived plastid and 'nucleomorph' genome. The evolutionary origins of the host and endosymbiont components of cryptophyte algae are at present poorly understood. Thus far, a single complete mitochondrial genome sequence has been determined for the cryptophyte Rhodomonas salina. Here, the second complete mitochondrial genome of the cryptophyte alga Hemiselmis andersenii CCMP644 is presented.
The H. andersenii mtDNA is 60,553 bp in size and encodes 30 structural RNAs and 36 protein-coding genes, all located on the same strand. A prominent feature of the genome is the presence of a ~20 Kbp long intergenic region comprised of numerous tandem and dispersed repeat units of between 22–336 bp. Adjacent to these repeats are 27 copies of palindromic sequences predicted to form stable DNA stem-loop structures. One such stem-loop is located near a GC-rich and GC-poor region and may have a regulatory function in replication or transcription. The H. andersenii mtDNA shares a number of features in common with the genome of the cryptophyte Rhodomonas salina, including general architecture, gene content, and the presence of a large repeat region. However, the H. andersenii mtDNA is devoid of inverted repeats and introns, which are present in R. salina. Comparative analyses of the suite of tRNAs encoded in the two genomes reveal that the H. andersenii mtDNA has lost or converted its original trnK(uuu) gene and possesses a trnS-derived 'trnK(uuu)', which appears unable to produce a functional tRNA. Mitochondrial protein coding gene phylogenies strongly support a variety of previously established eukaryotic groups, but fail to resolve the relationships among higher-order eukaryotic lineages.
Comparison of the H. andersenii and R. salina mitochondrial genomes reveals a number of cryptophyte-specific genomic features, most notably the presence of a large repeat-rich intergenic region. However, unlike R. salina, the H. andersenii mtDNA does not possess introns and lacks a Lys-tRNA, which is presumably imported from the cytosol.
The mitochondrion is a double-membrane enclosed organelle found in the vast majority of extant eukaryotes. Mitochondria are best known for their essential role in energy generation, but they are also the site of additional important cellular processes such as iron-sulfur (Fe-S) cluster assembly and the beta-oxidation of fatty acids . Some degenerate forms of mitochondria, such as the mitosome of the diplomonad parasite Giardia lamblia, have secondarily lost energy generating pathways and seem to retain only the Fe-S cluster maturation function . All mitochondria are believed to share a single origin from an α-proteobacterial-like prokaryote , but a wide diversity of mitochondrial genome architectures have evolved subsequent to the diversification of modern-day eukaryotes [1, 3, 4]. For example, whereas "derived" animals possess monomeric circular mitochondrial genomes, an observation which led to the initial assumption that mtDNAs are primarily circular , many other mitochondrial genomes, such as that of the ciliate Tetrahymena pyriformis , the green alga Chlamydomonas reinhardtii  and the cnidarian metazoan Aurelia aurita (moon jelly)  are linear . In addition, while some fungi and many plants have circular-mapping mtDNAs, their mitochondria actually contain predominantly linear mtDNA molecules with combinations of monomers and concatemers, with only a minor fraction of the molecules present in a circular form . A more extreme example is the mtDNA of kinetoplastids, which consists of one maxi- and many different mini-circles that are interconnected to form an extensive network . Mitochondrial gene content is also highly variable; the mtDNA of the jakobid flagellate Reclinomonas americana encodes 97 genes, the largest set of mitochondrial genes currently known , whereas the mtDNA of the malaria parasite Plasmodium falciparum contains just 3 protein coding genes and 2 highly fragmented small and large subunit ribosomal RNA (rRNA) genes . The most highly derived forms of mitochondria, such as the hydrogenosome of Trichomonas vaginalis  and the Giardia lamblia mitosome , have lost their genomes entirely .
Mitochondria are also known as sites of unusual molecular biology and biochemistry. Marande and Burger  recently showed that the mtDNA genes of the euglenid Diplonema papillatum are fragmented into as many as nine modules, each residing on a distinct 6 or 7 Kbp chromosome. The mechanism by which these fragmented gene pieces are linked together to form contiguous transcripts is unknown. Extensive mRNA editing is another example of the bizarre molecular biology of mitochondria. Kinetoplastid mitochondrial mRNAs are subject to insertions and deletions of uridylate residues, sometimes >100 such insertions/deletions per transcript . Mitochondrial mRNA editing is also widespread in land plants  and dinoflagellates . For example, ~2% of the cox1 and cob gene sequences in three dinoflagellate species investigated by Lin et al.  were edited at the mRNA level.
We are studying the genomic diversity and evolution of cryptophytes, a ubiquitous and ecologically significant group of single-celled eukaryotes found in freshwater and marine environments. Most cryptophytes, except for members of the genus Goniomonas, harbor plastids of secondary endosymbiotic origin . A variety of shared morphological features, such as the presence of ejectisomes, flat mitochondrial cristae, and an anterior depression, support the monophyly of cryptophytes, as do molecular phylogenetic data . One unique feature of cryptophyte plastids that distinguishes them from other plastids of red algal origin is the retention of the remnant nucleus of the red algal endosymbiont, referred to as the nucleomorph [22, 23]. Consequently, most cryptophytes harbor four distinct genomes–nuclear, nucleomorph, mitochondrial, and plastid genomes–contained in separate compartments. Cryptophytes are thus an interesting model system with which to study endosymbiotic gene transfer, genome evolution, and protein targeting.
In this study, we report the complete mitochondrial genome sequence of the newly described cryptophyte species Hemiselmis andersenii CCMP644, and compare it to the only other cryptophyte mitochondrial genome described thus far, that of Rhodomonas salina . In addition, individual and concatenated mitochondrial protein coding gene sequences were analyzed to infer the phylogenetic relationships of cryptophytes to other eukaryotes.
DNA preparation, sequencing, and genome assembly
Hemiselmis andersenii mtDNA was isolated and sequenced to ~10× coverage as described in Lane et al. . About 1,200 end sequences were screened for quality and vector contamination with Pregap4 and automatically assembled using gap4 version 4.10 in the Staden package . Complete automated assembly of a large intergenic space between trnS and cox2 was unsuccessful due to the highly repetitive nature of this region. In an attempt to manually resolve this area, short (~30 bp) unique sequences within the trnS and cox2 genes were used to probe the sequence database for reads that extended from these two loci into the repeat region. These sequences were extracted and manually aligned using MacClade version 4.08 . Sequences at the ends of the new constructs were then selected and the process was repeated. However, due to the presence of multiple identical copies of a >500 bp repeat, the assembly of a single unambiguous contig was not possible. When all available sequence reads were considered, three robust contigs were produced, each ending with similar repetitive sequences consisting of a ~340 bp repeat unit. These three contigs were joined to circularize the map. The complete H. andersenii mtDNA has been submitted to GenBank under the following accession number: EU651892.
DNA secondary structure within the repeat region was predicted using mfold version 3.2  at a folding temperature of 37°C and the ionic conditions of 1.0 M [Na+] and 0.0 M [Mg++].
Genome size/structure determination
We used pulsed-field gel electrophoresis (PFGE) to obtain an independent size estimate of the H. andersenii mitochondrial genome. Hemiselmis andersenii total DNA plugs were prepared as described in Lane et al.  and digested overnight with the restriction enzymes Pst I or Bgl II (Fermentas, Hanover, MD, USA). Based on the genome sequence, these enzymes were predicted to cut the mtDNA only once or twice. Both untreated and enzyme-digested H. andersenii DNA plugs were run on a 1% agarose gel (1× TBE) in 0.5× TBE buffer at 14.0°C for 18 h at a voltage of 6.0 V/cm with a switch time between 1–25 s using a CHEF-DR III Pulsed-Field Gel Electrophoresis System (Bio-Rad Laboratories, Hercules, CA, USA). DNA on the pulsed-field gel was transferred to a nylon membrane. Southern hybridization using a ~700 bp cox I probe as in Lane and Archibald  revealed that undigested mitochondrial DNA molecules were trapped in the wells or found in the 'compression zone'. The Pst I or Bgl II endonuclease treated DNA plugs revealed mitochondrial molecules in a discrete band below the 'compression zone'. The corresponding bands could not be visualized on the ethidium-bromide stained pulsed-field gel image because of nuclear and nucleomorph DNA smears in the background. In order to visualize mtDNA on the pulsed-field gel, an initial PFGE run was used to remove the linear nuclear and nucleomorph chromosomes from the PFGE plugs. These plugs, which still contained organellar DNA, were subsequently removed from the gel and digested with the restriction enzymes Pst I and Bgl II. Digested plugs were then inserted into a fresh gel and electrophoresed under the conditions described above. The 5 Kbp and Lambda CHEF DNA Size Standard (Bio-Rad Laboratories, Hercules, CA, USA) were used to estimate the size of the enzymatically linearized H. andersenii mtDNA.
Genome annotation and GC content/skew analyses
Annotation of the H. andersenii mtDNA and the GC content and skew analyses were performed in Artemis version 8 . Gene identification was carried out using BLASTX and BLASTN. Small and large ribosomal rRNA subunit genes were identified by comparison to rRNA gene sequences in the mitochondrial genome of Rhodomonas salina. Transfer RNAs were identified using tRNAscan-SE version 1.21 .
Genome rearrangements between the two cryptophyte mtDNA
The extent to which the H. andersenii and R. salina mitochondrial genomes are rearranged to each other was estimated using GRIMM . Each genome was designated as a sequence of 63 units, which include a repeat region and 62 genes common between the two cryptophyte mtDNAs.
RT-PCR of 'trnK(uuu)'
tRNAscan-SE version 1.21  identified a putative intron of ~20 bp in the anticodon loop of the H. andersenii trnK(uuu) gene. To determine whether this prediction was correct, we performed RT-PCR using Lysine-tRNA-specific primer pairs and H. andersenii total RNA provided by H. Khan. To eliminate DNA contamination, 1 μl of total RNA was incubated for 30 min with RQ1 RNase-Free Dnase (Promega, Madison, WI, USA). RT-PCR was performed using the QIAGEN one-step RT-PCR kit (QIAGEN, Valencia, CA, USA) and with control reactions in which the reverse-transcription process was skipped. The following two pairs of primers were used: 1) The forward primer 5'-GAAGGTTGCTCGAATGGAA-3' with the reverse primer 5'-GAAGGTATAGGAATTGAACCTATTC-3' 2) and the forward primer 5'-GCCCAGAAGGTTGCTC-3' with the reverse primer 5'-AAGAAGGTATAGGAATTGAACCTAT-3'. RT-PCR was performed with the reverse transcription step for 30 min at 50°C and the subsequent inactivation of reverse transcriptase and activation of HotStart Taq DNA polymerase for 15 min at 95°C, followed by 35 cycles at 94°C for 1 min, 47°C for 1 min, and 72°C for 1 min, and a final extension at 72°C for 10 min. The amplified PCR fragments were cloned into pCR4-TOPO vector in the TOPO TA cloning kit for sequencing (Invitrogen, Carlsbad, CA, USA). Between 5 and 10 bacterial colonies from each reaction were selected for sequencing on a Beckman Coulter CEQ8000 (Beckman Coulter Inc., Fullerton, California, USA).
Molecular phylogenetic analysis
From the 36 protein-coding genes found in the H. andersenii mtDNA, 25 were selected for phylogenetic analyses. Eleven genes (atp8, nad8, rps2, rps3, rps4, rps7, rps8, rps13, rpl5, rpl6, tatC) were excluded because their sequences were poorly conserved and/or were only present in a few taxonomic groups. H. andersenii protein sequences were aligned with their homologs from other mitochondrial genomes available from GenBank. Amino acid sequences were aligned using MacClade version 4.08  and ambiguously aligned sites were manually removed. In addition to individual protein analyses, a concatenated protein data set containing 25 proteins was analyzed. To include the maximum number of gene sequences, we combined 25 protein-coding gene sequences encoded in 18 mitochondrial genomes across diverse eukaryotic taxa. As most mitochondrial genomes do not possess all 25 protein-coding genes selected for analysis, as many as 12 protein gene sequences were missing per taxon. A maximum likelihood tree was produced using RAxML-VI-HPC version 2.2.3  with the PROTOMIXJTT model of sequence evolution and the automatic tree rearrangement setting, and from 100 distinct randomized maximum parsimony starting trees. Bootstrap analysis was based on 100 re-samplings.
Results and Discussion
General features of Hemiselmis andersenii mtDNA
Functional categories of 36 protein genes encoded in the mitochondrial genome of Hemiselmis andersenii.
Ribosomal proteins (14)
rps2 rps3 rps4 rps7 rps8 rps11 rps12 rps13 rps14 rps19
rpl5 rpl6 rpl14 rpl16
Oxidative phosphorylation (21)
nad1 nad2 nad3 nad4 nad4L nad5 nad6 nad7 nad8 nad9 nad10 nad11
Ubiquinol:cytochrome c oxidoreductase
Cytochrome c oxidase
cox1 cox2 cox3
atp1 atp6 atp8 atp9
Sec-independent protein translocase protein (1)
Comparison of the mtDNA gene order in H. andersenii to other genomes reveals the presence of five gene clusters shared among distantly related protists: two ribosomal protein clusters (rps12-rps7-rps19-rps3-rpl16-rpl14-rpl5-rps14 and rps8-rpl6-rps13-rps11) and three NADH dehydrogenase clusters (nad4L-nad5; nad4-nad2; nad10-nad9). These gene clusters have been suggested to represent vestiges of bacterial operons [12, 24]. Interestingly, all 74 genes in the H. andersenii mitochondrial genome are encoded on the same strand. While the evolution of such an arrangement seems improbable, absolute strand polarity has been observed in the mitochondrial genomes of diverse eukaryotes such as the amoeba Acanthamoeba castellanii (59 genes), the fungus Penicillium marneffei (47 genes), and the green alga Chlamydomonas eugametos (20 genes) [35–37]. In addition, strikingly similar mtDNA architectures–gene-dense regions, a single large repetitive intergenic region, and all genes encoded on one strand–are seen in diverse protists such as the stramenopile Thraustochytrium aureum (The Organelle Genome Megasequencing Program; http://megasun.bch.umontreal.ca/ogmp/) and the green alga Pedinomonas minor . Understanding the biological significance of such convergence at the level of genome architecture will require comparative molecular and biochemical studies of mitochondria in these organisms.
Comparison of the mitochondrial genomes of Hemiselmis andersenii and Rhodomonas salina
Comparison of two cryptophyte mitochondrial genomes
60, 553 bp
Size of the repeat region
19.7 Kbp (33%)
4.7 Kbp (10%)
Group II introns
Number of genes (with assignable functions)
66 genes (28 tRNAs)
69 genes (27 tRNAs)
a pair of ~1.5 Kbp repeat units
With respect to conservation of gene order, 64.5% of the shared genes between the two cryptophyte mitochondrial genomes (40 out of 62 genes–36 protein-coding genes, 24 tRNA genes (see below), 2 rRNA genes) are present in thirteen syntenic blocks, each consisting of 2–7 genes. These include: 1) cox1-cob-nad11, 2) nad4L-nad5, 3) atp1-trnP(ugg), 4) rps8-rpl6-rps13-rps11, 5) trnC(gca)-atp6, 6) trnI(gau)-trnQ(uug)-trnR(gcg)-trnE(uuc)-trnW(cca)-nad10-nad9, 7) nad4-nad2, 8) trnR(ucu)-trnG(ucc), 9) trnM(cau)f-trnS(uga), 10) trnY(gua)-trnL(uag), 11) tatC-'trnK(uuu)' [H. andersenii] /trnS(gcu) [R. salina]-nad7, 12)cox3-rps12-rps7-rps19, and 13) rps3-rpl16-rpl14-rpl5-rps14. As noted earlier, some of the conserved gene clusters, such as nad4L-nad5, are found in distantly related eukaryotes and appear to be vestiges of bacterial operons. Analysis using GRIMM  suggests that the observed difference in gene order between the two cryptophyte mitochondrial genomes can be explained by at least 31 instances of genome reversal events.
Repeat structure of the H. andersenii mitochondrial genome
The R. salina mtDNA is characterized by a pair of ~1.5 Kbp inverted repeats that are joined by 112 bp of sequence . In contrast, repeats in the H. andersenii mitochondrial genome are not inverted, but are instead dispersed or arranged in tandem throughout the large non-coding region, with individual repeat units ranging from 22 to 336 bp and occurring up to 100 times (Figure 2). Given that R. salina and H. andersenii are distantly related to one another , the large repeat region presumably arose during or prior to the early diversification of cryptophytes. While there is no obvious sequence similarity between the two repeat regions, both contain multiple copies of palindromic sequences, which are predicted to form stable stem-loop DNA structures . In H. andersenii, two types of stem-loop structures were identified–I and II–using the DNA MFOLD program . The Type I structure has two slight variations, I-a and I-b, which occur 21 and 5 times, respectively (Figures 2 and 3). Type I-a and I-b structures have 22 and 20 base pairings in their stems, respectively, and occur adjacent to tandem repeats (Figures 2 and 3). One copy of the type II stem-loop structure is located within a ~300 bp segment that is devoid of any discernable repeat units, but close to the high and low GC regions noted earlier (Figures 2 and 3). As was suggested for R. salina by Hauth et al. , tandem repeats and multiple stem-loop structures in H. andersenii mtDNA might be involved in the regulation of transcription and replication, a hypothesis that needs to be tested further.
Hauth et al.  demonstrated that the repeat region of the R. salina mtDNA roughly coincides with a change in the direction of 'cumulative GC skew' [calculated as (G-C)/(G+C)] and suggested that the repeat corresponds to the origin of replication. We investigated the GC skew in the H. andersenii mitochondrial genome to see whether a similar pattern exists. Unlike R. salina, however, the H. andersenii GC skew does not change direction near the repeat region. Instead, in both the H. andersenii and R. salina mtDNA, observed GC skew patterns strongly correlate with transcriptional orientations, where the coding strand tends to be G-rich (data not shown). Therefore, the GC skew patterns of the two cryptophyte mitochondrial genomes do not seem to be the result of replication-associated mutational bias, but rather the non-random distribution of the protein coding genes, as has been observed in some other genomes . Nevertheless, based on the presence of other features such as stem-loop structures, it seems reasonable to assume that the repeat region in both cryptophyte mitochondrial genomes corresponds to the origin of replication.
Codon usage and transfer RNAs
Hemiselmis andersenii mtDNA codon usage table.
Second Position of Codon
First Position of Codon
UUU [F] 608
UCU [S] 123
UAU [Y] 312
UGU [C] 123
Third Position of Codon
UUC [F] 126•
UCC [S] 18
UAC [Y] 69•
UGC [C] 20•
UUA [L] 688•
UCA [S] 205•
UAA [stop] 32
UGA [stop] 2
UUG [L] 163
UCG [S] 57
UAG [stop] 3
UGG [W] 120•
CUU [L] 179
CCU [P] 139
CAU [H] 140
CGU [R] 83
CUC [L] 38•
CCC [P] 18
CAC [H] 30•
CGC [R] 17•
CUA [L] 88•
CCA [P] 115•
CAA [Q] 217•
CGA [R] 95•
CUG [L] 20
CCG [P] 28
CAG [Q] 42
CGG [R] 32
AUU [I] 579
ACU [T] 162
AAU [N] 307
AGU [S] 183
AUC [I] 85•
ACC [T] 24
AAC [N] 95•
AGC [S] 29•
AUA [I] 207•
ACA [T] 242•
AAA [K] 517• †
AGA [R] 110•
AUG [M] 233••
ACG [T] 54
AAG [K] 75
AGG [R] 15
GUU [V] 311
GCU [A] 186
GAU [D] 221
GGU [G] 299
GUC [V] 38
GCC [A] 35
GAC [D] 39•
GGC [G] 41•
GUA [V] 204•
GCA [A] 225•
GAA [E] 283•
GGA [G] 131•
GUG [V] 62
GCG [A] 64
GAG [E] 65
GGG [G] 83
Another possible mechanism to account for the missing tRNA is that the structurally abnormal 'trnK(uuu)' gene (Figure 4A) forms a functional Lys-tRNA to decode the codons AAA and AAG. Several cases of atypically-structured tRNAs are known from animal and ciliate mitochondria [47, 48]. Interestingly, tRNAscan-SE  predicted the existence of a 20 bp intron within the H. andersenii 'trnK(uuu)', and we conducted further experiments to test whether this is indeed the case. RT-PCR experiments using primer sets specific for 'trnK(uuu)' indicated that the putative intron was not removed in the mature tRNA. This results is not unexpected, given that the 20-bp putative intron is too short to be a self-splicing group I or II intron, which are the only known types of introns reported in mitochondrial genomes . Sequencing of ~20 clones also did not reveal any evidence for RNA editing within the 'trnK(uuu)'. These results suggest that if 'trnK(uuu)' is indeed expressed to form a functional Lys-tRNA, it is predicted to have an unusually AU-rich stem in the codon loop and a long variable region, atypical for Lys-tRNA (Figure 4A). Long variable regions ranging from 11 to 23 nucleotides are generally restricted to tRNA-Leu, tRNA-Ser, and bacterial tRNA-Tyr . The D- and T-loops of the 'trnK(uuu)' sequence show sequence similarity to one of the two mitochondrion-encoded tRNA-Ser genes (Figure 4A and 4B), both of which have a long variable region. In addition, comparative analysis with the R. salina mtDNA revealed genomic position conservation between the H. andersenii trnS-like 'trnK(uuu)' gene and the trnS(gcu) gene of R. salina, flanked by the tatC and nad7 genes. The H. andersenii 'trnK(uuu)' and R. salina trnS(gcu) genes both overlap tatC by 51 bp and 22 bp, respectively. This strongly suggests that the H. andersenii 'trnK(uuu)' is indeed derived from an ancestral gene that encoded tRNA-Ser, explaining the origin of its long variable region. The overlap between the H. andersenii 'trnK(uuu)' and tatC suggests that 'trnK(uuu)' may play a role in processing the 3' end of the tatC gene transcript. This hypothesis could explain why the 'trnK(uuu)' gene still remains in the genome and retains conserved secondary structure in the stem loop and D- and T-loops, even if it does not form a functional tRNA. Comprehensive molecular and biochemical experimentation will be necessary to confirm or refute the existence of mitochondrial tRNA import in H. andersenii and the functionality of the unusual 'trnK(uuu)' gene.
When the H. andersenii tRNA genes were compared to those of R. salina, 24 homologous pairs of tRNAs were identified, leaving only four H. andersenii tRNA and three R. salina tRNA genes not unambiguously matched to each other. Each of the tRNA pairs possess identical anticodons except for the H. andersenii 'trnK(uuu)' and R. salina trnS(gcu) pair, despite their common derivation. The trnS(gcu) of H. andersenii, having sequence homology to the 'trnK(uuu)', probably originated from a recent gene duplication event. Of the three remaining H. andersenii tRNA genes that are unmatched in R. salina, two–trnL(gag) and trnG(gcc)–are redundant because trnL(uag) and trnG(ucc) can decode all of their respective four-codon families . These redundant copies might have been lost in an ancestor of R. salina after it diverged from H. andersenii. Lastly, the H. andersenii trnI(cau) is somewhat similar to the trnK(uuu) of the R. salina and only marginally resembles the R. salina trnI(cau) at the 3' end. It is possible that the H. andersenii trnI(cau) originated through recombination between ancestral trnI(cau) and trnK(uuu) genes, which would explain the lack of an obvious trnK(uuu) homolog in H. andersenii comparable to the R. salina trnK(uuu). Substantial sequence divergence among the three genes, however, makes it difficult to accurately trace the origin of the trnI(cau) and the loss of the original trnK(uuu) gene in H. andersenii. On the other hand, the unusual trnI(uau) gene reported from R. salina is not found in H. andersenii. It was suggested that the R. salina trnI(uau) is derived from trnF(uuc) through a recent gene duplication event . Overall, the two cryptophyte mitochondrial genomes use similar tRNA sets to recognize codons. However, unlike H. andersenii, which may need to import at least trnK(uuu) from cytosol, the R. salina mtDNA does possess the minimal required set for tRNA autonomy.
Molecular phylogenetic analyses
Cryptophytes are a well-established eukaryotic lineage, supported by both molecular and morphological features . However, their relationship to other eukaryotic groups, particularly those containing plastids of secondary endosymbiotic origin, has been the subject of considerable debate. The cryptophyte plastid is the product of a secondary endosymbiosis involving a red algal cell, the same process which accounts for plastid origins in haptophytes, dinoflagellates, and stramenopiles . Cavalier-Smith  suggested that plastids in these four algal lineages arose from a single secondary endosymbiosis in a common ancestor that these organisms shared, to the exclusion of other eukaryotic groups. However, this "chromalveolate" hypothesis is controversial [51, 52]. Recent molecular studies have shown that the katablepharids, an enigmatic collection of plastid-less flagellates, are a sister group to cryptophytes [53, 54], and large-scale concatenated analyses of nuclear genes suggest that cryptophytes and haptophytes are also related [55, 56].
As expected, a close relationship between the two cryptophytes H. andersenii and R. salina was well supported in the mitochondrial protein phylogenies, with twenty of twenty-five individual protein phylogenies showing this relationship. Five individual gene phylogenies–nad2, rpl14, rpl16, rps12, rps14–did not recover a H. andersenii-R. salina clade, although alternative topologies were not supported with >50% bootstrap support values. Additionally, single protein phylogenies were not, for the most part, able to resolve the relationship of cryptophytes to other eukaryotes. The position of cryptophytes was highly variable from protein to protein and the group did not regularly associate with other taxonomic clades with >50% bootstrap support values, except for in the cob and nad1 gene trees, where cryptophytes branch with haptophytes (81%) and jakobids (77%), respectively.
We subsequently analyzed a set of 25 concatenated proteins to assess the phylogenetic position of cryptophytes. In this analysis, the H. andersenii-R. salina clade received 100% bootstrap support (Figure 5). Other well-established eukaryotic groups including opisthokonts, rhodophytes, stramenopiles, and Viridiplantae, were also strongly recovered, but the relationships among major lineages were not. The jakobid Reclinomonas branched as the sister group to the Viridiplantae with moderate support (89% bootstrap support), and Malawimonas showed an affinity for these two groups in two of the three data sets, as was previously inferred from a concatenate of ten mitochondrial proteins . It is not clear whether the jabokid (and/or malawimonad)-Viridiplantae affinity is a phylogenetic artifact or reflects the true evolutionary history of mitochondrial genes. Though growing evidence supports a relationship between cryptophytes and haptophytes [55, 56, 58], our extensive mitochondrial protein analyses did not reveal this relationship with reasonable bootstrap support, other than in a single protein gene tree (cob). In summary, while mitochondrial gene sequences are able to resolve some of the eukaryotic lineages determined using other markers, they are at present incapable of resolve the deepest branches of the eukaryotic tree using current phylogenetic methods and with the present level of taxon sampling.
We have sequenced the mitochondrial genome of the cryptophyte H. andersenii and compared it to that of the distantly related cryptophyte R. salina. Our analyses reveal that both genomes are characterized by a gene dense region and a single large intergenic space that includes numerous repeats and palindromic sequences predicted to form stable DNA stem and loop structures. Despite the overall similarities in content and architecture between the two genomes, their modes of regulating DNA replication and transcription seem to differ. Unlike R. salina, all 73 genes in the H. andersenii mtDNA are located on the same strand, a relatively rare observation in mitochondrial genomes. Phylogenic analysis of multiple mitochondrial gene sequences indicated a clear affiliation between the two cryptophytes but was not able to resolve the position of cryptophytes relative to other eukaryotic groups.
We thank D. Spencer for discussion, J. Leigh for mitochondrial protein sequence alignments, H. Khan for H. andersenii RNA, A. Roger for help with phylogenetic analyses, and D. Spencer and H. Khan for helpful comments on the manuscript. A. Bendich is acknowledged for providing insight on the probable in vivo structure of H. andersenii mtDNA. This work was supported by Genome Atlantic and a Natural Sciences and Engineering Research Council of Canada Discovery Grant (28335-04) awarded to JMA. EK receives postdoctoral fellowship support from the Tula Foundation. JMA is a Scholar of the Canadian Institute for Advanced Research, Program in Integrated Microbial Biodiversity.
- Burger G, Gray MW, Lang BF: Mitochondrial genomes: anything goes. Trends Genet. 2003, 19 (12): 709-716. 10.1016/j.tig.2003.10.012.View ArticleGoogle Scholar
- Tovar J, Leon-Avila G, Sanchez LB, Sutak R, Tachezy J, van der Giezen M, Hernandez M, Muller M, Lucocq JM: Mitochondrial remnant organelles of Giardia function in iron-sulphur protein maturation. Nature. 2003, 426 (6963): 172-176. 10.1038/nature01945.View ArticleGoogle Scholar
- Marande W, Burger G: Mitochondrial DNA as a genomic jigsaw puzzle. Science. 2007, 318 (5849): 415-415. 10.1126/science.1148033.View ArticleGoogle Scholar
- Slamovits CH, Saidarriaga JF, Larocque A, Keeling PJ: The highly reduced and fragmented mitochondrial genome of the early-branching dinoflagellate Oxyrrhis marina shares characteristics with both apicomplexan and dinoflagellate mitochondrial genomes. J Mol Biol. 2007, 372 (2): 356-368. 10.1016/j.jmb.2007.06.085.View ArticleGoogle Scholar
- Nosek J, Tomaska L: Mitochondrial genome diversity: evolution of the molecular architecture and replication strategy. Curr Genet. 2003, 44 (2): 73-84. 10.1007/s00294-003-0426-z.View ArticleGoogle Scholar
- Burger G, Zhu Y, Littlejohn TG, Greenwood SJ, Schnare MN, Lang BF, Gray MW: Complete sequence of the mitochondrial genome of Tetrahymena pyriformis and comparison with Paramecium aurelia mitochondrial DNA. J Mol Biol. 2000, 297 (2): 365-380. 10.1006/jmbi.2000.3529.View ArticleGoogle Scholar
- Vahrenholz C, Riemen G, Pratje E, Dujon B, Michaelis G: Mitochondrial DNA of Chlamydomonas reinhardtii: the structure of the ends of the linear 15.8 Kb genome suggests mechanisms for DNA replication. Curr Genet. 1993, 24 (3): 241-247. 10.1007/BF00351798.View ArticleGoogle Scholar
- Shao ZY, Graf S, Chaga OY, Lavrov DV: Mitochondrial genome of the moon jelly Aurelia aurita (Cnidaria, Scyphozoa): A linear DNA molecule encoding a putative DNA-dependent DNA polymerase. Gene. 2006, 381: 92-101. 10.1016/j.gene.2006.06.021.View ArticleGoogle Scholar
- Bendich AJ: Reaching for the ring: the study of mitochondrial genome structure. Curr Genet. 1993, 24 (4): 279-290. 10.1007/BF00336777.View ArticleGoogle Scholar
- Bendich AJ: Structural analysis of mitochondrial DNA molecules from fungi and plants using moving pictures and pulsed-field gel electrophoresis. J Mol Biol. 1996, 255 (4): 564-588. 10.1006/jmbi.1996.0048.View ArticleGoogle Scholar
- Marande W, Lukes J, Burger G: Unique mitochondrial genome structure in diplonemids, the sister group of kinetoplastids. Eukaryot Cell. 2005, 4 (12): 2170-2170. 10.1128/EC.4.12.2170.2005.PubMed CentralView ArticleGoogle Scholar
- Lang BF, Burger G, O'Kelly CJ, Cedergren R, Golding GB, Lemieux C, Sankoff D, Turmel M, Gray MW: An ancestral mitochondrial DNA resembling a eubacterial genome in miniature. Nature. 1997, 387 (6632): 493-497. 10.1038/387493a0.View ArticleGoogle Scholar
- Ji YE, Mericle BL, Rehkopf DH, Anderson JD, Feagin JE: The Plasmodium falciparum 6 kb element is polycistronically transcribed. Mol Biochem Parasitol. 1996, 81 (2): 211-223. 10.1016/0166-6851(96)02712-0.View ArticleGoogle Scholar
- Clemens DL, Johnson PJ: Failure to detect DNA in hydrogenosomes of Trichomonas vaginalis by nick translation and immunomicroscopy. Mol Biochem Parasitol. 2000, 106 (2): 307-313. 10.1016/S0166-6851(99)00220-0.View ArticleGoogle Scholar
- Embley TM, Martin W: Eukaryotic evolution, changes and challenges. Nature. 2006, 440 (7084): 623-630. 10.1038/nature04546.View ArticleGoogle Scholar
- Maslov DA, Avila HA, Lake JA, Simpson L: Evolution of RNA editing in kinetoplastid protozoa. Nature. 1994, 368 (6469): 345-348. 10.1038/368345a0.View ArticleGoogle Scholar
- Knoop V: The mitochondrial DNA of land plants: peculiarities in phylogenetic perspective. Curr Genet. 2004, 46 (3): 123-139. 10.1007/s00294-004-0522-8.View ArticleGoogle Scholar
- Zhang H, Lin S: Mitochondrial cytochrome b mRNA editing in dinoflagellates: possible ecological and evolutionary associations?. J Eukaryot Microbiol. 2005, 52 (6): 538-545. 10.1111/j.1550-7408.2005.00060.x.View ArticleGoogle Scholar
- Lin SJ, Zhang HA, Spencer DF, Norman JE, Gray MW: Widespread and extensive editing of mitochondrial mRNAs in dinoflagellates. J Mol Biol. 2002, 320 (4): 727-739. 10.1016/S0022-2836(02)00468-0.View ArticleGoogle Scholar
- Graham LE, Wilcox LW: Algae. 2000, Upper Saddle River, NJ, Prentice HallGoogle Scholar
- Adl SM, Simpson AGB, Farmer MA, Andersen RA, Anderson OR, Barta JR, Bowser SS, Brugerolle G, Fensome RA, Fredericq S, James TY, Karpov S, Kugrens P, Krug J, Lane CE, Lewis LA, Lodge J, Lynn DH, Mann DG, McCourt RM, Mendoza L, Moestrup O, Mozley-Standridge SE, Nerad TA, Shearer CA, Smirnov AV, Spiegel FW, Taylor MFJR: The new higher level classification of eukaryotes with emphasis on the taxonomy of protists. J Eukaryot Microbiol. 2005, 52 (5): 399-451. 10.1111/j.1550-7408.2005.00053.x.View ArticleGoogle Scholar
- Archibald JM: Nucleomorph genomes: structure, function, origin and evolution. Bioessays. 2007, 29 (4): 392-402. 10.1002/bies.20551.View ArticleGoogle Scholar
- Douglas S, Zauner S, Fraunholz M, Beaton M, Penny S, Deng LT, Wu XN, Reith M, Cavalier-Smith T, Maier UG: The highly reduced genome of an enslaved algal nucleus. Nature. 2001, 410 (6832): 1091-1096. 10.1038/35074092.View ArticleGoogle Scholar
- Hauth AM, Maier UG, Lang BF, Burger G: The Rhodomonas salina mitochondrial genome: bacteria-like operons, compact gene arrangement and complex repeat region. Nucleic Acids Res. 2005, 33 (14): 4433-4442. 10.1093/nar/gki757.PubMed CentralView ArticleGoogle Scholar
- Lane CE, van den Heuvel K, Korera C, Curtis BA, Parsons BJ, Bowman S, Archibald JM: Nucleomorph genome of Hemiselmis andersenii reveals complete intron loss and compaction as a driver of protein structure and function. Proc Natl Acad Sci U S A. 2007, 104: 19908-19913. 10.1073/pnas.0707419104.PubMed CentralView ArticleGoogle Scholar
- Staden R, Beal KF, Bonfield JK: The Staden package, 1998. Methods Mol Biol. 2000, 132: 115-130.Google Scholar
- Maddison DR, Maddison WP: MacClade 4: analysis of phylogeny and character evolution. 2001, Sunderland, MA, Sinauer Associates Inc.Google Scholar
- Zuker M: Mfold web server for nucleic acid folding and hybridization prediction. Nucleic Acids Res. 2003, 31 (13): 3406-3415. 10.1093/nar/gkg595.PubMed CentralView ArticleGoogle Scholar
- Lane CE, Khan H, MacKinnon M, Fong A, Theophilou S, Archibald JM: Insight into the diversity and evolution of the cryptomonad nucleomorph genome. Mol Biol Evol. 2006, 23 (9): 1817-1817.Google Scholar
- Lane CE, Archibald JM: Novel nucleomorph genome architecture in the cryptomonad genus Hemiselmis. J Eukaryot Microbiol. 2006, 53 (6): 515-521. 10.1111/j.1550-7408.2006.00135.x.View ArticleGoogle Scholar
- Rutherford K, Parkhill J, Crook J, Horsnell T, Rice P, Rajandream MA, Barrell B: Artemis: sequence visualization and annotation. Bioinformatics. 2000, 16 (10): 944-945. 10.1093/bioinformatics/16.10.944.View ArticleGoogle Scholar
- Lowe TM, Eddy SR: tRNAscan-SE: a program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997, 25 (5): 955-964. 10.1093/nar/25.5.955.PubMed CentralView ArticleGoogle Scholar
- Tesler G: GRIMM: genome rearrangements web server. Bioinformatics. 2002, 18 (3): 492-493. 10.1093/bioinformatics/18.3.492.View ArticleGoogle Scholar
- Stamatakis A: RAxML-VI-HPC: maximum likelihood-based phylogenetic analyses with thousands of taxa and mixed models. Bioinformatics. 2006, 22 (21): 2688-2690. 10.1093/bioinformatics/btl446.View ArticleGoogle Scholar
- Burger G, Plante I, Lonergan KM, Gray MW: The mitochondria DNA of the ameboid protozoan, Acanthamoeba castellanii: complete sequence, gene content and genome organization. J Mol Biol. 1995, 245 (5): 522-537. 10.1006/jmbi.1994.0043.View ArticleGoogle Scholar
- Woo PCY, Zhen HJ, Cai JJ, Yu J, Lau SKP, Wang J, Teng JLL, Wong SSY, Tse RH, Chen R, Yang HM, Liu B, Yuen KY: The mitochondrial genome of the thermal dimorphic fungus Penicillium marneffei is more closely related to those of molds than yeasts. FEBS Lett. 2003, 555 (3): 469-477. 10.1016/S0014-5793(03)01307-3.View ArticleGoogle Scholar
- Denovanwright EM, Lee RW: Comparative structure and genomic organization of the discontinuous mitochondrial ribosomal RNA genes of Chlamydomonas eugametos and Chlamydomonas reinhardtii. J Mol Biol. 1994, 241 (2): 298-311. 10.1006/jmbi.1994.1505.View ArticleGoogle Scholar
- Turmel M, Lemieux C, Burger G, Lang BF, Otis C, Plante I, Gray MW: The complete mitochondrial DNA sequences of Nephroselmis olivacea and Pedinomonas minor. Two radically different evolutionary patterns within green algae. Plant Cell. 1999, 11 (9): 1717-1730. 10.1105/tpc.11.9.1717.PubMed CentralGoogle Scholar
- Mclean MJ, Wolfe KH, Devine KM: Base composition skews, replication orientation, and gene orientation in 12 prokaryote genomes. J Mol Evol. 1998, 47 (6): 691-696. 10.1007/PL00006428.View ArticleGoogle Scholar
- Marck C, Grosjean H: tRNomics: analysis of tRNA genes from 50 genomes of Eukarya, Archaea, and Bacteria reveals anticodon-sparing strategies and domain-specific features. RNA. 2002, 8 (10): 1189-1232. 10.1017/S1355838202022021.PubMed CentralView ArticleGoogle Scholar
- Esseiva AC, Naguleswaran A, Hemphill A, Schneider A: Mitochondrial tRNA import in Toxoplasma gondii. J Biol Chem. 2004, 279 (41): 42363-42368. 10.1074/jbc.M404519200.View ArticleGoogle Scholar
- Glover KE, Spencer DF, Gray MW: Identification and structural characterization of nucleus-encoded transfer RNAs imported into wheat mitochondria. J Biol Chem. 2001, 276 (1): 639-648. 10.1074/jbc.M007708200.View ArticleGoogle Scholar
- Schneider A, Marechal-Drouard L: Mitochondrial tRNA import: are there distinct mechanisms?. Trends Cell Biol. 2000, 10 (12): 509-513. 10.1016/S0962-8924(00)01854-7.View ArticleGoogle Scholar
- Hopper AK, Phizicky EM: tRNA transfers to the limelight. Genes Dev. 2003, 17 (2): 162-180. 10.1101/gad.1049103.View ArticleGoogle Scholar
- Gray MW, Lang BF, Cedergren R, Golding GB, Lemieux C, Sankoff D, Turmel M, Brossard N, Delage E, Littlejohn TG, Plante I, Rioux P, Saint-Louis D, Zhu Y, Burger G: Genome structure and gene content in protist mitochondrial DNAs. Nucleic Acids Res. 1998, 26 (4): 865-878. 10.1093/nar/26.4.865.PubMed CentralView ArticleGoogle Scholar
- Borner GV, Morl M, Janke A, Paabo S: RNA editing changes the identity of a mitochondrial tRNA in marsupials. EMBO J. 1996, 15 (21): 5949-5957.PubMed CentralGoogle Scholar
- Schnare MN, Greenwood SJ, Gray MW: Primary sequence and posttranscriptional modification pattern of an unusual mitochondrial tRNA(Met) from Tetrahymena pyriformis. FEBS Lett. 1995, 362 (1): 24-28. 10.1016/0014-5793(95)00179-D.View ArticleGoogle Scholar
- Steinberg S, Cedergren R: Structural compensation in atypical mitochondrial transfer RNAs. Nat Struct Biol. 1994, 1 (8): 507-510. 10.1038/nsb0894-507.View ArticleGoogle Scholar
- Lang BF, Laforest MJ, Burger G: Mitochondrial introns: a critical view. Trends Genet. 2007, 23 (3): 119-125. 10.1016/j.tig.2007.01.006.View ArticleGoogle Scholar
- Cavalier-Smith T: Principles of protein and lipid targeting in secondary symbiogenesis: euglenoid, dinoflagellate, and sporozoan plastid origins and the eukaryote family tree. J Eukaryot Microbiol. 1999, 46 (4): 347-366. 10.1111/j.1550-7408.1999.tb04614.x.View ArticleGoogle Scholar
- Grzebyk D, Katz ME, Knoll AH, Quigg A, Raven JA, Schofield O, Taylor FJR, Falkowski PG: Response to comment on "The evolution of modern eukaryotic phytoplankton". Science. 2004, 306 (5705): 2191c-10.1126/science.1105297.View ArticleGoogle Scholar
- Keeling PJ, Archibald JM, Fast NM, Palmer JD: Comment on "The evolution of modern eukaryotic phytoplankton". Science. 2004, 306 (5705): 2191b-10.1126/science.1103879.View ArticleGoogle Scholar
- Kim E, Simpson AGB, Graham LE: Evolutionary relationships of apusomonads inferred from taxon-rich analyses of 6 nuclear encoded genes. Mol Biol Evol. 2006, 23 (12): 2455-2466. 10.1093/molbev/msl120.View ArticleGoogle Scholar
- Okamoto N, Inouye I: The katablepharids are a distant sister group of the Cryptophyta: a proposal for Katablepharidophyta divisio nova/Kathablepharida phylum novum based on SSU rDNA and beta-tubulin phylogeny. Protist. 2005, 156 (2): 163-179. 10.1016/j.protis.2004.12.003.View ArticleGoogle Scholar
- Hackett JD, Yoon HS, Li S, Reyes-Prieto A, Rummele SE, Bhattacharya D: Phylogenomic analysis supports the monophyly of cryptophytes and haptophytes and the association of Rhizaria with Chromalveolates. Mol Biol Evol. 2007, 24 (8): 1702-1713. 10.1093/molbev/msm089.View ArticleGoogle Scholar
- Patron NJ, Inagaki Y, Keeling PJ: Multiple gene phylogenies support the monophyly of cryptomonad and haptophyte host lineages. Curr Biol. 2007, 17 (10): 887-891. 10.1016/j.cub.2007.03.069.View ArticleGoogle Scholar
- Khan H, Parks N, Kozera C, Curtis BA, Parsons BJ, Bowman S, Archibald JM: Plastid genome sequence of the cryptophyte alga Rhodomonas salina CCMP1319: lateral transfer of putative DNA replication machinery and a test of chromist plastid phylogeny. Mol Biol Evol. 2007, 24 (8): 1832-1842. 10.1093/molbev/msm101.View ArticleGoogle Scholar
- Rice DW, Palmer JD: An exceptional horizontal gene transfer in plastids: gene replacement by a distant bacterial paralog and evidence that haptophyte and cryptophyte plastids are sisters. BMC Biol. 2006, 4: 31-10.1186/1741-7007-4-31.PubMed CentralView ArticleGoogle Scholar
- Secq MPO, Goer SL, Stam WT, Olsen JL: Complete mitochondrial genomes of the three brown algae (Heterokonta : Phaeophyceae) Dictyota dichotoma, Fucus vesiculosus and Desmarestia viridis. Curr Genet. 2006, 49 (1): 47-58. 10.1007/s00294-005-0031-4.View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.