Complete chloroplast genome of Gracilaria firma (Gracilariaceae, Rhodophyta), with discussion on the use of chloroplast phylogenomics in the subclass Rhodymeniophycidae
© The Author(s). 2017
Received: 18 October 2016
Accepted: 22 December 2016
Published: 6 January 2017
The chloroplast genome of Gracilaria firma was sequenced in view of its role as an economically important marine crop with wide industrial applications. To date, there are only 15 chloroplast genomes published for the Florideophyceae. Apart from presenting the complete chloroplast genome of G. firma, this study also assessed the utility of genome-scale data to address the phylogenetic relationships within the subclass Rhodymeniophycidae. The synteny and genome structure of the chloroplast genomes across the taxa of Eurhodophytina was also examined.
The chloroplast genome of Gracilaria firma maps as a circular molecule of 187,001 bp and contains 252 genes, which are distributed on both strands and consist of 35 RNA genes (3 rRNAs, 30 tRNAs, tmRNA and a ribonuclease P RNA component) and 217 protein-coding genes, including the unidentified open reading frames. The chloroplast genome of G. firma is by far the largest reported for Gracilariaceae, featuring a unique intergenic region of about 7000 bp with discontinuous vestiges of red algal plasmid DNA sequences interspersed between the nblA and cpeB genes. This chloroplast genome shows similar gene content and order to other Florideophycean taxa. Phylogenomic analyses based on the concatenated amino acid sequences of 146 protein-coding genes confirmed the monophyly of the classes Bangiophyceae and Florideophyceae with full nodal support. Relationships within the subclass Rhodymeniophycidae in Florideophyceae received moderate to strong nodal support, and the monotypic family of Gracilariales were resolved with maximum support.
Chloroplast genomes hold substantial information that can be tapped for resolving the phylogenetic relationships of difficult regions in the Rhodymeniophycidae, which are perceived to have experienced rapid radiation and thus received low nodal support, as exemplified in this study. The present study shows that chloroplast genome of G. firma could serve as a key link to the full resolution of Gracilaria sensu lato complex and recognition of Hydropuntia as a genus distinct from Gracilaria sensu stricto.
Rhodophyta is a monophyletic phylum currently divided into seven classes, including Bangiophyceae, Compsopogonophyceae, Cyanidiophyceae, Florideophyceae, Porphyridiophyceae, Rhodellophyceae and Stylonematophyceae [1, 2]. The Florideophyceae which accommodates more than 6700 species of red algae  has only chloroplast genomes of 15 species published to date. Despite the sporadic studies [4–6] on the phylogenetic relationships among the Florideophyceae inferred using chloroplast genome data, the whole chloroplast genome has been demonstrated to be a promising resource to resolve the red algal relationships, attributable to the conserved nature of the slowly-evolving genome . In line with that, emergence of the next generation high-throughput sequencing technologies and the associated exponential decline in the cost of sequencing would open up the opportunity for more whole genome sequencing projects targeted to gain more novel insights into the evolutionary relationships within the red algal lineage.
Gracilaria firma Chang et Xia is an agar-producing seaweed distributed in the tropical and subtropical regions in the western Pacific [7–9]. It is an economically important marine crop that has been cultivated on commercial scale in several countries including Taiwan , Vietnam  and the Philippines . The seaweed exemplifies superior growth and agar quality among other investigated Gracilarioid agarophytes . Apart from serving as the source of agar which seeks versatile applications in food and pharmaceutical industries, G. firma is also harvested and made into local delicacies for human consumption . Some regional abalone farms preferred using G. firma as the natural feed .
This study presents the complete chloroplast genome of G. firma, adding to the number of chloroplast genome available for the genus Gracilaria, one of the largest red algal genera which encompassed more than a hundred species  and has undergone numerous taxonomic revisions with contradicting conclusions [14–16]. Analysis of the synteny and genome structure concurred that the red algal chloroplast genomes are very compact and conserved across the subphylum Eurhodophytina, with different orders exhibiting syntenies that discriminate lineages. Phylogenomic analyses including most of the recently available (as of the date of writing this manuscript in June 2016) taxa across eight orders of Florideophyceae were conducted. The relevance of chloroplast genomic data in addressing the phylogenetic relationships at different hierarchical levels in Rhodymeniophycidae was also discussed.
Taxon sampling and sequencing
The plant material of G. firma was procured from a cultivation farm operated in Kouhu Township, Yunlin County, Taiwan. Genomic DNA was extracted from fresh thallus of G. firma using the DNeasy Plant Mini Kit (Qiagen, Valencia, CA, USA) according to the manufacturer’s instruction and sent to a company for library prep and sequencing (ScienceVision Sdn Bhd, Selangor, Malaysia). A corresponding voucher specimen is deposited in the Herbarium of the National Taiwan Ocean University, Taiwan (NTOU) under the accession number NTOU-KH-5i2016-Gf. The library was prepared using a Nextera XT kit (Illumina, San Diego, CA, USA) and sequenced with 250 bp pair-end reads on the MiSeq sequencing platform (Illumina, San Diego, CA, USA).
De novo assembly and annotation of the Gracilaria firma chloroplast genome
A combination of automated pipelines and manual verification was used to annotate the chloroplast genome of G. firma. Initial gene calling was accomplished using DOGMA  with a 60% cutoff for protein-coding genes, 80% for RNAs, and an e-value cut-off of 1e-5 for BLAST hits. The rRNA genes were determined using RNAmmer 1.2 server  with the ‘kingdom of input sequences’ selected as ‘Bacteria’. The tRNAs were identified using tRNAscan-SE v1.21  with default parameters and the source identified as ‘Mito/Chloroplast’. Intron and tmRNA were searched using ARAGORN , while the ribonuclease P gene (rnpB) was detected using RNAweasel . Open reading frames (ORFs) longer than 25 amino acids within the intergenic regions were searched using the NCBI’s ORF finder (https://www.ncbi.nlm.nih.gov/orffinder), with the genetic code option of ‘bacterial, archaeal and plant plastid’. BLASTp homology searches against all the non-redundant protein sequences from GenBank were used to determine the start and stop codon positions for each protein-coding gene, including those detected by DOGMA and the small or missing genes recovered from ORF finder. The circular genome map was generated using OGDraw . The G. firma chloroplast genome sequence and annotation was deposited into GenBank under accession number KX601051.
Synteny analyses on the red algal chloroplast genomes
Whole-genome alignments were generated using the progressiveMauve aligner implemented in Mauve v20150226  under default settings. The synteny within the Gracilariales was assessed from the alignment of the chloroplast genomes of G. firma and related species from the same order, with which the genome sequences were reoriented to have the beginning of the psaM gene as the first position for alignment. In view of the variable genome structure as well as the lack of coherence in the designation of the first gene for chloroplast genomes across the Eurhodophytina, the synteny among the Florideophyceae and Bangiophyceae was assessed by visual inspection of the whole-genome alignments variously generated with the chloroplast genome of G. firma serving as the reference to detect the conserved segments of sequence free from any internal rearrangements, also known as the locally collinear blocks (LCBs).
Red algal taxa analyzed in this study and their chloroplast genome composition
Species and GenBank accession number
Cyanidioschyzon merolae (AB002583)
Cyanidium caldarium (AF022186)
Porphyridium purpureum (AP012987)
Porphyra umbilicalis (JQ408795)
Pyropia haitanensis (KC464603)
Wildemania schizophylla (KR028420)
Bangia atropurpurea (KR028420)
Calliarthron tuberculosum (KC153978)
Sporolithon durum (KT266785)
Chondrus crispus (HF562234)
Gelidium elegans (KT266786)
Gelidium vagum (KT266787)
Gracilaria firma (KX601051)
Gracilaria salicornia (KF861575)
Gracilaria chilensis (KT266788)
Gracilaria tenuistipitata var. liui (AY673996)
Gracilariopsis lemaneiformis (KU179794)
Grateloupia taiwanensis (KC894740)
Coeloseira compressa (KU053957)
Laurencia snackeyi (LN833431)
Vertebrata lanosa (KP308097)
Amino acid sequences from each individual gene were aligned using MAFFT v7.222  with the ‘L-INS-i’ strategy. Each individual alignment was subjected to trimming using the program Gblocks v0.91b available at http://phylogeny.lirmm.fr/phylo_cgi/one_task.cgi?task_type=gblocks  for the removal of ambiguous regions which were poorly aligned or contained gaps under the setting for a more stringent selection that does not allow many contiguous non-conserved positions. The alignments of each individual gene were variously concatenated using Bioedit v7.2.5  to result in two datasets comprised of 20,033 and 27,205 positions each. Maximum likelihood (ML) tree search were implemented in PhyML v3.0 , based on the MtZoa + G + I + F and cpREV + G + I + F model automatically selected by the program for the 79-gene and 146-gene datasets respectively. Branch support for both datasets were evaluated using the SH-like approximate Likelihood Ratio Test (SH-aLRT) implemented in PhyML. Only the 146-gene dataset was subjected to bootstrap analysis with 1000 bootstrap replicates. Bayesian inference (BI) was conducted with MrBayes v3.2.6 , using the cpREV + G + I + F model on both datasets as the best-fitting MtZoa + G + I + F model deduced for the 79-gene dataset was not implemented in MrBayes, with two parallel independent runs, each of which consisted of one cold chain and three hot chains of Markov chain Monte Carlo iterations for one million generations. The trees were sampled every 100th generation. Convergence of the runs to the stationary distribution was determined by looking at the standard deviation of split frequencies (always less than 0.01) and by the convergence of the parameter values in the two independent runs. The first 25% of the total number of the trees were discarded as burn-in, and the remaining trees were used to calculate a 50% majority rule tree and to determine the posterior probabilities for all datasets. Members of the Cyanidiales, Cyanidioschyzon merolae and Cyanidium caldarium were designated as the outgroup taxa based on the global phylogenetic searches recovered for the red algae .
Results and discussion
Characteristics of the chloroplast genome of Gracilaria firma
Genetic information is densely packed in the chloroplast genome of G. firma, with the combined coding regions spanning 83.5% of the genomic sequence. The intergenic spacers were 122 bp long on average and overlapping ORFs between seven gene pairs were observed, including psbC-psbD (7 bp), rpl24-rpl14 (1 bp), rpl14-rps17 (4 bp), rps17-rpl29 (4 bp), rpl23-rpl4 (28 bp), rps18-rpl33 (4 bp) and atpF-atpD (4 bp). Such overlapping is not unprecedented in the red algal chloroplast genome. For instance, the overlapping of psbC-psbD had been reported in Gracilariopsis lemaneiformis, G. tenuistipitata var. liui and Laurencia snackeyi [5, 38, 39]. Cyanidioschyzon merolae even showed up to 40% of gene overlapping in the chloroplast genome, including the rps14-rps17 gene pair .
Functional classification of Gracilaria firma chloroplast genes
dnaB, rne, rnz, mat
rpoA, rpoB, rpoC1, rpoC2, rpoZ
ompR, rbcR, tctD, ntcA
infB, infC, tsf, tufA
rpl1, rpl2, rpl3, rpl4, rpl5, rpl6, rpl9, rpl11, rpl12, rpl13, rpl14, rpl16, rpl18, rpl19, rpl20, rpl21, rpl22, rpl23, rpl24, rpl27, rpl28, rpl29, rpl31, rpl32, rpl33, rpl34, rpl35, rpl36, rps1, rps2, rps3, rps4, rps5, rps6, rps7, rps8, rps9, rps10, rps11, rps12, rps13, rps14, rps16, rps17, rps18, rps19, rps20
Protein quality control
clpC, dnaK, ftsH, groEL
apcA, apcB, apcD, apcE, apcF, cpcA, cpcB, cpcG, cpeA, cpeB, nblA, ycf58
psaA, psaB, psaC, psaD, psaE, psaF, psaI, psaJ, psaK, psaL, psaM, ycf3, ycf4
psbA, psbB, psbC, psbD, psbE, psbF, psbH, psbI, psbJ, psbK, psbL, psbN, psbT, psbV, psbW, psbX, psbY, psbZ, ycf12
petA, petB, petD, petF, petG, petJ, petL, petM, petN, ccsA, ccs1
acsF, bas1, ftrB, dsbD, trxA
atpA, atpB, atpD, atpE, atpF, atpG, atpH, atpI
odpA, odpB, rbcL, rbcS, cbbX, pgmA
accA, accB, accD, acpP, fabH
argB, gltB, syh, ilvB, ilvH, trpA, trpG, syfB
chlI, moeB, preA, thiG
cemA, secA, secG, secY, tatC, sufB, sufC, ycf38, ycf63
dfr, ycf17, ycf19, ycf20, ycf21, ycf22, ycf23, ycf33, ycf34, ycf35, ycf36, ycf37, ycf39, ycf40, ycf45, ycf46, ycf52, ycf53, ycf54, ycf55, ycf60, ycf65, ycf80
Gfir_ORF1, Gfir_ORF2, Gfir_ORF3, …, Gfir_ORF23
rrf, rrs, rrl
trnA, trnC, trnD, trnE, trnF, trnG, trnG, trnH, trnI, trnK, trnL, trnL, trnL, trnM, trnM, trnM, trnN, trnP, trnQ, trnR, trnR, trnR, trnS, trnS, trnT, trnT, trnV, trnV, trnW, trnY
Red algal plasmid-derived regions in the chloroplast genome of Gracilaria firma
Gfir_ORF1 and Gfir_ORF2 are two overlapping ORFs which appear to be truncated versions of single ORF (ORF5/ORF2) found in the Gch7220 or Gch3937 plasmids of G. chilensis, corresponding to amino acids 1 to 36 and 44 to 101, respectively (Additional file 1). Gfir_ORF20 which shares 76 and 74% amino acid identity with ORF5 from Gch7220 and ORF2 from Gch3937 at 199 amino acids is slightly longer than the comparable ORF from G. chilensis plasmids (Additional file 1). The high sequence identity may indicate recent horizontal transfer of this ORF from G. chilensis-specific plasmid to G. firma chloroplast. Gfir_ORF18 and Gfir_ORF19 display considerable amino acid similarity with ORF4 (61% amino acid identity corresponding to amino acids 17 to 91) and ORF7 (64% amino acid identity corresponding to amino acids 23 to 61) of Gro4970 plasmid respectively. The remaining ORFs are homologous to all five ORFs in the Gle4293 plasmid of Gp. lemaneiformis. Gfir_ORF17, Gfir_ORF16, Gfir_ORF15, Gfir_ORF14 and Gfir_ORF13 showed similarity to ORF1, ORF2, ORF3, ORF4 and ORF5 of Gle4293 plasmid (Fig. 4), with each ORF corresponding to amino acids 57 to 417 (53% identity), 16 to 132 (49% identity), 17 to 262 (53% identity), 1 to 168 (59% identity) and 17 to 149 (58% identity). Apart from Gfir_ORF17, Gfir_ORF12, Gfir_ORF21, Gfir_ORF22 and Gfir_ORF23 are also truncated versions of the ORF1 from Gle4293 plasmid, each corresponding to amino acids 1 to 161 (42% identity), 375 to 418 (59% identity), 275 to 359 (64% identity) and 226 to 274 (61% identity). The integration of whole plasmid into the chloroplast of G. firma may have occurred, considering the presence of degenerated fragments homologous to all five ORFs from Gle4293 plasmid in a contiguous manner in the intergenic region between the nblA and cpeB genes of the G. firma chloroplast genome.
Whole-genome alignment of the chloroplast genome of all five members of Gracilariales revealed a common syntenic break at the position between 30,000 and 40,000, which corresponds to the intergenic regions within the rrs-ompR-psbD cluster (Fig. 3). This region represents the “hotspot” for plasmid integration in the chloroplast genome of Gracilariales, as ORFs homologous to red algal plasmids were found in this region based on the previous studies [5, 6] (Fig. 4). The intergenic region between ompR and psbD in the chloroplast genome of G. salicornia has an ORF that shares strong similarity with ORF1 of Gch7220 plasmid [5, 6]. And the corresponding spacer region in the chloroplast genome of G. chilensis has a gene pair leuC-leuD of bacterial origin and three plasmid-derived regions which are homologous to the ORF1, ORF2 and ORF3 from Gle4293 plasmid respectively . In the chloroplast genome of G. tenuistipitata var. liui, the intergenic region between rrs and ompR in the chloroplast genome was reported to have a plasmid-derived region homologous to ORF1 of Gle4293 plasmid , and the region between ompR and psbD has a gene pair leuC-leuD and an ORF similar to ORF14 of Gch7220 plasmid . Three ORFs homologous to the ORF5 and ORF1 of Gle4392 plasmid were found in the spacer region between rrs and ompR in the chloroplast genome of Gp. lemaneiformis. A notable feature unique to the chloroplast genome of G. firma is the presence of a relatively large syntenic break with 12 red algal plasmid homologs spanning about 7 kb at position ~153,000 between nblA and cpeB (asterisk in Fig. 3). This region was not observed in any other examined taxa, and likely to have contributed to the expansion of chloroplast genome size in G. firma.
Despite the caveat that the sequenced library was not searched to confirm the existence of extrachromosomal plasmid in G. firma, we preclude the possibility of mapping artefact in the assembly for the expanded chloroplast genome on two grounds: (1) single contigs with identical sequence were obtained from two independent assemblies using different de novo and reference-guided assemblers, and (2) the presence of red algal plasmid remnants in the chloroplast genome has been reported in red algal taxa with or without naturally occurring extrachromosomal plasmid [4, 5, 38, 41, 42]. Previous studies have verified the occurrence of plasmid-derived regions in the chloroplast genome of Gp. lemaneiformis , and confirmed the consistency of the plasmid-derived sequences within individuals and populations of Gelidium elegans, Porphyra pulchra and Sporolithon durum , using customized primer pairs for PCR. The copy number and position of the homologous plasmid-derived ORFs in chloroplast genomes is inconsistent with the red algal phylogenetic relationships . Incorporation of foreign genetic materials into the maternally inherited organelles may have rendered their fixation in a population. The plasmid may have recombined randomly during the integration into the chloroplast genome and led to gene truncation and subsequent gene loss, such that vestiges of red algal plasmid were perceived only in certain but not all red algal species. Such species-specific integration of plasmid-derived ORFs is consistent with the evolution of mobile genetic elements. The plasmids are thought to be analogous to transposable elements with mobility that can contribute to the gain or loss among closely related genomes . However, the actual role of red algal plasmids remains elusive. While recognizing the possible role of red algal plasmids in mediating gene transfer between foreign DNA and organelles, a study considered the plasmids as parasitic elements that spread plasmid-derived DNA regions in different organelles . It was suggested that the ubiquitous red algal plasmid remnants may be associated with their function in ancient horizontal gene transfer among nucleus, chloroplast and mitochondrion genomes .
Evolution of chloroplast genome in Eurhodophytina
The gene content and gene order of 21 chloroplast genomes representing the classes Bangiophyceae and Florideophyceae was compared to infer the evolution of chloroplast genome in red algae. The highly divergent unicellular thermophiles of the classes Cyanidiophyceae and Porphyridiophyceae were not included in the genome alignment. Only one representative from each genus of Bangiophyceae was included in the analyses to emphasize the resolution of the relationship within Florideophyceae using chloroplast genome data. The parasitic red alga Choreocolax was also excluded from the genome alignment as it experienced many rearrangements and losses despite sharing regions of synteny with other Florideophycean taxa .
Gene cluster content of Eurhodophytina
Conserved genes present in LCB
orf243 [ycf27 or ompR] , psbD, psbC, orf198 [upp], ycf19, rps16, ycf65, groEL, trnR(ACG), syh, trnQ(UUG), trnR(CCU)
psbW, syfB, rps1, orf71 [ycf40 or thiS], trnH(GUG), ycf29 [tctD], ycf28 [ntcA], petB, petD, rpl12, rpl1, rpl11, trnW(CCA), orf277 [moeB], trnS(GGA), rpl9
dnaB, trnF(GAA), clpC
rpl19, trnV(GAC), trnY(GUA), trnT(GGU), ilvB, ycf33, argB, trnM(CAU), trnA(GGC), trnS(GCU), trnD(GUC), ftsH, trnS(CGA), psaE, psaH, psaN, psaT, psbB, ycf38, petF
rps14, petG, psbK, psbZ, trnG(GCC), dnaK, rpl3, rpl4, rpl23, rpl2, rps19, rpl22, rps3, rpl16, rpl29, rps17, rpl14, rpl24, rpl5, rps8, rpl6, rpl18, rps5, secY, rpl36, rps13, rps11, rpoA, rpl13, rps9, rpl31, rps12, rps7, tufA, rps10
psaB, psaA, accB, orf565 [ycf45], acpP, trnS(UGA), psaD, chlB, ycf59 [ascF], petN, orf71 [secG or ycf47], ycf36, trnM(CAU), orf199 [bas1], pbsA, rpl35, rpl20, preA, odpA [pdhA], odpB [pdhB], petA, tatC, apcE, apcA, apcB, atpE, atpB, ycf3, infB, rps18, rpl33, rps20, rpoB, rpoC1, rpoC2, rps2, tsf, atpI, atpH, atpG, atpF, atpD, atpA, ycf16 [sufC], ycf24 [sufB], trnL(CAA), ycf39, psbI, orf149 [ycf58], cemA, trnM(CAU)
infC, ilvH, trnL(UAA), trnC(GCA), cbbX, rbcS, rbcL, trxA, rpl28, trnT(UGU), psaL, ycf7 [petL], ycf4, trnG(UCC)
rps4, orf450 [ycf80], trnR(CCG), apcF, ycf20
trnL(UAG) , ycf35, psbA, rne, rpl27, rpl21, orf263 [ycf56], rpl32, ycf32 [psbY], rbcR, orf108 [ycf54], orf320 [ycf55], orf238 [ycf53], carA, petJ, psbV, accD, psbX, trnL(GAG) , fabH, apcD, psaJ, psaF, orf174 [ycf52], trnP(UGG), ycf37, rpl34, ycf46, chlN, chlL
psaM, chlI, trnR(UCU), trnV(UAC), orf263a [ycf63], ycf26 [dfr], psbE, psbF, psbL, psbJ, psaI, ftrB, ycf12 [psb30], orf327 [ycf62 or tilS], trnK(UUU), trpA, trnE(UUC), secA, ycf21, pgmA, cpcA, cpcB, ycf61 [rpoZ], gltB, psaC, ycf23, ycf22, accA, psaK, ccsA, cpeA, cpeB, ycf18 [nblA], cpcG
trnN(GUU), orf240 [dsbD or ccdA], ccs1, trpG, thiG, orf203 [ycf60], rps6
Fifteen LCBs were identified chloroplast genomes, with blocks A–K comprising the protein-coding and tRNA genes, and blocks L–P representing the rRNA operon (Fig. 5). Blocks E, F, I and J encompassed a large portion of the conserved gene repertoire of red algal chloroplast genomes. Some blocks comprised of only one or two genes, including blocks L, M, N and P, as a result of single gene translocation (Table 3). Most of the genomic rearrangements involved simultaneous inversion or translocation of gene cluster in the collinear blocks B, C, E, G, I, J, and K. Major syntenic differences observed between Bangiophyceae and Corallinophycidae were the inversion of collinear blocks E, G, J and K, and the translocation of block I from the original position immediately downstream of block H exhibited by Bangiophyceae, to a novel position between block K and the rRNA operon. The gene order has been highly conserved since the split from Bangiophyceae and a high degree of synteny was observed in Florideophyceae. Representatives of the Corallinophycidae, Ceramiales and Halymeniales exhibited similar gene order across the examined Florideophycean lineages, with some gene losses in the Rhodymeniophycidae as mentioned above. An inversion of the collinear block (I + K) occurred in the Gracilariales, Gigartinales, Gelidiales and Rhodymeniales with respect to its orientation in the Corallinophycidae, Ceramiales and Halymeniales. In addition, C. compressa of the Rhodymeniales featured a unique translocation of the three rRNA genes (blocks L, M and P) from the original position downstream of block K as in the Gracilariales, Gigartinales and Gelidiales, to a position flanked by block J upstream and block I downstream. The Gracilariales exclusively exhibited inversion of the blocks B and C not seen in other orders.
The Florideophycean taxa possessed only a single copy of rRNA operon typically comprised of the rrf, rrl, trnA, trnI and rrs genes, except for the rRNA operon of Coeloseira compressa which comprised only three rRNA genes. The representatives of Bangiophyceae showed more variations in the number of the rRNA genes. Porphyra umbilicalis and Pyropia haitanensis possessed two copies of rRNA operon each made up of the rrf-rrl-trnA-trnI-rrs gene cluster as direct repeats, with one copy of the operon positioned between blocks I and J, and another downstream of block K. Hughey et al.  reported that Pyropia perforata lacked the second copy of rRNA operon found in other species of Porphyra and Pyropia. Instead of having two copies of rRNA operon like most species of Porphyra and Pyropia, both Bangia atropurpurea and Wildemania schizophylla have an additional rrf gene on top of the single copy of rRNA operon possessed.
Inclusion of more red algal taxa of the Bangiophyceae and Florideophyceae for the comparison of genomic synteny has resulted in the observation of more rearrangement patterns across the subphylum Eurhodophytina, with an increase in the number of conserved gene cluster from 11 in previous studies [4, 38] to 15 in present study. The chloroplast genomes of the Bangiales which had four genera examined in the present study exhibited identical genome structure when the rRNA operons were not taken into consideration. The red algal chloroplast genomes are highly conserved in gene content and order, considering the relatively minimal number of gene lost and extent of genomic rearrangement across Eurhodophytina over a substantial evolutionary distance since the divergence of Bangiophyceae and Florideophyceae that has occurred at least 940 million years ago [5, 47]. However, the gene order pattern was not reflective of the ordinal relationships inferred from the phylogenomic analyses. Parallel evolution in gene order is observed in Florideophyceae such that identical gene order pattern arose independently several times in different lineages.
Phylogenetic relationships inferred using the chloroplast genome
The relationships among most of the orders in the subclass Rhodymeniophycidae have been identified as one of the evolutionary lineages in the red algal phylogeny by Verbruggen et al. . These relationships require considerable analyses by the inclusion of more markers and wider taxon sampling to increase the informative characters for better phylogenetic resolution. The poorly supported interordinal relationships within the subclass is largely attributed to the rapid radiation of lineages (i.e. the Gigartinales sensu lato) at the base of the subclass. We refrain from drawing concluding remarks on the phylogenetic relationships of the Rhodymeniophycidae with notion that the mere addition of one taxon from previously studied Gracilariales in the underrepresented taxon sampling will not result in significant changes to the interordinal relationships within the subclass. Relative phylogenetic affinities of the taxa (or the representative orders) examined in this study based on the analyses using chloroplast genome data in this study were compared with those attained from previous work based on different sets of markers over wide sampling [47–50]. Interordinal relationships within the subclass Rhodymeniophycidae recovered in this study are congruent with the results inferred using the mitochondrial genome  and also multiple markers of different genomic origins in some studies (rbcL, psaA, psbA, EF2, SSU, LSU and cox1 ; SSU, LSU, EF2, rbcL and CO1-5P ). These results, however, are in contrast with that inferred using DNA data of 14 markers (EF2, 23S rDNA, 28S rDNA, 16S rDNA, 18S rDNA, cox1, psaA, psaB, psbA, psbC, psbD, rbcL, rbcS and tufA) belonging to all three genomic compartments mainly mined from GenBank, of which ten markers were poorly represented . The first divergence of Chondrus crispus in Gigartinales from other orders received strong nodal support in this study and moderate to strong support in previous studies [47–49]. The placement of the family accommodating this taxon in the most recently derived order in Florideophyceae was poorly supported in . In contrast to the poorly supported grouping of Gracilariales with Ceramiales and Gelidiales in , this study recovered a moderately supported relationships among the Gracilariales, Halymeniales and Rhodymeniales (BS = 74, PP = 1.00). Consistent relationships among Gracilariales, Halymeniales and Rhodymeniales with moderate nodal support (BS = 67, PP = 1.00) was also recovered in the phylogeny inferred using mitochondrial genome and wider taxon sampling , but the relationships among these three orders were poorly supported in phylogenies constructed using multigene data over broad taxonomic breadth of Rhodymeniophycidae [47, 49]. While large multi-locus datasets for a broad taxonomic breadth are considered to be the preferred solution to resolve the relationships in the Rhodymeniophycidae , genome-scale data based on limited taxon sampling as in this study can still recover similar phylogenetic inference for the common taxa used in different studies. This suggests that the use of chloroplast genome data could help to improve the support for the interordinal relationships previously identified using multiple markers on Florideophyceae-wide sampling.
Classification in the family Gracilariales has always been in a state of flux over the years, with more than hundred species passing under the generic names of Gracilaria and Hydropuntia based on the morphological and anatomical features [14–16]. It was not until the rather comprehensive molecular study based on the rbcL gene had successfully delineated the Gracilaria sensu lato into three groups which correspond to a new lineage and the previously defined genera of Gracilaria and Hydropuntia, that the systematics within Gracilariaceae was considered stabilized . Hydropuntia was maintained as a genus distinct from Gracilaria based on the well-circumscribed reproductive features, despite the lineage being supported by high Bayesian posterior probabilities and low ML bootstrap percentages in the molecular phylogeny . It differs from the Gracilaria sensu stricto which features cystocarps composed of gonimoblasts and carposporangia arranged in longer chains that are dichotomously or irregularly branching and tubular filaments connecting the gonimoblasts to the pericarp, by possessing cystocarps composed of gonimoblasts cells in short chains terminating in apical carposporangia that are arranged radially, and tubular filaments connecting the gonimoblasts to the base of the cystocarp. However, a recent study on the phylogeny of Gracilariaceae using the rbcL, cox1 and UPA indicated the non-monophyly of Hydropuntia and suggested its reduction to Gracilaria . The taxa of Gracilariaceae examined in present study are representative of the three previously delineated groups, with G. salicornia as Gracilaria sensu stricto , G. firma as Hydropuntia , and G. tenuistipitata and G. chilensis as the new lineage . Despite the limited taxon sampling, the fully resolved phylogeny of Gracilariaceae inferred using the concatenated dataset of 146 chloroplast protein-coding genes in this study implied the potential of chloroplast genome in resolving the Gracilaria sensu stricto conundrum. Inclusion of more samples of Gracilaria sensu lato for analyses may eventually lead to the recognition of Hydropuntia as a genus distinct from Gracilaria sensu stricto.
The interpretation and accuracy of the branch support measured as ML bootstrap replicate proportions and Bayesian posterior probabilities is still an issue of dispute, especially when the deep relationships involving rapid radiation of lineages are considered, although the likelihood-based phylogenetic reconstruction in the non-parametric maximum-likelihood and Bayesian frameworks have established themselves as the methods of choice . The very short internal branches separating the major lineages of Gracilariales, Halymeniales, Rhodymeniales, Gelidiales and Ceramiales with long terminal branches (Fig. 6) suggest that these lineages of the subclass Rhodymeniophycidae underwent a rapid radiation . The branching order is difficult to resolve when there is a rapid accumulation of many lineages in short period of time and when sequences become saturated, where the phylogenies typically exhibit very short internal branches with low ML bootstrap but high Bayesian posterior probabilities supports. It has been noted that the ML bootstrapping is computational intensive and often underestimates the branch support, whereas the Bayesian estimators inflate the confidence of the corresponding branches . The values estimated by SH-aLRT, an alternative non-parametric approach to the conventional bootstrapping, which corroborated most of the relationships supported by high Bayesian posterior probabilities in this study, are likely to be another measure to support the data accurately. The branch supports derived from SH-aLRT are considered to be more consistent and conservative than conventional bootstrapping . In addition, Bayesian posterior probabilities may be more indicative of the evolutionary relationships compared to the ML bootstrap replicate proportions which have been typically considered conservative when the full alignment was analyzed, as the deep relationships received increased ML bootstrap supports parallel to the high Bayesian posterior probabilities when analyses were conducted on the progressively more conservative alignments generated by site-stripping .
This study presents the chloroplast genome of Gracilaria firma and identified unique red algal plasmid DNA remnants in the genome. Despite some lineage- and species-specific gene losses in the Florideophyceae, the chloroplast genomes across the subphylum Eurhodophytina are highly conserved in synteny and genome architecture. The chloroplast genomes hold substantial information that can be tapped for improving the resolution of the phylogenetic relationships at all taxonomic hierarchical levels. However, additional study including improved taxon sampling, additional sequence data and further exploration of analyses options such as data partitions and evolutionary model selections is warranted to resolve the relationships within the Rhodymeniophycidae [47–50].
We are grateful to Miss Yealing-Tay from BioEasy Sdn. Bhd. for her technical advice on the analysis of the sequenced reads. We also thank two anonymous reviewers for their constructive and valuable comments that helped to improve the previous version of this manuscript.
Funding was provided by the Taiwan Ministry of Science and Technology (102-2628-B-019-002-MY3 and 104-2811-B-019-004 to SML) and the Ministry of Higher Education, Malaysia, MoHE-HIR Grant (H-50001-00-A000025 to PEL).
Availability of data and materials
The alignments used for the phylogenomic analyses are available from the corresponding authors upon request.
PKN and LCL obtained the samples and performed the experiment. PKN, CMC and TWP analyzed the data. PKN, SML and PEL conceived and wrote the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Saunders GW, Hommersand MH. Assessing red algal supraordinal diversity and taxonomy in the context of contemporary systematic data. Am J Bot. 2004;91:1494–507.View ArticlePubMedGoogle Scholar
- Yoon HS, Müller KM, Sheath RG, Ott FD, Bhattacharya D. Defining the major lineages of red algae (Rhodophyta). J Phycol. 2006;42:482–92.View ArticleGoogle Scholar
- Guiry MD, Guiry GM. Algaebase. World-wide electronic publication, National University of Ireland, Galway. 2016. http://www.algaebase.org. Accessed 15 Jun 2016.
- Janouškovec J, Liu SL, Martone PT, Carré W, Leblanc C, Collén J, et al. Evolution of red algal plastid genomes: ancient architectures, introns, horizontal gene transfer, and taxonomic utility of plastid markers. PLoS One. 2013;8:e59001.View ArticlePubMedPubMed CentralGoogle Scholar
- Du QW, Bi GQ, Mao YX, Sui ZH. The complete chloroplast genome of Gracilariopsis lemaneiformis (Rhodophyta) gives new insight into the evolution of family Gracilariaceae. J Phycol. 2016;52(3):441–50.View ArticlePubMedGoogle Scholar
- Lee J, Kim KM, Yang EC, Miller KA, Boo SM, Bhattacharya D, et al. Reconstructing the complex evolutionary history of mobile plasmids in red algal genomes. Sci Rep. 2016;6:23744.View ArticlePubMedPubMed CentralGoogle Scholar
- Chang CF, Xia BM. Taxonomic studies on Gracilaria from China. Stud Mar Sinica. 1976;11:91–163.Google Scholar
- Terada R, Baba M, Yamamoto H. New record of Gracilaria firma Chang et Xia (Rhodophyta) from Okinawa, Japan. Phycol Res. 2000;48(4):291–4.View ArticleGoogle Scholar
- Lin SM. Marine benthic macroalgal flora of Taiwan: Part I Order Gracilariales (Rhodophyta). 1st ed. Keelung: National Taiwan Ocean University; 2009.Google Scholar
- Ajisaka T, Chiang YM. Recent status of Gracilaria cultivation in Taiwan. Hydrobiologia. 1993;260/261:335–8.View ArticleGoogle Scholar
- Titlyanov EA, Titlyanova TV, Oham VH. Stocks and the use of economic marine macrophytes of Vietnam. Russ J Mar Biol. 2012;38:285–98.View ArticleGoogle Scholar
- Trono GC. Diversity of the seaweed flora of the Philippines and its utilization. Hydrobiologia. 1999;398/399:1–6.View ArticleGoogle Scholar
- Arano KG, Trono GC, Montano NE, Hurtado AQ, Villanueva RD. Growth, agar yield and quality of selected agarophyte species from the Philippines. Bot Mar. 2000;43(6):517–24.View ArticleGoogle Scholar
- Gurgel CFD, Fredericq S. Systematics of the Gracilariaceae (Gracilariales, Rhodophyta): a critical assessment based on rbcL sequence analyses. J Phycol. 2004;40:138–59.View ArticleGoogle Scholar
- Bird CJ, Rice EL, Murphy CA, Ragan MA. Phylogenetic relationships in the Gracilariales (Rhodophyta) as determined by 18S rDNA sequences. Phycologia. 1992;31(6):510–22.View ArticleGoogle Scholar
- Lyra GM, Costa ES, Jesus PB, Matos JC, Caires TA, Oliveira MC, et al. Phylogeny of Gracilariaceae (Rhodophyta): evidence from plastid and mitochondrial nucleotide sequences. J Phycol. 2015;51(2):356–66.View ArticleGoogle Scholar
- Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30:2114–20.View ArticlePubMedPubMed CentralGoogle Scholar
- Simpson JT, Wong K, Jackman SD, Schein JE, Jones SJ, Birol I. ABySS: a parallel assembler for short read sequence data. Genome Res. 2009;19:1117–23.View ArticlePubMedPubMed CentralGoogle Scholar
- Langmead B, Salzberg SL. Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012;9:357–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Paulino D, Warren RL, Vandervalk BP, Raymond A, Jackman SD, Birol I. Sealer: a scalable gap-closing application for finishing draft genomes. BMC Bioinformatics. 2015;16:230.View ArticlePubMedPubMed CentralGoogle Scholar
- Thorvaldsdottir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14:178–92.View ArticlePubMedGoogle Scholar
- Peng Y, Leung HC, Yiu SM, Chin FY. IDBA-UD: a de novo assembler for single-cell and metagenomics sequencing data with highly uneven depth. Bioinformatics. 2012;28:1420–8.View ArticlePubMedGoogle Scholar
- Hunter SS, Lyon RT, Sarver BAJ, Hardwick K, Forney LJ, Settles ML. Assembly by reduced complexity (ARC): a hybrid approach for targeted assembly of homologous sequences. bioRxiv. 2015. http://dx.doi.org/10.1101/014662.
- Wyman SK, Jansen RK, Boore JL. Automatic annotation of organellar genomes with DOGMA. Bioinformatics. 2004;20:3252–325.View ArticlePubMedGoogle Scholar
- Lagesen K, Hallin P, Rødland EA, Stærfeldt H, Rognes T, Ussery DW. RNAmmer: consistent and rapid annotation of ribosomal RNA genes. Nucleic Acids Res. 2007;35:3100–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Lowe TM, Eddy SR. tRNAscan-SE: A program for improved detection of transfer RNA genes in genomic sequence. Nucleic Acids Res. 1997;25:955–64.View ArticlePubMedPubMed CentralGoogle Scholar
- Laslett D, Canback B. ARAGORN, a program for the detection of transfer RNA and transfer-messenger RNA genes in nucleotide sequences. Nucleic Acids Res. 2004;32:11–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Lang BF, Laforest M, Burger G. Mitochondrial introns: A critical view. Trends Genet. 2007;23:119–25.View ArticlePubMedGoogle Scholar
- Lohse M, Drechsel O, Bock R. OrganellarGenomeDRAW (OGDRAW) – a tool for the easy generation of high-quality custom graphical maps of plastid and mitochondrial genomes. Curr Genet. 2007;52:267–74.View ArticlePubMedGoogle Scholar
- Darling ACE, Mau B, Blattner FR, Perna NT. Mauve: Multiple alignment of conserved genomic sequence with rearrangements. Genome Res. 2004;14(7):1394–403.View ArticlePubMedPubMed CentralGoogle Scholar
- Cox CJ, Li B, Foster PG, Embley TM, Civáň P. Conflicting phylogenies for early land plants are caused by composition biases among synonymous substitutions. Syst Biol. 2014;63(2):272–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Philippe H, Brunkmann H, Lavrov DV, Littlewood DTJ, Manuel M, Wörheide G, et al. Resolving difficult phylogenetic questions: Why more sequences are not enough. PLoS Biol. 2009;9(3):e1000602.View ArticleGoogle Scholar
- Katoh K, Standley DM. MAFFT Multiple sequence alignment software version 7: Improvements in performance and usability. Mol Biol Evol. 2013;30(4):772–80.View ArticlePubMedPubMed CentralGoogle Scholar
- Dereeper A, Guignon V, Blanc G, Audic S, Buffet S, Chevenet F, et al. Phylogeny fr: robust phylogenetic analysis for the non-specialist. Nucleic Acids Res. 2008;31:W465–9.View ArticleGoogle Scholar
- Hall TA. BioEdit: a user-friendly biological sequence alignment editor and analysis program for Windows 95/98/NT. Nucleic Acids Symp Ser. 1999;41:95–8.Google Scholar
- Guindon S, Dufayard JF, Lefort V, Anisimova M, Hordijk W, Gascuel O. New algorithms and methods to estimate maximum-likelihood phylogenies: Assessing the performance of PhyML 3.0. Syst Biol. 2010;59(3):307–21.View ArticlePubMedGoogle Scholar
- Ronquist F, Teslenko M, van der Mark P, Ayres DL, Darling A, Höhna S, et al. MrBayes 3.2: Efficient Bayesian phylogenetic inference and model choice across a large model space. Syst Biol. 2012;61:1–4.View ArticleGoogle Scholar
- Hagopian JC, Reis M, Kitajima JP, Bhattacharya D, Oliveira MC. Comparative analysis of the complete plastid genome sequence of the red alga Gracilaria tenuistipitata var. liui provides insights into the evolution of Rhodoplasts and their relationship to other plastids. J Mol Evol. 2004;59:464–77.View ArticlePubMedGoogle Scholar
- Verbruggen H, Costa JF. The plastid genome of the red alga Laurencia. J Phycol. 2015;51:586–9.View ArticlePubMedGoogle Scholar
- Ohta N, Matsuzaki M, Misumi O, Miyagishima SY, Nozaki H, Tanaka K, et al. Complete sequence and analysis of the plastid genome of the unicellular red alga Cyanidioschyzon merolae. DNA Res. 2003;10:67–77.View ArticlePubMedGoogle Scholar
- Campbell MA, Presting G, Bennett MS, Sherwood AR. Highly conserved organellar genomes in the Gracilariales as inferred using new data from the Hawaiian invasive alga Gracilaria salicornia (Rhodophyta). Phycologia. 2014;53:109–16.View ArticleGoogle Scholar
- DePriest MS, Bhattacharya D, López-Bautista JM. The plastid genome of the red macroalga Grateloupia taiwanensis (Halymeniaceae). PLoS One. 2013;8:e68246.View ArticlePubMedPubMed CentralGoogle Scholar
- Salomaki ED, Nickles KR, Lane CE. The ghost plastid of Choreocolax polysiphoniae. J Phycol. 2015;51:217–21.View ArticlePubMedGoogle Scholar
- Wang L, Mao YX, Kong FN, Li GY, Ma F, Zhang BL, et al. Complete sequence and analysis of plastid genome of two economically important red alga–Pyropia haitanensis and Pyropia yezoensis. PLoS One. 2013;8:e65902.View ArticlePubMedPubMed CentralGoogle Scholar
- Lang BF, Nedelcu A. Plastid Genomes of Algae. In: Bock R, Knoop V, editors. Genomics of chloroplasts and mitochondria. Netherlands: Springer; 2012. p. 59–87.View ArticleGoogle Scholar
- Hughey JR, Gabrielson PW, Rohmer L, Tortolani J, Silva M, Miller KA, et al. Minimally destructive sampling of type specimens of Pyropia (Bangiales, Rhodophyta) recovers complete plastid and mitochondrial genomes. Sci Rep. 2014;4:5113.View ArticlePubMedPubMed CentralGoogle Scholar
- Yang EC, Boo SM, Bhattacharya D, Saunders GW, Knoll AH, Fredericq S, et al. Divergence time estimates and the evolution of major lineages in the florideophyte red algae. Sci Rep. 2016;6:21361.View ArticlePubMedPubMed CentralGoogle Scholar
- Yang EC, Kim KM, Kim SY, Lee J, Boo GH, Lee J, et al. Highly conserved mitochondrial genomes among multicellular red algae of the Florideophyceae. Genome Biol Evol. 2015;7(8):2394–406.View ArticlePubMedPubMed CentralGoogle Scholar
- Saunders GW, Filloramo G, Dixon K, Le Gall L, Maggs CA, Kraft GT. Multigene analyses resolve early diverging lineages in the Rhodymeniophycidae (Florideophyceae, Rhodophyta). J Phycol. 2016. doi:10.1111/jpy.12426.Google Scholar
- Verbruggen H, Maggs CA, Saunders GW, Le Gall L, Yoon HS, De Clerck O. Data mining approach identifies research priorities and data requirements for resolving the red algal tree of life. BMC Evol Biol. 2010;10:16.View ArticlePubMedPubMed CentralGoogle Scholar
- Le NH, Lin SM. The discontinuous geographic distribution of Gracilaria firma (Gracilariaceae, Rhodophyta) from the Gulf of Tonkin to the Gulf of Thailand along the coastlines of Vietnam. Vietnamese J Biotechnol. 2005;3(3):373–80.Google Scholar
- Verbruggen H, Theriot EC. Building trees of algae: some advances in phylogenetic and evolutionary analysis. Eur J Phycol. 2008;43(3):229–52.View ArticleGoogle Scholar
- Wrobel B. Statistical measures of uncertainty for branches in phylogenetic trees inferred from molecular sequences by using model-based methods. J Appl Genet. 2008;49:49–67.View ArticlePubMedGoogle Scholar
- Simmons MP, Norton AP. Divergent maximum-likelihood-branch-support values for polytomies. Mol Phylogenet Evol. 2014;73:87–96.View ArticlePubMedGoogle Scholar
- Glöckner G, Rosenthal A, Valentin K. The structure and gene repertoire of an ancient red algal plastid genome. J Mol Evol. 2000;51:382–90.View ArticlePubMedGoogle Scholar
- Tajima N, Sato S, Maruyama F, Kurokawa K, Ohta H, Tabata S, et al. Analysis of the complete plastid genome of the unicellular red alga Porphyridium purpureum. J Plant Res. 2014;127:389–97.View ArticlePubMedGoogle Scholar
- Smith DR, Hua J, Lee RW, Keeling PJ. Relative rates of evolution among the three genetic compartments of the red alga Porphyra differ from those of green plants and do not correlate with genome architecture. Mol Phylogenet Evol. 2012;65:339–44.View ArticlePubMedGoogle Scholar
- Hughey JR. Genomic and phylogenetic analysis of the complete plastid genome of the California endemic seaweed Wildemania schizophylla. Madrono. 2016;63(1):34–8.View ArticleGoogle Scholar
- Zhang Y, Guo Y-M, Li T-J, Chen C-H, Shen K-N, Hsiao C-D. The complete chloroplast genome of Gracilariopsis lemaneiformis, an important economic red alga of the family Gracilariaceae. Mitochondrial DNA B Resour. 2016;1(1):2–3.View ArticleGoogle Scholar
- Kilpatrick ZM, Hughey JR. Mitochondrial and plastid genome analysis of the marine red alga Coeloseira compressa (Champiaceae, Rhodophyta). Mitochondrial DNA B Resour. 2016;1(1):456–8.View ArticleGoogle Scholar