Pyropia, a genus of marine red algae, is one of the most economically important mariculture crops. It has an annual harvest of more than 120,000 t (dry weight) and a value of over US $2 billion per year [1–3]. With the expansion of artificial seeding and the development of the floating culture method, farming and processing of Pyropia has now become one of the largest seaweed industries in East Asian countries, including China, Japan and South Korea . In China, two major cultivars, Pyropia yezoensis Ueda and Pyropia haitanensis Chang et Zheng, are distributed in northern China and southern China, respectively. P. haitanensis, a typical warm temperate zone species originally found in Fujian Province, comprises 75% of the total production of cultivated Pyropia in China [4, 5].
Pyropia are not only economically important, but also have high basic research value. As sessile organisms that inhabit the intertidal zone, an environment of rapidly changing physical conditions due to the turning tides, Pyropia have high levels of tolerance to various abiotic stressors such as desiccation, osmotic shock, temperature, and light . Furthermore, Pyropia differ from most terrestrial plants in many aspects of their biology, such as their unique heteromorphic digenetic life cycle, their special pathway for carbon assimilation in photosynthesis, their genetic chimera of blades, their ability to accumulate iodine, their original composition of their cell walls, and their associated cell wall synthesis pathways . These specific characteristics present opportunities for new discoveries in Pyropia. Recently, P. yezoensis has been recognized as a useful model organism for fundamental and applied studies of marine algae [3, 6], and a great deal of studies have been conducted to understand the special characteristics of Pyropia[1, 7–12]. A whole genome sequencing project is also ongoing. However, for P. haitanensis, which has more primitive properties and special biological processes than P. yezoensis[13, 14], limited studies have been conducted, and many aspects of its biology remain poorly explored. Currently, only 6035 ESTs and 140 nucleotide sequences are deposited in NCBI database for P. haitanensis, and its mitochondrial genome was just been sequenced . The whole genome of P. haitanensis may not be sequenced for years. The limited genomic sequence resources have hampered studies to elucidate the molecular mechanisms of specific traits and understand the complex mechanisms of stress tolerance in P. haitanensis.
Transcriptome analysis is an attractive alternative to whole genome sequencing. A transcriptome is a complete set of transcripts in a cell or an organism at a specific developmental stage or under a physiological condition [15–17]. The transcriptome can provide useful information about gene expression, gene regulation, and amino acid content of proteins. Therefore, transcriptome analysis is essential to interpret the functional elements of the genome and reveal the molecular constituents of cells and tissues [16, 17]. During the last few years, various technologies have been used to study the Pyropia transcriptomes, including EST sequencing [7, 8, 13], and microarrays . However, microarrays are limited by background and cross hybridization problems and only measure the relative abundance of transcripts. Moreover, only predefined sequences are detected. Traditional sequencing methods for the generation of ESTs require costly and time-consuming approaches involving cDNA library construction, cloning, and labor intensive Sanger sequencing. These problems have limited the transcriptome analysis in Pyropia.
This situation has changed with the introduction of Next Generation Sequencing (NGS) technology. NGS technology, including the Illumina HiSeq 2000, the Roche/454 Genome Sequencer FLX Instrument and the ABI SOLiD System, is considered a powerful and cost-efficient tool for advanced research in many areas, including re-sequencing, microRNA expression profiling, DNA methylation, and especially de novo transcriptome sequencing for non-model organisms . These NGS platforms can sequence in parallel massive amounts of DNA molecules derived directly from mRNA, producing millions or even billions of high-quality short reads. Previous studies have confirmed that the relatively short reads can be effectively assembled, especially with the great advantage of paired-end sequencing [15–18]. The Illumina transcriptome or whole genome de novo sequencing and assembly have been successfully used for several non-model organisms [17, 19–21]. Furthermore, NGS has also significantly accelerated and improved the sensitivity of gene-expression profiling, and is expected to boost collaborative and comparative genomics studies . Nevertheless, despite its obvious advantages, NGS was only recently first undertaken in Pyropia to investigate the transcriptome of P. yezoensis.
Thus, the present study aimed to characterize P. haitanensis’ functional genome and identify novel genes. The high-throughput sequencing platform Illumina HiSeq 2000 was used to profile the transcriptome of P. haitanensis. We constructed a library with mixed samples, including sporophytes and gametophytes of P. haitanensis, which were cultured under different conditions. Nearly 120 million reads totaling over 9 billion bp of high-quality DNA sequence with an average read length of 90 bp were obtained. These sequences were assembled into 24,575 unigenes by trinity de novo transcriptome assembly software , and approximately 70% of these unigenes were already annotated, as identified by BLAST searches against the SwissProt (http://expasy.org/tools/blast), Kyoto Encyclopedia of Genes and Genomes (KEGG, http://www.genome.jp/kegg), Clusters of orthologous groups (COG, http://www.ncbi.nlm.nih.gov/COG/), Nr (http://blast.ncbi.nlm.nih.gov/Blast.cgi) and gene ontology (GO) databases. A total of 2,727 cSSR (cDNA simple sequence repeat) markers of P. haitanensis also have been developed based on these unigenes by MicroSAtellite (MISA, http://pgrc.ipk-gatersleben.de/misa/) software. These assembled, annotated transcriptome sequences and SSR markers provide a valuable genomic resource for further studying the molecular basis of P. haitanensis’ special biological features and for marker-assisted selective breeding in P. haitanensis.