Transcriptomic analysis of ‘Suli’ pear (Pyrus pyrifolia white pear group) buds during the dormancy by RNA-Seq

Background Bud dormancy is a critical developmental process that allows perennial plants to survive unfavorable environmental conditions. Pear is one of the most important deciduous fruit trees in the world, but the mechanisms regulating bud dormancy in this species are unknown. Because genomic information for pear is currently unavailable, transcriptome and digital gene expression data for this species would be valuable resources to better understand the molecular and biological mechanisms regulating its bud dormancy. Results We performed de novo transcriptome assembly and digital gene expression (DGE) profiling analyses of ‘Suli’ pear (Pyrus pyrifolia white pear group) using the Illumina RNA-seq system. RNA-Seq generated approximately 100 M high-quality reads that were assembled into 69,393 unigenes (mean length = 853 bp), including 14,531 clusters and 34,194 singletons. A total of 51,448 (74.1%) unigenes were annotated using public protein databases with a cut-off E-value above 10-5. We mainly compared gene expression levels at four time-points during bud dormancy. Between Nov. 15 and Dec. 15, Dec. 15 and Jan. 15, and Jan. 15 and Feb. 15, 1,978, 1,024, and 3,468 genes were differentially expressed, respectively. Hierarchical clustering analysis arranged 190 significantly differentially-expressed genes into seven groups. Seven genes were randomly selected to confirm their expression levels using quantitative real-time PCR. Conclusions The new transcriptomes offer comprehensive sequence and DGE profiling data for a dynamic view of transcriptomic variation during bud dormancy in pear. These data provided a basis for future studies of metabolism during bud dormancy in non-model but economically-important perennial species.

Pears (Pyrus spp.) are among the world's most important perennial deciduous fruit trees and have a key feature of transition from growth to dormancy during their annual growth cycles. Studies of pear dormancy have focused mainly on the physiological level, including respiration [26], carbohydrate [27,28] and protein metabolism [29], and chilling requirements [30]. To date, few studies of pear dormancy at the molecular level have been conducted. Ubi et al. (2010) isolated two dormancy-associated MADS-box (DAM) genes and studied their expression patterns during the seasonal endodormancy transition phases in Japanese pear (Pyrus pyrifolia) [31]. Although these data highlighted the potential of molecular research to understand dormancy in this crop, they were insufficient to elucidate the molecular regulation mechanism. Furthermore, with global warming, many fruit trees, including pears, growing in warm areas have suffered from inadequate winter chill and showed advanced springtime and delayed autumnal phenologies, uneven foliation and blossoming, and long blooming periods, which are unfavorable for sustainable pear production [32][33][34][35][36][37]. Therefore, understanding the molecular mechanisms of pear dormancy transitions will greatly assist programs to breed cultivars with lower chilling requirements and to develop agronomic measures to cope with insufficient winter chill.
Traditionally, researchers have studied target nucleotide sequences by cloning, sequencing and comparing with known sequences, annotating their functions, then verifying their functions using tools such as RNAi, microarrays, and genetic transformation. These methods are very useful, but characterizing a large number of genes in a single experiment is difficult, especially with respect to specific genes. In recent years, RNA-sequencing (RNA-Seq) technology based on pyro-sequencing has become the most popular and powerful tool for species that lack reference genome information. RNA-Seq is less costly, more efficient, and allows faster gene discovery and more sensitive and accurate profiles of the transcriptome than microarray analysis or other techniques [38][39][40][41][42][43][44][45]. To better understand the molecular mechanisms of bud dormancy transition in pear trees, we used RNA-Seq technology to identify and characterize the expression of a large number of genes, especially those differentially expressed during dormancy progression.
The aim of the present study was to gain an understanding of molecular mechanisms during pear bud dormancy and establish a sound foundation for future molecular studies. We sequenced cDNA libraries from lateral flower buds of 'Suli' pear (Pyrus pyrifolia white pear group) from endo to ecodormancy stages using Illumina deep-sequencing technology. Approximately 100 M high-quality reads were obtained and assembled into 69,393 unigenes. Furthermore, four digital gene expression (DGE) libraries were constructed to compare gene expression patterns in different dormancy states using an upgraded digital gene expression system. The assembled and annotated transcriptome sequences and gene expression profiles will provide valuable resources for the identification of pear genes involved in bud dormancy.

Dormancy status of lateral flower buds in pears
To identify differentially expressed genes (DEGs) during dormancy, the dormancy status of lateral flower buds in pears must be defined. Dormancy status on four collection dates was estimated using excised shoots. No bud break was observed on shoots sampled on Nov. 15 or Dec. 15, but more than 50% of the buds had broken on Jan. 15 and Feb. 15 ( Figure 1). Lateral flower buds were determined to be in the endodormant phase on Nov. 15 and Dec. 15 and in the ecodormant phase on Jan. 15 and Feb. 15. Figure 1 Bud break percentage of 'Suli' pear after 21 days of forcing conditions. Dormant shoots of field-grown 'Suli' pear trees were collected on Nov. 15, Dec. 15, Jan. 15, andFeb. 15, 2010-2011, and kept in water in a phytotron at day/night 25 ± 1/18 ± 1°C, with a 12-h photoperiod of white light (320 μ photon mol m -2 s -1 ) and 75% humidity. Percent bud break in 12 shoots per sampling period was assessed after 21 d.

Illumina sequencing and de novo assembly
To obtain an overview of the pear bud transcriptome during dormancy, a cDNA library was generated from RNA isolated from buds pooled from Nov. 2010 to Feb. 2011, then paired-end sequenced using the Illumina platform. After cleaning and quality checks, approximately 100 M high-quality reads were assembled into 197,357 contigs ( Table 1). The mean contig size was 272 bp. Using paired-end joining and gap-filling, these contigs were further assembled into 69,393 unique sequences with a mean size of 853 bp, including 14,531 clusters and 34,194 singletons. The size distributions of these contigs and unigenes are shown in Additional file 1. To evaluate the quality of sequencing data, we randomly selected 8 unigenes and designed 8 pairs of primers (Additional file 2) for RT-PCR amplification. Each primer pair resulted in a band of the expected size; the identity of all PCR products was confirmed by Sanger sequencing.

Functional classification
We used GO and KEGG assignments to classify the functions of the predicted pear unigenes. Approximately 48,725 sequences could be annotated using GO, and 36,717 unigenes could be categorized into three main categories: biological process, cellular component, and molecular function. To our knowledge, the apple (Malus × domestica) genome has been completed. Among the organisms with completely-sequenced genomes, apple is most closely related to pear. Therefore, the distribution of GO annotations in the pear unigene data was compared with that of the apple genome (63,517 full length sequences) (ftp://ftp.jgi-psf.org/pub/JGI_data/phytozome/ v8.0/Mdomestica/annotation/Mdomestica_196_cds.fa.gz) using Blast2GO. The sequences could be categorized into 60 GO functional groups ( Figure 3). The percentages of annotated apple genes assigned to each group generally mirrored those of pear unigenes, reflecting the similar functionalities of their genomes and further highlighting that a large diversity of pear unigenes was represented by these sequences. Through sequence comparison, we observed that, although both species were highly similar, significant differences also existed between them.

DGE library sequencing and mapping sequences to the reference transcriptome database
Using the RNA-seq technique, we analyzed changes in gene expression at four times during pear bud dormancy. Four DGE libraries (from buds sampled on Nov. 15, Dec. 15, Jan. 15, and Feb. 15) were sequenced to generate approximately 13-15 million clean reads per library after filtering the raw reads. The total number of mapped reads in each library ranged from 11.0-12.3 million, and the percentage of these reads ranged from 78.8-80.9%. Among them, the number of unique match reads ranged from 7.0-7.9 million (Table 2). To confirm whether the number of detected genes increased proportionally to sequencing effort, sequence saturation analysis was performed. A trend of saturation where the number of detected genes almost ceases to increase when the number of reads reaches 5 million (Additional file 5). We evaluated the randomness of the Distinct sigletons 34,194 Sequences with E-value < 10 -5 51,448  DGE data by analyzing the distribution of reads matching the reference genes [38], because nonrandom biases for specific gene regions can directly affect subsequent bioinformatics analysis. The randomness of the data was good, with reads evenly distributed throughout the transcriptome (Additional file 6). For mRNA expression, heterogeneity and redundancy are two significant characteristics. While the majority of mRNA is expressed at low levels, a small proportion of mRNA is highly expressed. Therefore, the distribution of unique reads was used to evaluate the normality of our RNA-Seq data. As shown in Figure 4, the distribution of unique reads over different reads abundance categories showed similar patterns for all four RNA-Seq libraries. The similarity distribution had a comparable pattern with more than 43% of the sequences having a similarity of 80%, while approximately 50% of the hits had a similar range ( Figure 4).  Figure 6A). In the molecular function category, only two GO, terms 'heme binding' and 'tetrapyrrole binding' , were significantly enriched in all three pairwise comparisons, while   'monooxygenase activity' was significantly enriched in the Dec. 15-VS-Jan. 15 and Jan. 15-VS-Feb comparisons ( Figure 6B). In the biological process category, no GO terms were significantly enriched in all three pairwise comparisons, but 'photosynthesis' , 'photosystem II assembly' , 'oxidation-reduction process' , and 'photosyn- The second largest group (Group 4) contained 56 (29.5%) genes, from CL 12859 to CL 9922, that were down-regulated from Nov. 15 to Jan. 15 then up-regulated between Jan. 15 and Feb. 15. This group mainly included genes encoding proteins associated with photosynthesis metabolism, such as chlorophyll A/B binding protein, photosystem I reaction center subunit III, photosystem I reaction center subunit IV A, plastocyanin A, chloroplast oxygen-evolving enhancer protein 1, ribulose bisphosphate carboxylase, cytochrome P450, and magnesium-protoporphyrin IX monomethyl ester cyclase. Additionally, zinc finger protein, SPL domain class transcription factor, and basic helix-loop-helix domain-containing protein were encoded.

Changes in gene expression profiles among dormancy stages
The third largest group (Group 5) contained 30 (15.8%) genes, from CL 10 to unigene 27694, that were up-regulated from Nov. 15 to Jan. 15, then downregulated between Jan. 15 and Feb. 15. This group mainly included genes encoding proteins associated with oxidation-reduction reaction and resistance, such as blue copper protein, 2-oxoglutarate-dependent dioxygenase, cytochrome P450, heat shock protein, glycine rich protein, and dehydrin 1.
In Group 3, seven genes were down-regulated from Nov. 15 to Dec. 15, but were up-regulated between Dec. 15 and Feb. 15. Finally, two genes (Group 1) were up-regulated from Nov. 15 to Dec. 15, but were downregulated between Jan. 15 and Feb. 15. Of these nine genes, only 2-aminoethanethiol dioxygenase isoform 2 and transferase were definitely annotated.

Gene expression analysis and Q-PCR validation
The RNA sampled at four times during bud dormancy provided templates for real-time quantitative PCR (Q-PCR) validation. We randomly selected seven DEGs to demonstrate the RNA-seq results (Figure 8). The Q-PCR data for these genes were basically consistent with the RNA-seq results of the four samples. Linear regression [(Q-PCR value) = α (RNA-seq value) + b] analysis showed a highly significant correlation (R = 0.7533 ** ) which indicated good reproducibility between transcript abundance assayed by RNA-seq and the expression profile revealed by Q-PCR data ( Figure 9).

Phylogenetic analysis of dormancy-associated MADS-box (DAM) genes and their expression variations
A phylogenetic tree constructed using the nucleotide sequences of two unigenes and 18 other MADS-box genes ( Figure 10) revealed that CL 1161.contig2 was more closely related to PpMADS13-1 of Pyrus pyrifolia, whereas CL 1161.contig5 was more similar to PpMADS13-2 of P. pyrifolia. Additionally, the results revealed that the pear DAM genes were more closely related to those of Prunus species (Prunus persica and P. mume). However, the pear DAM genes formed an independent subclade.
Based on the phylogenetic analysis, we selected two unigenes (CL 1161.contig2 and CL 1161.contig5) to analyze their expression variations during dormancy. The expression levels of both genes decreased with endodormancy release in lateral flower buds ( Figure 11).

Discussion
In this study, using RNA-Seq technology, a 'Suli' pear cDNA library and four DGE libraries (from samples collected on Nov. 15 [46]. These ESTs participated in different metabolic pathways related to photoperiod, temperature, circadian clocks, water, energy, reactive oxygen species, and hormones [5,22,46]. To our knowledge, this is the first report to use RNA-Seq technique to identify large numbers of genes involved in different metabolic pathways in pear bud dormancy. Compared with traditional EST analyses, RNA-Seq was less expensive, more efficient, and allowed faster gene discovery in bud dormancy studies. Through RNA-seq analysis, we found that the numbers and expression profiles of DEGs differed at different times during dormancy. A total of 1,978, 1,024, and 3,468 genes were differentially expressed between Nov. 15 [25]. By analyzing KEGG pathways, we found DEGs that participated in several different pathways. Some pathways (such as starch and sucrose metabolism, circadian rhythm, and flavonoid biosynthesis) had been previously correlated to bud break in other species [5,17,22,46], and some like phenylpropanoid biosynthesis, stilbenoid, diarylheptanoid and gingerol biosynthesis, zeatin biosynthesis, ether lipid metabolism, endocytosis, and glycerophospholipid metabolism were associated with bud break for the first time in this study. These data may suggest new research directions for understanding bud dormancy.
Some of the genes found in this work had been previously identified in other perennial plants. The DAM genes, widely described in perennial species such as leafy spurge [5], peach [9][10][11][12], raspberry [22], Japanese apricot [15], kiwifruit [24], and Japanese pear [31], are candidates for internal factors controlling endodormancy. In this study, we also found two DAM genes, and phylogenetic analysis revealed that CL 1161.contig2 was more closely related to PpMADS13-1 of Pyrus pyrifolia, whereas CL 1161.contig5 was more similar to PpMADS13-2 of Pyrus pyrifolia. Changes in the expression of CL 1161.contig2 and CL 1161.contig5 decreased with endodormancy release in lateral flower buds were consistent with the findings of earlier work comparing PpMADS13-1 and PpMADS13-2 (See figure on previous page.) Figure 7 Hierarchical clustering analysis of differentially-expressed genes during pear dormancy. The log 2 Ratio for significantly differentially-expressed genes were used. Each column represents a comparison of samples between dates (e.g., Nov. 15-VS-Dec. 15), and each row represents a gene. Expression differences are shown in different colors; red indicates up-regulation and green indicates down-regulation. The 190 genes were classified into seven regulation patterns.
gene expression in lateral leaf buds of Japanese pear [31], PpDAM5 and PpDAM6 in lateral vegetative buds and lateral flower buds of peach [10,11], and all PmDAMs (PmDAM1-PmDAM6) in lateral vegetative buds of Japanese apricot [15]. Our study, along with previous studies, suggested that DAM genes might play significant roles in the regulation of bud dormancy in 'Suli' pear.
The accumulation of dehydrin (DHN) is known to be associated with freezing tolerance in plants [47]. Recent studies have reported that the accumulation of DHNs in woody plants correlates with seasonal transitions in dormancy and cold acclimation stages during winter [16,48], but characterizations of DHN genes expressed during dormancy are limited. Yakovlev et al. (2008) found a reduction in the transcript levels of most of the 15 DHNs that they cloned as Norway spruce neared bud-burst [48]. Garcia-Bañuelos et al. (2009) reported that transcripts of apple DHN were highly expressed in bark and bud tissues when the plant was dormant and cold hardy in midwinter, but were not expressed in early spring when cold hardiness was lost and buds were growing [49]. Recently, several studies have identified DHN genes that were activated by ABA and C-repeat binding factor (CBF) in response to abiotic stresses [14,[50][51][52]. Intriguingly, in leafy spurge, ABA levels were elevated during endodormancy but dropped following the transition to ecodormancy [5]. Horvath et al. (2008) found that CBF genes involved in cold regulated were up-regulated during the transition from para-to endo-dormancy [5]. In the present study, one gene (CL 9148) encoding dehydrin showed significantly higher expression during the transition from endo-to eco-dormancy; thereafter, the expression level of this gene rapidly decreased, as indicated by DGE analysis and Q-PCR data. Based on previous studies, we speculated that DHN genes may act as signals and offer some protection for 'Suli' pear after the end of endodormancy, when pear often encounters unfavorable environmental conditions, such as cold. Therefore, more attention should be paid to ABA and CBF, which activate DHN genes, in future studies of transcriptional regulation related to the pear dormancy process.
Generally, sugar transport is thought to occur via H + /sugar symports that depend on a pH gradient generated by a plasma membrane H + -ATPase [53]. Gevaudant et al. (2001) examined expression of the four H + -ATPase genes and reported that the levels of three H + -ATPase gene mRNAs increased, whereas the level of one H + -ATPase gene decreased in vegetative buds of peach trees after dormancy release [54]. Mazzitelli et al. (2007) demonstrated that the plasma membrane H + -ATPase gene was highly expressed during the dormancy transition [22]. Our results showed that the expression of a plasma membrane H + -ATPase gene (CL 1729) was up-regulated in pear buds during the endodormant maintenance period, down-regulated during endodormancy-release, and then up-regulated again. The expression patterns of plasma membrane H + -ATPase in 'Suli' pear were different from those of peach and raspberry buds, perhaps due to species-level or tissue-level differences.
In the present study, some genes encoding galactinol synthase (CL 9475), plastocyanin A (CL 1227), chlorophyll A/ B binding protein (CL 7129, CL 514, CL 4948, CL 9178, CL 9961, CL 3279), and S-adenosylemethionine decarboxylase (CL 2122) were differentially expressed. Of these, chlorophyll A/B binding protein and S-adenosylemethionine decarboxylase were previously reported in other perennial plants [46,55]. The expression levels of these genes changed significantly during the dormancy process. Thus, these genes may play roles in the regulation of bud dormancy in 'Suli' pear.
Although the molecular mechanisms associated with dormancy transitions in pear trees remain largely unknown, the present transcriptome analysis provided valuable information that could facilitate future studies on the detailed molecular functions of genes related to pear bud dormancy.  Coefficient analysis between gene expression ratios obtained from RNA-seq and Q-PCR data. The Q-PCR log 2 values (expression ratios; y-axis) were plotted against dormancy stages (x-axis). ** indicates a significant difference at p ≤ 0.01.

Conclusions
We obtained transcriptome data that provided comprehensive sequencing and DGE profiling data for a dynamic view of transcriptomic variation during the dormancy stage in pear. Physiological processes such as phenylpropanoid biosynthesis, stilbenoid, diarylheptanoid and gingerol biosynthesis, zeatin biosynthesis, ether lipid metabolism, endocytosis, glycerophospholipid metabolism, photosynthesis, phenylalanine metabolism, starch and sucrose metabolism were all differentially regulated during bud dormancy. Approximately 190 genes involved in many metabolic processes were significant differentially regulated during bud dormancy. Genes related to bud dormancy and their expression profiles at four timepoints during dormancy were analyzed further. This offered new insights into the molecular mechanisms underlying pear bud dormancy. This provided a relatively complete molecular platform for future research on the progression of pear bud dormancy. To our knowledge, this work is the first to study pear bud dormancy using RNA-Seq.

Plant materials
'Suli' pear cultivars grafted onto Pyrus betulaefolia Bunge rootstocks were obtained from the Dangshan Germplasm Resources Center (Dangshan County, Anhui Province, China). Pear trees were 10 years old and were considered to be in the adult phase. Trees used in the experiment were not pruned or chemically treated during the experimental period. All samples were collected from the same trees at each stage. The current season's growth shoots were collected on Nov. 15, Dec. 15, Jan. 15, Jan. 25and Feb. 15 from Nov. 2010 through Feb. 2011. Lateral flower bud samples were stored immediately in liquid N 2 and then at −80°C until RNA extraction after picking.

Dormancy status of lateral flower buds
The dormancy status of lateral flower buds at four collection dates (Nov. 15,Dec. 15,Jan. 15,and Feb. 15) was estimated by evaluating excised shoots from fieldgrown trees. To measure the percentage of bud break, 12 current season's growth shoots, with lengths of about 60 cm and bearing apical buds and 10-12 lateral flower buds, were collected and placed in water in 500 mL vials in a phytotron kept at day/night 25 ± 1/18 ± 1°C, with a 12-h photoperiod of white light (320 μ photon mol m -2 s -1 ) and 75% humidity. The water in the vials was changed and the basal ends of the shoots were cut every 2-3 d. After 21 d, the dormancy status was valued by percentage bud break; the beginning of bud break was defined as green leaf tips enclosing visible flowers. Lateral flower buds of shoots with bud break percentages < 50% were considered to remain in the endodormant stage.

RNA extraction, library preparation and RNA-seq
Thirty shoots (three biological replicates, with 10 shoots per replicate) with lengths of about 60 cm and bearing 10-12 lateral flower buds were sampled at each stage (Nov. 15, Dec. 15, Jan. 15, Jan. 25, and Feb. 15). The buds were sampled from three biological replications of each stage and produced an independent pool. Total RNA was extracted from lateral flower buds using the pine tree extraction protocol [56]. The transcriptome assembly library was pooled by mixing equal quantities of RNA of five dormancy stages. The four DGE libraries consisted of separate RNA extracts from buds of four different dormancy stages, i.e., Nov. 15, Dec. 15, Jan. 15, and Feb. 15. Each library was pooled by mixing equal quantities of RNA from three biological replications for each stage. Each pool was sequenced once technically since the RNA-seq data are highly replicable with relatively little technical variation [57]. The following protocols were performed by staff at the Beijing Genome Institute (BGI; Shenzhen, China). RNA integrity was confirmed with a 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA, USA). Oligo-(dT) magnetic beads were used to isolate poly-(A) mRNA from total RNA, and mRNA was fragmented in fragmentation buffer. Using these short fragments (≈200 bp) as templates, random hexamer-primers were used to synthesize first-strand cDNA. Second-strand cDNA was synthesized using buffer, dNTPs, RNaseH, and DNA polymerase I. Short double-stranded cDNA fragments were purified with QiaQuick PCR extraction kit (Qiagen, Venlo, The Netherlands), resolved with EB buffer for end reparation and adding poly (A), then ligated to sequencing adapters. After purification via agarose gel electrophoresis, suitable fragments were enriched by PCR amplification before library sequencing using Illumina HiSeq ™ 2000 (San Diego, CA, USA).

De novo assembly and function annotation
Raw sequence data in fastq format were filtered to remove reads containing adaptors, reads with more than 5% unknown nucleotides, and low-quality reads with more than 20% bases of quality value ≤ 10. Only clean reads were used in the following analysis. The sequences from the Illumina sequencing were deposited in the NCBI Sequence Read Archive (Accession SRX147917). First, transcriptome de novo assembly was carried out by BGI using the short-read assembly program Trinity [58] with the following parameters: min_contig_length = 100, min_glue = 2, group_pairs_distance = 250, path_reinforcement_distance = 75, bfly_opts = '-V 5 -edge-thr = 0.05 -stderr', min_kmer_cov = 2. Meanwhile, all reads of approximately 100 M (i.e. transcriptome sequencing reads and RNA-Seq reads) were mapped back to the apple genome reference (ftp://ftp.jgi-psf.org/pub/JGI_data/phytozome/ v8.0/Mdomestica/assembly/Mdomestica_196.fa.gz) to identify continuous gene regions using SOAPsplice software (Release 1.9; http://soap.genomics.org.cn/soapsplice. html). Secondly, we realigned all the transcripts onto the reference genome. When more than one transcript were placed in one gene region and they each other had an overlap less than 24 bp, we connected them into a longer transcript. A total of 31,727 transcripts assembled by Trinity were connected into 19,309 transcripts. Then the redundancy of unigenes was removed by TGICL (v.2.1) with options '-l 40 -v 25'. Finally based on gene family clustering, the unigenes were divided into two classes: clusters and singletons. The former was prefixed with 'CL' and the latter with 'unigene'. The id number of each unigene followed this prefix. In a cluster, the similarity between unigenes was more than 70%.
Blastx alignment (E-value < 10 -5 ) between unigenes and protein databases such as nr, Swiss-Prot, KEGG, and GO was performed, and the best-aligning results were used to determine the sequence direction of unigenes. When different databases conflicted, the results were prioritized in the order: nr, Swiss-Prot, KEGG, and GO. When a unigene did not align to any of the databases, ESTScan [59] was used to decide its sequence direction. GO annotation was analyzed by Blast2GO software (v.2.5.0). KEGG pathway annotation was performed using Blastall software against the KEGG database. The assembled sequences could be searched using the Gene-ID listed in Additional materials (Additional file 3).

Analysis and mapping of DGE reads
Raw sequence data in fastq format were filtered to remove reads containing adaptors, reads with more than 10% unknown nucleotides, and low-quality reads with more than 50% bases of quality value ≤ 5. The sequences from the DGE analysis were deposited in the NCBI Sequence Read Archive under accession numbers SRX148326 (Nov. 15), SRX148327 (Dec. 15), SRX148328 (Jan. 15), and SRX148329 (Feb. 15). Clean reads were mapped to our transcriptome reference database using the short oligonucleotide analysis package SOAPaligner/ soap2 [60], allowing mismatches of no more than two bases. The unique mapped reads were used in subsequent analyses. For gene expression analysis, the number of unique-match reads was calculated and then normalized to RPKM (reads per kb per million reads) [61]. The RPKM method eliminates the influence of different gene lengths and sequencing discrepancies on the calculation of gene expression, so that the calculated gene expressions can be directly compared among samples.

Evaluation of DGE libraries
A statistical analysis of the frequency of each uniquematch read in each DGE library was performed to compare gene expression at different times in pear dormancy using the method described by Audic et al. [62]. The P value was used to identify genes expressed differentially between each samples following the formula below, in which N1 and N2 represent the total numbers of unique-match reads in Samples 1 and 2, respectively, and gene A contained x and y unique-match reads mapped to Samples 1 and 2, respectively. Enriched P-values were calculated according to the hypergeometric test [ The false discovery rate (FDR) was used to determine the P-value threshold in multiple testing [63]. Briefly, assuming that R differentially expressed genes had been selected, S genes of those were really shown differential expression, whereas the other V genes actually indicated no difference which were false positive. If the error ratio (Q = V/R) was required to remain below a specified cutoff (0.01), the FDR value should not exceed 0.01. FDR-values were calculated according to the Benjamini and Hochberg algorithm [63]: We used FDR ≤ 0.001, the absolute value of log 2 Ratio ≥ 1, and the RPKM value of each gene for either sample over 10 as the thresholds to judge the significance of gene expression differences, where log 2 Ratio indicates the degree of differential expression between two samples and was the ratio of RPKM values for the treatment and control samples. This analysis found genes with significantly differential expression among samples prior to GO function and KEGG pathway analyses.

Clustering of gene expression profiles
Hierarchical cluster analysis of 190 gene expression patterns was performed with cluster [64] and Java Treeview [65] software. The log 2 Ratio for each gene was used for the hierarchical clustering analysis.

Phylogenetic analysis
A phylogenetic tree was constructed based on the nucleotide sequences of two unigenes and 18 additional DAM genes. The tree was generated with MEGA (v. 4.0.1) [66] software, using the neighbor-joining method.

Real-time quantitative RT-PCR analysis
Total RNA used for Q-PCR analysis was extracted from lateral flower buds of four different dormancy stages, i.e., Nov. 15, Dec. 15, Jan. 15, and Feb. 15, using three biological replicates of about 300 buds. Total RNA was extracted as described above, genomic DNA was removed with DNase I, and total RNA concentration was measured. First-strand cDNA was synthesized from 4 μg of DNA-free RNA using the Revert Aid ™ First Strand cDNA Synthesis Kit (Fermentas, Glen Burnie, MD, USA) according to the manufacturer's instructions. The cDNA was used as the template for Q-PCR. Primer sequences (designed using primer 3, http://frodo.wi.mit.edu/cgi-bin/ primer3/primer3_www.cgi) are listed in Additional file 16. The Q-PCR mixture (20 μl total volume) contained 10 μl of SYBR Premix ExTaq ™ (Takara, Kyoto, Japan), 0.4 μl of each primer (10 μM), 2 μl of cDNA, and 7.2 μl of RNasefree water. The reactions were performed on a LightCycler 1.5 instrument (Roche, Basel, Switzerland) according to the manufacturer's instructions. The two-step Q-PCR program began with 30 seconds at 95°C, followed by 40 cycles of 95°C for 5 seconds and 60°C for 20 seconds. Template-less controls for each primer pair were included in each run. The specificity of Q-PCR primers was confirmed by melting curve and sequencing of Q-PCR products. Expression was calculated as 2 -Δ Δ Ct and normalized to that of the actin gene (PpActin, JN684184) [67], and data were managed with the Data Processing System (DPS, v. 7.05, Zhejiang University, Hangzhou, China).