RNA-Seq reveals genotype-specific molecular responses to water deficit in eucalyptus
© Villar et al; licensee BioMed Central Ltd. 2011
Received: 19 July 2011
Accepted: 2 November 2011
Published: 2 November 2011
Skip to main content
© Villar et al; licensee BioMed Central Ltd. 2011
Received: 19 July 2011
Accepted: 2 November 2011
Published: 2 November 2011
In a context of climate change, phenotypic plasticity provides long-lived species, such as trees, with the means to adapt to environmental variations occurring within a single generation. In eucalyptus plantations, water availability is a key factor limiting productivity. However, the molecular mechanisms underlying the adaptation of eucalyptus to water shortage remain unclear. In this study, we compared the molecular responses of two commercial eucalyptus hybrids during the dry season. Both hybrids differ in productivity when grown under water deficit.
Pyrosequencing of RNA extracted from shoot apices provided extensive transcriptome coverage - a catalog of 129,993 unigenes (49,748 contigs and 80,245 singletons) was generated from 398 million base pairs, or 1.14 million reads. The pyrosequencing data enriched considerably existing Eucalyptus EST collections, adding 36,985 unigenes not previously represented. Digital analysis of read abundance in 14,460 contigs identified 1,280 that were differentially expressed between the two genotypes, 155 contigs showing differential expression between treatments (irrigated vs. non irrigated conditions during the dry season), and 274 contigs with significant genotype-by-treatment interaction. The more productive genotype displayed a larger set of genes responding to water stress. Moreover, stress signal transduction seemed to involve different pathways in the two genotypes, suggesting that water shortage induces distinct cellular stress cascades. Similarly, the response of functional proteins also varied widely between genotypes: the most productive genotype decreased expression of genes related to photosystem, transport and secondary metabolism, whereas genes related to primary metabolism and cell organisation were over-expressed.
For the most productive genotype, the ability to express a broader set of genes in response to water availability appears to be a key characteristic in the maintenance of biomass growth during the dry season. Its strategy may involve a decrease of photosynthetic activity during the dry season associated with resources reallocation through major changes in the expression of primary metabolism associated genes. Further efforts will be needed to assess the adaptive nature of the genes highlighted in this study.
Planted forests constitute only 7% of the global forested area, but contribute to a significant proportion of overall forest goods and services (e.g. up to 35% of industrial roundwood supply). In the context of climate change, the adaptation of planted forests is essential for a sustainable forestry sector. The adaptation of industrial plantations to present and future environmental conditions (including extreme weather events) depends on several factors, including the genetic diversity of the material used for reforestation and the phenotypic plasticity of individual genotypes. Genetic diversity ensures that forest trees can survive, adapt and evolve under changing environmental conditions [1, 2], whereas phenotypic plasticity constitutes a shorter term response to environmental changes at the individual level of particular importance in long-lived organisms, such as trees [2, 3].
Eucalyptus is one of the key genera among planted trees. The genus is includes the most important hardwood fibre crops species planted worldwide (19 million hectares according to ). Several Eucalyptus species grow rapidly and are highly adaptable. These properties led to their introduction worldwide, at latitudes extending from southern Europe to South Africa. In its natural range (Australia and some nearby islands), Eucalyptus are also found in a diverse spectrum of ecological niches. The genetic diversity of Eucalyptus has been studied extensively and remarkable levels of variation have been detected using neutral markers [5–11] and in genes possibly involved in adaptive traits [12–14]. Phenotypic plasticity is also likely to ensure better adaptation of individual genotypes to changing environmental conditions  and is of particular importance in clonal forestry.
Ecophysiological studies have shown that water is the principal factor limiting stem growth in Eucalyptus (e.g. [16, 17]). Moreover, some studies have reported that eucalyptus genotypes differ in terms of their capacity for phenotypic modification in response to water deficit [18–20]. Several physiological mechanisms for coping with drought have been described in these species: i) the regulation of transpiration to decrease water loss , ii) resource reallocation from the shoot to the root, to increase water uptake , and iii) adjustment of osmotic potential  or protection against reactive oxygen species, to prevent damage due to stress . Drought tolerance mechanisms have been described in detail at the molecular level for both annual and perennial model plants, such as Arabidopsis [23–25] and Populus [26–28], but little is known about the molecular basis of drought tolerance in Eucalyptus, particularly in field conditions.
Next-generation sequencing (NGS) provides new opportunities for studies of the molecular plasticity in response to water deficit. The high throughput of NGS is particularly useful in non-model organisms for which few genomic resources are available . Moreover, NGS is suitable for transcript profiling, combining the high throughput of serial analysis of gene expression (SAGE) with the functional annotation capacity of EST sequencing . These techniques have been widely used for transcriptome profiling, particularly for studies of biotic  and abiotic  stress responses, and the characterisation of developmental processes . Considerable sequencing depth can be obtained, making it possible to identify transcriptome expression variation .
In plants, the shoot apical meristem (SAM) is a key organ in stem development. The SAM initiates phytomers and regulates shoot growth by integrating several signals, such as hormones (ABA, auxins, cytokinins) and transcription (e.g. homeobox) . When plants are subjected to environmental stimuli, the leaf developmental network is adjusted by changes in shoot apex activation . In Eucalyptus, EST resources have been developed for various tissues, such as roots, leaves and wood-forming tissues [35–38], but a limited number of genomic resources are available for shoot apices, despite the important role of this organ in plant organogenesis.
In this study, we compared transcript profiles in the shoot apices of two eucalyptus genotypes used in industrial plantations, under two watering regimes -- irrigated (IR) versus non-irrigated (NI). The two genotypes differ in their growth rates and ecophysiological characteristics at maturity, with one genotype being more productive and water use-efficient than the other. We used pyrosequencing (Roche 454) to sequence non-normalized cDNA libraries constructed from shoot tip mRNA. After verifying technical reproducibility, we addressed the following questions: i) Are there molecular differences between genotypes, reflected in the contrasting phenotypes, and do these differences affect specific pathways or have a random effect on the transcriptome? ii) Can we detect molecular plasticity in the response to water shortage during the dry season, and which pathways are affected? iii) Does this plasticity differ between genotypes (i.e. is there any genotype-by-environment interaction?), and which genes or pathways reflect these differences?
We compared the response of two eucalyptus genotypes, 1-41 (NCBI Taxonomy ID: 764271) and 18-50 (NCBI Taxonomy ID: 765255), to water shortage during the dry season of 2008. These two genotypes are used in industrial plantations in the Republic of Congo. Hybrid 1-41 (named G1 in the following sections) was obtained by open pollination of E. alba (the male parent is unknown) and the hybrid 18-50 (named G2) was derived from a controlled pollination of E. urophylla (genotype 14-36) by E. grandis (genotype 9-10). These two hybrids differ in their growth rates and water use efficiency (WUE, estimated by isotopic carbon composition) at maturity, G2 being superior than G1.
Trees were vegetatively propagated by rooted cuttings and established in a field experiment in Yanika, Republic of Congo (4°20'S, 11°38'E, 50 m above sea level), in June 2007. Trees were planted in plots of 64 cuttings per genotype and per treatment, including a buffer zone of 40 plants. Two watering regimes were used during the dry season: no irrigation (NI) and irrigation (IR). Trees were watered with sprinklers, to replenish evapotranspiration losses, estimated at 3 mm per day. In order to evaluate the effect of water deficit on above-ground biomass growth and molecular plasticity, plant material was sampled in September 2008, 16 months after the trees were planted. The dry season began approximately on May 15th 2008 -therefore, trees under NI treatment were subjected to four months without rainfall by the time samples were collected.
Volumetric water content (VWC) was measured by time domain reflectometry (TDR; Trase system, Soil moisture, Santa Barbara, CA). Four series of TDR probes per genotype and per treatment were installed horizontally, at six depths (0.15, 0.5, 1, 2, 3 and 4 m). Mean values were calculated from the four replicated measurements at each depth.
where Xijk is the above-ground biomass in genotype i (G1 or G2), treatment j (NI or IR) and replicate k. a, b and c are the regression coefficients of G, the genotypic effect, T, the treatment effect and G × T the interaction between genotype and treatment, and εijk is the residual.
Contigs with less than 10 reads for the 12 sequencing sets generated in this study were eliminated from further statistical analyses. For the 14,460 remaining contigs, the numbers of reads per sequencing set and per contig were used to assess gene abundance. Two types of statistical analysis were performed. First, pairwise comparisons were carried out between genotypes (G2 vs. G1 sequencing sets, irrespective of treatment) and between treatments (IR vs. NI sequencing sets, irrespective of genotype). Four additional comparisons were carried out for each genotype and each treatment, as follows: IR vs. NI for genotypes G1 and G2, G1 vs. G2 for treatments IR and NI. Statistical tests, based on the use of the MARS method in the DEGseq package  were performed to assess differential expression . Second, two-way analysis of variance (ANOVA) was performed on contigs, making use of the three replicates (run A, B and C) per treatment to estimate random variation and test the genotype (G), treatment (T) and genotype-treatment interaction (GxT) effects. Transcript abundance was normalized by dividing the number of reads by the sequence length of the contigs and the total number of sequences in each sequencing set. Contigs with a q-value <0.05 in the DEGseq test (i.e. after false discovery rate corrections)  and with p-values <0.05 after ANOVA were considered to be differentially expressed and were extracted for further analysis. The 14,460 genes analysed were classified into four classes: not significantly differentially expressed (NS), and showing genotype ("G" contigs), treatment ("T" contigs) or genotype × treatment ("GxT" contigs) effects. For the comparison of expression levels, we used log2-transformed fold-changes between contig abundances in the various contrasts obtained in the DEGseq analysis.
Contigs were assigned a putative function by BLASTX , using various public databases: UniProtKB/Swiss-Prot (release 57.1), RefSeq Protein (release 34), Pfam (release 23.0), with an e-value cut-off value of 10e-5. Sequences were also compared to TIGR's assemblies of Arabidopsis_thaliana (release 14), Helianthus_annuus (release 6), Populus (release 4), Picea (release 3) and Vitis vinifera (release 6), with an e-value cut-off of 10e-2. Gene Ontology terms were assigned via the UniprotKB accession and clustered with Blast2GO . The differential distributions of each class of effect (T, G, GxT and NS) between Biological Processes, Molecular Functions and Cellular Components were assessed using Fisher's exact tests, with a significance threshold of 0.05. Pathway analysis was carried out with Mapman . Differentially expressed contigs were assigned to functional categories (or bins) by Mercator (http://mapman.gabipd.org/web/guest/mercator). A dedicated pathway map was created to represent most of these contigs. The Wilcoxon rank sum test was used to identify differentially regulated bins.
A factorial design including two genotypes (G1 and G2) and two water regimes (irrigated IR vs. non-irrigated NI) was established in a field trial in one of the main areas of eucalyptus plantation in the Republic of Congo. The experiment was evaluated over a period of two years. Soil water content (SWC) was monitored throughout the experiment at six depths (0.15-4 m), to assess water availability in different experimental conditions. In this study, we focus on the effect of water availability in the second dry season (after four months without rainfall) on biomass production and the transcriptome.
Summary statistics for the three 454-sequencing half-runs
# of Reads
# of Reads
# of bp
Average length of reads (bp)
Average length of reads (bp)
1.5 run GS-FLX Titanium
Assembly statistics from TGICL (I), and figures obtained after Pyrocleaner analysis (II), for the set of sequences reported here (III)
E. spp (III)
# reads in contigs
Average length of all contigs (bp)
# Large Contigs >500 bp
Average length of large contigs
In total, 90,579 UE (70% of E. spp UE) did not have sequence similarity to the E. grandis sequence reads from Novaes et al. . Most of these sequences were singletons (71,761), although some were contigs (18,818), corresponding to 27% of the E. spp contigs. A BLAST homology search (cut-off: 10-10) of published eucalyptus databases (ESTs from GenBank available on April 2010; 454-ESTs generated by Novaes et al., ; 454-ESTs from JGI from E. globulus xylem and leaf tissues; Illumina contigs generated by Mizrachi et al., ) showed that 21,401 UE (comprising 3,066 contigs and 18,335 singletons) did not match any known sequence. Thus, the resource described here greatly extends the list of genes known to be expressed in Eucalyptus, which will be critical for the annotation of the genome sequence. Due to the smaller number of reads in run A, the total number of UE including reads generated from run A was also smaller (58,763) than that generated from the other two runs (67,467 in run B and 67,756 in run C). Each supplementary half-run produced between 16 and 21% new contigs for a second half-run, and between 5 and 7% for a third half-run (Additional file 3). Vega-Arreguin et al.  reported similar trends in maize, with a plateau of gene representation reached after the third successive GS-20 454-sequencing run. The number of reads generated was therefore considered sufficient to sample most expressed genes.
Within conditions, correlations between the G1 and G2 sequencing sets were robust and similar between treatments: 90% for the IR treatment and 88% for the NI treatment, suggesting that these two genotypes displayed similar patterns of gene expression when placed in the same environmental conditions. Correlations between the IR and NI sequencing sets were slightly weaker and a stronger contrast was observed within genotypes: on average, 86% for genotype G1 and 82% for genotype G2. The weaker correlations obtained for G2 suggest that this genotype had a more pronounced response to water deficit than genotype G1.
Annotation results for protein hits, nucleic acid hits, and Gene Ontologies (GO): Biological Process (BP), Cellular Component (CC) and Molecular Function (MF)
E.spp Protein hits
Nucleic acid hits
According to Gene Ontology (GO) classification, 38,190 UE (25% of the E.spp sequencing set UE) were associated with at least one biological process (BP), molecular function (MF) or cellular component (CC). The proportions of UE annotated in each category were generally similar to those obtained in Arabidopsis (Additional file 4), suggesting that the E.spp sequencing sets are appropriate for the analysis of gene expression on a broad range of functional categories.
A total of 155 contigs showing a treatment effect ("T contigs"; figure 7B) were identified with similar numbers overexpressed in the two treatments (81 in IR and 74 in NI). Thirteen "T contigs" were expressed only in NI conditions, whereas all the "T contigs" overexpressed in IR conditions were also found in NI sequencing sets, suggesting that few "specific genes" are upregulated in response to water deficit but that the set of genes expressed in favourable conditions is also expressed at a lower level in stressed plants.
Finally, 274 contigs corresponded to "GxT" contigs. The larger number of "GxT" contigs than of "T contigs" suggests that some of the observed molecular plasticity is under genetic control. Only 11 "GxT contigs" displayed significant differential expression between the IR and NI conditions in genotype G1 (figure 7C), whereas 112 "GxT contigs" displayed such behaviour in genotype G2 (figure 7D), suggesting a more pronounced response in G2. Similarly, 48 "GxT contigs" were differentially expressed between the two genotypes in IR conditions (figure 7E), whereas 228 "GxT contigs" were differentially expressed between the two genotypes for the NI treatment (figure 7F). These results suggest that, despite the rather similar expression patterns for the two genotypes in IR conditions, water deficit induced a molecular response specific to each genotype, reflecting different strategies to respond to water shortage during the dry season.
Of the 1,445 contigs displaying significant differential expression, the 1,280 "G contigs" did not enable characterisation of specific molecular processes (i.e. did not show any clear co-regulation with genes of the same biosynthesis pathway). For 95% of these contigs, many different genes from different molecular processes were activated, depending on the experimental condition (Additional file 6). In some instances (5% of "G" contigs), some bins presented specific overexpression in one genotype (Additional file 7). For example, contigs related to ethylene biosynthesis and cell organisation were overexpressed mainly in genotype G2, whereas contigs related to photosynthesis, nitrilases, calcium signalling and pathogenesis-related protein bins were overexpressed in genotype G1. Serine proteases (9 contigs) were expressed more strongly in genotype G1 (p-value = 0.038), whereas ubiquitin E3-encoding proteins (23 contigs) were expressed more strongly in genotype G2 (p-value = 0.096), suggesting that proteolysis occurred via different pathways in the two genotypes.
Analysis of the metabolic pathways for "T" contigs was limited because of the small number of contigs (155), which were distributed in several bins (Additional file 8). However, the expression of genes related to carbohydrate degradation and ethylene biosynthesis were found to be stronger for the NI treatment, whereas the expression of genes related to ribosomal protein synthesis and cell development appeared to be stronger for the IR treatment (Additional file 7).
Next-generation sequencing (NGS) technologies are becoming the method of choice for large-scale transcriptome analysis, even for non-model species (e.g. [29, 41], reviewed in ). Several technologies have been developed, differing essentially in the number of reads generated and read length (reviewed in ), making it possible to catalogue the genes expressed and to monitor gene expression.
In Eucalyptus, Mizrachi et al.  generated 3.93 Gbp of short reads (36-60 bp) with sequencing-by-synthesis technology from Illumina, and assembled this information de novo into 18,894 contigs (Illumina-contigs) longer than 200 bp (22.1 Mbp in total). In this study, we obtained 0.398 Gbp of sequences with longer reads (mean of 350 bp), which were assembled into 48,950 contigs (454-sequencing contigs) with more than 200 bp each (36.5 Mbp in total). We were thus able to assemble more reads, i.e. 9.2% of the sequencing set, that is a much higher rate than the 0.56% reported by Mizrachi et al. . BLAST searches for sequence similarities between the two datasets showed that 86% of the contigs were common to both studies (42,550 454-sequencing contigs matched 16,278 Illumina contigs). However, each Illumina contig matched a mean of five 454-sequencing contigs, indicating that UE detected with our approach were probably confounded in the short-read assembly. In addition, the short Illumina contigs may represent domains shared by multiple proteins, confirming the difficulty involved in assembling short reads into transcriptional units . Alternatively we can not rule out the fact that genes were split in multiple contigs in the 454 assembly because of the lack of coverage compared to Illumina's short reads. Finally, we found that 36,985 454-UE did not match any previously described eucalyptus ESTs, and that 43% of these UE (18% of the 454-sequencing contigs) displayed no match with any nucleic acid or protein sequence published for any other species. Therefore, our 454-sequencing data considerably enriched the Eucalyptus EST collection.
RNA-Seq is also an interesting approach to obtain a comprehensive digital gene expression profile for specific tissues, cell types or developmental processes. In this study, the high degree of repeatability observed for the three replicates made it possible to test G, T and GxT effects with a statistical support. We were able to monitor the expression of 14,460 UE and to identify 1,445 UE displaying at least one significant effect. Some technical biases, such as non-linear amplification and a lack of sequencing depth may have resulted in a lack of precision in the prediction of gene expression by 454-sequencing (Additional file 9). Short read-based sequencing approaches provide ample read coverage and generally give better predictions of gene expression [56, 57]. Thus, a combination of long and short reads may be seen as a reasonable strategy for the analysis of gene expression [55, 58, 59]. With this combined strategy, long-read sequencing can be used to establish a comprehensive catalogue of transcriptional units, while short reads mapped onto this assembly provide greater sequencing depth, improving predictions of abundance.
We observed that phenotypic and molecular variation are accounted for principally by genotypic differences. Indeed, above-ground biomass and contig abundance were influenced principally by genotype. Above-ground biomass was, on average, 49% higher for G2 than for G1, and most of the differentially expressed contigs (1,280 of 1,455) in this study presented a genotypic effect. The number of contigs overexpressed in one or the other of the genotypes was similar - 624 and 656 contigs were overexpressed in G1 and G2, respectively. No particular genotypic signature in terms of functional categories or pathways was detected. The two genotypes differed strongly in phenotype (not only it terms of growth potential, but also in terms of leaf morphology, stomatal distribution and water use efficiency), but it remains unclear whether these differences in transcript abundance were responsible for trait variation, neutral, or simply involved in reproductive isolation between parental species. Indeed, differences in gene expression between species have been reported in the field of ecological genomics [60, 61] and interpreted as a mechanism of speciation. In our study, both genotypes were hybrid combinations between different species. This may have increased the number of differences between their transcriptomes. Further investigations about the role of gene expression in ecological speciation is an important question, particularly for eucalyptus, in which species complexes are common .
The variance accounted for by genotype-by-environment interaction (GEI) at the phenotypic and molecular level was also significant in this study. The NI treatment resulted in a significantly lower above-ground biomass, and this difference was greater for the least productive of the two genotypes, G1 (24%), than for G2 (12%). The G and T effects accounted for 56% and 13% of the above-ground biomass variation, respectively, whereas GxT effects accounted for only 0.2% of the variance. At the molecular level, we also found a higher proportion of genes displaying G effects (1,280 contigs, 8.8% of the contigs screened) but, surprisingly, we found fewer genes displaying T effects (155 contigs, 1.1% of the contigs screened) than GxT effects (274 contigs, 1.9% of the contigs screened).
Summary of phenotypic and molecular plasticity evidenced for the two studied genotypes (G1 and G2), between irrigated (IR) and non irrigated (NI) treatments
IR → NI
IR → NI
+ (light signalling)
-- (light signalling) ++ (receptor kinases)
Other stress related genes
-- (heat-shock proteins) ++ (PR proteins)
+ (carbohydrates secondary metabolism),
++ (cell organisation, lipids, proteins) -- (secondary metabolism)
Cellular activity maintenance
-- (photosynthesis, transport)
Contigs displaying GEI effects could be classified into two groups according to the function of the proteins encoded : i) regulatory proteins responsible for drought signal transduction and response triggering, and ii) functional proteins involved in cell protection, damage repair and the maintenance of cell activity.
Regarding regulatory functions, the genes involved in the biosynthesis of hormones, such as ethylene and auxins in particular (aldo/keto reductase, proteins of the ethylene-responsive family, 2-oxoglutarate-dependent dioxygenase) were mostly overexpressed in genotype G1 and underexpressed in G2, in the NI condition. Genes acting as second messengers in the transduction of hormonal signal to stomatal guard cells [64, 65] also displayed GEI effects: in the NI condition, genes involved in calcium signalling were predominantly overexpressed in G1, whereas the response of G2 to drought preferentially involved receptor kinases. We also identified other signal transducers, such as light-induced proteins and heat-shock proteins, which may be related to other types of stress induced by water deficit, including osmotic stress due to pH variations and oxidative stress due to the accumulation of reactive oxygen species (ROS) [66, 67]. These pathways were mostly overexpressed in G1 and underexpressed in G2 during the dry season. Pathogenesis-related (PR) protein genes were overexpressed in G2, but displayed a less clear-cut pattern of expression in G1. PR proteins were initially reported to be induced by hormones or ROS in response to biotic stress [68, 69], but they have also been shown to be involved in other abiotic stresses . Lee et al.  also suggested that PR proteins may be used as storage proteins when growth is limited by environmental factors. Some transcription factors responded strongly to water shortage in genotype G2. Two, in particular, encoded factors homologous to AtMYB12 and AtMYB85, which have been shown to regulate secondary metabolism (flavonoid and lignin biosynthesis, respectively) in Arabidopsis . These results suggest that water shortage induces different cellular stress cascades, perceived differently by the two genotypes.
Stress signal transducers interact to trigger the regulation of gene expression for the maintenance of three main functions: cell protection, damage repair and the maintenance of cell activity. Our results suggest that more genes related to cell protection were involved in the response to water shortage in genotype G1 than in genotype G2. Protection against drought stress seems to involve mostly carbohydrates, with 11 contigs displaying GEI effects, and, to a lesser extent, polyamines, which may modulate some ion channels .
By contrast to the trends observed for genes related to cell protection, more genes related to damage repair seem to be expressed during the dry season in genotype G2 than in genotype G1, particularly those related to cell organisation. The overexpression, during NI treatment, of genes related to primary metabolism, including carbohydrate, lipid and protein synthesis and degradation, suggests that resources are reallocated for the repair of cell structures or the formation of new structures in drought stressed plants. The patterns of expression of secondary metabolism genes differed between the two genotypes as well. As an example, G1 displayed a higher number of genes related to terpenoids and flavonoids synthesis (that may protect against oxidative stress) overexpressed in the NI treatment compared to genotype G2. However, the contrasts between NIR and IR treatments were much higher for G2. Conversely, genes related to lignin biosynthesis (e.g. CCoAOMT) were overexpressed in the IR condition only for G2.
Gene related to photosynthesis were found to be under-expressed in G2 subjected to NI treatment, whereas no variation was found in G1. Other metabolic processes, such cell development and transport, controlled by genes encoding water or sugar channels, decreased in NI treatment, particularly in G2. These results confirm the trends observed by Alexandersson et al.  in Arabidopsis. These authors studied the expression of 18 genes encoding aquaporins and showed that most of these genes were downregulated in leaves subjected to a gradual water deficit. Interestingly, cell activity seemed to be more reduced at the transcriptional level for G2, although this genotype grew more strongly. It is possible that, during water deficit, genotype G2 reduces its rate of photosynthesis and reallocates resources (as suggested by changes in primary metabolism) to preserve its cell structures and ability to resume growth when conditions become more favourable.
We found that 31 of the 274 contigs displaying a GxT effect were absent from the G1 sequencing set (11.3%), whereas only two such contigs were absent in G2 (0.7%). Unfortunately, of 31 contigs absent in G1, 16 could not been assigned to a homolog gene in Arabidopsis. The others corresponded to genes related to cell organisation (ankyrins), ethylene synthesis, protein metabolism, PR proteins and receptor kinases. These genes may be considered non-essential for tree development, and are therefore unlikely to be subject to selection constraints. Landry et al.  found an overrepresentation of non-essential genes (the deletion of which is not lethal) among genes displaying GEI in Saccharomyces cerevisiae. They proposed two hypotheses to account for the activity of these genes being compensated in cells: i) metabolic buffering: non-coded metabolites may be rerouted through the metabolic network, and ii) genetic buffering: paralogous genes may supply the missing function. We showed in the results section (Additional file 5) that differences between genotypes may be accounted for by the preferential expression of different members or splicing forms of genes from the same family. This observation may confirm the hypothesis of genetic buffering.
Scale plasticity was observed for 146 of the 274 differentially expressed contigs: genotype ranks were conserved between treatments but one genotype reacted more or less strongly to the environmental variation. Conversely, 90 contigs showed a change in ranking between genotypes (rank plasticity). Landry et al.  hypothesised that these two types of GEI would have different effects on the evolution of plastic traits. In the case of scale plasticity, whatever the environment, selection would result in the same favoured genotype, whereas in the case of rank plasticity, different genotypes would be selected in different environments. In the present study, we found differentially expressed genes presenting both scale plasticity (62%) and rank plasticity (38%), indicating different types of reaction norms on which natural selection would act on.
We showed that next-generation sequencing is a powerful tool for transcriptome screening: with 398 Mb of sequence, we were able to assemble ESTs into 69,584 contigs, with remaining 80,245 singletons, and to determine the relative abundance of 14,460 contigs each comprising more than 10 reads. Large differences between genotypes, in terms of phenotypic behaviour and transcriptome regulation, were observable. Differences in gene expression between the two genotypes appear to affect the whole transcriptome, rather than specific pathways. The genotype-specific response to water shortage (i.e. GxT effect) was more pronounced than the response common to both genotypes (i.e. T effect). The genes displaying genetically controlled plasticity were found to belong to a number of different pathways essentially related to signal transduction and primary metabolism. The more productive genotype, G2, express a larger set of genes, leading to the triggering of specific molecular responses. Moreover, GxT interaction results principally from a lack of molecular response in genotype G1, together with a strong response of genotype G2 (table 4). The ability to regulate more actively its transcriptome might be a key component in the maintenance of biomass in water deficit conditions.
Finally, although this study provides clues to the way in which different genotypes activate their transcriptomes when subjected to water deficit, more research is required to understand the molecular mechanisms involved during the dry season. First, there is a need to characterize reaction norm in a broader genetic background . Second, epigenetics or post-transcriptional regulation mechanisms that are well known to interfere with abiotic stress responses [76, 77] deserve specific investigations.
We thank Andréas Ndeko, André Mabiala, Joël Polidori and Dr. Aubain R. Saya (CRDPI, Rep. of Congo) for setting up the experimental trial. This article is part of the PhD thesis of Emilie Villar, supervised by Jean-Marc Gion and Christophe Plomion. EV was supported by CIRAD. This work was supported by grants from CIRAD, ATP project "Plasticité phénotypique des pérennes sous contrainte hydrique au champ", and the Conseil Régional d'Aquitaine ("ABIOGEN" project no.° Presage 32973). Some of the experiments (RT-qPCR) presented here were performed at the Genome Transcriptome facility of Bordeaux (grants from the Conseil Régional d'Aquitain nos. 20030304002FA and 20040305003FA and from the European Union, FEDER no. 2003227).
This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.