Seedling growth
The size of the seedlings was obtained at three and six months from the induction of germination. The three-month-old seedlings had an average of 7.2 nodes (ranging from 3 to 9) and 5.6 cm of internode length, and 9/10 seedlings had 6 or more nodes. The 6-month-old seedlings had an average of 10 nodes (ranging from 7 to 14) with 9.2 cm of internode length.
MRNA profile during the seedling growth of identified genes coding for transcription factors involved in plant development
Samples were taken from the whole seed or activated germinating embryo in the first two weeks of plant growth. After the first month, the shoots and the roots were separated and RNA was sampled separately. At each time point of the study, the RNA from 10 seeds, germinating embryos or seedlings was pooled prior to analysis by Q-RT-PCR. The mRNA level was determined in seeds at time 0, before inducing germination; at 1 day, just after the water imbibition; in germinating embryos at one and two weeks during vernalization; and in seedlings every month up to the end of the follow-up at 6 month. Ten genes previously shown to code for olive transcription factors [2] and a gene previously identified as likely being involve in olive plant development (contig 30,294) [3] were analyzed for gene expression during early development (Fig. 2).
An apparent downregulation was detected during water imbibition, but this is due to the change from seed to embryo between times 0 to time 1. However, in contigs 33,184 (SOC1-like) and 32,154 (APL2-like) the down regulation occurred during the vernalization incubation. In general the genes were upregulated later during development, except for the contig 27,994 (SBP-like), which remained low for the 6 months of follow-up. The upregulation was detected in the shoots but not in the roots, except for contig 33,184 (SOC1-like), in which case the roots and shoots registered higher mRNA levels three months after germination. Two more genes, contigs 31,742 (CONSTANS-LIKE 5-like) and 61,861 (SVP-like), were upregulated after the first two months, just after the seedlings were potted. However, most genes were upregulated after the third month from germination, just when the plants reached a size of around 7 nodes or higher.
Of special interest, an AGAMOUS-like gene, contig 11,388, was strongly repressed during germination and its mRNA was undetected during the rest of the follow-up (Fig. 3a). The analysis of mRNA by Q-RT-PCR in different tissues showed that this gene was present only in seeds (Fig. 3b). The microarray data of meristems from plants 6 to 39 months old (available from [3]) confirmed that no expression of this gene was detectable during plant development except in dormant seeds (data not shown). Therefore, this gene could be involved in the regulation of seed dormancy, and its expression is repressed probably to allow the seed to be released from dormancy and to germinate. In addition, contig 34,438 (AGL8, FRUITFULL, APL1-like) was surprisingly upregulated during the seed germination and remained elevated for the first month, to be downregulated for the rest of the follow-up period (Fig. 3c). According to its annotation and the flowering-expression pattern observed in microarray transcriptomic study (Fig. 3d), this gene is likely part of the A genes, exhibiting a type-A function in the ABC model of flower-pattern formation. However, it must have another role during the germination process. The contig numbers, the DNA sequences, and the microarray data were obtained from material available in [3].
RNAseq and transcriptome assembly
After trimming and cleaning, we obtained 342,049,597 paired-end reads of 101 bp length and 36.6 Phred quality (34.5 Gb). Trinity assembled all reads into 337,404 transcripts, and 26,393 unigenes were multi-isoform. After the removal of short contigs, contaminants, and non-coding transcripts, and after the selection of the longest isoform for clustered transcripts, the resulting transcriptome was composed of 109,125 unigenes (N50 = 1490 bp, average length = 839). A total number of 83,502 transcripts (76.52%) had an ortholog in Uniprot database of plant proteins. Sma3s software assigned functional annotations for 77,442 transcripts (67.3% of the transcriptome). The reads were mapped to this transcriptome for the time-series-expression analysis.
The RNAseq results were confirmed by a different technique —that is, the results of mRNA accumulation determined by Q-RT-PCR from the previous section were compared with RNAseq data. A great similarity of mRNA profiles was found in 8 of 10 unigenes and only partial differences in just two unigenes (Additional file 1: Figure S1).
Time-series-expression analysis of RNAseq data
The gene-expression data from the first time point (month one) was compared one by one with the rest of time points, from month two to six, and the genes that had at least a 8-fold change in any of the five comparisons with a 99% significance were selected. This analysis selected 4633 unigenes that were grouped according to their expression pattern in 42 k-mean groups (Additional file 2: Figure S2). For each group the enriched GO-terms were established and then subjected to a semantic association of Gene Ontology (GO) terms by a REVIGO analysis [20]. Of the k-means groups, 26 contained many GO terms related to stress responses but no GO terms related to plant development, and thus they were removed from the analysis. Seven of the 16 groups with relevant information about plant development corresponded to unigenes upregulated during the 6-month period studied (Fig. 4), and 9 corresponded to downregulated unigenes (Fig. 5). Additionally, k-means groups with similar patterns were grouped into clusters and the results of REVIGO analysis are shown in (Additional file 3: Table S1).
Cluster A was composed of groups 8 (133 unigenes) and 12 (136 unigenes) (Fig. 4a). In this cluster gene-expression increased over the six months of follow-up and, according to the REVIGO analysis, there were over-represented GO terms related to plant development such as: plant-type secondary cell wall biogenesis and cell wall organization and biogenesis; lignin catabolic process (biosynthetic); regulation of vernalization; formation of plant organ boundary; plant organ development; development maturation; multicellular organism growth; leaf pavement cell development; epidermal cell differentiation; negative regulation of multicellular organismal process (basically, this category contains GO terms related to breaking of seed dormancy); regulation of gene expression by genetic imprinting; genetic imprinting; DNA methylation on cytosine within a CG sequence; maintenance of DNA methylation; and response to cytokinin (including response to gibberellin and other hormones).
Cluster B was composed of groups 9 (63 unigenes), 10 (98 unigenes), and 11 (173 unigenes) (Fig. 4b). In this cluster, gene expression was upregulated after two months from onset of germination, just after the seedlings were potted. In this cluster the over-represented REVIGO GO terms related to plant development referred to: shoot system development; plant-type secondary cell wall biogenesis; and cell wall organization and biogenesis; secondary metabolite biosynthetic process (including regulation of lignin biosynthetic process); positive regulation of developmental process, positive regulation of multicellular organismal process; plant septum development; development maturation; tissue development, leaf pavement cell development; positive regulation of stomatal complex development; regionalization (stomatal complex paterning); formation of plant organ boundary; phloem or xylem histogenesis; vernalization response; fruit ripening (this large category containing GO terms related to seed dormancy and leaf senescence); negative regulation of multicellular organismal process (this category basically containing GO terms related to the breakage of seed dormancy; acropetal auxin transport (this category being related mainly to GO terms of IAA biosynthesis and metabolism); ethylene biosynthetic process; response to gibberellins; regulation of hormone biosynthetic process (a large category including biosynthetic processes of brassinosteroids and auxins); regulation of steroid metabolic process (especially including terms of abscisic acid biosynthetic process); cellular response to hormone stimuli and lipids (two large categories including many terms related to cellular response to hormone stimuli), cell-cell signaling involved in cell fate commitment (a large category with many terms of epidermis development); and negative regulation of cell differentiation (a large category of extremely diverse GO terms).
Cluster C was made up of groups 1 (146 unigenes) and 5 (126 unigenes) (Fig. 4c), and corresponded to upregulated genes after three months from germination. The REVIGO GO terms were over-represented and related to: plant development in this cluster regulation of growth rate; organ growth; regulation of secondary growth; regulation of vernalization response; programmed cell death; cell death; negative regulation of seed dormancy (basically release of seed from dormancy); positive regulation of leaf senescence (a large category including other plant organ senescence and development terms); brassinosteroid metabolic process (category including GO terms related to flavone biosynthetic process and phytosteroid metabolic process); and isoprenoid metabolic process. This last REVIGO group contained a very large group of GO terms related to the metabolism of isoprenoids that are required in primary and secondary metabolic processes with function in photosynthesis (carotenoids, chlorophylls, and plastoquinone), respiration (ubiquinone), membrane fluidity (sterols), and regulation of growth and development by hormones such as cytokinins, brassinosteroids, gibberelins, abscisic acid, and strigolactones.
Cluster D was composed of groups 34 (67 unigenes) and 39 (142 unigenes) (Fig. 5d), and corresponds to genes down-regulated the first month from germination and, according to the REVIGO analysis, there were over-represented GO terms related to plant development such as: leaf shaping; plant septum development; internode patterning (including xylem and phloem pattern formation, embryonic root morphogenesis, and others); positive regulation of developmental growth; negative regulation of leaf senescence; tissue development; positive regulation of abscisic acid-activated signaling pathway (a large category representing GO terms of a different set of genes of abscisic signaling pathway to those expressed from the second month of cluster B); response to ethylene (this category also including response to other hormones as gibberellin and cytokinin, again corresponding to a different set of genes than in cluster B); and internode patterning (this category including GO terms related to xylem and phloem pattern formation and leaf vascular tissue pattern formation).
Cluster E was constituted of groups 21 (54 unigenes) and 42 (171 unigenes) (Fig. 5e), corresponding to genes that had been downregulated just after the first two months, when the seedlings were potted. According to the REVIGO analysis, there were over-represented GO terms related to: plant development as dormancy process; positive regulation of seed germination; negative regulation of programmed cell death; cell fate specification; specification of plant organ identity; developmental process and multicellular organismal process; negative regulation of ethylene-activated signaling pathway (a very large category of GO terms including many factors and positive response to gibberellin hormone, representing GO terms of a different set of genes of response to gibberellins to those expressed from the second month of cluster B); response to gibberellin (a small category reinforcing the previous category on the previous point); auxin metabolic process (again seeming to be a change in the set of genes involved in hormone regulation and/or transport); negative regulation of cytokinin-activated signaling pathway (same comment as for the previous points; isoprenoid biosynthetic process (again the same comment as for the previous points); cellular response to endogenous stimuli (most GO terms related to responses to different hormones; same comment as for previous points); M specification of plant organ identity (a category of GO terms indicating a switch-off of genes that had been working in the first two months of development, after the seedlings had been potted); release of seed from dormancy (this category including GO terms of repressed genes related to seed exit from dormancy in agreement with the negative regulation of these kinds of genes in Cluster A and B).
Cluster F contained groups 23 (73 unigenes), 31 (138 unigenes), 37 (165 unigenes), 38 (158 unigenes) and 40 (171 unigenes), corresponding to genes that were downregulated after the first three months following germination induction. This time, three months of growth appeared to be important because there were many changes in the expression of genes related to development, and therefore, according to the REVIGO analysis, there were over-represented GO terms related to plant development such as: negative regulation of stomatal complex development (the name of this category of GO terms being a bit confusing because it includes GO terms such as stomatal complex morphogenesis, guard cell differentiation or regulation of stomatal complex development), and positive regulation of stomatal complex development (a large category including GO terms related to development); regulation of secondary growth; lateral growth; cell wall modification involved in multidimensional cell growth; growth; cell death; leaf vascular tissue pattern formation; embryonic morphogenesis (including post-embryonic plant morphogenesis), formation of plant organ boundary (this category including shoot system morphogenesis activated one month earlier, [Cluster B], and a pattern of increasing expression through the 6 months of follow-up [Cluster A], vegetative to reproductive phase transition of meristem and regulation of timing of transition from vegetative to reproductive phase (two categories with a large number of GO terms mostly related to reproduction or development); embryonic meristem development; cell fate specification; apical cell fate commitment; cell differentiation; centrolateral axis specification; seed germination (including seedling development); response to gibberellin and response to auxin (these categories affecting other hormones and thus causing a strong response to hormones in the third month that ended in the fourth); cellular response to hormone stimuli; negative regulation of response to stimuli (in agreement with the previous comment); regulation of developmental process (with genes involved in development being repressed after the third month of growth); plant organ morphogenesis and anatomical structure morphogenesis (a very large category that is consistent with the previous comments); and development process.