Isolation of nuclei from early embryogenic cells
To gain new insights in the molecular processes that cause some cells in the callus to undergo somatic embryogenesis (SE), we produced INTACT-suitable Arabidopsis transgenic plants carrying the NTF chimeric protein under the control of the LEAFY COTYLEDON 2 (LEC2) promoter region. The NTF protein consists of three domains: the WPP domain of Arabidopsis RAN GTPASE ACTIVATING PROTEIN 1, which is necessary and sufficient for nuclear envelope association, the green fluorescent protein (GFP) for visualization, and the biotin ligase recognition peptide (BLRP), which acts as a substrate for the Escherichia coli biotin ligase BirA (constitutively expressed in the INTACT transgenic lines background) [16]. LEC2 is a B3 domain transcription factor essential for proper development of the zygotic embryo and for triggering somatic embryogenesis in vegetative cells in the absence of exogenous auxin or stress treatments [18]. LEC2 expression in embryogenic cultures has been documented in many plant species [19,20,21], making it a first-choice marker for SE. Two parallel in vitro embryogenic callus cultures were initiated from ProLEC2:NTF transgenic lines (in a Pro35S:BirA background). As expected, the NTF protein was visible in the immature zygotic embryos used to establish the culture (Additional file 1: Fig. S1A). As early as 3 days after callus induction on medium supplemented with 2,4-D, GFP fluorescence could not be detected and stayed off throughout callus formation and proliferation (Additional file 1: Figure S1B-E). After 3 weeks, one of the cultures was kept proliferating on 2,4-D, whereas the other was moved onto hormone-free medium to induce SE. After 10 days of SE induction, we detected GFP expression although no embryo structure was apparent on the callus surface (Additional file 1: Figure S1F and G). At this point, nuclei from embryogenic cultures induced to undergo SE and expressing the NTF marker were purified with the INTACT method whereas un-induced embryogenic callus cultures were subjected to nuclei purification (Fig. 1a). Following the INTACT pull-down, we exclusively retrieved nuclei bound to streptavidin-coated beads (Additional file 1: Figure S2A-C), most of which were associated in large clumps (Additional file 1: Figure S2A, red arrowheads). On the other hand, in control experiments carried with ProLEC2:NTF plants in a wild-type background, free nuclei were observed before INTACT pull-down (Additional file 1: Figure S2D, green arrowheads), and only beads were retrieved after pull-down (Additional file 1: Figure S2E, yellow arrowhead). Nuclear RNAs from two independent biological replicates of each sample were extracted and deep sequenced. We calculated expression values for each gene in each replicate and controlled the similarity of the samples using the log likelihood ratio statistic under a simple Poisson model [22]. As expected, replicates from each cell type clustered together and separated well from the replicates of the other cell type, revealing different expression profiles between the two cell types (Additional file 1: Figure S3). In total, we detected 17,576 genes in the two cell types. Remarkably, 98.8% of expressed genes were detected in embryogenic and proliferating callus cells, suggesting that we sampled early enough to look at the first differentiation events. Among the 1,2% of genes only expressed in one cell type, only 79 were uniquely detected in callus cells, and 134 were uniquely detected in embryogenic cells (Additional file 2: Table S1). To study differences in gene expression programs between the two cell types, we identified differentially-expressed genes (DEGs). A union set of 6699 DEGs (Additional file 3: Table S2), equal to 38% of detected genes, were observed with a balanced distribution between genes up and down-regulated in embryogenic (3327 genes were upregulated in callus cells and 3372 in embryogenic cells), suggesting that, at this stage, distinct cell fates are dictated by differential expression levels rather than cell specific gene expression. In order to validate our transcriptome data, we checked the transcription level of endogenous LEC2 and other genes known to play a role in SE and differentiation (Fig. 1b). LEC2 features among the most dramatically upregulated genes, as expected from our experimental set-up. Nevertheless, LEC2 transcript abundance is relatively low compared to other DEGs (Fig. 1b and Additional file 3: Table S2). This is in accordance to previous transcriptome analyses performed on Arabidopsis embryogenic cultures [23], indicating that low levels of LEC2 transcription factor are sufficient to trigger the developmental cascade that leads to SE. However, transcripts of LEC1, another marker for SE [24], were not detected. LEC1 expression is positively regulated by LEC2 [25], suggesting that we might have sampled our cultures before LEC1 transcriptional activation by LEC2. Consistent with this, another transcription factor controlled by LEC2, the MADS domain AGAMOUS-LIKE15 (AGL15) [26], was also not upregulated in embryogenic cells.
Beyond LEC transcription factors, a handful of genes have been implicated in the transition from somatic to embryonic fate in plants. Similarly to what has been observed with LEC1 and LEC2, over-expression of the AINTEGUMENTA-LIKE (AIL) BABYBOOM (BBM) and AIL5 transcription factors promote embryogenesis and organogenesis in the absence of exogenously applied growth regulators [27, 28], whereas the SOMATIC EMBRYOGENESIS RECEPTOR KINASE1 (SERK1) has been shown to enhance somatic embryo development [9]. We found upregulation of BBM, AIL5 and SERK1 in embryogenic cells, though to a much lesser extent than LEC2 (Fig. 1b). Interestingly, their absolute expression level is high in both callus cells and embryogenic cells (Additional file 3: Table S2), suggesting that they might play a different role than LEC2, and might be involved in the acquisition of competence to undergo SE (induction phase), rather than triggering embryo differentiation (developmental phase). Consistent with this interpretation, BBM was recently found to directly and positively regulate LEC2 and LEC1 expression [29]. The AP2/ERF transcription factor WOUND INDUCED DEDIFFERENTIATION 1 (WIND1) has been implicated in establishing and maintaining the de-differentiation status of somatic cells upon wounding, and seedlings over-expressing WIND1 exhibit callus-like un-organized cell proliferation around the shoot meristem [30]. In our experimental system, WIND1 exhibits the same pattern observed for BBM and SERK1 (Fig. 1b), suggesting that it might play a similar role in conferring embryogenic competence. Finally, qRT-PCR analyses on embryogenic callus and un-induced callus confirmed expression trends for all genes tested (Fig. 1c). Detection of relatively early markers of SE and the strong enrichment in LEC2 transcripts indicate that we have correctly purified SE-induced cells, whereas the absence of LEC2 induced markers, suggests that we have sampled cultures at an early stage of SE.
Last, to verify that our culture conditions were suitable for producing embryogenic callus, we left part of our callus material on hormone-free medium for up to 3 weeks in order to observe somatic embryos emergence. Additional file 1: Fig. S4 shows somatic embryos emerging from embryogenic callus (Additional file 1: Figure S4A) and an optical section of a developing somatic embryo (Additional file 1: Figure S4B) obtained through confocal microscopy.
Early embryogenic cells are transcriptionally rather than metabolically active
To study the differences between embryogenic and non-induced callus cell types from a functional and molecular point of view, we performed gene ontology studies for DEGs. Our analyses show that DEGs up-regulated (DEGsUP) and DEGs down-regulated (DEGsDOWN) in embryogenic cells fall into different gene ontology categories (Additional file 3: Table S2 and Fig. 2), implying that different transcriptional programs are active in callus and embryogenic cells. In the ‘Biological process’ category, over-represented GO terms for DEGsUP include ‘actin filament-based movement’, ‘movement of cell or subcellular component’ and ‘microtubule-based movement’ (Fig. 2), suggesting a re-organization of cell contents in embryogenic cells, possibly due to changes in cell fate and activation of polarized cell growth. Other GO terms over-represented in DEGsUP include ‘chromosome organization’, and ‘chromatin organization’ (Fig. 2), in line with mounting evidences that epigenetic marks act as gatekeepers to cell fate transitions [31, 32]. Notably, the ‘regulation of gene expression’ category is over-represented in DEGsUP and under-represented in DEGsDOWN. We quantified the number of differentially expressed transcription factors (TFs) and found that they account for 14.5% of DEGsUP and for only 4.2% (64 on 1539) of DEGsDOWN (64 out of 1539, percentages are statistically different p < 0.01, Fisher Exact test). This suggests that a boost of expression of transcription factors is likely to cause the activation of SE developmental pathways, or the repression of callus fate. On the other hand, among the DEGsDOWN, we observed an enrichment of gene categories linked to a variety of metabolic activities such as ‘neutral lipid metabolic process’, ‘cellulose biosynthetic process’ and ‘plant-type cell wall biogenesis’. These results have been further confirmed by gene enrichments studies in the ‘molecular function’ category (Fig. 2): over-represented GO terms for DEGsUP include ‘chromatin binding’, ‘nucleic acid binding transcription factor activity’ and ‘motor activity’, whereas over-represented GO terms for DEGsDOWN feature terms related to biochemical activities such as ‘polygalacturonate 4-alpha-galacturonosyltransferase activity’, ‘carbohydrate binding’, ‘substrate-specific transmembrane transporter activity’ and ‘neutral lipid metabolic processes’ (Fig. 2). Overall, these analyses suggested that embryogenic cells fate is associated to an enhanced transcriptional activity and repression of metabolic pathways.
Early embryogenic cells share similarities with meristematic and embryo cells at the transcriptional level
We studied the transcriptional set up of embryogenic cells their resemblance to other cell types. For this, we performed principal component analysis (PCA) including both collected data sets (embryogenic cells and proliferating callus cells) together with publicly available expression data (microarrays or RNA deep sequencing collected from different Arabidopsis tissues). In all PCA analyses, embryogenic cells and callus cells clustered close to each other (Figs. 3 and 4). Although, to a great extent, this is expected when comparing data from different experiments, the tight clustering of our samples might be the result of early sampling during somatic embryogenesis and suggests that we are looking at the first differentiation steps between these two cell types. This interpretation is in line with the finding of a large gene expression overlap between the two samples. Somatic embryos have been widely reported to resemble both morphologically and physiologically zygotic embryos [8, 33, 34]. In order to assess similarities between our cultures and developing zygotic embryos, we performed a PCA using our datasets together with publicly available sequencing data from 1 to 2 cell embryos, 8 cell embryos (octant) and 32 cell embryos (globular stage) [35]. Whereas the first principal component separates samples by experiment, the second component accounts for more than 8% of the variance in the dataset and correlates well with embryo development. According to PC2, embryogenic cells are closely related to 8-cell embryos (Fig. 3a, green dots) and well differentiated from 2 or 32-cell embryos (Fig. 3a, yellow and blue dots respectively). In accordance to this result, we detected low or no transcripts for WUSCHEL (WUS), SHOOTMERISTEMLESS (STM) and CLAVATA3 (CLV3), meristematic genes whose expression has been documented in embryogenic culture systems (Fig. 1b and Additional file 4: Table S3) [12, 36,37,38]. Indeed, during zygotic embryo development, markers of an organized shoot apical meristem (SAM) are visible only starting at the 16 cell stage, with WUS expression appearing in four sub-epidermal apical cells [39], later followed by STM activation in the apical domain of the early globular embryo [40] and last, CLV3 expression is detected at the heart stage between the emerging cotyledons [41]. On the other hand, we did not detect WUSCHEL-related-HOMEOBOX 2 (WOX2) and WOX9, known to establish the apical and the basal domains of the early zygotic embryo [42]. Lack of WOX2 and WOX9 expression suggests that early patterning during somatic embryos establishment might be directed by alternative developmental routes. Other members of the family belonging to the WOX2 module (WOX1,2,3,5) recently reported to initiate the stem cell program during zygotic embryogenesis [43] were not detected, exception made for WOX5, which nonetheless it is not differentially expressed in embryogenic cells. The HD-ZIP III genes PHABULOSA (PHB), PHAVOLUTA (PHV), and REVOLUTA (REV) are well known factors playing a fundamental role in establishing the SAM in zygotic embryos. They are expressed throughout the 16-cells stage embryo, and in later stages their expression is restricted to the central region of the embryo (SAM included), provasculature and the cotyledons adaxial side [44, 45]. We found these genes highly expressed in both callus and embryogenic cells, together with other HD-ZIP III genes: CORONA (CNA) and Arabidopsis thaliana HOMEOBOX GENE8 (ATHB8). REV, CNA and ATHB8 are listed among the DEGsUP, arguing that SAM specification during SE takes alternative developmental routes to those known to function during zygotic embryogenesis.
Given the documented expression of SAM markers in a variety of embryogenic cultures [12, 36, 38], it is generally believed that SAM organization is one of the early events in SE. Thus, we performed a PCA analysis to compare the expression patterns of callus cells and embryogenic cells to those from SAM functional subdomains [46]. Namely, we used expression profiles from the stem cell niche marked by CLV3 expression, the organizing center marked by WUS expression and the SAM peripheral zone marked by FILAMENTOUS FLOWER (FIL) expression. As shown above, the first principal component explained 75.6% of the variance and differentiated well our dataset from the SAM dataset. However, when principal components 2 was considered (explaining 7.4% of the variance), we observed a closer similarity of the embryonic cells to the pFIL expressing domain of the SAM rather than the stem-cell niche or the WUS expressing organizing center (Fig. 3b). This is in line with the lack of transcript for the well-known regulators of stem cell activity WUS, CLV3, CLV1 and CLV2 [39, 47] (Fig. 1b). Despite we did not detect FIL expression or other members of the YABBY family in embryogenic cells (Additional file 3: Table S2 and Additional file 4: Table S3), YABBY positive regulators KANADI 1 (KAN1) and KAN2 [48, 49] are among the DEGsUP (Fig. 1b and Additional file 3: Table S2). Furthermore, we observed low or no transcript levels of class I KNOTTED-like homeobox (KNOX1) genes STM, KNOTTED1-LIKE HOMEOBOX GENE 1 (KNAT1), KNAT2 and KNAT6 [50]. By contrast, three out of four members of the class II KNOX (KNOX2) genes, KNAT3, KNAT4, and KNAT5, are among our DEGsUP (Fig. 1b-c and Additional file 3: Table S2; Additional file 4: Table S3). KNOX1 and KNOX2 have been shown to have antagonistic and opposing functions, with KNOX1 involved in maintenance of meristematic potential in the SAM and KNOX2 implicated in leaf primordia formation [51]. The Myb transcription factor ASYMMETRIC LEAVES 1 (AS1), whose expression is detected in young leaf primordia, acts in antagonism with KNOX1 transcription factors [52], and is up-regulated in embryogenic cells in our study (Fig. 1b and Additional file 3: Table S2; Additional file 4: Table S3).
Members of the AIL transcription factors are known to play a central role in embryogenesis, meristem maintenance, organ positioning and growth [53]. Together with upregulation of the two AIL members (AIL5 and BBM) known to play a role in SE (discussed above), we observed upregulation of ANT and AIL7 (Fig. 1b-c and Additional file 4: Table S3). Similarly to what is observed with the KNOX gene family, among the AIL genes, we observed upregulation of members involved in the development of the meristem peripheral zone and young leaf primordia [53, 54], whereas other members implicated in maintenance of the stem cell niche (such as AIL3, AIL4, and AIL6) [53, 54], were not found differentially expressed (Fig. 1b and Additional file 4: Table S3), suggesting that SAM peripheral zone markers might switch on before stem cell niche ones during SE.
In 2010, Sugimoto et al. have shown how callus induced by the application of auxin and cytokinin to in vitro cultured Arabidopsis tissues, is characterized by gene expression patterns reminiscent of root meristems, even if it is derived from aerial organs [55]. We performed PCA using our samples and publicly available microarray expression profiles of a high-resolution set of developmental time points within a single Arabidopsis root [56]. The first principal component explained 25.57% of the variance and differentiated root tissues and our datasets, whereas the second (explaining the 11.16% of the variance) suggested a closer similarity of our datasets to meristematic tissues of the root tip, such as the proximal lateral root cap (LRC) (Fig. 4, blue triangles pointed by arrows) rather than other root tissues, or more distal parts of the LRC (Fig. 4). This result is in line with the observations previously made by Sugimoto et al. [55], linking callus cell fate to meristematic root tissue fate. Moreover, this PCA is supported by expression of well-known markers for endodermis and LRC specification in our samples (Additional file 4: Table S3). The NAC domain transcription factors FEZ, SOMBRERO (SMB) and BEARSKIN1 (BRN1) are important factors in patterning the root tip by controlling cell division planes and root cap maturation [57, 58] and their transcripts are more abundant in callus cells rather than embryogenic cells, as they feature in our DEGsDOWN list (Additional file 3: Table S2 and Additional file 4: Table S3). On the other hand, ground tissues specific markers SCARECROW (SCR), SHORTROOT (SHR), JACKDAW (JKD) and CYCLIN D6 (CYCD6) are all upregulated in embryogenic cells (Additional file 3: Table S2 and Additional file 4: Table S3). SCR, together with SHR is known to play a role in both root and shoot endodermis specification in Arabidopsis [59], thus, their expression in embryogenic cells is a sign of ongoing tissue specification and patterning.