Identification of fasciclin-like arabinogalactan proteins in textile hemp (Cannabis sativa L.): in silico analyses and gene expression patterns in different tissues
© The Author(s). 2017
Received: 7 April 2017
Accepted: 31 July 2017
Published: 20 September 2017
The fasciclin-like arabinogalactan proteins (FLAs) belong to the arabinogalactan protein (AGP) superfamily and are known to play different physiological roles in plants. This class of proteins was shown to participate in plant growth, development, defense against abiotic stresses and, notably, cell wall biosynthesis. Although some studies are available on the characterization of FLA genes from different species, both woody and herbaceous, no detailed information is available on the FLA family of textile hemp (Cannabis sativa L.), an economically important fibre crop.
By searching the Cannabis genome and EST databases, 23 CsaFLAs have been here identified which are divided into four phylogenetic groups. A real-time qPCR analysis performed on stem tissues (isolated bast fibres and shivs sampled at three heights), hypocotyls (6-9-12-15-17-20 days-old), whole seedlings, roots, leaves and female/male flowers of the monoecious fibre variety Santhica 27, indicates that the identified FLA genes are differentially expressed. Interestingly, some hemp FLAs are expressed during early phases of fibre growth (elongation), while others are more expressed in the middle and base of the stem and thus potentially involved in secondary cell wall formation (fibre thickening). The bioinformatic analysis of the promoter regions shows that the FLAs upregulated in the younger regions of the stem share a conserved motif related to flowering control and regulation of photoperiod perception. The promoters of the FLA genes expressed at higher levels in the older stem regions, instead, share a motif putatively recognized by MYB3, a transcriptional repressor belonging to the MYB family subgroup S4.
These results point to the existence of a transcriptional network fine-tuning the expression of FLA genes in the older and younger regions of the stem, as well as in the bast fibres/shivs of textile hemp. In summary, our study paves the way for future analyses on the biological functions of FLAs in an industrially relevant fibre crop.
Arabinogalactan proteins (AGPs) are cell surface glycoproteins belonging to the hydroxyproline-rich glycoprotein superfamily ( and references therein) which are involved in many aspects of plant development, i.e. pattern formation, phytohormone interaction, tissue differentiation, reproduction, response to (a)biotic stresses, cell expansion and secondary cell wall deposition [2, 3]. These heavily glycosylated proteins are subdivided into four main classes: classical AGPs, AG peptides, Lys-rich AGPs, fasciclin-like AGPs (FLAs) [3–6].
FLAs are characterized by the occurrence of one or two AGP domains, as well as one or two fasciclin (FAS) domains . FAS domains were first identified in the fruit fly Drosophila melanogaster and later found in many other organisms, from bacteria to higher plants to animals . Although a consensus sequence for the FAS domains is lacking, two regions are highly conserved, named H1 and H2 (of ca. 10 amino acids) . Additionally, most FLAs show an N-terminal signal peptide and a C-terminal glycosylphosphatidylinositol (GPI) membrane anchor [5, 7], mediating attachment to the cell surface.
FLAs constitute multigene families in plants: for example, 21 FLAs have been identified in thale cress, 24 in rice, 35 in poplar, 34 in wheat, 19 in cotton, 33 in chinese cabbage and 18 in eucalypt [5, 7–11]. Molecular studies focused on FLAs are important, since they increase our understanding of the molecular functions of this protein family: the available literature on the topic has shown that FLAs in plants are not only related to tissue-specific functions, but also involved in generalized responses to environmental constraints, both biotic and abiotic [3, 7, 11, 12].
Additionally, a strong body of evidence in the literature has highlighted the importance of FLAs in regulating aspects linked to cell wall biosynthesis and, more generally, to stem mechanics in both herbaceous and woody species, as well as fibre growth. For instance, in Arabidopsis, insertional mutants of Atfla11 and Atfla12 and Atfla11/fla12 double mutants show modified stem mechanics, due to a decrease in cellulose, arabinose and galactose in secondary cell walls . Likewise, in Eucalyptus, FLAs belonging to the subgroup A [5, 12] are involved in stem mechanics : in particular EgrFLA2 is linked to cellulose microfibril angle. In poplar, antisense expression of PtFLA6 alters secondary cell wall composition in the xylem, by affecting the biosynthesis of lignin and cellulose . In cotton, GhFLA1 is involved in fibre initiation and elongation: its overexpression increases fibre length, while its silencing results in shorter fibres with an altered primary cell wall composition . In the fibre crop flax, some FLAs were shown to be upregulated at the snap point, a physical region marking the transition from elongation to cell wall thickening, hence confirming the potential function of these genes in the regulation of fibre development [15, 16].
The molecular steps involved in the regulation of bast fibre initiation, development and intrusive growth comprise many still unexplored aspects [24–26]; hence an increased knowledge in these mechanisms would favor the development of biotechnological tools focused on bast fibre improvement.
In the light of the above-mentioned relationships between FLAs and cell wall-related processes and considering the industrial applications of C. sativa, we here sought to identify and study the expression patterns of hemp FLA genes in the different stem tissues, as well as in other organs. By using bioinformatics coupled to RT-qPCR, we show that some FLA genes are highly expressed in bast fibres. Moreover, we identify groups of FLAs, upregulated either at the top or the bottom of the stem, which share putative conserved elements in their promoters. Our study therefore lays the foundation to further molecular analyses on a unique family of proteins in an important herbaceous crop.
Plant material and growth conditions
A hemp monoecious fibre variety (C. sativa cv. Santhica 27) was studied in this work. Plants were grown and sampled as described in . Briefly, after six weeks of growth in controlled chambers, samples were taken along three stem regions localized at different heights with respect to the “snap point” (e.g an empirically-defined reference region marking the transition from elongation to secondary cell wall thickening; ). The “TOP” segment internode corresponds to the region right below the apex (above the snap point), the “MID” (middle) segment is the internode containing the snap point and the “BOT” (bottom) segment is located two internodes below the “MID” sample (for clarity, a cartoon depicting the sampling strategy is shown in Fig. 1b). A segment of 2.5 cm was collected in the middle of each internode to avoid too much variation in gene expression, due to the varying developmental stages of the cell types.
Fibres were separated from the shivs by peeling the cortical tissues and by quickly processing them as described in . The shivs were directly plunged in liquid nitrogen and stored at −80 °C. The number of independent biological replicates is four, with the exception of the BOT core tissues, for which the biological replicates are three. A total of 13 plants were pooled for each replicate.
Two leaves (sampled below the TOP region from 4 biological replicates, each composed of a pool of 16 plants) were frozen in liquid nitrogen after removal of the midrib with a scalpel and subsequently stored at −80 °C. Hemp seedlings were obtained by germinating the seeds for 2 days at 25 °C (16 h 25 °C/8 h 20 °C light/dark cycles) on moist cotton wool; four biological replicates, each composed of 15–20 seedlings were frozen in liquid nitrogen and stored at −80 °C until RNA extraction. A pool of 4–5 female and male flowers sampled from 4 biological replicates, each composed of a pool of 5 plants (grown at 60% humidity with a 10 h light 25 °C/14 h dark 20 °C cycle during 5 weeks) were sampled, immediately plunged in liquid nitrogen and stored at −80 °C. Roots from four biological replicates, each composed of 16 plants, were extensively rinsed with tap water to remove soil particles, then blotted dry, directly frozen in liquid nitrogen and stored at −80 °C.
The hypocotyls, aged from 6 to 20 days after sowing, were grown and sampled as described in . Three biological replicates, each consisting of a pool of 20 hypocotyls, were used.
Identification of CsaFLA genes using bioinformatics
In order to identify the FLA genes in C. sativa (hereafter referred to as CsaFLAs for the genes and CsaFLAs for the corresponding proteins), different databases were searched: the Medicinal Plant Genomics Resource (http://medicinalplantgenomics.msu.edu/mpgr_external_blast.shtml) and the Cannabis sativa Genome Browser Gateway (http://genome.ccbr.utoronto.ca/cgi-bin/hgBlat?command=start&org=C.+sativa&db=finola1&hgsid=93256). CsaFLAs were identified by using orthologous FLA protein sequences of Arabidopsis thaliana  and Populus trichocarpa . These sequences were used to perform a BLAT analysis against the hemp Finola and Purple Kush database (Cannabis Genome Browser Gateway; ) and a BLASTP in the MPGR database. Several incomplete sequences were retrieved when using the MPGR database; however it was possible to deduce their full length sequences either by querying the Cannabis Genome Browser Gateway, or the EST database at NCBI (dbEST; available at http://www.ncbi.nlm.nih.gov/dbEST/).
In silico and phylogenetic analyses of CsaFLA protein sequences
Putative FAS domains were identified with the Motif Scan algorithm (http://myhits.isb-sib.ch/cgi-bin/motif_scan), N-terminal signal peptides were identified with SignalP (http://www.cbs.dtu.dk/services/SignalP/) and SignalBlast (http://sigpep.services.came.sbg.ac.at/signalblast.html); the subcellular localization was predicted with TargetP (http://www.cbs.dtu.dk/services/TargetP/).
The big-PI Plant Predictor program (; available at http://mendel.imp.ac.at/gpi/plant_server.html) was used to identify the glycosylphosphatidylinositol (GPI) anchor. The 3D homology models of the hemp FLA 10 and FLA 11 were generated with iTASSER Suite ( using 4ut1 and 1o70 as targets respectively; available at http://zhanglab.ccmb.med.umich.edu/I-TASSER/) employing LOMETS, SPICKER and TM-align. The models were then refined using REMO by optimizing the backbone hydrogen-bonding networks and FG-MD by removing the steric clashes and improving the torsion angles. The H1 and H2 conserved regions, motifs and residues implicated in adhesion in both proteins were manually annotated according to Johnson et al. . The final structures showing various domains, conserved regions, motifs and residues involved in adhesion were visualized with Swiss PDB Viewer v4.1 . Conserved motifs in the CsaFLA promoter sequences (retrieved at the Cannabis sativa Genome Browser Gateway) were identified using the MEME Suite 4.11.2 (; available at http://meme-suite.org/doc/cite.html?man_type=web). The identified motifs were subsequently analyzed with Tomtom (; available at http://meme-suite.org/tools/tomtom) for a comparison against the available motifs in the JASPAR CORE plant database 2016 . For the phylogenetic analysis, full-length sequences were aligned with ClustalOmega (http://www.ebi.ac.uk/Tools/msa/clustalo) and the generated alignment submitted to PHYML (http://www.phylogeny.fr) to obtain a maximum likelihood phylogenetic tree. The Maximum Likelihood tree was constructed using an aLRT (approximate likelihood ratio test) for non-parametric branch support, based on a Shimodaira-Hasegawa-like procedure. The tree was visualized with iTOL-Interactive Tree Of Life (http://itol.embl.de/). Intron-exon junctions were visualized with Gene Structure Display Server 2.0 (GSDS, http://gsds.cbi.pku.edu.cn/) .
Immunohistochemical analyses were performed on resin-embedded tissue sections, as previously described . The LM14 antibody (PlantProbes) was diluted 1:10 in milk protein (MP)/PBS (5% w/v). Sections were incubated for 1.5 h, rinsed three times in PBS and subsequently incubated for 1.5 h with the anti-rat IgG coupled to FITC (Sigma) diluted 100-fold in MP/PBS.
RNA extraction and RT-qPCR
Total RNA was extracted using a modified CTAB extraction protocol combined with an RNeasy Plant Mini Kit (Qiagen) according to . The RNA concentration and quality were measured by using a Nanodrop ND-1000 (Thermo Scientific) and a 2100 Bioanalyzer (Agilent), respectively. One microgram of RNA was retrotranscribed into cDNA using the ProtoScript II RTase (NEB) and random primers, according to the manufacturer’s instructions.
The cDNA was diluted to 2 ng/μL and 2 μl used for the RT-qPCR analysis in 384-wells microplates. An automated liquid handling robot (epMotion 5073) was used to prepare the 384-wells microplates (10 μl final volume). A tissue maximization design was used to prepare the microplates . The expression of each CsaFLA was normalized using 5 reference genes (tubulin, CDPK, RAN, clathrin and F-box, which geNORMPLUS identified as sufficient for appropriate data normalization) for the stem tissues, as described in , and 3 (RAN, TIP41 and F-box) for the other tissues (leaves, seedlings, flowers and roots). For statistical analysis, the normalized relative quantities exported from qBasePLUS were log2 transformed. A one-way ANOVA was carried out using IBM SPSS Statistics v19. A Tukey’s HSD was performed as post-hoc test. The normal distribution of the data was verified with a Kolmogorov–Smirnov test.
Primers were designed using Primer3Plus (http://www.bioinformatics.nl/cgi-bin/primer3plus/primer3plus.cgi/) and verified with the OligoAnalyzer 3.1 tool from Integrated DNA technologies (http://eu.idtdna.com/calc/analyzer). Primer efficiencies were checked via qPCR using a serial five-fold dilution of cDNA (25, 5, 1, 0.2, 0.04, 0.008 ng/μL). The primer sequences, amplicon length and Tm, amplification efficiencies and R2 are indicated in Additional file 1: Table S1.
Sequencing of some representative CsaFLA promoters
To determine the homology of the promoter sequences of the variety Santhica 27 with those from the PurpleKush and Finola reference genomes, primers were designed on 3 representative genes (CsaFLA2–7-16) using the available sequences at the Cannabis sativa Genome Browser to perform nested PCRs (Additional file 2: Table S2). Genomic DNA was extracted from stem tissues (whole internodes) by using a CTAB-based protocol coupled to the NucleoSpin Plant II kit (Macherey-Nagel). Briefly, 500 μl of extraction buffer (2% CTAB, 2.5% PVP-40, 2 M NaCl, 100 mM Tris-HCl pH 8.0, 25 mM EDTA and 10 μl RNase) were added to 100 mg of finely ground sample and the slurry was vortexed vigorously. After an incubation step at 60 °C for 10 min, 20 μl β-ME/ml buffer were added and the samples were further incubated for 20 min at 60 °C. Subsequently, 500 μl chloroform/isoamyl alcohol 24:1 were added, the samples were vortexed and centrifuged at RT for 10 min at 10000 g. To the aqueous phase, 2/3 cold isopropanol were added and the DNA was precipitated for 1 h at −20 °C. After this stage, the Nucleospin II columns were used to bind the DNA and the manufacturer’s instructions were followed to elute genomic DNA.
PCRs were performed using 50 ng DNA and the Q5 Hot Start High-Fidelity 2X Master Mix, following the manufacturer’s instructions. The optimal annealing temperatures were computed using the NEB Tm calculator (available at http://tmcalculator.neb.com/#!/).
PCR products were ligated into the pGEM-T Easy vector, following the manufacturer’s instructions and cloned into JM109 chemically competent cells. Three positive clones for each gene promoter were grown o/n at 37 °C in LB medium supplemented with ampicillin 100 μg/ml. Plasmids were extracted using the QIAGEN plasmid miniprep kit and sequenced on an Applied Biosystems 3500 Genetic Analyser using the BigDye Terminator v3.1 Cycle Sequencing and the BigDye XTerminator Purification kits, according to the manufacturer’s instructions.
Identification of putative FLAs in C. sativa: Protein architecture and phylogenetic analysis
BLAST/BLAT analyses of the 21 A. thaliana sequences (AtFLAs) performed against the Medicinal Plant Genomics Resource, the NCBI EST and the Cannabis Genome Browser Gateway databases led to the identification of 23 CsaFLAs (Additional file 3: Table S3). It should be noted that, during the database queries, a contig, i.e. csa_locus_44222_iso_1_len_407_ver_2, which was initially called CsaFLA22 and retrieved at the Medicinal Plant Genomics Resource (MPGR), was also found. However, we believe that this partial gene was erroneously attributed to C. sativa, since we never amplified any product with different primers designed on it and the reported FPKM values at the MPGR are 0 for all the tissues examined. We discarded this gene from our analyses, but kept the original nomenclature given to the hemp FLA genes (i.e. CsaFLA1–24), as at this stage we cannot rule out the existence of this gene in textile hemp.
Proline, Alanine, Serine, and Threonine (PAST) proportions in CsaFLAs
Identities of CsaFLAs with AtFLAs
Arabidopsis sequence identity
Arabidopsis sequence identity
CsaFLA expression patterns in hemp tissues
An immunohistochemical analysis carried out with the LM14 antibody (recognizing AGPs) revealed that the epitope is distributed in different tissues of the hemp stem (Additional file 7: Figure S3): this result shows the broad distribution of these proteins in the different hemp stem tissues. In particular, in the bottom internode, AGPs are present in the core tissues (cell walls of fibres and vessels), cortical parenchyma/collenchyma and, notably, in the inner region of the fibres, i.e. the layer (plasma membrane) delimiting the fibre cell lumen.
In hemp bast fibres, the heat-map hierarchical clustering shows 5 major expression trends (Fig. 3). These are the following: 1) a group of genes (CsaFLA2–6-24) is upregulated at the middle internode containing the snap point (in the core the expression decreases towards the base of the stem); 2) CsaFLA1-4-7-8-10-20-23 are expressed at higher levels in the top and decreased towards the bottom internode; 3) two FLAs, CsaFLA5 and 21, are downregulated at the snap point; 4) three genes, CsaFLA9-11-17, show a tendency to upregulation at the snap point, although the pattern is less marked with respect to group I (and in the core the expression increased towards the stem base); 5) the last group comprises FLAs upregulated at the bottom (CsaFLA3-12-13-15-16-18-19).
Identification of conserved motifs in the promoters of some CsaFLAs
Conserved motifs in the promoters of FLAs from group II and V
(Tomtom/JASPAR CORE 2016 plants)
Transcription activator controlling flowering time. Probably also involved in photoperiod perception.
Repression of the phenylpropanoid biosynthesis-related genes. Response to salt stress, wounding, ABA, SA.
Domains, conserved regions, motifs and residues mediating adhesion in CsaFLAs from class A and C.
Solvent accessibility of residues within conserved motifs
The two hydrophobic residues preceding and after [YF]H motif are thought to be implicated in mediating adhesion of these proteins. It is interesting to note that these residues are generally aliphatic amino acids such as I, L and V , however in FLA10 an Ala and polar Ser are found in the first and second FAS domains respectively (Fig. 6a).
It is noteworthy that [YF]H residues are either completely or partially buried, however both residues that flank [YF]H motif on the N-terminal side (AL, VL, LV) are completely solvent-exposed in both FLA10 and FLA11 (Table 4). In contrast, aliphatic residues (VV) on the C-terminal side of [YF]H motif in H1 domain of FLA10 are either completely or partially buried, whereas polar Ser is partially buried, but Leu is exposed in the H2 domain (Fig. 6a, space-filled). For FLA11 both VL residues on the C-terminal side of [YF]H motif are solvent exposed (Fig. 6b, space-filled). In general, residues belonging to FLA11 and those located towards the N-terminal side are exposed more favorably to mediate adhesion than FLA10 and those located on the C-terminal side. This suggests that adhesion for both these proteins may be mediated via hydrophobic interactions.
The FLAs identified in C. sativa group into the previously described four phylogenetic classes (Fig. 2) . A nomenclature of CsaFLAs is hereby also proposed which follows the Arabidopsis classification (i.e. when the phylogenetic tree highlighted clustering of a CsaFLA proteins with a specific AtFLA, the same number was assigned to C. sativa).
Within class A, the largest, it is possible to observe a separate clade represented by CsaFLA3-12-13-15-16-18-19 which is highly expressed at the snap point and in the older stem regions, both in the bast fibres and the shivs (Fig. 3). A subset of class A genes (CsaFLA3-9-11-13) was more expressed in the old hypocotyls (peaking at H17 with high values at H15 and H20). As previously shown , the hypocotyl undergoes secondary growth in H9 and later time-points. The phylogenetic position of this cluster of FLAs, together with their common expression pattern, might indicate a specific role in secondary growth. This group of genes may indeed represent hemp-specific single FAS domain FLAs specialized in secondary growth, in a manner analogous to what was previously shown in eucalypt and thale cress [11, 12]. Hemp is unfortunately recalcitrant to transformation, therefore homologous testing, as previously performed on e.g. eucalypt FLAs , is cumbersome. However heterologous testing in a more amenable system, e.g. Nicotiana tabacum, can confirm or refute the hypothesis.
It is here worth discussing also the phylogenetic position of CsaFLA11 in a clade grouping AtFLA11, EgrFLA2b and EgrFLA3b (Fig. 2). These genes were shown to affect stem mechanics, as well as cell wall architecture [11, 12]. The AtFLA11 transcript was detected in the xylem and interfascicular fibres in inflorescence stem, preceding the lignification of those two tissues ; CsaFLA11 also shows a gradual increase in expression towards the older regions of the stem and it is slightly more expressed in the older hypocotyl too (Fig. 5). This FLA represents another interesting candidate putatively involved in cell wall-related processes in textile hemp.
Within class C, CsaFLA4 and CsaFLA1 group together with the characterized orthologs from thale cress. AtFLA4 (SOS5) is involved in cell expansion  and AtFLA1 was shown to regulate root and shoot development in tissue culture . CsaFLA8 was more expressed in the TOP region of the stem, as well as in H6, suggesting a role in elongating tissue. However, it remains to be shown whether the hemp genes are involved in the same regulatory networks as in Arabidopsis.
The expression of the 23 CsaFLAs was first investigated in the different tissues of the stem, because we wanted to identify those genes specifically associated with a tissue-type and a stem region. Among them, we would like to draw the reader’s attention on the first group of genes, represented by CsaFLA2-6-24, because they show a different expression profile in the bast fibres and the shivs. The expression in the shivs shows a decrease from the top to the bottom of the stem, while in the bast fibres their expression peaks at the snap point. This is quite interesting if we consider that the snap point is the region marking a shift in the stem mechanical properties, as it determines the transition from cell elongation to thickening . It was shown that the young stem regions of hemp at the vegetative stage of growth are characterized by the presence of ca. 66% glucose, while older regions have about 82%: this result confirms that during their transition from elongation to thickening, bast fibres require great amounts of glucose for the synthesis of cellulose . The 3 FLAs may therefore be involved in cell wall-related processes occurring during this transition. Additionally, this is in agreement with the flax microarray data showing upregulation of certain FLAs around the snap point  and with the increased expression of poplar FLAs in tension wood, which, like bast fibres, is composed of a cellulosic G-layer [41, 42]. As previously discussed for poplar tension wood, specific FLAs with a GPI-anchor might be involved in the cytoskeleton-cell wall connections during fibre expansion/elongation . This would be the case of CsaFLA2 and CsaFLA6, which possess a GPI-anchor (Additional file 5). In the hypocotyl, CsaFLA2 was significantly more expressed in H6 (Fig. 5). FLAs might also be involved in triggering a cellular signal inducing the formation of the G-layer, via the cleavage of their GlcNAc oligosaccharides by the action of chitinases [22, 41]. It was shown that in flax stems, specific chitinases are highly expressed in bast fibres and may regulate G-layer formation in these cell types . Therefore, it is reasonable to assume that the concerted action of specific FLAs and chitinases may be involved in the transition from elongation to G-layer formation in hemp.
In group II and V are FLAs which, in the bast fibres, show a gradual decrease from the apical to the basal part of the stem and an increase in expression, respectively. A similar trend was observed in the hypocotyls: CsaFLA8 (belonging to the stem group II) was more expressed in H6; CsaFLA13 (belonging to the stem group V) was more expressed in H15, H17 and H20 (Fig. 5). In addition, the hypocotyl expression pattern of CsaFLA3 was similar to the one of CsaFLA13. Our study therefore identified specific FLAs likely involved in bast fibre elongation during intrusive growth (CsaFLA1-4-7-8-10-20-23) and others involved in secondary cell wall deposition during the thickening stage (CsaFLA3-12-13-15-16-18-19).
The expression of hemp FLAs was also investigated in other tissues, notably leaves, roots, male/female flowers and in seedlings (Fig. 4).
The genes belonging to group III in stem tissues (Fig. 3) are highly expressed in roots: within this cluster of FLAs are the orthologs of AtFLA7 and AtFLA11 (Fig. 2) for which a higher number of ESTs was retrieved in the roots of thale cress .
In reproductive organs, the RT-qPCR results show that some genes are highly expressed in male and female flowers. This suggests that some FLAs are involved in hemp inflorescence formation.
In order to investigate whether specific regulatory elements occurred in the promoters of the genes showing specific expression patterns in the stem tissues, we analyzed the genes from group I-II and V (Fig. 3). While for group I no conserved motifs could be obtained, 2 conserved sequences were found for group II and V (Table 2). A conserved motif recognized by the MADS box transcription factor SOC1 could be identified in the promoters of the genes upregulated in the apical stem regions: this finding suggests that they may be involved in a developmental program regulating the transition from vegetative to reproductive growth and/or the response to hormonal regulation (e.g. via gibberellin). In this respect it is noteworthy that in A. thaliana SOC1 was shown to control the annual growth habit : soc1 ful mutants show indeed woody growth reminiscent of the perennial lifestyle. Hence the FLAs upregulated at the top of the stems might belong to a regulatory circuit controlling elongation and suppressing secondary growth.
The genes in group V show the presence of a conserved motif putatively recognized by MYB3, which is an R2R3 MYB transcriptional repressor belonging to subgroup S4 together with the characterized AtMYB4 . MYB4 negatively regulates phenylpropanoid biosynthesis (more specifically, in thale cress it is a negative regulator of hydroxycinnamic acid metabolism and it exerts its silencing function by displacing the activators binding to the MYB motifs present in many promoters of genes involved in the phenylpropanoid metabolism; ). It is therefore possible that the identified element is involved in the coordination of phenylpropanoid biosynthesis in bast fibres and might regulate the hypolignification observed in these cells [45, 46]. In our recently-published transcriptomic dataset , we observed an upregulation of the SOC1 gene at the top (4-fold induction with respect to the bottom and 1.3-fold induction with respect to the middle) and MYB4 at the bottom (1.7-fold induction with respect to the top and 4.6-fold induction with respect to the middle). This result therefore strengthens the existence of a putative regulatory circuit (controlling, among other genes, the expression of CsaFLAs) at the top and bottom of adult hemp plants.
In conclusion, our work has identified (at least) 23 genes coding for FLAs in textile hemp, some of which specific to distinct stages of bast fibre development. Bioinformatics has highlighted the occurrence of conserved motifs in the promoters of genes upregulated either at the top or at the bottom of the stem. This finding points to the existence of a fine regulatory network controlling bast fibre elongation and cell wall composition. Future functional analyses carried out on heterologous systems will shed more light on the functions of the identified genes.
The authors wish to thank Aude Corvisy and Laurent Solinhac for their technical support.
The Fonds National de la Recherche, Luxembourg, (Project CANCAN C13/SR/5774202), is gratefully acknowledged for financial support. The funding agency had no role in the design of the study, in the collection, analyses, or interpretation of data, in the writing of the manuscript, and in the decision to publish the results.
Availability of data and materials
All data generated or analysed during this study are included in this published article and its supplementary information files.
GG conceived and designed the experiments; LM-P performed the experiments; MB contributed to the RT-qPCR and bioinformatics analyses; KSS performed protein modeling; GG, LM-P, SL, MB, St.L, KSS and J-FH analyzed the data; GG, LM-P, SL, MB, St. L, KSS and J-FH wrote the paper. All authors read and approved the final manuscript.
Ethics approval and consent to participate
Consent for publication
In March 2017, GG has filed a patent (“Genetically engineering of plant fibres and plant thereof”) describing the promoters of the FLA genes in hemp, which might potentially pose a competing interest. The patent is owned by the Luxembourg Institute of Science and Technology that has no affiliation to any commercial entity. The protection procedures of the associated intellectual property do not alter the adherence to BMC Genomics policies on sharing data and materials.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Tan L, Showalter A, Egelund J, Hernandez-Sanchez A, Doblin M, Bacic A. Arabinogalactan proteins and the research challenges for these enigmatic plant cell surface proteoglycans. Front Plant Sci. 2012;3:140.View ArticlePubMedPubMed CentralGoogle Scholar
- Seifert GJ, Roberts K. The Biology of Arabinogalactan Proteins. Ann Rev. Plant Biol. 2007;58:137–61.View ArticleGoogle Scholar
- Pereira AM, Pereira LG, Coimbra S. Arabinogalactan proteins: rising attention from plant biologists. Plant Reprod. 2015;28:1–15.View ArticlePubMedGoogle Scholar
- Schultz CJ, Rumsewicz MP, Johnson KL, Jones BJ, Gaspar YM, Bacic A. Using Genomic Resources to Guide Research Directions. The Arabinogalactan Protein Gene Family as a Test Case. Plant Physiol. 2002;129:1448–63.View ArticlePubMedPubMed CentralGoogle Scholar
- Johnson KL, Jones BJ, Bacic A, Schultz CJ. The Fasciclin-Like Arabinogalactan Proteins of Arabidopsis. A Multigene Family of Putative Cell Adhesion Molecules. Plant Physiol. 2003;133:1911–25.View ArticlePubMedPubMed CentralGoogle Scholar
- Ellis M, Egelund J, Schultz CJ, Bacic A. Arabinogalactan-Proteins: Key Regulators at the Cell Surface? Plant Physiol. 2010;153:403–19.View ArticlePubMedPubMed CentralGoogle Scholar
- Zang L, Zheng T, Chu Y, Ding C, Zhang W, Huang Q, et al. Genome-Wide Analysis of the Fasciclin-Like Arabinogalactan Protein Gene Family Reveals Differential Expression Patterns, Localization, and Salt Stress Response in Populus. Front Plant Sci. 2015;6:1140.View ArticlePubMedPubMed CentralGoogle Scholar
- Faik A, Abouzouhair J, Sarhan F. Putative fasciclin-like arabinogalactan-proteins (FLA) in wheat (Triticum aestivum) and rice (Oryza sativa): identification and bioinformatic analyses. Mol Genet Genomics. 2006;276:478–94.View ArticlePubMedGoogle Scholar
- Huang GQ, Xu WL, Gong SY, Li B, Wang XL, Xu D, et al. Characterization of 19 novel cotton FLA genes and their expression profiling in fiber development and in response to phytohormones and salt stress. Physiol Plantarum. 2008;134:348–59.View ArticleGoogle Scholar
- Jun L, Xiaoming W. Genome-wide identification, classification and expression analysis of genes encoding putative fasciclin-like arabinogalactan proteins in Chinese cabbage (Brassica rapa L.). Mol Biol Rep. 2012;39:10,541–55.View ArticleGoogle Scholar
- MacMillan CP, Taylor L, Bi Y, Southerton SG, Evans R, Spokevicius A. The fasciclin-like arabinogalactan protein family of Eucalyptus grandis contains members that impact wood biology and biomechanics. New Phytol. 2015;206:1314–27.View ArticlePubMedGoogle Scholar
- MacMillan CP, Mansfield SD, Stachurski ZH, Evans R, Southerton SG. Fasciclin-like arabinogalactan proteins: specialization for stem biomechanics and cell wall architecture in Arabidopsis and Eucalyptus. Plant J. 2010;62:689–703.View ArticlePubMedGoogle Scholar
- Wang H, Jiang C, Wang C, Yang Y, Yang L, Gao X, et al. Antisense expression of the fasciclin-like arabinogalactan protein FLA6 gene in Populus inhibits expression of its homologous genes and alters stem biomechanics and cell wall composition in transgenic trees. J Exp Bot. 2015;66:1291–302.View ArticlePubMedGoogle Scholar
- Huang GQ, Gong SY, Xu WL, Li W, Li P, Zhang CJ, et al. A Fasciclin-Like Arabinogalactan Protein, GhFLA1, Is Involved in Fiber Initiation and Elongation of Cotton. Plant Physiol. 2013;161:1278–90.View ArticlePubMedPubMed CentralGoogle Scholar
- Roach MJ, Deyholos MK. Microarray analysis of flax (Linum usitatissimum L.) stems identifies transcripts enriched in fibre-bearing phloem tissues. Mol Genet Genomics. 2007;278:149–65.View ArticlePubMedGoogle Scholar
- Roach MJ, Deyholos MK. Microarray Analysis of Developing Flax Hypocotyls Identifies Novel Transcripts Correlated with Specific Stages of Phloem Fibre Differentiation. Ann Bot. 2008;102:317–30.View ArticlePubMedPubMed CentralGoogle Scholar
- Guerriero G, Hausman JF, Strauss J, Ertan H, Siddiqui KS. Lignocellulosic biomass: Biosynthesis, degradation, and industrial utilization. Eng Life Sci. 2016;16:1–16.View ArticleGoogle Scholar
- Behr M, Legay S, Zizková E, Motyka V, Dobrev PI, Hausman JF, et al. Studying Secondary Growth and Bast Fiber Development: The Hemp Hypocotyl Peeks behind the Wall. Front Plant Sci. 2016;7:1733.View ArticlePubMedPubMed CentralGoogle Scholar
- Andre CM, Hausman JF, Guerriero G. Cannabis sativa: The Plant of the Thousand and One Molecules. Front Plant Sci. 2016;7:19.View ArticlePubMedPubMed CentralGoogle Scholar
- Mangeot-Peter L, Legay S, Hausman JF, Esposito S, Guerriero G. Identification of Reference Genes for RT-qPCR Data Normalization in Cannabis sativa Stem Tissues. Int J Mol Sci. 2016;17:1556.Google Scholar
- Guerriero G, Sergeant K, Hausman JF. Integrated -Omics: A Powerful Approach to Understanding the Heterogeneous Lignification of Fibre Crops. Int J Mol Sci. 2013;14:10,958–78.View ArticleGoogle Scholar
- Mokshina N, Gorshkova T, Deyholos MK. Chitinase-Like and Cellulose Synthase Gene Expression in Gelatinous-Type Cellulosic Walls of Flax (Linum usitatissimum L.) Bast Fibers. PLOS ONE. 2014;9:e97949.View ArticlePubMedPubMed CentralGoogle Scholar
- Guerriero G, Mangeot-Peter L, Hausman JF, Legay S. Extraction of High Quality RNA from Cannabis sativa Bast Fibres: A Vademecum for Molecular Biologists. Fibers. 2016;4:23.Google Scholar
- Lev-Yadun S. Plant fibers: Initiation, growth, model plants, and open questions. Russ. J. Plant Physl. 2010;57:305–15.View ArticleGoogle Scholar
- Snegireva A, Chernova T, Ageeva M, Lev-Yadun S, Gorshkova T. Intrusive growth of primary and secondary phloem fibres in hemp stem determines fibre-bundle formation and structure. AoB Plants. 2015;7:plv061.Google Scholar
- Guerriero G, Hausman JF, Cai G. No Stress! Relax! Mechanisms Governing. Growth and Shape in Plant Cells. Int J Mol Sci. 2014;15:5094-5114.Google Scholar
- Gorshkova TA, Salnikov VV, Chemikosova SB, Ageeva MV, Pavlencheva NV, van Dam JEG. The snap point: a transition point in Linum usitatissimum bast fiber development. Ind Crops Prod. 2003;18:213–21.View ArticleGoogle Scholar
- van Bakel H, Stout JM, Cote AG, Tallon CM, Sharpe AG, Hughes TR, et al. The draft genome and transcriptome of Cannabis sativa. Genome Biol. 2011;12:R102.View ArticlePubMedPubMed CentralGoogle Scholar
- Eisenhaber B, Wildpaner M, Schultz CJ, Borner GHH, Dupree P, Eisenhaber F. Glycosylphosphatidylinositol Lipid Anchoring of Plant Proteins. Sensitive Prediction from Sequence- and Genome-Wide Studies for Arabidopsis and Rice. Plant Physiol. 2003;133:1691–701.View ArticlePubMedPubMed CentralGoogle Scholar
- Yang J, Yan R, Roy A, Xu D, Poisson J, Zhang Y. The I-TASSER Suite: Protein structure and function prediction. Nat Methods. 2015;12:7–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Guex N, Peitsch MC. SWISS-MODEL and the Swiss-Pdb Viewer. An environment for comparative protein modeling. Electrophor. 1997;18:2714–23.View ArticleGoogle Scholar
- Bailey TL, Elkan C. Fitting a mixture model by expectation maximization to discover motifs in biopolymers. Edited by Edited by AAAI press 1994:28–36.Google Scholar
- Gupta S, Stamatoyannopoulos JA, Bailey TL, Noble WS. Quantifying similarity between motifs. Genome Biol. 2007;8:R24.View ArticlePubMedPubMed CentralGoogle Scholar
- Mathelier A, Fornes O, Arenillas DJ, Chen CY, Denay G, Lee J, et al. JASPAR 2016: a major expansion and update of the open-access database of transcription factor binding profiles. Nucleic Acids Res. 2016;44:D110–5.View ArticlePubMedGoogle Scholar
- Hu B, Jin J, Guo AY, Zhang H, Luo J, Gao G. GSDS 2.0: an upgraded gene feature visualization server. Bioinformatics. 2015;31:1296–7.View ArticlePubMedGoogle Scholar
- Bustin SA, Benes V, Garson JA, Hellemans J, Huggett J, Kubista M, et al. The MIQE Guidelines: Minimum Information for Publication of Quantitative Real-Time PCR Experiments. Clin Chem. 2009;55:611–22.View ArticlePubMedGoogle Scholar
- Ito S, Suzuki Y, Miyamoto K, Ueda J, Yamaguchi I. AtFLA11, a Fasciclin-Like Arabinogalactan-Protein, Specifically Localized in Screlenchyma Cells. Biosci Biotech Bioch. 2005;69:1963–9.View ArticleGoogle Scholar
- Shi H, Kim Y, Guo Y, Stevenson B, Zhu JK. The Arabidopsis SOS5 Locus Encodes a Putative Cell Surface Adhesion Protein and Is Required for Normal Cell Expansion. Plant Cell. 2003;15:19–32.View ArticlePubMedPubMed CentralGoogle Scholar
- Johnson KL, Kibble NAJ, Bacic A, Schultz CJ. A Fasciclin-Like Arabinogalactan-Protein (FLA) Mutant of Arabidopsis thaliana, fla1, Shows Defects in Shoot Regeneration. PLOS ONE. 2011;6:e25154.View ArticlePubMedPubMed CentralGoogle Scholar
- Crônier D, Monties B, Chabbert B. Structure and chemical composition of bast fibers isolated from developing hemp stem. J Agric Food Chem. 2005;53(21):8279–8289.Google Scholar
- Lafarguette F, Leplé JC, Déjardin A, Laurans F, Costa G, Lesage-Descauses MC, et al. Poplar genes encoding fasciclin-like arabinogalactan proteins are highly expressed in tension wood. New Phytol. 2004;164:107–21.View ArticleGoogle Scholar
- Gritsch C, Wan Y, Mitchell RAC, Shewry PR, Hanley SJ, Karp A. G-fibre cell wall development in willow stems during tension wood induction. J Exp Bot. 2015;66:6447–59.View ArticlePubMedPubMed CentralGoogle Scholar
- Melzer S, Lens F, Gennen J, Vanneste S, Rohde A, Beeckman T. Flowering-time genes modulate meristem determinacy and growth form in Arabidopsis thaliana. Nat Genet. 2008;40:1489–92.View ArticlePubMedGoogle Scholar
- Jin H, Cominelli E, Bailey P, Parr A, Mehrtens F, Jones J, et al. Transcriptional repression by AtMYB4 controls production of UV-protecting sunscreens in Arabidopsis. EMBO J. 2000;19:6150–61.View ArticlePubMedPubMed CentralGoogle Scholar
- Day A, Ruel K, Neutelings G, Cronier D, David H, Hawkins S, et al. Lignification in the flax stem: evidence for an unusual lignin in bast fibers. Planta. 2005;222:234–45.View ArticlePubMedGoogle Scholar
- Huis R, Morreel K, Fliniaux O, Lucau-Danila A, Fénart S, Grec S, et al. Natural Hypolignification Is Associated with Extensive Oligolignol Accumulation in Flax Stems. Plant Physiol. 2012;158:1893–915.View ArticlePubMedPubMed CentralGoogle Scholar
- Guerriero G, Behr M, Legay S, Mangeot-Peter L, Zorzan S, Ghoniem M, et al. Transcriptomic profiling of hemp bast fibres at different developmental stages. Sci Rep. 2017;7:4961.View ArticlePubMedPubMed CentralGoogle Scholar