Comparative transcriptome investigation of global gene expression changes caused by miR156 overexpression in Medicago sativa

Background Medicago sativa (alfalfa) is a low-input forage and potential bioenergy crop, and improving its yield and quality has always been a focus of the alfalfa breeding industry. Transgenic alfalfa plants overexpressing a precursor of alfalfa microRNA156 (MsmiR156) were recently generated by our group. These plants (miR156OE) showed enhanced biomass yield, reduced internodal length, increased shoot branching and trichome density, and a delay in flowering time. Transcripts of three SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE (SPL) genes (MsSPL6, MsSPL12, and MsSPL13) were found to be targeted for cleavage by MsmiR156 in alfalfa. Results To further illustrate the molecular mechanisms underlying the effects of miR156 in alfalfa, two miR156OE genotypes (A11a and A17) were subjected to Next Generation RNA Sequencing with Illumina HiSeq. More than 1.11 billion clean reads were obtained from our available sequenced samples. A total of 160,472 transcripts were generated using Trinity de novo assembly and 4,985 significantly differentially expressed genes were detected in miR156OE plants A11a and A17 using the Medicago truncatula genome as reference. A total of 17 genes (including upregulated, downregulated, and unchanged) were selected for quantitative real-time PCR (qRT-PCR) validation, which showed that gene expression levels were largely consistent between qRT-PCR and RNA-Seq data. In addition to the established SPL genes MsSPL6, MsSPL12 and MsSPL13, four new SPLs; MsSPL2, MsSPL3, MsSPL4 and MsSPL9 were also down-regulated significantly in both miR156OE plants. These seven SPL genes belong to genes phylogeny clades VI, IV, VIII, V and VII, which have been reported to be targeted by miR156 in Arabidopsis thaliana. The gene ontology terms characterized electron transporter, starch synthase activity, sucrose transport, sucrose-phosphate synthase activity, chitin binding, sexual reproduction, flavonoid biosynthesis and lignin catabolism correlate well to the phenotypes of miR156OE alfalfa plants. Conclusions This is the first report of changes in global gene expression in response to miR156 overexpression in alfalfa. The discovered miR156-targeted SPL genes belonging to different clades indicate miR156 plays fundamental and multifunctional roles in regulating alfalfa plant development. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-3014-6) contains supplementary material, which is available to authorized users.


Background
Medicago sativa (alfalfa) is a perennial forage legume that is also a candidate low-input bioenergy crop due to its great yield potential and high energy value [1][2][3]. However, in order to fully realize alfalfa's potential, significant improvements to biomass yield and quality are needed to compete against high yielding grasses, such as switchgrass and miscanthus. Recently, we overexpressed a precursor of miRNA156 (MsmiR156) in alfalfa, and this led to up to a 2-fold increase in biomass yield, delayed flowering time, enhanced cellulose content and reduced lignin, producing an overall improvement in biomass quality [4]. In addition, three SQUAMOSA-PROMOTER BINDING PROTEIN-LIKE (SPL) genes (MsSPL6, MsSPL12, and MsSPL13) were found to be downregulated via transcript cleavage by miR156 in alfalfa [4]. All SPL proteins constitute a family of transcription factors which contain a highly conserved DNA binding domain of 76 amino acids (called SBP domain) with two zinc binding sites and a nuclear localization signal (NLS) [5]. Given the diversity of traits affected by overexpression of miR156 in alfalfa, it is critical to identify and characterize its downstream target genes, especially SPL genes and genes that are regulated by SPLs, as well as understand the functions and behaviours of SPL genes and their target genes by solidly linking each to one or more phenotypes exhibited by miR156OE plants.
In recent years, genome-wide global transcriptome analysis has become a powerful tool to uncover genes which control various traits in plants. For example, using transcriptome analysis, Zhou et al. (2014) discovered a candidate MYB transcription factor responsible for red leaf coloration in peaches [26]. From the de novo assembled Ipomoea nil (morning glory) transcriptome, genes in the phenylpropanoid biosynthesis pathway were identified and SSR markers were developed for deployment in breeding programs [27]. Transcriptome analysis of Lotus corniculatus identified genes involved in secondary metabolism [28]. Comparative transcriptome analysis of latex from two different rubber tree clones (Hevea brasiliensis Muell. Arg.) revealed new cues for the regulation of latex regeneration and duration of latex flow [29]. Similarly, in alfalfa, transcriptome analysis of resistant and susceptible alfalfa cultivars infected with root-knot nematode unveiled a number of differentially expressed common and cultivar-specific genes [30]. Identification of candidate genes related to fall dormancy in dormant and non-dormant alfalfa cultivars was also accomplished by analyzing the leaf transcriptomes of these two cultivars [31]. The M. sativa gene index 1.2 was used to investigate gene expression differences between M. sativa ssp. sativa (B47) and M. sativa ssp. falcata (F56) [32]. So far, there has been no reported transcriptome analysis for miR156OE alfalfa plants; however, using microarray hybridization, Xie et al. reported that the expression levels of 3008 genes were affected in leaves of miR156OE rice (Oryza sativa) [33]. Thus, analysis of genome-wide changes in gene expression profiles in contrasting alfalfa cultivars should not only reveal differentially expressed genes, but also provide insights into possible molecular mechanisms that underlie various traits in miR156OE alfalfa plants. These include delayed flowering time, enhanced biomass production, and increased shoot branching [4].
To illustrate changes in global gene expression induced by miR156 overexpression in alfalfa, we conducted RNA sequencing (RNA-Seq) on two miR156OE alfalfa genotypes (A11a and A17) generated in our previous study, and which showed reduced plant height and stem thickness; increased branching (main and lateral branches) and node numbers, as well as increased trichome density in leaves [4]. In addition, these plants showed delayed flowering time, a reduction in lignin content and an increase in cellulose content compared to WT control [4]. The present analysis investigates whether additional SPL genes are targeted for transcript cleavage by miR156, and what other genes are differentially expressed in miR156OE alfalfa plants.

A de novo assembled alfalfa transcriptome
In order to illustrate the role of miR156 in alfalfa plant development, WT and the two most prominent miR156OE genotypes A11a and A17 [4] were selected for Next Generation Sequencing at the transcriptome level to detect differentially expressed genes (DEGs). Since the full sequence of the alfalfa genome has not been reported yet, the de novo assembled alfalfa transcriptome was first obtained using all available HiSeq data sequenced by our group. After filtering for lowquality and problematic reads such as empty adapters, short reads and unpaired reads, 1.11 billion reads were assembled using the Trinity transcriptome assembly program [34]. A total of 160,472 transcripts ranging in size from 200 bp to 9673 bp were obtained (Additional file 1: Table S1) with an average length of 874.06 bp and an N50 of 1406 bp (Fig. 1). These represented 120,046 Trinity 'genes' with an average length of 735.43 bp and an N50 of 1117 bp (Fig. 1a). The majority (113,756) of these genes were in the range of 200-999 bp (Fig. 1b).

Differentially expressed genes in miR156OE genotypes
As miR156 overexpression affects a number of traits in alfalfa plants [4], we set out to identify DEGs that may be responsible for such traits. In total, relative to WT, 4,445 genes were significantly affected in genotype A11a which was roughly twice that observed in genotype A17 which had 2,294 DEGs (p < 0.005) (Additional file 2: Table S2). Of all DEGs, 1,754 (502 up-regulated and 1,212 downregulated) were differentially expressed in both genotypes (Fig. 2a). In order to narrow down the gene list, we focused on DEGs with at least a 2-fold change (FC). There were 2,055 (613 up-regulated and 1,442 down-regulated) and 1,069 (298 up-regulated and 771 down-regulated) DEGs with a 2 FC for A11a and A17, respectively, of which 721 (68 up-regulated and 637 down-regulated) overlapped between the two genotypes (Fig. 2b). Moreover, there were 425 (31 up-regulated and 394 down-regulated) and 276 (110 up-regulated and 166 down-regulated) DEGs with at least a 4 FC in genotypes A11a and A17, respectively, of which 137 (11 up-regulated and 126 down-regulated) overlap between the two genotypes (Fig. 2c).

Gene ontology (GO) enrichment analysis of DEGs
Gene ontology enrichment analysis was carried out to identify pathways that may be affected in miR156OE plants. Combining all of the >2FC DEGs from genotypes A11a and A17, approximately 12,514 GO terms were assigned (Fig. 3a). Many of these GO terms could reflect some traits affected by miR156 overexpression. Of the 132 GO terms in the molecular function category; nicotianamine synthase activity, ferredoxin-NAD(P) reductase activity, transcription regulatory region sequence-specific DNA binding, folic acid binding, starch synthase activity, starch binding, sucrosephosphate synthase activity, chitin binding and electron transporter (transferring electrons within the cyclic electron transport pathway of photosynthesis activity) (Fig. 3b) are of particular interest. For example, many transcription factors (such as SPL genes) which can bind specific DNA sequences [35] were found to be significantly downregulated, and this is closely related to the term of transcription regulatory region sequence-specific DNA binding. The GO terms cellular component, thylakoid, thylakoid membrane and chromosome (Fig. 3c) may be related to photosynthesis, which could affect miR156OE traits, such as elevated biomass production, and influence biosynthesis of sugar, starch and lignin [4]. Among the 17 functions classified as biological processes; response to water, sexual reproduction, flavonoid biosynthesis, sucrose transport, cellular copper ion homeostasis, and lignin catabolism (Fig. 3d) are the main interesting terms because they are related to the miR156OE traits such as delayed flowering time and effects on sugar, starch, lignin and cellulose contents [4,9]. The full list of the components for the three fractions (molecular function, cellular component and biological process) is shown in Additional file 3: Table S3.

Validation of RNA-Seq data by quantitative real time PCR
To validate the RNA-Seq data, we randomly selected 17 genes (including upregulated, downregulated, and unchanged) for expression analysis by quantitative real time PCR (qRT-PCR). All of the qRT-PCR primers were designed based on alfalfa transcripts which were assembled using the Trinity program (Additional file 1: Table  S1). The expression levels of the selected genes from the RNA-Seq analysis were compared to the qRT-PCR data in Table 1. In general, there was a strong correlation between the two sets of expression data. All genes selected for validation showed a similar expression trend (up-regulation, down-regulation, or unchanged) in the qRT-PCR and RNA-Seq analysis, and 14 of the 17 transcripts (82 %) displayed a similar level of expression change (Table 1). These results support a strong level of confidence in our RNA-Seq data.

Novel SPL targets of miR156 in alfalfa
In addition to the three previously reported miR156targeted SPL genes in alfalfa [4], our RNA-Seq analysis revealed four more significantly down-regulated SPL genes in both miR156OE genotypes, namely MsSPL2, MsSPL3, MsSPL4, and MsSPL9 (Additional file 2: Table  S2), which are homologous to M. truncatula SPL genes Medtr8g463140, Medtr2g014200, Medtr4g088555 and Medtr7g092930, respectively. We further tested the expression patterns of the four SPLs by qRT-PCR and found that the transcript fold changes were consistent with those detected by RNA-Seq (Figs. 4a, b, c, d and e). Similarly, RNA-Seq and qRT-PCR data showed that the previously reported MsSPL12 was also down-regulated in the miR156OE genotypes A11a and A17 ( Fig. 4a and g). Conversely, MsSPL6 and MsSPL13 were not detected with significant downregulation in the RNA-Seq analysis (Fig. 4a); however, their significant down-regulation in A11a and A17 was detected using qRT-PCR ( Fig. 4f and h). In summary, a total of seven SPL genes are significantly down-regulated in miR156OE genotypes. These genes are thus potential targets for transcript cleavage by miR156 in alfalfa.
To investigate whether miR156 directly targets the four newly discovered SPL genes, we identified their predicted miR156 recognition sites using sequence alignment and used a modified 5'-RACE technique [4] to test for transcript cleavage. Among the twenty clones sequenced for each gene, transcript cleavage was detected in all four SPLs outside of their predicted miR156 target sites: 54 bp upstream in nine MsSPL2 clones (Fig. 5a), 42 bp downstream in sixteen MsSPL3 clones (Fig. 5b), 143 bp upstream in eleven clones and 130 bp upstream in one clone of MsSPL4 (Fig. 5c), and 350 bp upstream in seven MsSPL9 clones (Fig. 5d). We also found that each of the four SPLs has a predicted nuclear    5).

Phylogeny of MsSPL genes in alfalfa
The conserved SBP domain was used to generate a phylogenetic tree for the SPL gene family in M. sativa and its close relatives M. truncatula and Glycine max, as well as the model plant Arabidopsis. The SPL genes can be grouped into eight main clades (Fig. 6a). Clades I, II and III represent SPL genes that are not targeted by miR156 (not highlighted with colour in Fig. 6a), while genes from clades IV, V, VI, VII and VIII can undergo cleavage by miR156. The multitude of traits affected in alfalfa by miR156 overexpression could be explained by the fact that its SPL targets, i.e. MsSPL2/ 3/4, MsSPL6, MsSPL9, MsSPL12 and MsSPL13, belong to clades VI, IV, VIII, V and VII, respectively (Fig. 6a).
In silico analysis of the SBP domains that were used to generate our phylogenetic tree showed that six cysteines, four histidines, and eight arginines were absolutely conserved (Fig. 6b). The nucleotide sequence of the SBP domain is also shown in Additional file 4: Figure S1.
MsmiR172 is downregulated in miR156OE alfalfa plants MiR156 and miR172 signals are integrated at the SPL3, SPL4, SPL5 and SPL9 genes in the model plant Arabidopsis [36,37]. Consistent with this finding, our RNA-Seq data revealed that the M. sativa miR172 precursor, MsmiR172 (which is homologous to M. truncatula Medtr2g101400), was significantly downregulated in both miR156OE genotypes (p < 0.00005) (the downregulation FC for genotype A11a and A17 are 11.09 and 8.33, respectively) relative to the WT control (Additional file 2: Table S2). In addition, three miR172-targeted genes homologous to Medtr7g117690, Medtr7g100590 and Medtr2g093060, which encode AP2 domain transcription factors [38], were significantly downregulated in both A11a and A17 genotypes (Additional file 2: Table S2). Whereas the MsmiR172 precursor was shown to be 76.82 % identical to its M. truncatula homologue, the mature sequence was completely conserved (Additional file 5: Figure S2).
Tissue-specific expression of miR156, miR172, and miR156-targeted SPL genes in alfalfa To gain an insight into how miR156 and its target genes are regulated in alfalfa, we evaluated the expression of Relative gene transcript levels were analyzed using the 2 -ΔCT method. Means of three independent biological repeats were used in this study. The student t test was used to analyze the significant differences of each tested gene between WT and miR156OE genotypes A11 and A17 (*p < 0.05, **p < 0.01) miR156, its target SPL genes, and miR172 in four tissue types at different developmental time points from the juvenile stage (10 day-old rooted cuttings) to just before flowering. Expression analysis showed that miR156 was primarily expressed in the leaves, with the highest levels observed at the earliest time point, 10 days (Fig. 7a). In contrast, miR172 could be detected in all tissue types, except for roots, at 10 days, with the highest levels observed in stems, just before flowering, at 40 days (Fig. 7b). In general, MsSPL6 and MsSPL13 had an opposite expression pattern to miR156, with the highest transcript levels observed in the leaves at 40 days (Fig. 7c, e). On the other hand, MsSPL12 (Fig. 7d), MsSPL2 (Fig. 7f ), MsSPL3 (Fig. 7g), MsSPL4 (Fig. 7h) and MsSPL9 (Fig. 7i) had diverse expression profiles. For example, MsSPL3 was expressed most strongly in roots at 40 days.

Altered expression of flowering pathway-related genes in miR156OE alfalfa
Our RNA-Seq data revealed that several flowering-related genes, including LEAFY (LFY), FLOWERING LOCUS T (FT), FRUITFULL (FUL), SUPPRESSOR OF OVEREXPRE SSION OF CONSTANS1 (SOC1) and APETALA1 (AP1), were down-regulated in the leaves of miR156OE genotypes relative to the WT control (Additional file 2: Table S2, their corresponding gene IDs in M. truncatula are Medtr3g098560, Medtr7g006690, Medtr4g109830, Medtr7g075870 and Medtr8g066260). Transcript sequences of the aforementioned genes were obtained from our alfalfa de novo assembled transcriptome and then used to design qRT-PCR primers for gene expression analysis. All five genes were significantly downregulated in the miR156OE genotypes A11a and A17 relative to WT (Table 1 and Fig. 8a-f, two discovered homologous MsFT-1 and MsFT-2 in alfalfa). Since the above-mentioned flowering related genes are potentially regulated by SPL genes, the upstream 2000 bp of their promoter sequences were screened to examine if they contain the "GTAC" core sequence of SPL binding element, which can be specifically recognized by SBP domain [35]. Among these five genes, two of them, LEY and FUL, contain 5 and 6 "GTAC" elements, respectively (Additional file 6 Document 1).

Discussion
We previously generated six alfalfa genotypes (A16, A8a, A8, A11, A17 and A11a) with increased miR156 expression  [4]. Two of the genotypes, A17 and A11a (with a 1600and 3400-fold increase in miR156, respectively), were chosen for RNA-Seq analysis. Compared to WT, these two genotypes displayed the most pronounced phenotypes, such as increased number of main/lateral branches and nodes, decreased plant height and internode length as well as delayed flowering, but the extent of these phenotypic changes was different between these two genotypes [4], presumably due to different miR156 levels. Our transcriptomic analysis, showed that overexpression of miR156 can affect some similar categories of downstream genes; genes differentially expressed in both genotypes, and which may affect similar phenotypes, but different miR156 expression levels could also affect expression of some unique genes in each genotype, which may affect the degree of phenotypic change relative to WT control. Therefore, it appears that diverse levels of miR156 expression may affect alfalfa traits differently.
Genes that are commonly downregulated in both A17 and A11a genotypes include four additional miR156targeted SPL genes (MsSPL2, MsSPL3, MsSPL4 and MsSPL9), in addition to the previously reported ones (MsSPL6, MsSPL12 and MsSPL13) [4]. In Arabidopsis, miR156 regulates 10 out of 16 Arabidopsis SPL genes that belong to the same clades as those silenced by miR156 in alfalfa [13,39]. In addition, in both alfalfa and Arabidopsis, the MsSPL2/3/4 and AtSPL3/4/5 cleavage sites were found at the 3'UTR, and that of MsSPL9 in the open reading frame region. However, unlike findings in Arabidopsis [13], cleavage sites for MsSPL2/3/4 and MsSPL9 in alfalfa were not detected within the predicted miR156 target region. In alfalfa, the detected cleavage sites were located either upstream or downstream of the predicted target sites. This may be explained by the RNA-induced silencing complex sliding through the transcript during miRNA-directed cleavage of the target [40]. Other studies have also shown similar results regarding cleavage site variation. For example, in a transcriptome-wide identification of miRNA targets using a degradome sequencing approach, a lower percentage of cleavage sites was found at the expected sequences for some conserved miRNAs in rice [41]. Also, OsSPL14 was found to be cleaved by OsmiR156 beyond its target site [18]. Although the cleavage sites for the newly discovered MsSPL2, MsSPL3, MsSPL4 and MsSPL9 in alfalfa are different from those in Arabidopsis, these MsSPLs belong to the same phylogenetic tree clades and share highly similar nucleotide and amino acid sequences with AtSPLs (Additional file 7: Figure S3 and Additional file 8: Figure S4), indicating these SPLs may perform similar functions in both alfalfa and Arabidopsis.
Based on the DEG list, the GO terms (such as effect of electron transporter, starch synthase activity, sucrosephosphate synthase activity, chitin binding, sexual reproduction, flavonoid biosynthesis, sucrose transport and lignin catabolism) are closely related to miR156OE alfalfa phenotypes, namely enhanced shoot branching, delayed flowering time and elevated biomass production [4], which involve a large number of biological pathways. Among the differentially expressed SPL genes, we hypothesize that MsSPL2/3/4 (clade VI) may perform similar functions as AtSPL3/4/5 in Arabidopsis (Fig. 6a), because both of these two groups of SPLs are relatively small in size (420-550 bp) and contain complementary sequences of miR156 in the 3' UTR. Leaves of Arabidopsis plants that overexpress AtSPL3/4/5 can develop adult characteristics faster than WT control [13,42]. AtSPL3/ 4/5 also functions by integrating signals from the autonomous photoperiod, age and Gibberellic acid (GA) pathways to redundantly promote the reproductive transition [7,[43][44][45][46][47][48]. Under short day conditions, the three genes (AtSPL3/4/5) are negatively regulated by miR156 in an age-dependent manner, and are positively regulated by SOC1 through the GA pathway [36,46]. Under long day conditions, however, SOC1, FT, and FLOWERING LOCUS D (FD) positively regulate AtSPL3/ 4/5 in leaves in response to photoperiod signals [46]. SPL proteins are also known to indirectly activate FT expression through the direct binding of the inflorescence meristem gene FUL, and directly activate transcription of FUL, AP1, and LFY in the shoot apical meristem of Arabidopsis [44,45,49,50]. Similarly, the above-mentioned flowering pathway-related genes (LFY, FT, FUL, SOC1 and AP1) were found to be significantly downregulated in miR156OE alfalfa plants (Fig. 8), suggesting that certain flowering mechanisms regulated by SPLs may be common in both Arabidopsis and alfalfa. In addition, some findings were also reported in snapdragon (Antirrhinum majus), where AmSBP1, an ortholog of AtSPL3/4/5, is involved in initiating flower development within the inflorescence [51]. Silencing of AmSBP1 eliminates flowering completely and causes an increase in vegetative branching under long day conditions [51]. On the other hand, mutations in the AtSPL3/4/5 ortholog COLORLESS NON-RIPENING resulted in fruits that failed to ripen [6], which may be considered as a novel function for SPL genes [5].
MsSPL9, discovered in this study, to be regulated by miR156 in alfalfa, belongs to clade VIII. This gene and its ortholog, AtSPL15, play redundant roles in regulating vegetative phase change and reproductive transition in Arabidopsis [14,42]. Obvious phenotypes were observed in the double spl9 spl15 Arabidopsis mutant with increased numbers of vegetative rosette leaves, rounder Means of three independent biological repeats were used in this study. The Student's t test was used to analyze the significant differences of each of the tested genes between WT and miR156OE genotypes A11a and A17 (*p < 0.05, **p < 0.01) leaf shape and delayed flowering time compared to WT [14]. In addition, overexpression of AtSPL9 in hyponastic leaves1 mutants -which have lower miR156 expressioncaused complete loss of the juvenile phase [52]. Except for phase change, the plastochron length was also affected in the late flowering atspl9spl5 double mutants, suggesting dissociation between growth and development [14]. Furthermore, genetic evidence indicates the involvement of AtSPL9 in petal trichome initiation via activation of TRICHOMELESS 1 (TCL1) and anthocyanin pigment accumulation in vegetative stems [15,53]. This TCL1 gene was also significantly downregulated in our miR156OE alfalfa plants (Fig. 8f ). In Arabidopsis and Patchouli (Pogostemon acblin), SPL9 is involved in the regulation of sesquiterpene biosynthesis [54]. Furthermore, our qRT-PCR results show that the SESQ UITERPENE SYNTHASE gene was downregulated in miR156OE plants (Table 1). AtSPL9 and AtSPL15 are two homologue genes in Arabidopsis and their functions are redundant, however, only one orthologue of MsSPL9 was discovered in alfalfa (Fig. 6a). If this is the case, without the redundant homologous gene of MsSPL15, there will be an obvious phenotype in MsSPL9 loss-offunction alfalfa genotypes. An ongoing project in our group is producing MsSPL9 overexpression as well as MsSPL9 loss-of-function alfalfa genotypes. By investigating these transgenic alfalfa genotypes, we will be able to determine if MsSPL9 also plays similar roles in regulating alfalfa vegetative phase change and reproductive transition, plastochron length, trichome development, anthocyanin pigment accumulation and sesquiterpene synthesis in alfalfa.
Expression analysis revealed that the transcripts of most of the SPL genes tested were detectable in roots. A recent publication reported that AtSPL3, AtSPL9 and AtSPL10 were involved in the repression of lateral root growth, and that miR156/SPLs module participates in lateral root primordia progression [55]. This is consistent with our results, which showed relatively high SPL transcript levels were detected in roots where miR156 transcript was undetectable.

Conclusion
In summary, this is the first report on the effect of miR156 overexpression on global gene expression in alfalfa. At least 2403 genes were differentially expressed by at least 2-fold in miR156OE compared to WT alfalfa. Gene ontology analysis showed that the types of genes that are significantly regulated in miR156OE genotypes are closely related to biological processes that can impact the phenotypes observed in miR156OE alfalfa, including enhanced shoot branching, increased trichome density, a delay in flowering time and elevated biomass production. The de novo assembled alfalfa transcriptome will add to the limited publicly available alfalfa genomics resources, and will allow for easier identification of alfalfa gene sequences. Four additional SPLs (MsSPL2/3/ 4 and MsSPL9) were discovered to be targeted for silencing by miR156 in alfalfa. Based on the phylogenetic tree analysis, all the current detected SPL genes in the miR156OE plants belong to five different clades, indicating that miR156 plays fundamental and multifunctional roles in regulating alfalfa plant development. It will be crucial to validate the functions of each SPL gene belonging to different clades to more fully understand the functions of miR156 in determining alfalfa traits.

Plant materials and growth conditions
The WT alfalfa clone N4.4.2 [56] was obtained from Dr. Daniel Brown (Agriculture and Agri-Food Canada). The miR156OE genotypes (A11a and A17) were obtained from our previous study [4]. All the alfalfa plants were grown under greenhouse conditions of 21-23°C, 16 h light per day, light intensity of 380-450 W/m 2 (approximately 500 W/m 2 at high noon time), and a relative humidity of 70 %.

Propagation of alfalfa by stem cuttings
In order to ensure all plant materials were at a similar developmental stage, prior to vegetative propagation by rooted stem cuttings, alfalfa plants were cut back 2-3 times and grown for 2 months to synchronize growth. For each time point, at least four rooted cuttings (biological replicates) were used in this study. For each replicate, about 3-4 stem sections containing 2 nodes each were cut and inserted into moistened growing media (Pro-Mix, Mycorrhizae TM , Premier Horticulture Inc., Woodstock, ON, Canada) in a 5-inche pot. Propagation domes (Ontario grower's supply, London, ON, Canada) were used to cover the pots which were kept in the greenhouse for 3 weeks to allow rooting from the cutstem.

Sequence verification of alfalfa miR172 precursor
A precursor of miR172 (approximately 300 bp) was amplified from an alfalfa cDNA template using a pair of primers (Additional file 9: Table S4), which were designed based on a miR172 precursor in a M. truncatula sequence database (mtr-MIR172a: MI0005600, with a verified genome location) [57]. The PCR product was then sequenced, and the sequence was compared to that of M. truncatula.
Extraction of total RNA, reverse transcription and quantitative real time PCR Different tissues of shoots, leaves, stems and roots of alfalfa plants were collected at 10, 20, 30, and 40 days after synchronizing their growth. Total RNAs were extracted using PowerPlantRNA Isolation Kit (MO BIO Laboratory, Mississauga, ON, Canada) or RNeasy Plant Mini Kit (QIAGEN) and 2 μg was used to generate cDNA through reverse transcription using oligo(dT) 15 and gene-specific reverse primers with SuperScript® III Reverse Transcriptase kit (Invitrogen TM ). The transcript levels of miR156 and miR172 were analyzed by stem loop qRT-PCR [58] and SPL genes by normal qRT-PCR using a CFX96 TouchTM Real-Time PCR Detection System (Bio-Rad). The cDNA was diluted with water (1:3) and qRT-PCR was carried out following PerfeCta SYBR Green FastMix (Quanta Biosciences, Canada) instructions. Each reaction consisted of 2 μl of cDNA template, 0.3 μl each of both gene-specific forward and reverse primers (10 μM) (Additional file 9: Table S4), and topped up to 10 μl ddH 2 O. Two reference genesacetyl CoA carboxylase 1 (ACC1) and acetyl CoA carboxylase 2 (ACC2) [4] were used to normalize the transcript levels in qRT-PCR. Finally, transcript levels of the respective genes were analyzed using a relative quantification 2 -ΔCt method [59].

RNA sequencing
Total RNA was extracted using PowerPlant® RNA Isolation Kit (MO BIO Laboratories, Inc.), RNA concentrations were determined using a NanoDrop 2000C (Thermo Scientific), and the quality of RNA samples was assessed by agarose gel electrophoresis. Four biological replicates were used for each sample. The RNA library was constructed and sequenced on an Illumina Hi-Seq 2500 using paired-end 101 bp reads at the Centre for Applied Genomics (Sick Kids Hospital, Toronto, Canada). Briefly, before library construction, the integrity of RNA samples was confirmed on an Agilent Bioanalyzer 2100 RNA Nano chip (Agilent Technologies) and an RNA library prepared using the Illumina TruSeq mRNA Library Preparation protocol. The poly(A) RNA from 500 ng of total RNA was enriched with oligo dT beads and then fragmented to convert to double stranded cDNA. One ul of each of the final RNA libraries was loaded on a Bioanalyzer 2100 DNA High Sensitivity chip (Agilent Technologies) to check for size, and the RNA libraries were quantified by qRT-PCR using the Kapa Library Quantification Illumina/ABI Prism Kit protocol (KAPA Biosystems). Finally, six libraries were pooled in one lane with equimolar quantities and sequenced on an Illumina HiSeq 2500 platform using a Rapid Run Mode flowcell (Illumina).

Differential expression analysis
Using M. truncatula as reference genome, differential expression was determined using published protocols [60]. Firstly, the QC analyses were carried out for all the raw reads using FastQC program. Since the quality of all the sequenced reads is identical, two representative results for per-base quality are shown in Additional file 10 Document 2. Raw sequence reads were then 5' trimmed on quality score (Q > 30), adapter sequences removed and short reads dropped using custom Perl scripts. All filtered and properly paired reads were then mapped to the M. truncatula genome using TopHat (v2.0.10). The fragment alignments generated by TopHat were used as input files for Cufflink (v2.2.1) and further analyzed through the recommended pipeline to detect the differentially expressed genes between miR156OE and WT [60]. Features with false discovery rate < 0.2 (20 % false positive rate) were considered differentially expressed between conditions. The p-value (>0.005) was used for rejecting the null hypothesis that value2 is equal to value1. More detailed methods and parameters for analyzing RNA-Seq data are listed in Additional file 10: Document 2.

Transcriptome de novo assembly
De novo assembly of M. sativa transcriptome was performed using the Trinity program as previously described [34]. Compared with other de novo assemblers, Trinity is able to recover more full-length transcripts across a broad range of expression levels [34]. Briefly, the 1.11 billion clean RNA-Seq reads mentioned above were used as input for de novo assembly. BLAST searches (E < 10E-5) were conducted against the NCBI Nr (http:// www.ncbi.nlm.nih.gov/), Swissprot (http:// www.expasy.ch/sprot/), KEGG (http://www.genome.jp/ kegg/) and COG (http://www.ncbi.nlm.nih.gov/COG/) databases. The used parameters for assembling transcriptome were described in Additional file 10: Document 2.
GO enrichment analysis M. truncatula GO terms were downloaded from GO Analysis Toolkit and Database for Agriculture Community (AGRI go, http://bioinfo.cau.edu.cn/agriGO/download.php). All the genes identified with significant differential expression (p < 0.005) and FC > 2 in this study were used as input to carry out GO enrichment analysis. The enriched GO terms were summarized and plotted following the published REVIGO protocol [61]. The ratios of molecular functions, cellular component and biological process were calculated based on the number of GO terms.
Detection of miR156 cleavage sites in MsSPL2, MsSPL3, MsSPL4, and MsSPL9 transcripts The cleavage sites in alfalfa SPL genes were detected using a modified 5' rapid amplification of cDNA end (5'-RACE) as previously reported [62]. The experiment was conducted using FirstChoice® RLM-RACE Kit