The NAC family transcription factor GmNAC42–1 regulates biosynthesis of the anticancer and neuroprotective glyceollins in soybean

Background Glyceollins are isoflavonoid-derived pathogen-inducible defense metabolites (phytoalexins) from soybean (Glycine max L. Merr) that have important roles in providing defense against pathogens. They also have impressive anticancer and neuroprotective activities in mammals. Despite their potential usefulness as therapeutics, glyceollins are not economical to synthesize and are biosynthesized only transiently and in low amounts in response to specific stresses. Engineering the regulation of glyceollin biosynthesis may be a promising approach to enhance their bioproduction, yet the transcription factors (TFs) that regulate their biosynthesis have remained elusive. To address this, we first aimed to identify novel abiotic stresses that enhance or suppress the elicitation of glyceollins and then used a comparative transcriptomics approach to search for TF gene candidates that may positively regulate glyceollin biosynthesis. Results Acidity stress (pH 3.0 medium) and dehydration exerted prolonged (week-long) inductive or suppressive effects on glyceollin biosynthesis, respectively. RNA-seq found that all known biosynthetic genes were oppositely regulated by acidity stress and dehydration, but known isoflavonoid TFs were not. Systemic acquired resistance (SAR) genes were highly enriched in the geneset. We chose to functionally characterize the NAC (NAM/ATAF1/2/CUC2)-family TF GmNAC42–1 that was annotated as an SAR gene and a homolog of the Arabidopsis thaliana (Arabidopsis) indole alkaloid phytoalexin regulator ANAC042. Overexpressing and silencing GmNAC42–1 in elicited soybean hairy roots dramatically enhanced and suppressed the amounts of glyceollin metabolites and biosynthesis gene mRNAs, respectively. Yet, overexpressing GmNAC42–1 in non-elicited hairy roots failed to stimulate the expressions of all biosynthesis genes. Thus, GmNAC42–1 was necessary but not sufficient to activate all biosynthesis genes on its own, suggesting an important role in the glyceollin gene regulatory network (GRN). The GmNAC42–1 protein directly bound the promoters of biosynthesis genes IFS2 and G4DT in the yeast one-hybrid (Y1H) system. Conclusions Acidity stress is a novel elicitor and dehydration is a suppressor of glyceollin biosynthesis. The TF gene GmNAC42–1 is an essential positive regulator of glyceollin biosynthesis. Overexpressing GmNAC42–1 in hairy roots can be used to increase glyceollin yields > 10-fold upon elicitation. Thus, manipulating the expressions of glyceollin TFs is an effective strategy for enhancing the bioproduction of glyceollins in soybean. Electronic supplementary material The online version of this article (10.1186/s12864-019-5524-5) contains supplementary material, which is available to authorized users.


Background
In 1939 K.O. Mueller et al. reported that metabolites that were elicited in potato upon inoculation with an incompatible race of Phytophthora infestans subsequently provided resistance to a compatible race [1]. Since then, the pathogen-inducible defense metabolites that have been identified from numerous plant species have collectively been referred to as 'phytoalexins'. Some phytoalexins have essential roles in defending agricultural crops against major pathogens. A classic example is the glyceollins of soybean that provide resistance to the oomycete Phytophthora sojae [2][3][4]. For decades researchers have studied the genetic regulation of phytoalexin elicitation by pathogens. Efforts have recently focused on identifying the transcription factors (TFs) that activate phytoalexin biosynthesis, a goal that has been confounded by the myriad of plant responses that occur synchronously in response to pathogens. Phytoalexins are biosynthetically diverse among plant species and include the isoflavonoid-derived glyceollins from soybean, the phenylpropanoid stilbenes from grapevine, the phenolic aldehyde gossypol from cotton, the terpenoid momilactones and phytocassanes from rice, and the indole alkaloid camalexin from Arabidopsis [5][6][7][8][9][10]. Since the TFs that activate the biosynthesis of phytoalexins in different plant species belong to different gene families and/or are non-homologous, for decades an important question has remained whether phytoalexin TFs are as diverse as the biosynthetic pathways that they regulate. Yet, several excellent reviews highlight that phytoalexins share common abiotic elicitors [11][12][13]. This could suggest conserved regulatory pathways and TFs among plant species despite the biosynthetic heterogeneity of phytoalexins.
Highly conserved abiotic elicitors of phytoalexins include heavy metals, herbicides, and UV irradiation. UV elicits stilbene phytoalexins in grapevine, Cissus Antarctica, and Cannabis sativa [14], the flavonoid and diterpenoid phytoalexins in rice [15,16], camalexin in Arabidopsis [17], and glyceollins in soybean [18]. In rice, loss-of-function mutants of the JA biosynthesis gene allene oxide cyclase (aos) or jasmonic acid-amido synthetase (osjar1-2) resulted in an almost complete loss of sakuranetin elicitation in response to UV [19]. Yet, the diterpenoid phytoalexins of rice were not affected in JA biosynthesis mutants. Copper chloride (CuCl 2 ) elicitation of sakuranetin, momilactone, and diterpenoid phytoalexins in rice was dramatically reduced by JA biosynthesis inhibitors [20]. The heavy metal silver nitrate (AgNO 3 ) elicited glyceollin accumulation in soybean by reducing its degradation and by enhancing the hydrolysis of isoflavone-glycoside conjugates that compete with glyceollins for the common biosynthetic intermediate daidzein [21]. AgNO 3 was shown to antagonize many plant development processes by inhibiting ethylene perception [22]. Yet, glyceollin elicitation by AgNO 3 was largely independent of ethylene signaling [21]. Herbicides such as acifluorfen elicit at least in part via the reactive oxygen species (ROS) signaling pathway(s). The ups1 loss-of-function mutant of Arabidopsis defective in ROS signaling had reduced camalexin levels in response acifluorfen [23]. ups1 also had reduced camalexin levels in response to Pseudomonas syringae and P. syringae pv maculicola (Psm), suggesting a shared biotic and abiotic elicitation pathway. In soybean, treatments with JA, ethylene, P. sojae WGE, or hydroxyl radical (a ROS) were highly effective at priming glyceollin biosynthesis in cells distal to the point of treatment, whereas SA was not [23,24].
In contrast to the abiotic stresses and signaling molecules that have conserved roles in eliciting phytoalexins in response to abiotic stresses, the TFs found to regulate phytoalexin biosynthesis have varied widely among plant species. GaWRKY1 activated gossypol biosynthesis in cotton [8]. GaWRKY1 transcripts were induced by methyl jasmonate (MeJA) and Verticillium dahlia but not by SA or H 2 O 2 . GaWRKY1 transcripts were co-expressed both spatially and temporally with gossypol biosynthesis genes and GaWRKY1 was able to directly bind the promoter of (+)-δ-cadinene synthase (CAD1) in the Y1H system. Another WRKY-family TF, namely AtWRKY33, was identified from Arabidopsis to directly bind and activate the promoter of the camalexin biosynthesis gene PAD3 [25]. WRKY33 transcripts were induced by the ROS-inducing herbicide paraquat, SA, and necrotrophic fungal pathogens [10]. GaWRKY1 and AtWRKY33 were not homologous since the proteins they encode had more than 20 other proteins that were more similar by reciprocal BLASTPs.
The R2R3-type MYB TF genes VvMYB14 and VvMYB15 from grapevine were co-induced with stilbene biosynthesis genes in response to UV irradiation, wounding, and the pathogen Plasmopara viticola [26]. The proteins directly bound the promoter of STILBENE SYNTHASE (STS) in transient gene reporter assays using grapevine suspension cells and induced the accumulation of stilbenes when overexpressed in grapevine hairy roots [26]. Homologs of VvMYB14 and VvMYB15 in Arabidopsis did not regulate camalexin biosynthesis but rather cold tolerance and defense-induced lignification, respectively [27,28]. Double and triple mutants of the Arabidopsis R2R3 MYBs AtMYB34, AtMYB51, and AtMYB122 had reduced camalexin levels upon elicitation with UV, AgNO 3 , and a PAMP isolated from Pythium aphanidermatum (PaNie) [29]. However, these three MYBs were unable to bind camalexin biosynthesis gene promoters and feeding the triple mutant plant with a biosynthetic intermediate restored camalexin accumulation, suggesting that AtMYB34, AtMYB51, and AtMYB122 did not regulate camalexin biosynthesis directly but rather an upstream process in the elicitation pathway [29]. The constitutive overexpression of the sorghum R2R3 MYB gene yellow seed (y1) in maize resulted in the ectopic accumulation of 3-deoxyanthocyanidins in vegetative tissues only upon challenge with Colletotrichum graminicola [5]. VvMYB15 and VvMYB14 were not homologs of y1 since reciprocal BLASTp's revealed 5-20 proteins that were more similar.
RNAi silencing of the bHLH-family TF gene OsMYC2 from rice almost completely eliminated the elicitation of sakuranetin in response to JA treatment [6]. OsMYC2 directly activated the promoter of a sakuranetin biosynthesis gene by transient transactivation assays in rice leaves [6]. Transcripts of another bHLH TF gene from rice, namely OsDPF, were inducible in rice leaves by UV, CuCl 2 and blast infection [9]. OsDPF directly activated the promoters of phytocassane and momilactone biosynthesis genes by transient transactivation assays in rice leaves. Overexpressing OsDPF resulted in increased expression of all diterpenoid biosynthetic genes and the accumulation of momilactones and phytocassanes, whereas decreased levels were observed in RNAi knock-down lines. Two homologous JA-inducible bHLHs, TSAR1 and TSAR2, were identified to directly activate triterpene saponin biosynthesis genes in Medicago truncatula [9]. TSAR1 and TSAR2 were not among the top 20 most similar proteins compared to OsDPF or OsMYC2, and OsDPF was only the 10th most similar to OsMYC2.
A NAC-type TF gene, AtANAC042, was identified from Arabidopsis by T-DNA insertion mutagenesis to have reduced levels of camalexin biosynthesis gene expressions and metabolites when elicited with the ROS-inducing herbicide acifluorofen, bacterial flagellin, or A. brassicicola [7]. Bacterial flagellin stimulated the accumulation of AtA-NAC042 transcripts at the elongation zone of the root (the site of camalexin biosynthesis), and the induction was abolished in the presence of either MeJA, a general kinase inhibitor (K252a), or a Ca 2+ -chelator (BAPTA).
Collectively, these studies have demonstrated that phytoalexin biosynthetic pathways are regulated by disparate, non-homologous TFs in different plant species, raising the question of whether any TF has a conserved role in regulating the biosynthesis of phytoalexins in plants. Here, we used a comparative transcriptomics approach on soybean that was exposed to novel abiotic stresses and identified a conserved phytoalexin regulator.

Materials and methods
Chemicals (−)-Glyceollin I was from Dr. Paul Erhardt (University of Toledo). Soybean isoflavonoid standards were purified and characterized according to [21]. Isoflavone standards were from Extrasynthese (France). Solvents were LC-MS grade (Fisher).

Plant materials and growth conditions
Soybean seeds were obtained from the USDA-GRIN soybean germplasm collection and from Elroy Cober (Agriculture and Agri-Food Canada). Harosoy 63 seeds (16 per batch) were sterilized in 30 mL of 70% ethanol, 0.2% triton X (v/v) for 5 min on a mixer wheel, rinsed thrice with sterile water, and imbibed overnight. The imbibate was then discarded to remove growth inhibitors and seeds were transferred to water soaked sterile vermiculite (250 mL in volume) in 500 mL beakers. The beaker tops were covered with ring-shaped sterile cheese cloth and covered with plastic wrap to ensure aseptic growth. The cheese cloth permitted passage of air between plastic wrap and the beaker top and the ring shape permitted the passage of light from above the beaker. Seedlings were grown at 22°C under a 16 h photoperiod using cool white T5 fluorescent lights (500 μE m − 2 s − 1 ). At the first trifoliate leaf stage (~8 day old), seedling roots were gently rinsed with sterile water to remove vermiculite and were transferred to stress treatments.

Stress treatments
For all stress treatments, the roots of five seedlings were wrapped together in a germination paper (Sartorius AG, Göttingen, Germany) saturated with half-strength Murashige and Skoog (MS) medium (pH 5.8) containing vitamins and 1% (w/v) sucrose unless indicated otherwise. The wrapped seedlings were transferred to a 100 mL beaker containing 50 mL of the above medium for the control, cold, heat, wounding and UV-C treatments. Each of the 100 mL beakers were then placed inside a sterile 500 mL beaker and the 500 mL beaker tops were again covered with a ring-shape cut of sterile cheese cloth overlaid with plastic wrap. The volume of the medium in the basin of the 100 mL beaker was maintained daily for all treatments, with the exception of the dehydration treatment. For dehydration, the medium-saturated germination paper was allowed to dry gradually in the 100 mL beaker containing no medium. All seedlings were grown under the temperature and lighting conditions listed above unless otherwise indicated. For heat and cold treatments, the 500 mL beakers were transferred to 37 and 15°C, respectively. For high carbon stress, the growth medium in the 100 mL beaker was replaced with 3% sucrose in water. For flooding, control medium was maintained up to the level of the hypocotyl-root junction throughout the 9 d treatment. For phosphate deprivation (−P), half-strength MS medium (pH 5.8) that lacked phosphate was used (Caisson Labs, Smithfield, UT). For UV-C treatment, seedlings in beakers were exposed to a 30 W g30 t8 germicidal light (Philips, NV) every day for 1 h. For acidity stress, seedlings were transferred half-strength MS medium pH 3.0 (acidified with HCl).
After 9 d of treatment (unless indicated otherwise), the five seedlings per treatment were unwrapped and separated, flash-frozen in liquid nitrogen, lyophilized to dryness, and individually ground to a fine powder and stored at − 80°C for metabolite and RNA extractions. The stored tissue powder was lyophilized again for 1 h prior to weighing.
For hairy root experiments, only secondary roots that grew to 3-6 cm on selection media were considered transgenic and were used for WGE treatments. Roots were cut into 1-cm pieces then overlaid with sterile water (mock) or wall glucan elicitor (WGE) that was extracted from P. sojae according to [21]. For RNA extraction, 100 mg of fresh tissue was harvested on ice and freeze dried prior to storage at − 80°C. For metabolite analyses, fresh hairy root tissues (~100 mg) were extracted immediately upon harvesting without lyophilization.

Isoflavonoid analysis
For analysis of seedlings, lyophilized tissue powder (12 mg) was extracted with 80% ethanol (10 μL mg − 1 dry tissue) and isoflavonoid identifications were done by UPLC-PDA-MS n as indicated in [21]. Four seedlings per treatment were individually extracted for metabolite analysis. Metabolite analyses of pH 3.0 medium, dehydration stress, and control treatments were confirmed by three independent experiments.
Hairy roots were extracted with 80% ethanol (1 μL mg − 1 fresh weight, FW) as described [21]. For all hairy root experiments, five biological replicates were analyzed per treatment. Two independent transformation experiments were analyzed per DNA construct. Absolute amounts of isoflavonoids were determined by comparison of the UPLC-PDA peak areas to a concentration curve of purified or authentic standards as described in [21].

RNA extraction and qRT-PCR
Total RNA was isolated from lyophilized tissue powder using the Spectrum Plant Total RNA Kit (Sigma-Aldrich, St. Louis, MO, USA) as described [21]. Total RNA (500 ng) was treated with DNase I (Amplification grade, Invitrogen, Carlsbad, CA, USA) to remove genomic DNA and cDNA was synthesized using SuperScript II Reverse Transcriptase (Invitrogen). cDNA templates were diluted 4-fold with water and qRT-PCR was conducted as described [21]. All qRT-PCR experiments included four biological replicates and two technical replicates. Primers used in this study are listed in Additional file 1: Table S1.

RNA-seq
Total RNA was extracted from the powder of individual seedlings as described above. Three individual seedlings per stress treatment and their respective controls were used to make a total of 12 libraries for RNA-seq analysis. RNA samples were sent to the Genomics Core Facility of West Virginia University for library preparation. The quality of each RNA sample was determined using an RNA Nano 6000 Chip and an Agilent 2100 Bioanalyzer (Santa Clara, CA). RNA samples with an Integrity Number (RIN) greater than 8.0 were used to prepare the libraries. Following quantification of RNA using a Qubit fluorometer, libraries were constructed from 750 ng using the mRNA stranded library prep kit (KAPA Biosystems) as per manufacturer's protocol with nine cycles of PCR. The completed cDNA libraries were quantified using a Qubit and pooled in equimolar ratios prior to sequencing at the Marshall University Genomics Core. The 100 bp paired-end reads were generated using a HiSeq1500 system (Illumina). Eight libraries were sequenced per lane in high-output mode.
Data filtering was carried out to eliminate adapter sequences and/or low-quality reads. The quality of raw reads was determined using FastQC software (http://www.bioin formatics.babraham.ac.uk/projects/fastqc/) and clean reads were then mapped/aligned to Glycine max reference genome (Gmax_275_V2.0.fa, https://phytozome.jgi.doe.gov/pz/ portal.html) using STAR RNA-seq aligner [30] with default mode based on the current gene annotation. Only the paired mapped reads were considered for further analyses. Reads were quantified using using featureCounts [31]. Differentially expressed genes (DEGs) were identified using a Negative Binomial Distribution in DESeq2 [32]. Multiple hypothesis correction was conducted with Benjamini Hochberg procedure to get an adjusted P value at 0.05 which decrease the false discovery rate (FDR). Principle component analysis, heatmap and clustering of the samples were done to check the robustness of the analysis. For the identification of gene homologs, genes were considered to be homologous if their predicted protein sequences were the best matches in reciprocal BLASTPs.

Cloning
The GmNAC42-1 ORF was PCR amplified from the cDNA of Harosoy63 seedlings treated with pH 3.0 medium (9 dat) by the attB Adapter PCR protocol (Invitrogen, Carlsbad, CA) using Phusion polymerase (Thermo Fisher Scientific) and primers (Additional file 1: Table S1). The amplicon was cloned into the donor vector pDONR221 using BP Clonase II (Invitrogen, Carlsbad, CA) and following sequencing was LR recombined downstream of GFP in the pGWB6 vector to assay subcellular localization and downstream of the GAL4 activation domain in the pDEST-GADT7 vector for Y1H. For silencing, a 227-bp region of exon 2 of GmNAC42-1 was amplified from cDNA and BP cloned into pDONR221, which after sequencing was LR subcloned into the RNAi vector pANDA35HK. Hairpin integrations were confirmed by sequencing.

Soybean hairy roots
Transgenic soybean hairy roots were produced according to [33] with some modifications. Relatively large Williams 82 soybean seeds without cracks were surface sterilized with 70% isopropyl alcohol (v/v) for 30 s and 10% commercial bleach (6.0% (v/v) sodium hypochlorite) for 5 min with gentle agitation, then rinsed three times in sterile MilliQ-filtered water (EMD Millipore, MA). Seeds were transferred to germination paper saturated with germination and co-cultivation (GC) medium (half-strength MS salts (Caisson Labs, UT), 1% sucrose, pH 5.8, and MS vitamins) in a sterile Petri dish and germinated for 3 d in the dark, then transferred to cool white T5 fluorescent lights (100 μE s − 1 m 2 ) at 24°C, a condition that was used for all subsequent soybean transformation steps.
Following pre-culture on LB-agar plates containing 50 mg L − 1 kanamycin and hygromycin, Agrobacterium rhizogenes strain K599 containing the empty vector or construct DNA were resuspended to an OD600 of 0.5-0.8 in phosphate buffer (0.01 M Na 2 HPO 4 , 0.15 M NaCl, pH 7.5) containing 100 μM acetosyringone. Cotyledons were gently twisted off of 6-7 d old seedlings. The apical meristem and hypocotyl was excised and several 1 mm-deep cuts were made across the adaxial surface of the cotyledon with a scalpel previously dipped in the Agrobacterium solution. Twenty-four to 36 cotyledons were inoculated per DNA. Cotyledons were placed adaxial-side-down on germination paper saturated with GC medium containing 100 μM acetosyringone and co-cultivated for 3 d at 22°C under low light (65 μE s − 1 m 2 ) on a 16 h photoperiod. Cotyledons were then cultured adaxial-side-up on hairy root growth (HRG) medium (half strength MS salts, 3% sucrose (w/v) (pH 5.8) with gelzan (2.4 g L − 1 ; Sigma-Aldrich, MO), MS vitamins (2.5 mL L − 1 ) and timentin (500 mg L − 1 ). Fourteen to 21 d later, transgenic primary roots with 2-3 cm secondary roots were transferred to and selected on HRG containing 50 mg L − 1 kanamycin and hygromycin. Only secondary roots that grew to 3-6 cm were considered transgenic and were used for treatments. All hairy root experiments were conducted two times independently, representative results are shown.

Subcellular localization
Soybean hairy roots transformed with nGFP-pGWB6 or nGFP-NAC42-1-pGWB6 were harvested and stained with propidium iodide according to the manufacturer's instructions (Sigma-Aldrich, St. Louis, MO, USA). Three-to-four roots per genotype per two independent transformation events were analyzed and a representative result is shown. Confocal images were acquired using a Nikon A1R Si confocal laser with N-SIM-E, a TiE inverted research microscope, and NIS Elements software. Imaging was performed using an Apo oil 60× objective, plus 1.5× optical zoom, and 6× digital zoom. Excitation and emission spectra were 488 nm and 500-550 nm for GFP and 488 nm and 570-620 nm for propidium iodide, respectively.

Novel abiotic stresses that regulate glyceollin biosynthesis
To gain insight into how abiotic stresses regulate glyceollin biosynthesis in soybean we first searched for a control growth condition that would allow us to measure the inductive and suppressive effects of abiotic stress treatments on glyceollin biosynthesis. We grew soybean seedlings under two light intensities, 10 and 500 μmol m − 2 s − 1 , which we refer to here as low and high light, respectively. We also compared seedlings grown on soil to those grown in liquid half-strength Murashige and Skoog (MS) medium that can be readily manipulated to provide nutrient and chemical stresses (see Methods). In addition to glyceollins, we also measured the levels of two key biosynthetic intermediates, two additional phytoalexins that have potent anti-pathogenic and/or medicinal activities, and two constitutively biosynthesized isoflavone-glycoside conjugates known to compete with glyceollins for biosynthetic intermediates. Specifically, we measured the levels of glyceollin I, glyceollin II, glyceollin III, and phaseol that are biosynthesized from the intermediate daidzein, and β prenyl genistein that is biosynthesized from genistein ( Fig. 1). We also measured the levels of an unknown metabolite that exhibited UV absorbance properties similar to isoflavonoids but did not represent any of the 57 (iso)flavonoid standards that we have in our library.
The MS medium high light condition was the only condition that elicited measurable amounts of all phytoalexins (Fig. 2a). The MS low light condition had greater amounts of glyceollins I and II but lacked phaseol and B prenyl -genistein, and thus may not be suitable for evaluating the specificity of the effects of abiotic stresses on the glyceollin pathway. Glyceollins were absent or in trace amounts in seedlings grown on soil, either under the high or low light conditions. Based on these results, we selected the MS medium high light as the control condition to evaluate the effects of abiotic stresses on glyceollin biosynthesis.
Seedlings were transferred to eight abiotic stress conditions and the amounts of total phytoalexins were enhanced significantly by pH 3.0 medium, UV-C, and dehydration compared to the control (ANOVA, Tukey post hoc test, P < 0.01) (Fig. 2b). pH 3.0 medium stimulated the greatest increase, having 22.7-fold greater amounts of total phytoalexins compared to the control and significantly greater amounts compared to all other treatments. UPLC-PDA chromatograms revealed major increases in the levels of glyceollins for pH 3.0 medium, and major reductions in the amounts of 6-O-malonyldaidzin for dehydration and pH 3.0 medium that were not observe for the UV-C treatment (Fig. 2c). pH 3.0 medium and dehydration predominantly caused increases in the amounts of glyceollin III and glyceollin II ( Fig. 2d). Overall, pH 3.0 medium had the greatest increase in glyceollin amounts, with glyceollin III becoming 25% of the total measured isoflavonoid content.

Acidity stress enhances and dehydration suppresses glyceollin biosynthesis
Pathogens generally elicit maximum glyceollin biosynthesis within 24-48 h of inoculation, then the levels rapidly decline [4,35]. To understand the dynamics of the regulation of glyceollin biosynthesis by pH 3.0 medium and dehydration, we measured metabolite levels at regular intervals up to 9 dat.
Following the transfer of seedlings to the control condition, we observed a gradual accumulation of glyceollins and phaseol peaking at 6 dat (Fig. 3a). In contrast, β prenyl -genistein rapidly decreased up to 3 dat then remained constant thereafter. Two elicitation patterns distinguished pH 3.0 medium from the control. Glyceollin III and phaseol exhibited sharp increases from 6 dat to 9 dat, whereas glyceollins I and II exhibited delayed and prolonged accumulation (Fig. 3a). Elicitation of these daidzein-derived phytoalexins was accompanied by decreases in daidzein and its glycosyl-conjugates, namely daidzin and 6-O-malonyldaidzin. Genistein and derived isoflavonoids were not increased by pH 3.0 medium. In sharp contrast, dehydration caused a sustained suppression of all daidzein-derived isoflavonoids over the 9 d period with up to a 106.8-fold suppression of glyceollin I at 6 dat (Fig. 3a). This major suppressive effect was not observed for genistein-derived metabolites.
To determine whether pH 3.0 medium and dehydration stresses regulated glyceollin biosynthesis gene transcripts, we measured the expression of key biosynthetic genes by quantitative reverse transcriptase-polymerase chain reaction (qRT-PCR). Specifically, we measured the expressions of isoflavone synthase 1 (IFS1) and IFS2, isogenes for the biosynthesis of isoflavones (Fig. 1). We also measured the expressions of isoflavone 2′-hydroxylase (I2'H) and glycinol 4-dimethylallyltransferase (G4DT), genes for the biosynthesis of all daidzein-derived phytoalexins and glyceollin I, respectively [36,37]. pH 3.0 medium upregulated all gene transcripts at 9 dat. The levels ranged from 4.4-to 20.7-fold greater than the control for I2'H and IFS2, respectively (Fig. 3b). By contrast, dehydration stress had reduced levels of all gene transcripts at 6 dat, ranging from 2.2-to 11.7-fold less than the control for IFS2 and I2'H, respectively.

Acidity and dehydration stresses oppositely regulate all known glyceollin biosynthesis genes
To investigate whether pH 3.0 medium and dehydration oppositely regulated all known glyceollin biosynthesis genes, we conducted RNA-seq comparing genes upregulated by pH 3.0 medium to those downregulated by dehydration. pH 3.0 medium upregulated 3242 and dehydration downregulated 9129 genes more than 2-fold, respectively (P < 0.05) (Additional file 3: Table S3 and Additional file 4: Table S4). By comparing the two gene lists, we found that 1058 genes were in common (Fig. 4a & Additional file 5: Table S5). All 27 known glyceollin biosynthesis genes spanning from phenylalanine ammonia lyase (PAL) to the glycinol:dimethylallyl diphosphate (DMAPP) transferases G4DT and G2DT [37,38] were upregulated by pH 3.0 medium and downregulated by dehydration, respectively (Table 1). Since DMAPP is derived from either the cytosolic mevalonate pathway or the plastidic methylerythritol phosphate (MEP) pathway, we checked our lists for these genes. pH 3.0 and dehydration stresses oppositely regulated genes for all steps of the MEP pathway up to DMAPP formation, whereas no mevalonate genes were differentially regulated (Table 1).
Since our RNA-seq analyses found that pH 3.0 medium and dehydration regulated glyceollin biosynthesis at the level of transcription, we hypothesized that TF genes required for the activation of those biosynthesis genes would also be present in our geneset. Yet, all previously identified isoflavonoid TF genes were not found. Those absent included TF genes identified by QTL mapping of isoflavonoid amounts, namely GmMYBJ3 (Glyma.06 g193600) or GmMYB29 (Glyma20g35180) [39,40]. Also absent were TFs that activated the biosynthesis of chalcone synthase-derived isoflavonoids during seed development, namely GmMYB176 (Glyma.05G032200) and GmCYP1 (Glyma.11G098700) [41,42].

Comparative transcriptomics identifies candidate transcription factors for the regulation of glyceollin biosynthesis
To better understand the pathways that were oppositely regulated by acidity and dehydration stresses, we analyzed the ontologies of the 1058 oppositely regulated genes (Fig. 4a). Signal transduction was the most common category of ontology (31.4% of genes, Fig. 4b).
When the signal transduction category was broken down into ontologies, the greatest proportion (28.3%) were annotated as systemic acquired resistance (SAR) (Fig. 4c). SAR is a component of the plant immune system whereby tissues distant from a pathogen infection site become primed (sensitized) to more rapidly activate resistance responses the second time the plant encounters the pathogen. Gene ontology (GO) enrichment analysis indicated that SAR genes were significantly enriched (P < 1.0 − 10 ) and included those involved of salicylic acid (SA)-dependent and independent signaling pathways, in addition to jasmonic acid (JA) and ethylene signaling pathways (GO:0009627, GO:0009862, GO:0009864, GO:0009871, and GO:0010112). The SAR genes included homologs of AGD2-LIKE DEFENSE RESPONSE PROTEIN 1 (ALD1) and FLAVIN-DEPENDENT-MONOOXYGEN-ASE1 (FMO1) that were indispensable for SAR in Arabidopsis (Table 2) [43][44][45]. ALD1 encodes an enzyme that synthesizes the non-protein amino acid pipecolic acid (Pip) from Lys upon pathogen attack [45]. FMO1 converts Pip to N-hydroxypipecolic acid (NHP) [46] and is needed for Pip to orchestrate priming of pathogen responses by SAdependent and independent pathways [47]. The SAR genes also included homologs of signaling and TF genes that had roles in regulating the elicitation of the indole alkaloid phytoalexin camalexin in Arabidopsis. PHYTO-ALEXIN DEFICIENT4 (PAD4) is a lipase-like gene required for SA-dependent elicitation of camalexin in response to microbial pathogens [48]. SIGMA FACTOR (See figure on previous page.) Fig. 3 Time course of phytoalexin and isoflavonoid biosynthesis during acidity and dehydration stresses. a Isoflavonoid levels by UPLC-PDA over time after transfer to the control condition, pH 3.0 medium, or dehydration stress. Error bars represent standard error of mean. b Isoflavonoid biosynthesis gene expressions at 6 and 9 dat measured by qRT-PCR. a Significantly greater and b significantly less than control, paired students t-test (P < 0.01). Error bars represent standard error of mean BINDING PROTEIN 1 (SIB1) encodes a TF that activates the expression of AtWRKY33, a direct regulator of camalexin biosynthesis genes [49]. However, homologs of AtWRKY33 (namely Glyma.02G232600 and Glyma.14G200200) were not found in our gene set nor were they significantly upregulated by pH 3.0 medium alone.

NAC42-type TFs are upregulated with glyceollins by abiotic and biotic elicitors
We conducted qRT-PCR to gain insight into whether the NAC42-type TFs that were identified by our transcriptomics analysis may be involved in regulating glyceollin biosynthesis. qRT-PCR confirmed that the three GmNAC42s were upregulated by pH 3.0 medium and downregulated by dehydration (Fig. 5a-b).
The predicted GmNAC42 proteins were 68.5-85.8% similar to each other and 54.3-56.7% similar to ANAC042/ JUB1 with GmNAC42-1 being the most similar (Additional file 6: Table S6). The N-terminal halves of these proteins contained the conserved NAM domain (pfam02365) putatively involved in dimerization and binding DNA (Additional file 7: Fig. S1). The N-terminal halves of the GmNAC42s were highly similar to ANAC042/JUB1 (76.2-83.3%), whereas the C-terminal halves putatively involved in protein-protein interactions were highly divergent (30.5-34.9% similarity) (Additional file 6: Table S6). A phylogenetic analysis of the predicted GmNAC42 proteins with characterized NACs revealed that the GmNAC42s were most closely related to VvNAC42_5 (Fig. 5c). VvNAC42_5 is an SA-independent powdery mildew responsive gene from grapevine (Vitis vinifera) [50]. Also in this cluster were proteins that positively regulate drought stress responses, namely SlJUB1 and DlNAC1 [51,52].
To probe further whether GmNAC42s may be positive regulators of glyceollins, we assessed whether their gene expressions were upregulated by the wall glucan elicitor (WGE) from P. sojae.
Treatment of soybean hairy roots with WGE resulted in maximum accumulation of glyceollins at 24 h after treatment (Fig. 5d). qRT-PCR found that all three GmNAC42s were upregulated 9.6-to 14.4-fold at this time with the glyceollin biosynthesis gene G4DT (Fig. 5e). GmNAC42-1 was the most highly upregulated.

GmNAC42-1 regulates glyceollin biosynthesis in response to Phytophthora sojae WGE
We chose to investigate the function of GmNAC42-1 since it is the soybean homolog of ANAC042, an indole alkaloid phytoalexin regulator from Arabidopsis, and since its gene expressions coincided with the elicitation of glyceollin biosynthesis. If GmNAC42-1 positively regulates glyceollin biosynthesis, silencing its gene expressions in elicited tissues should reduce the accumulation of glyceollin metabolites and biosynthesis gene transcripts. Conversely, overexpressing GmNAC42-1 should increase the accumulation of glyceollins and their biosynthesis gene transcripts. To test, we produced soybean hairy roots harboring an RNA interference (RNAi) construct that encoded a hairpin dsRNA identical to a    227 bp region of exon 2 of GmNAC42-1 and roots that overexpressed the GmNAC42-1 open reading frame (ORF) via the constitutive cauliflower mosaic virus promoter (p35S). A 2.0-fold silencing of GmNAC42-1 decreased the accumulations of glyceollin biosynthesis gene transcripts IFS1, IFS2, and G4DT 1.8-to 2.4-fold (Fig. 6a). Off-target silencing of GmNAC42-2 was observed but not for GmNAC42-3. The overexpression of GmNAC42-1 upregulated IFS1, IFS2, and G4DT from 2.1-to 8.3-fold in roots treated with WGE or mock (H 2 O) (Fig. 6b-c).

GmNAC42-1 localizes to the nucleus and directly binds the promoters of glyceollin biosynthesis genes
To determine whether the subcellular localization of the GmNAC42-1 protein was consistent with its putative role as a TF, we cloned its ORF downstream of an N-terminal GFP tag and expressed the translational fusion in soybean hairy roots using the constitutively active CaMV-35S promoter (p35S) [53]. nGFP-GmNAC42-1 localized to the nucleus as shown by co-localization with propidium iodide fluorescence (red arrowheads, Fig. 7a-c). By contrast, GFP expressed by the empty vector localized to the cytosol and other extra-nuclear compartments ( Fig. 7d-f ).
To test whether the GmNAC42-1 protein could directly bind the promoters of glyceollin biosynthesis genes, the ORF was also cloned downstream of the GAL4 activation domain and expressed in yeast harboring several 500 bp segments of IFS2 or G4DT promoters (Fig. 7g). GmNAC42-1 weakly activated the G4DT promoter segment closest the transcription start site (G4DTpro1) that had one predicted NAC binding element (T/ATTGACT/ C), failed to activate the segment that lacked the element (G4DTpro1), and strongly activated both IFS2 promoter segments that each had several elements (Fig. 7h).

GmNAC42-1 is required for full elicitation of glyceollin biosynthesis
In this study, we found that transcripts of the NAC-family TF gene GmNAC42-1 were upregulated with glyceollin biosynthesis genes and metabolites when soybean tissues were elicited by acidity stress or the biotic elicitor WGE from P. sojae. They were also downregulated with glyceollin biosynthesis genes and metabolites by dehydration stress. The overexpression and silencing of GmNAC42-1 in WGE-treated hairy roots enhanced and suppressed, respectively, the expressions of the isoflavone biosynthetic genes IFS1 and IFS2, the glyceollin-specific gene G4DT, and the accumulation of glyceollin metabolites. Since G4DT is specifically involved in glyceollin biosynthesis, the results suggest that GmNAC42-1 is a regulator of glyceollin elicitation and not the biosynthesis of constitutively accumulating isoflavone conjugates. However, overexpressing or silencing GmNAC42-1 did not affect the expression levels of I2'H, one of the key genes required for glyceollin biosynthesis [54]. Further, overexpression of GmNAC42-1 in the absence of WGE did not result in the accumulation of glyceollins. Thus, our results showed that GmNAC42-1 is required for the full elicitation of glyceollin biosynthesis in response to P. sojae WGE, but is not sufficient to upregulate all glyceollin biosynthesis genes.
The nGFP-GmNAC42-1 fusion protein localized to the nucleus in the absence of an elicitor treatment and thus did not rely on elicitor treatment for nuclear localization as observed for the phytoalexin TF AtWRKY33 or the NAC-family TFs StNTP1 and StNTP2 [55,56]. Since GmNAC42-1 is essential for full elicitation of glyceollins, we suggest that GmNAC42-1 acts in concert with at least one other TF to coordinately activate all glyceollin biosynthetic genes. Further, by upregulating some but not all glyceollin genes, GmNAC42-1 could also function in SAR to prime soybean tissues distal to an inoculation site for subsequent rapid/high-level elicitation [23,57,58]. A subsequent direct inoculation of the primed tissues would activate the expressions or activity of one or more additional TFs that upregulates I2'H and other glyceollin biosynthesis genes that are not regulated by GmNAC42-1 alone. In that case, overexpressing GmNAC42-1 could serve as an alternative to spraying the lactofen-containing herbicide Cobra that primes glyceollin biosynthesis to increase resistance against pathogens such as white mold, the causal agent of sclerotinia stem rot, without adversely effecting yield [59,60]. Future experiments should test whether overexpressing GmNAC42-1 in soybean plants primes glyceollin biosynthesis without adverse effects on yield as well. Since the rapidity of glyceollin elicitation is a major factor that distinguishes resistant to P. sojae (Rps) soybean genotypes from nearly-isogenic susceptible genotypes [61][62][63][64], experiments should also test whether overexpressing GmNAC42-1 enhances the rapidity of glyceollin elicitation in response to compatible P. sojae (Rps) genotypes.

GmNAC42-1 and a conserved phytoalexin elicitation pathway
The regulation of phytoalexins by pathogens and specific abiotic stresses suggests that elicitation is highly complex and may require multiple signaling pathways. This study in soybean identified acidity stress (pH 3.0 medium) and dehydration as novel regulators of phytoalexin biosynthesis. Transcriptome analysis found that the genes upregulated by acidity stress and downregulated by dehydration were reminiscent of pathogen responses, with SAR genes being highly overrepresented. The SAR genes included homologs of Arabidopsis ALD1 and FMO1 that synthesize the systemic signaling molecules Pip and its derivative N-hydroxypipecolic acid (NHP) to orchestrate priming of (See figure on previous page.) Fig. 6 Overexpression and silencing of GmNAC42-1 in soybean hairy roots. a Gene expressions in WGE-treated Williams 82 hairy roots undergoing RNAi silencing of GmNAC42-1. b Gene expressions in WGE-treated hairy roots overexpressing GmNAC42-1. c Gene expressions in mock-treated hairy roots overexpressing GmNAC42-1. Measurements were 24 h after treatment by qRT-PCR. a Significantly greater and b significantly less than control, paired students t-test (P < 0.01). d Amounts of phytoalexins and constitutive isoflavonoids in soybean hairy roots undergoing RNAi silencing of GmNAC42-1 24 h after treatment with WGE or H2O. e Metabolite amounts from hairy roots overexpressing GmNAC42-1. Different letters show significant differences by single factor ANOVA, Tukey post hoc test, P < 0.01 pathogen responses [46,47], and the lipase-like and TF Arabidopsis genes PAD4 and ANAC042 that regulate the biosynthesis of camalexin in Arabidopsis [7,48]. Here, we found that GmNAC42-1 is the soybean homolog of ANAC042 and is required for full elicitation of glyceollins. The results suggest a conserved phytoalexin elicitation pathway for phenylpropanoid-derived glyceollins in soybean and indole alkaloid-derived camalexin in Arabidopsis that requires NAC42 TFs. Further, our investigation of Lager's transcriptome dataset [65] demonstrated that ANAC042 and its target camalexin biosynthesis genes (namely CYP71A12, CYP71A13 and CYP71B15/PAD3) [7] were upregulated by long-term acidity stress, suggesting that NAC42-dependent induction of phytoalexins may be a conserved response to acidity stress.
More insight into the NAC42 pathway could be drawn from the fact that glyceollin biosynthesis was elicited by the treatment of soybean cotyledons with hydroxyl radical (a ROS) [24] and camalexin elicitation by the ROS-inducing herbicide acifluorofen required ANAC042 [7]. The ROS-inducing herbicide lactofen systemically primes glyceollin biosynthesis [59]. ROS accumulation is stimulated by various phytoalexin elicitors such as pathogens, heavy metals, and UV irradiation [66][67][68]. Further, the acidification of growth media from pH 5.0 to 4.5 stimulated ROS production in seedlings of barley and Scots pine [69,70] and in MS medium containing Plantago shoots [71]. Also, genes that positively regulate ROS (GO:2000377 and GO:2000379) were overrepresented in the soybean and Arabidopsis transcriptome responses to long-term acidity stress. Thus, the NAC42 pathway may be a conserved ROS signaling pathway responsible for phytoalexin elicitation in response to various abiotic and biotic elicitors. It is tempting to speculate that major TFs that regulate acidity and dehydration responses may regulate GmNAC42-1 since the stresses oppositely regulate GmNAC42-1 transcripts. STOP1 is a zinc finger TF that is a major regulator of protective responses to acidity stress [72,73]. STOP1 also stimulates ROS production [74]. Yet, STOP1 homologs were not found in the soybean transcriptome response to long-term acidity stress (9 dat), and ANAC042 was not downregulated in an Arabidopsis stop1 mutant at 1 dat [72]. This could infer that NAC42 induction of phytoalexins is downstream of ROS signaling and not directly regulated by STOP1. ABA is a major regulator of dehydration responses in part through the activity of ABA-responsive element (ABRE)-binding TFs [75]. Our transcriptome dataset shows that dehydration is a powerful negative regulator of glyceollin biosynthesis and GmNAC42-1, raising the possibility that both are negatively regulated by ABA. We found that ABREs were present in the promoter regions (~1000 bp upstream of the transcription start site) of several glyceollin biosynthesis genes, but no ABREs were observed in the GmNAC42-1 promoter (data not shown). Thus, dehydration may regulate glyceollin biosynthesis at multiple levels.

Co-option of phytoalexin biosynthesis by NAC42
Phytoalexin TF genes of the NAC, MYB, bHLH, and WRKY families have been identified from Arabidopsis, rice, cotton, maize and grapevine [5][6][7][8][9][10]. Yet none of these TF genes were homologous among plant species. The phytoalexins elicited in these species were biosynthetically diverse and included indole alkaloids, momilactones and phytocassanes, terpenoid aldehydes, deoxyanthocyanidins, and stilbenoids, respectively. Thus, it has remained a question whether any phytoalexin TFs are conserved in plants or whether they are as diverse as the biosynthetic pathways that they regulate. Here, we found that GmNAC42-1 is required for the full activation of glyceollin biosynthesis in soybean. Its homolog ANAC042 is needed for the full elicitation of camalexin biosynthesis in Arabidopsis [7]. Glyceollins are isoflavonoid derivatives derived from phenylalanine, whereas camalexin is an indole alkaloid biosynthesized from tryptophan. It is possible that NAC42 TFs regulate genes in the shikimate pathway that produces phenylalanine and tryptophan. Yet, our overexpression and silencing experiments demonstrated that GmNAC42-1 regulated isoflavonoid-and glyceollin-specific biosynthetic genes through the direct binding of their promoters. While our promoter sequence analyses identified the putative NAC-binding element T/ATTGACT/C within 1 kb of the translation start sites of the camalexin-specific biosynthetic genes CYP71A12 and CYP71A13 that were regulated at the mRNA level by ANAC042 [7], the DREB2A element that was suggested to be the target of ANAC042/JUB1 [76] was not found in those regions nor within glyceollin biosynthetic gene promoters. If NAC42 TFs indeed bind the element T/ATTGACT/C element in glyceollin-and camalexin-specific biosynthetic genes, this would suggest that phytoalexin biosynthesis pathways were co-opted into stress-inducible regulation by NAC42 TFs. Our future work will focus on characterizing the recognition elements and DNA binding domains of GmNAC42-1 and ANAC042 that are required to activate phytoalexin biosynthesis.