Genomic and transcriptomic analysis of carbohydrate utilization by Paenibacillus sp. JDR-2: systems for bioprocessing plant polysaccharides

Background Polysaccharides comprising plant biomass are potential resources for conversion to fuels and chemicals. These polysaccharides include xylans derived from the hemicellulose of hardwoods and grasses, soluble β-glucans from cereals and starch as the primary form of energy storage in plants. Paenibacillus sp. JDR-2 (Pjdr2) has evolved a system for bioprocessing xylans. The central component of this xylan utilization system is a multimodular glycoside hydrolase family 10 (GH10) endoxylanase with carbohydrate binding modules (CBM) for binding xylans and surface layer homology (SLH) domains for cell surface anchoring. These attributes allow efficient utilization of xylans by generating oligosaccharides proximal to the cell surface for rapid assimilation. Coordinate expression of genes in response to growth on xylans has identified regulons contributing to depolymerization, importation of oligosaccharides and intracellular processing to generate xylose as well as arabinose and methylglucuronate. The genome of Pjdr2 encodes several other putative surface anchored multimodular enzymes including those for utilization of β-1,3/1,4 mixed linkage soluble glucan and starch. Results To further define polysaccharide utilization systems in Pjdr2, its transcriptome has been determined by RNA sequencing following growth on barley-derived soluble β-glucan, starch, cellobiose, maltose, glucose, xylose and arabinose. The putative function of genes encoding transcriptional regulators, ABC transporters, and glycoside hydrolases belonging to the corresponding substrate responsive regulon were deduced by their coordinate expression and locations in the genome. These results are compared to observations from the previously defined xylan utilization systems in Pjdr2. The findings from this study show that Pjdr2 efficiently utilizes these glucans in a manner similar to xylans. From transcriptomic and genomic analyses we infer a common strategy evolved by Pjdr2 for efficient bioprocessing of polysaccharides. Conclusions The barley β-glucan and starch utilization systems in Pjdr2 include extracellular glycoside hydrolases bearing CBM and SLH domains for depolymerization of these polysaccharides. Overlapping regulation observed during growth on these polysaccharides suggests they are preferentially utilized in the order of starch before xylan before barley β-glucan. These systems defined in Pjdr2 may serve as a paradigm for developing biocatalysts for efficient bioprocessing of plant biomass to targeted biofuels and chemicals. Electronic supplementary material The online version of this article (doi:10.1186/s12864-016-2436-5) contains supplementary material, which is available to authorized users.


Background
The bacterium Paenibacillus sp. JDR-2 (Pjdr2) originally isolated from sweetgum wood (Liquidambar styraciflua) disks exposed to surface soils has been shown to completely utilize the lignocellulosic polymer glucuronoxylan (GX n ). Previous studies showed that growth on minimal media supplemented with polymeric xylan was preferred to that on simple sugars such as xylose, glucose, or arabinose [1,2]. These studies indicated that efficient xylan utilization is attributable, in part, to a 157 kDa GH10 β-1,4-endoxylanase (Xyn10A 1 ) containing carbohydrate binding modules (CBM) for binding to polysaccharides and surface layer homology (SLH) domains for cell-association. The efficiency of utilization was such that the products of xylan hydrolysis were rapidly assimilated as they were formed [2,3]. These early findings suggest that Pjdr2 utilizes glucuronoxylan in a vectorial manner with an unidentified mechanism for coupling surface localized polymer hydrolysis to rapid oligoxyloside transport into the cell.
PCR screening for genes encoding enzymes typically involved in utilization of GX n led to the identification of the aldouronate utilization gene cluster (Fig. 1a). This cluster of genes encodes three intracellular glycoside hydrolases, a GH67 α-glucuronidase, a GH10 endoxylanase and a GH43 β-xylosidase (Agu67A, Xyn10A 2 and Xyn43B 1 , respectively), as well as regulatory proteins and ABC (ATP-binding cassette) transporters ( Fig. 1a and Additional file 1).
Through in silico analysis this gene cluster is predicted to contain multiple promoters and catabolite repression elements (cre) although the entire region has only a single detected terminator following the last gene, xyn43B 1 . Importantly, the aldouronate utilization gene cluster was coordinately regulated with the distally located xyn10A 1 , supporting the role of this large surface anchored xylanase in membrane localized glucuronoxylan hydrolysis [4]. More recent studies showed that Pjdr2 utilized polymeric GX n at a rate 2.8 times higher than the GX n derived aldouronate, aldotetrauronic acid, suggesting a functional coupling of primary xylan hydrolysis to oligosaccharide transport into the cell [3].
Analysis of the sequenced genome [5] for carbohydrate active enzymes (CAZy) through the CAZy database (http://www.cazy.org/) [6] identified a number of genes encoding enzymes predicted to be involved in xylan utilization. These include five genes encoding GH10 and one encoding GH11 xylanases with expected, and in three cases demonstrated, endo-β-1,4-xylanase activity for xylan main chain hydrolysis. Genes encoding accessory enzymes for xylan depolymerization including β-xylosidases, αglucuronidases, α-L-arabinofuranosidases, and acetyl esterases are also predicted to be numerous. For example, the genome of Pjdr2 contains 25 genes encoding a broad class of GH43 enzymes including β-xylosidases and α-L-arabinofuranosidases. There are also four genes encoding α-glucuronidases of which three are in the Fig. 1 Polysaccharide utilization regulons in Pjdr2. Genomic organization of xylan utilization genes (a) assigned a role in xylan utilization in Pjdr2. For a complete list see Additional file 1 [1]. Genetic organization of barley β-glucan (b) and starch (c) utilization regulons consisting of genes encoding extracellular multi-modular cell-associated or secreted glycoside hydrolases for depolymerization, ABC transporters for assimilation of the generated oligosaccharides, and intracellular glycoside hydrolases for further processing and metabolism. SBP, solute binding protein; IMP, inner membrane protein; BPD, binding protein dependent; GH, glycoside hydrolase. Locus tag annotated as Pjdr2_#### abbreviated to only consist of the numeric portion, #### GH115 family and one in the GH67 family. Recombinant enzymes encoded by several of these genes have been characterized [2,3,7]. Several other CAZy families commonly involved in xylan utilization are also highly represented. The cumulative potential xylan degrading capacity of Pjdr2 supports its demonstrated abilities for complete xylan utilization.
More recent efforts have focused on transcriptomic analysis of Pjdr2 while growing on xylan substrates. Proteins involved in xylan utilization were identified based on gene expression levels [1]. Primary findings of that study revealed that Pjdr2 secretes only two endoxylanases for xylan depolymerization. The resulting mixture of oligosaccharides generated by Xyn10A 1 [2] and Xyn11 (unpublished) consists of neutral oligoxylosides and, depending on the xylan type, a mixture of substituted xylooligosaccharides including aldouronates and similarly substituted oligoarabinoxylosides. Genes predicted to encode ABC transporters were found to have increased transcript levels during growth on xylans and it was proposed that these enable transport of this complex mixture of oligoxylosides into the cell for intracellular hydrolysis to monomeric sugars for catabolism [2][3][4]8]. The observed high expression level of genes encoding intracellular oligosaccharide degrading accessory enzymes during growth on xylans supported this hypothesis (Additional file 1). Comparative transcriptomics following growth of Pjdr2 on dicot-derived hardwood (sweetgum) GX n and a monocot-derived grass (sorghum) glucuronoarabinoxylan (GAX n ) indicated that systems with distinct enzymes and ABC transporters are employed for utilization of oligoarabinoxylosides as compared to aldouronates [1].
Based on the earlier physiological characterization and the most recent transcriptomic study, the proposed model for xylan utilization indicates that Pjdr2 relies on transport and intracellular degradation of xylooligosaccharides. This system in Pjdr2 shows similarity to the polymeric sugar utilization by other bacteria including Geobacillus, Thermotoga and Clostridium [9][10][11] and stands in contrast to the more classical paradigm for fungi that requires complete extracellular conversion of polysaccharides to monosaccharides [12]. The role of the cell-associated endoxylanase in the xylan utilization systems represents an alternative paradigm to that observed in cellulolytic bacteria such as Clostridium thermocellum in which glycoside hydrolases comprise a cell-associated complex as opposed to individual enzymes [13][14][15][16].
Recent studies show that Pjdr2 is also capable of efficient utilization of other biomass derived polysaccharides including barley β-glucan and starch [17,18]. Genome analysis indicates that these polysaccharide utilization systems include extracellular glycoside hydrolases with modular architecture for cell-association and carbohydrate binding. We present here an overview of a broad transcriptomic study characterizing Pjdr2 gene regulation in response to growth on barley β-glucan and starch as well as their constituent disaccharide sugars, cellobiose and maltose. The results are additionally considered in regards to the previously studied xylanutilization system, providing a comparison of these three polysaccharide-utilization systems with respect to transport and catabolism of the products of depolymerization as well as their monosaccharide constituents. Comparison of these three polysaccharide utilization systems of Pjdr2 indicate a reliance upon cell-associated glycoside hydrolases with CBM's for interacting with polysaccharides and SLH domains for cell-association. Furthermore, identification of 29 genes within the Pjdr2 genome encoding proteins involved in carbohydrate utilization that contain sets of SLH domains supports an evolutionary path leading to the secretion of cellassociated glycoside hydrolases. This system is efficient in the depolymerization of polysaccharides at the cell surface and is found in Pjdr2 as well as related bacteria including Clostridium [19][20][21], Caldicellulosiruptor [22] and Thermoanaerobacter [23].

Experimental design
For this transcriptome study, we sought a greater understanding of how Pjdr2 utilizes polymeric sugars. Genome analysis and polysaccharide growth studies supported efficient utilization of the polysaccharides soluble β-glucan and starch. Through bioconversion these abundant biomass-derived sugar polymers may contribute to the production of value-added chemical or fuels. To obtain a broad understanding of how these polysaccharides are utilized by Pjdr2, total RNA was prepared from earlymid exponentially growing cultures growing on these polysaccharides as well as their limit enzymatic hydrolysis products and their constituent simple sugars. The sample preparation and RNA-seq data acquisition portions of this work overlap with a recently published xylan utilization transcriptome and the results presented here are compared with this earlier work to provide perspective, draw conclusions and identify themes which define the efficient manner in which Pjdr2 utilizes polysaccharides [1]. The saccharides used in the study that provided the final comparative data set include barley βglucan (B) and cellobiose (C) representing β-configured glucans, starch (S) and maltose (M) representing αconfigured glucans, sweetgum glucuronoxylan (SG) and sorghum glucuronoarabinoxylan (SO) representing different xylan types and the constituent monosaccharides of these polysaccharides, including glucose (G), xylose (X) and arabinose (A). Further, each condition was routinely compared to a yeast extract (YE) control condition which consisted of 0.5 % YE without added carbohydrate and a sweetgum xylan with no YE (SGnoYE) control. Throughout the manuscript where specific genes are considered, their respective transcript levels or their encoded proteins are routinely identified with an accompanying abbreviated locus tag accession number consisting of the four digit number, e. g. the locus tag Pjdr2_0001 would be described as 0001. The total data set normalized RPKM (Reads Per Kilobase per Million reads sequenced) values were compared by fold changes taken from the ratio of the condition in consideration over the YE control condition unless otherwise stated. Data was judged to be significant given a 4-fold change and a p-value, < 0.05.

Genes involved in barley β-glucan utilization
Recent studies have shown that Pjdr2 may utilize soluble β-glucans [18]. Barley β-glucan consists of a linear polysaccharide chain of β-1,4 linked glucose frequently and regularly interrupted with β-1,3-linked glucose [24]. This polysaccharide lacks side chain substitutions such as those found in xylan, hence the extracellular degradation of barley-β-glucan is correspondingly less complex, presumably requiring fewer enzymes. The genome of Pjdr2 encodes three GH16 enzymes (genes 0951, 0952, and 0824) annotated as licheninases or laminarinases through domain analysis. All three enzymes are predicted to be secreted and while one consists of a singular GH16 catalytic module, the other two have an extensive multimodular architecture. Both of these modular enzymes (genes 0951 and 0824) contain triplicate N-terminal SLH domains, presumably for cell surface localization, and multiple CBM's similar to those observed for the Xyn10A 1 enzyme involved in xylan utilization (Fig. 4).
During growth on barley-β-glucan, the genes encoding the multimodular Bgl16A 1 (gene 0951) and the non-modular Bgl16A 2 (gene 0952) increased 80-fold and 25-fold respectively compared to growth on the yeast extract control (YE) not supplemented with carbohydrate ( Table 1). The bgl16A 3 gene (0824) encoding the second large multimodular GH16 enzyme is expressed only at low levels on all substrates tested. Bgl16A 1 shares 34 % amino acid similarity with the catalytic domain of the GH16 laminarinase from Thermotoga maritima (UniProt accession: Q9WXN1) [25] and Bgl16A 2 shares 71 % similarity to a probable licheninase from Bacillus subtilis (UniProt accession: P04957) ( Table 2) [26]. In support of these annotations, recombinant Bgl16A 1 has been shown to have activity against barley β-glucan and laminarin as both substrates contain the requisite β-1,3-glucan linkage while recombinant Bgl16A 2 shows its highest activity on barley β-glucan [18].
Hydrolysis of barley β-glucan with GH16 laminarinase and licheninase enzymes is expected to liberate β-1,3/1,4 mixed linkage glucooligosaccharides. Increased transcript levels for two predicted ABC transporter gene cassettes were observed during growth on barley β-glucan compared to the YE control. The first cassette consisting of genes 0949, 0950 and 0953, flanks the enzyme encoding genes bgl16A 1 and bgl16A 2 described above and showed greater than a 1200-fold increase in transcript levels during growth on barley β-glucan (Table 3). These barley β-glucan utilization genes constitute an apparent operon specifically responsive to growth on soluble β-1,3 (4)-glucans and no other tested substrate. This operon is directly linked to a β-glucan-responsive set of putative transcriptional regulators (genes 0947 and 0948) located immediately upstream but transcribed in the opposite direction. Together, these seven genes, 0947 through 0953, constitute the glucan utilization gene cluster (Fig. 1b). The second ABC transporter gene cassette consisting of genes 5314, 5315, and 5316 has increased expression on barley β-glucan and was also increased on xylan [1]. This overlapping regulation will be discussed below.
Comparison of the barley β-glucan utilization system described here to the xylan utilization system described previously [1] implies a missing enzymatic component for barley β-glucan utilization. Pjdr2 appears to transport the mixed linkage oligosaccharide products of the two secreted GH16 endoglucanases in a manner similar to xylan utilization. However, unlike the defined intracellular oligosaccharide processing in the xylan utilization systems, there is no clear evidence for increased expression of genes encoding enzymes for intracellular hydrolysis of glucooligosaccharides contributing to the barley β-glucan utilization system. Genome analysis has identified several genes encoding enzymes which could be involved in the further processing of intracellular glucooligosaccharides (e.g. the genome encodes fifteen GH3 enzymes), but none of these genes are confidently assigned to this role based on the transcriptomic data. One candidate, gene 0317, encoding an intracellular GH3 β-glucosidase attains elevated transcript levels with growth on barley-β-glucan and cellobiose. The comparatively high expression level on YE resulted in a limited relative increase of just 2.7-fold with barley β-glucan, although compared to growth on arabinose or glucose this gene had a 10 and 5.4-fold increase in expression, respectively (Table 1).

Genes with increased expression during growth on cellobiose
Cellobiose is thought to represent a primary limit product of mixed linkage β-glucan utilization and was chosen for study to discriminate between utilization of the barley β-glucan polymer and its hydrolysis products.   Cellobiose as a growth substrate results in increased expression of several genes encoding putative ABC transporters in Pjdr2. The genes 5960, 5961 and 5962 show greater than 118-fold increase in expression on cellobiose relative to YE. Both glucose and barley β-glucan also induce the genes encoding this transporter although to a lower extent (Table 3). From the transcriptome data it is not known precisely how cellobiose is converted to glucose for entry into glycolysis. As detailed above for the intracellular processing of glucooligosaccharides resulting from barley β-glucan utilization the putative intracellular GH3 β-glucosidase (gene 0317) may serve this role. This gene is expressed on cellobiose at nearly the same increased level as found with growth on barley β-glucan relative to other sugars, but not YE (Table 1). In addition, gene 0750 encoding a putative intracellular β-xylosidase (Xyn43B 2 ) that was earlier predicted to be involved in xylan utilization due to its 100-fold increase on xylan relative to YE (Additional file 1) is found in this work to be increased 177-fold during growth on cellobiose [1] (Table 1). This gene may encode the enzyme primarily responsible for hydrolysis of cellobiose. If so, this putative xylosidase either has dual substrate specificity or it actually encodes a GH43 β-glucosidase the expression of which is induced by cellobiose and to a lesser extent xylobiose. The GH43 family does not as yet contain an enzyme with a reported β-glucosidase activity. The expression of xyn43B 2 is also increased on barley β-glucan by 3-fold (Table 1) relative to YE and may contribute as well to the hydrolysis of the glucooligosaccharides derived from this polymer.

Genes involved in starch utilization
Pjdr2 grows very efficiently on starch [17]. Utilization of this α-1,4-linked glucose storage polysaccharide appears similar to barley β-glucan as this polysaccharide is also chemically simple relative to xylans with fewer enzymes required for degradation to glucose. The genome of Pjdr2 encodes four GH13 amylases. Three of these, Amy13A 1 , Amy13A 2 , and Amy13A 3 have significantly increased transcript levels ranging from 55-fold to over 100-fold increased expression during growth on starch (Table 1). Both Amy13A 1 (gene 0774) and Amy13A 2 (gene 5200) are predicted to be secreted and primarily responsible for endo-hydrolysis of native starch. The amy13A 2 gene encodes a large multimodular enzyme including SLH domains and CBM's for cell surface proximal substrate localization (Fig. 4) while amy13A 1 encodes only a catalytic domain. The starch utilization system in Pjdr2 also has a predicted intracellular amylase, Amy13A 3 (gene 0783) presumably to complete the degradation of the transported, intracellular maltodextrins. Amy13A 1 shares 46 % amino acid sequence identity with the extracellular amylase from Bacillus megaterium (UniProt accession: P20845) ( Table 2) [27,28] and Amy13A 2 shares 33 % identity over a large portion of its modular sequence with an amylopullanase from Thermoanaerobacter pseudethanolicus (UniProt accession: P38939) [29]. Amy13A 3 shares 47 % amino acid identity with an intracellular maltogenic amylase from B. subtilis (UniProt accession: O06988) [30] where it is thought to function in the conversion of maltotriose and larger maltodextrins to maltose and glucose ( Table 2).
A single ABC transporter gene cassette showed increased transcript levels during growth on starch relative to YE (Table 3). The genes for this transporter (genes 0771, 0772 and 0773) are just upstream of amy13A 1 (0774) and form a predicted operon (Fig. 1c). The solute binding protein (gene 0771) of this transporter shares 33 % amino acid identity with a maltodextrin binding protein from Bacillus subtilis 168 (Table 2) [31]. This putative maltose/maltodextrin ABC transporter gene cassette was shown to be markedly up-regulated during growth on xylans (Fig. 3b, Additional file 2) [1]. However, the genomic localization of this ABC gene cluster within a predicted operon containing the gene encoding extracellular amylase suggests its primary function is that of a maltodextrin transporter (Fig. 1c). This overlap in regulation will be further discussed below.
The high amino acid sequence identity between Amy13A 3 and the maltogenic amylase from B. subtilis suggest that this enzyme might process transported maltodextrins to glucose and maltose [30]. As a component of a complete starch utilization system, gene 1149 encodes a putative α-glucan phosphorylase (MalP) allowing for phosphorolytic cleavage of intracellular maltose [32,33]. Expression levels of this gene on starch compared to YE yielded insignificant results (p-value, 0.403), but a 9.1-fold transcript increase is observed relative to growth on glucose (p-value, 0.004) ( Table 1). Transcript data for the fourth predicted amylase encoding gene, 1045, was not considered statistically significant (p-value, 0.635) and did not appear to exhibit dynamic regulation on starch, barley β-glucan or xylan and linear RPKM values were comparatively low (1.2-1.6).
Genes with increased expression during growth on maltose As the primary hydrolysis limit product of starch, maltose was included in this study to distinguish physiological features for efficient starch utilization. The transporter genes described above as part of the putative starch utilization operon are also upregulated. In addition, genes 5589, 5590 and 5591 encoding a second ABC transporter are up-regulated approximately 10-fold on maltose over YE (Table 3). Once internalized, maltose would be expected to follow the pathway similar to that predicted in starch utilization; however, for this growth condition expression of the gene encoding the MalP protein is not increased relative to any other growth condition.
This finding reveals a difference between the intracellular processing of maltodextrins derived from starch hydrolysis by the surface localized multimodular Amy13A 2 and maltose directly assimilated. A focused search failed to identify homologs of genes known for the conversion of maltose or maltose-6-phosphase (e. g. glucose phosphorylase or 6-phospho-alpha-glucosidase). Two other genes upstream of those encoding the maltose specific transporter identified above code for proteins annotated as an oxidoreductase (gene 5587) and a hypothetical protein (gene 5588) and the predicted operon appears related to the thuAB encoding operon involved in trehalose utilization in Agrobacterium tumefaciens [34] (Table 1). This suggests that Pjdr2 converts maltose to 3-keto-maltose.

Monosaccharide assimilation and metabolism
From genome analysis [5], intracellular metabolism of the hexose, glucose, and the pentoses, xylose and arabinose, are expected to follow through the Embden-Meyerhof-Parnas (EMP) pathway and pentose phosphate pathway (PPP), respectively, for entrance into the tricarboxylic acid (TCA) cycle. Following transport of arabinose through the previously identified arabinose responsive ABC transporter [1], this sugar may be converted to ribulose-5-phosphate by the arabinose isomerase and ribulose kinase enzymes. In Pjdr2, the gene 2502 (Table 4) attains a 24-fold increase in transcript level with growth on arabinose and 4.9-fold on sorghum MeGAX n (Additional file 2). Based on transcript levels the candidate ribulose kinase enzyme is encoded by gene 4209. This enzyme is a distant homolog (~21 % ID) to the AraB protein which is a component of the L-arabinan utilization system of Geobacillus stearothermophilus [35] (Table 2). Transcript levels of gene 4209 are increased 17-fold on arabinose (Table 4) and 3-fold on sorghum MeGAX n relative to YE (Additional file 2). This gene does not show an increased transcript level on other carbohydrate growth conditions used in this study. The genes 0977, 0978 and 0979 encoding an ABC transporter are primarily responsive to xylose resulting in an average of 135-fold increase in transcript level relative to YE (Table 4). These genes also showed significant but much lower fold increases on glucose and arabinose. Additionally, a predicted symporter encoded by gene 1340 shares 49 % identity with the AraE xylose and arabinose symporter in B. subtilis (Table 2) [36]. Expression of this gene is responsive to xylose resulting in a 162-fold increase. This gene is also expressed on cellobiose (Additional file 2), glucose and arabinose although to a much lower extent than observed on xylose ( Table 4). Conversion of xylose to xylulose-5-phosphate follows a similar path as arabinose since genes encoding xylose isomerase (gene 5159) and xylulose kinase (gene 5158) result in nearly a 100-fold and 63-fold increase in expression, respectively, on xylose compared to YE controls (Table 4). Growth on xylans also resulted in transcript increases of 35-fold and greater for these two genes (Additional file 2).
While the genes that encode the transporters that import xylose and arabinose can be identified based on homology and increased transcript levels, a system for efficient glucose assimilation is less apparent. Genes encoding three putative ABC transporters showed increased transcript levels with growth on glucose, but for only two of these ABC transporters (genes 0472, 0473 and 0474 and genes 2400, 2401 and 2402) is it possible that glucose may be the target sugar for transport. For both of these transporter gene sets transcript is increased not only on glucose, but also similarly increased on arabinose, xylose and cellobiose (Tables 3 and 4). The 0472-0474 gene set is increased more significantly at approximately 20-fold relative to the YE control, while the 2400-2402 gene set is just greater than the significance cutoff of 4-fold (Table 4 and Additional file 2). The third ABC transporter gene set whose transcript is significantly increased with growth on glucose (genes 0977, 0978 and 0979) is assigned as a xylose transporter. While its expression is significant with growth on glucose, it is very low relative to growth on xylose. Analysis for phosphotransferase systems (PTS) reveals two operons (gene sets 2007-2010 and 6221-6226) encoding all protein components of a complete PTS system. Based on homology, the 6221-6226 gene set appears very likely to be a mannitol transport system, while the 2007-2010 set encodes a EIIA component (gene 2010) which is annotated as a glucose superfamily transporter, and a separate protein product (gene 2009) encoding the EIIBC components annotated as an N-acetylglucosamine specific transporter (Additional file 2). None of the genes encoding the complete PTS system components have increased transcript responsive to growth on glucose relative to YE. Interestingly, two unlinked PTS system components (gene 3804 annotated as an Enzyme I complex and gene 0174 annotated as HPr phosphocarrier protein) have relatively high constitutive expression levels, but their roles are unclear (Additional file 2). The analysis for potential glucose specific transporters is not conclusive from this data. Once transported into the cell, conversion of glucose to glucose-6-phosphate for entry into glycolysis appears to be mediated by only a single enzyme: a glucokinase (gene 0170) which yields an average RPKM value of 128 ± 15 over all the tested growth conditions (Table 4). This physiological data underscores original research which showed that Pjdr2 does not efficiently utilize simple sugars in minimal salt media [2].
Overlapping regulation: starch > xylan > soluble β-glucan Unexpectedly, the combined data for barley β-glucan, starch and xylan reveals a regulatory connection for utilization of these polymers. This can be seen in quantitative comparisons of the expression of genes encoding the secreted multimodular GH16, GH13, and GH10 endolytic enzymes and those encoding their associated substrate binding proteins that serve as a representative of the specific ABC transporter for the saccharides generated by these enzymes on the cell surface (Fig. 2). Growth on xylans, both GX n and GAX n , supports the enhanced expression of genes associated with utilization of xylans and starch but not those associated with the utilization of soluble β-glucan. While barley β-glucan induces genes related to its extracellular degradation and assimilation, these results show that it also induces 8 of the 13 glycoside hydrolase genes involved in xylan utilization (Table 1) [1]. Furthermore, while growth on xylan does not induce any soluble β-glucan utilization genes it does induce genes encoding all of the GH13 α-amylases and the ABC transporter considered to be involved in starch depolymerization and transport for utilization ( Fig. 3b and Additional file 2) [1] with the exception of the putative α-glucan phosphorylase gene, malP. Following growth on starch, Pjdr2 does not induce genes for either xylan or barley β-glucan utilization. These relationships are represented in the heat map shown in Fig. 3b in which expression of genes encoding ABC transporter proteins as well as accessory enzymes for intracellular metabolism of assimilated oligosaccharides are shown. These findings may be due to a metabolic substrate preference in a manner similar to glucose mediated catabolite repression, or result from evolved enzyme systems for utilization of polysaccharides that are typically associated. In cereal grains, these three carbohydrates can be found together, with xylan and β-glucan localized more to the cell wall and outer layers, and the starch consolidated in the endoplasm [37]. The model that currently describes this relationship (Fig. 3a) can be described as starch first, xylan second and barley β-glucan third. From the observed coordinate gene expression, Pjdr2 appears prepared to utilize multiple polysaccharides (Fig. 3b).
In consideration of the expression of the genes encoding intracellular xylanases, e.g. Xyn43B 1 and Xyn8 [1] on barley β-glucan, it is possible that these enzymes may have a bifunctional role in the intracellular hydrolysis of β-glucooligosaccharides, thereby, providing an additional route for the intracellular processing of the barley β-glucan derived glucooligosaccharides and cellobiose.
Other genes involved in transport also show overlapping regulation. The predicted ABC transporter previously annotated as a "multiple sugar transport system" consisting of the genes 5314, 5315 and 5316 is shown to have increased expression on barley β-glucan ( Table 3) similar to that observed for xylan [1]. This is the only ABC transporter gene set that follows the pattern of expression during growth on barley β-glucan as that observed for the xylan specific glycoside hydrolase genes (Fig. 3b), supporting the possibility that it might be bifunctional in substrate recognition. One other gene encoding an ABC transporter component also follows this pattern. Gene 1322 (Table 3) encoding an inner membrane component, UgpE (BPD transport system IMP, Fig. 1a), of the aldouronate utilization gene cluster [1,4] Fig. 3 Overlapping regulation of polysaccharide utilization genes in Pjdr2. Schematic representation (a) of the regulatory connections between the studied polysaccharide substrates. Growth condition responsive genes (b) for barley β-glucan, starch and xylans were compared by hierarchical clustering relative to expression on the yeast extract control. High expression, red; low expression, blue. LT, Locus tag annotated as Pjdr2_#### abbreviated to only consist of the numeric portion, #### β-glucan. Studies are underway to elucidate these overlapping regulatory connections.

Overlapping regulation: cellobiose and xylobiose
Some genes with increased transcript levels during growth on cellobiose were found to also have increased expression levels with growth on xylans. The gene cluster 5596, 5597 and 5598 (Table 3) encodes an ABC transporter annotated as an "unknown carbohydrate transporter" and has been assigned a potential role in xylan utilization based on increased transcript levels (Additional file 1) [1]. Analysis of growth on cellobiose indicates these genes are expressed at a level comparable to that on xylan. A corollary to this finding is the observation that growth on cellobiose also resulted in increased transcript levels for genes 0728, 0729 and 0730 encoding an ABC transporter previously assigned a putative function in xylooligosaccharide (X 2 and X 3 ) transport [1]. From the similar level of expression on both xylan and cellobiose, it is proposed that these transporters may be specific for disaccharides such as cellobiose and the primary neutral product of enzymatic xylan hydrolysis, xylobiose. These findings indicate that the transporters of β-configured oligosaccharides may be promiscuous in their substrate recognition.

Proteins with SLH domains
Enzyme systems utilized by Pjdr2 for the extracellular processing of xylan, barley β-glucan and starch share a common theme. These systems include extracellular cellassociated multimodular glycoside hydrolases to generate oligosaccharides that are released in close proximity to the bacterial cell wall. This functionality is mediated by surface layer homology (SLH) domains that anchor the enzymes to the cell surface [2,38,39] and carbohydrate binding modules (CBM) that presumably associate the enzyme with the target polysaccharide [40]. These cellsurface proximal oligosaccharides are then efficiently transported with substrate specific ABC transporters and further hydrolyzed within the cell to monosaccharides for introduction into catabolism. This surface localization represents a strategy for competitive utilization of these polysaccharides. As part of this work we sought to define the roles of SLH domains in Pjdr2 in the processing of plant polysaccharides.
In total, there are 77 genes encoding proteins with regions homologous with SLH domains. Of these, 73 have two or more consecutive SLH domains which is the minimum set thought to be required for tight binding to the cell wall [41]. Of the 77 SLH domain containing proteins, 29 are predicted to be involved in carbohydrate processing (Table 5) as indicated through domain analysis. In this smaller set, the average calculated protein size is nearly 193 kDa and the average predicted pI is 4.74 with a standard deviation of just 0.10. Domain and BLASTp analysis (Table 5) shows the diversity of functions of associated carbohydrate active enzymes among these SLH proteins (Table 5). In the current transcriptomic data set, only a single SLH-bearing gene has been identified being involved in the catalysis of either xylan (gene 0221), barley β-glucan (gene 0951) or starch (gene 5200) utilization from the 29 identified SLH-encoding carbohydrate processing genes (Table 5). Of the other SLH domain containing proteins in this list it can be seen that Pjdr2 may utilize numerous other polysaccharides with the same strategy. Some of these include arabinan, galactomannan, chitin, pectin and hyaluronan (Fig. 4).

Polysaccharide utilization in Pjdr2
From the studies presented here for the utilization of barley-derived β-glucan and starch, we observe a similar strategy evolved by Pjdr2 as illustrated in the earlier xylan transcriptome report [1]. In each case coordinately expressed gene sets have been identified (Fig. 1) and central to each encoded enzyme system is a multimodular glycoside hydrolase containing carbohydrate binding modules which afford interaction with polysaccharide substrates and a triplicate set of SLH domains for cell surface localized formation of oligosaccharides (Fig. 4).
For processing of β-1,3(4)-glucans, contiguous genes encoding transcriptional regulators, ABC transporters, the multimodular cell-associated Bgl16A 1 and the secreted non-modular Bgl16A 2 catalytic domain along with an associated ABC transporter comprise a β-glucan utilization regulon. In this case both secreted enzymes digest barley β-glucan to tri-, tetra-, penta-and hexasaccharides, and laminarin to mono-, di-, tri-and tetrasaccharides indicating similar functions for both enzymes [18]. These oligosaccharides resulting from extracellular barley β-glucan hydrolysis and cellobiose (from either barley β-glucan or growth on cellobiose) are presumably transported into the cell where they are subsequently degraded to monosaccharides by the action of a GH3 endoglucanase and/or a novel GH43 enzyme (Xyn43B 2 ) with β-glucosidase functionality.
For starch processing, a regulon encoding a putative maltodextrin ABC transporter together with the nonmodular Amy13A 1 managed starch utilization. Encoded distally, the multimodular cell-associated amylase Amy13A 2 likely produces small maltodextrins proximal to the cell surface. These may then be taken up and processed by intracellular maltogenic Amy13A 3 to yield maltose and glucose. Final conversion of maltose is thought to occur through the action of an α-glucan phosphorylase yielding glucose and glucose-1 phosphate.
For the soluble β-glucan, starch and xylan utilization systems, two endo-acting hydrolases may work synergistically with each other for efficient depolymerization of the specific polymeric substrate to oligosaccharides. The modular property of the larger enzyme allows  Annotated in the CAZy database as a GH115 f Secretion was deduced by detection of a signal peptide using the Signal-P server g Molecular weight (MW) and isoelectric point (pI) predictions were obtained through the ProtParam tool available through the ExPASy web server h Putative function is based either on the predicted function from domain assignment or a justification for assignment as a protein that is involved with sugar manipulations generation of oligosaccharides close to the cell surface without diffusion into the medium and hence couples the depolymerization process with assimilation by ABC transporters for intracellular processing and metabolism. These systems for polysaccharide utilization with minimized secretion of extracellular glycoside hydrolases coupled to transport of oligosaccharides in lieu of simple monomeric sugars potentially affords a significant conservation of cellular energy in the form of ATP as described for the processing of cellulose by C. thermocellum.
Based on increased expression levels of genes during growth on multiple polysaccharides a regulatory connection is observed between utilization of barley β-glucan, starch and xylans. Barley β-glucan induces genes involved in extracellular depolymerization and assimilation specific to soluble β-1,3(4)-glucan. However, it also induces many of the genes shown to play a prominent role in the xylan utilization systems [1]. Although xylans do not induce genes specific to barley β-glucan utilization, they do induce genes belonging to the starch utilization system. When Pjdr2 was grown on starch, no genes specific to xylan or barley β-glucan utilization were found to be induced. These studies show the transcriptional induction and repression strategies evolved in Pjdr2 for utilizing a variety of polysaccharides.
Interestingly, induction of the starch utilization genes with growth on xylan results in increased expression of amy13A 1 and amy13A 3 while the amy13A 2 gene encoding the large surface anchored amylase is expressed just enough to meet the significance selection cutoff (4-fold). This same pattern is also observed with growth on maltose. It appears the elevated expression of the amy13A 2 gene is specific for starch and the non-starch substrates which activate the expression of the starch utilization regulon (including amy13A 3 ) may poise Pjdr2 for rapid response to starch availability.
SLH domains appear to play a vital role in interaction of Pjdr2 with its native environment. The 77 SLH domain-containing proteins encoded in the genome of Pjdr2 highlight the expanded use of this domain for cell wall associations and also hints to a modus operandi, at least regarding an approach to polymeric substrate utilization.

Conclusion
The genome of Pjdr2 comprises genes encoding extracellular cell-associated depolymerizing enzymes to bioprocess various plant polysaccharides and these include xylans, soluble β-glucans, starch, and also arabinans and galactomannans (Fig. 4). The polysaccharide utilization systems in Pjdr2 serve as potential candidates for further evaluation or for introduction into other related fermentative bacteria to serve as biocatalysts to achieve direct conversion of non-cellulosic biomass to desired products. Preliminary studies have shown the ability of Pjdr2 to produce fermentative products including lactate, acetate, and ethanol from xylans, β-1,3(4)-glucans, and starch, under oxygen limiting conditions (unpublished). The potential of Pjdr2 to produce individual cellassociated glycoside hydrolases for processing noncellulosic polysaccharides is an alternative strategy to the cell-associated cellulosome complexes evolved by cellulolytic Clostridium [14]. Pjdr2 may be considered for direct bioprocessing of hemicelluloses or may be co-cultured with cellulolytic organisms tolerant of  Table 5. Coding sequence locus tag accession numbers are provided as Pjdr2_#### microaerophilic conditions for conversion of biomass to targeted products. Pjdr2 is a candidate for further development as a biocatalyst for consolidated bioprocessing of biomass derived from energy crops and agricultural residues to targeted biofuels and chemicals.

Reagents
The carbohydrates xylose, arabinose, glucose, maltose and cellobiose were of the highest purity available. The starch was purchased from Sigma-Aldrich (St. Louis, MO, USA) and was reported to be pure. Soluble low-viscosity barley β-glucan (Product No. P-BGBL, Lot 100402a) was purchased from Megazyme International (Wicklow, Ireland) and was reported to contain < 0.1 % arabinoxylan and < 0.31 % starch. The SG and SO xylans were purified from ground sweetgum wood [2,43] and sorghum stalk bagasse [7] as previously described [44,45] using standard procedures.

Growth of Pjdr2
Pjdr2 was routinely cultivated and growth for RNA isolation as described previously [1][2][3][4]. A total of 11 growth conditions were considered in this study. These include barley β-glucan (B) and starch (S) along with their representative dimeric and simple sugars cellobiose (C), maltose (M) and glucose (G), respectively. The sample preparation and RNA-seq data acquisition portion of this manuscript overlaps with a recently published xylan utilization transcriptome [1] which studied the polysaccharide and simple sugar substrates sweetgum wood glucuronoxylan (SG), sorghum stalk glucuronoarabinoxylan (SO), xylose (X) and arabinose (A). Together these paired studies included yeast extract (YE) and sweetgum wood glucuronoxylan without YE supplementation (SGnoYE) as control growth conditions for a total of eleven conditions studied. The characteristics of all growth conditions were defined prior to RNA studies to determine the early mid-exponential phase for harvesting the cells for RNA isolation. Cells were inoculated into 2 ml of 1 % yeast extract (YE) with Zucker-Hankin (ZH) [46] salt medium in 16×100 mm culture tubes and grown overnight at 30°C, with an orbital rotation of 250 rpm or with a Roto-torque positioned at a 45°angle set at high mode and speed 8. After 24 h the optical density at 600 nm (OD 600 ) was measured and cells were harvested (13,000 rpm, 1 min) to start a sub-culture with 2 % inoculum in 15 ml of 1 % YE in ZH medium in a 250 ml flask. These cultures were grown at 30°C and 250 rpm using a G-2 gyrotary shaker (New Brunswick Scientific) for approximately 6 h to an OD 600 of 0.5-0.8. The cells were harvested to make an initial inoculation with starting OD 600 of 0.04 in 15 ml of desired growth media for this study. These sample cultures were then grown at 30°C and 250 rpm until the empirically predetermined early to mid-exponential harvest time for a given condition, generally OD 600 0.4-0.8 for carbon supplemented and 0.25 for yeast extract. Growth conditions for RNA sequencing (RNA-seq) transcriptomic analysis consisted of ZH media with 0.5 % carbohydrate and 0.5 % YE, except in two control conditions. The YE control consisted of 0.5 % YE with no carbohydrate and SG control (SGnoYE) with 0.5 % sweetgum GX n in ZH medium without YE. The culture aliquots used for RNA isolation were streaked onto xylan agar plates to confirm the purity of the cultures.

RNA isolation
For each of the 11 growth conditions, three parallel cultures were grown for a total of 33 RNA isolations. The amount of cells harvested from each culture was determined through empirical analysis of previous growth studies. Total RNA was isolated from early midexponential growing cultures using the RNeasy Protect Bacteria Mini Kit from Qiagen (Valencia, CA, USA) without the use of the RNA Protect reagent. Cells were lysed according to protocol four in the RNAprotect Bacterial Reagent Handbook (2nd edition) with application of lysozyme and Proteinase K as instructed. RNA was purified from the resulting cell lysate using the RNeasy column (protocol seven of the handbook) with oncolumn RNase-free DNase treatment to remove traces of DNA (Appendix B of the handbook) or in some cases using the TURBO-DNA-free kit from Ambion (Life Technologies, Carlsbad, CA, USA). Total RNA was quantified by absorbance at 260 nm and the purity was assessed with the 260/280 nm absorbance ratio. Absence of DNA in the RNA preparations was verified by PCR.
In some cases the RNeasy on-column DNase treatment was applied and treated again if required. The RNA preparations were submitted for Bioanalyzer (Agilent, Santa Clara, CA, USA) analysis at the University of Wisconsin Biotechnology Center or The University of Florida Interdisciplinary Center for Biotechnology Research to verify the absence of RNA degradation. The RNA specific quantification of the samples was performed with the Qubit fluorimeter (Life Technologies, Carlsbad, CA) prior to sample submission to the Joint Genome Institute (JGI), Walnut Creek, CA,.
RNA sequencing and data analysis RNA sequencing was performed by the Joint Genome Institute (JGI), US Department of Energy, Walnut Creek, CA, as previously described [1]. Briefly, rRNA-depleted RNA was fragmented using divalent cations and high temperature. Fragmented RNA was reverse transcribed using random hexamers and Superscript II (Invitrogen) followed by second strand synthesis. The fragmented cDNA was treated to allow end-pair A-tailing adapter ligation and 10 cycles of PCR. Libraries were quantified by qPCR. The libraries were sequenced using the Illumina HiSeq sequencing platform utilizing a TruSeq paired-end cluster kit, v3, and 161 Illumina's cBot instrument to generate clustered flow cells for sequencing. Sequencing of the flow cells was performed on the Illumina HiSeq2000 sequencer using a TruSeq SBS sequencing kit 200 cycles, v3, following a 2×100 indexed run recipe. Raw sequence read data was filtered using BBDuk (filterk = 27, trimk = 27; https://sourceforge.net/ projects/bbmap/) to remove Illumina adapters, known Illumina artifacts, phiX, trim Illumina adapters from the right end of the read and quality-trim the right end of the read to Q6. Resulting reads containing one or more 'N' , or with quality scores (before trimming) averaging less than 10 over the read, or length under 33 bp after trimming, were discarded.
The filtered raw data files were analyzed using ArrayStar ver. 12.2 software from DNASTAR (Madison, Wisconsin). The results from all growth conditions (in triplicates) were mapped to the annotated genome, averaged and normalized across the entire 11 conditions for comparisons. ANOVA analysis was performed to assess gene data quality with respect to the global dataset. The final output was provided as RPKM values (Reads Per Kilobase per Million reads sequenced). In ArrayStar, statistical analysis for comparisons of two conditions was performed with the moderated t-test and adjusted p-values were calculated using the FDR (Benjamini Hochberg) method [47]. Unless otherwise stated, the expression of genes discussed in this study is based upon a fold difference relative to YE control. Data with p-values less than 0.05 were considered to be significant.

Gene annotation and analysis
Functional roles were assigned to genes based on analysis by BLASTp from NCBI (http://www.ncbi.nlm.nih.gov/) [48,49], IMG database (http://img.jgi.doe.gov/) or Pfam (http://pfam.xfam.org/) [50]. For genome analysis of the SLH containing genes, the first and third SLH domains of xylanase XynA1 were blasted against the genome and the two resulting datasets combined and made nonredundant. Operon predictions were based on in silico analysis using the PePPER webserver (http://server.molgenrug.nl/) [51]. The genes from the genome of Pjdr2 are identified by their locus tags. The locus tags are identified as Pjd2_####, where #### represents the 4-digit gene number used for gene identification in this study. The filtered raw data was processed using ArrayStar. This processed data is available in the supplemental material (Additional file 2). The data represented in the Tables have expression data rounded off to 1 or 2 decimal points, and the p-values converted to 2 decimal points. Raw data is available as described in Availability of Supporting Data section.