Skip to main content

Glucose-lactose mixture feeds in industry-like conditions: a gene regulatory network analysis on the hyperproducing Trichoderma reesei strain Rut-C30

Abstract

Background

The degradation of cellulose and hemicellulose molecules into simpler sugars such as glucose is part of the second generation biofuel production process. Hydrolysis of lignocellulosic substrates is usually performed by enzymes produced and secreted by the fungus Trichoderma reesei. Studies identifying transcription factors involved in the regulation of cellulase production have been conducted but no overview of the whole regulation network is available. A transcriptomic approach with mixtures of glucose and lactose, used as a substrate for cellulase induction, was used to help us decipher missing parts in the network of T. reesei Rut-C30.

Results

Experimental results on the Rut-C30 hyperproducing strain confirmed the impact of sugar mixtures on the enzymatic cocktail composition. The transcriptomic study shows a temporal regulation of the main transcription factors and a lactose concentration impact on the transcriptional profile. A gene regulatory network built using BRANE Cut software reveals three sub-networks related to i) a positive correlation between lactose concentration and cellulase production, ii) a particular dependence of the lactose onto the β-glucosidase regulation and iii) a negative regulation of the development process and growth.

Conclusions

This work is the first investigating a transcriptomic study regarding the effects of pure and mixed carbon sources in a fed-batch mode. Our study expose a co-orchestration of xyr1, clr2 and ace3 for cellulase and hemicellulase induction and production, a fine regulation of the β-glucosidase and a decrease of growth in favor of cellulase production. These conclusions provide us with potential targets for further genetic engineering leading to better cellulase-producing strains in industry-like conditions.

Background

Given current pressing environmental issues, research around green chemistry and sustainable alternatives to petroleum is receiving increased attention. A promising substitute to fossil fuels resides in second generation bio-ethanol, an energy source produced through fermentation of lignocellulosic biomass. One of the key challenges for industrial bio-ethanol production is to improve the competitiveness of plant biomass hydrolysis into fermentable sugars, using cellulosic enzymes.

The filamentous fungus Trichoderma reesei, because of its high secretion capacity and cellulase production capability, is the most used microorganism for the industrial production of cellulolytic enzymes. The T. reesei QM6a strain, isolated from the Solomon Islands during the Second World War [1], was improved through a series of targeted mutagenesis experiments [25]. Among the variety of mutant strains, Rut-C30 is actually known as the reference hyper-producer [6, 7], and its cellulase production is 15-20 times that of QM6a [8]. Comparison of genomes of the Rut-C30 strain and its ancestor QM6a brings to light the occurrence of numerous mutations including 269 SNPs, eight InDels, three chromosomal translocations, five large deletions and one inversion [914]. Alas among them, only few mutations have been proved to be directly linked to the hyper-producer phenotype [10, 15], the most striking one being the truncation of the gene cre1 [9]. CRE1 is the main regulator of catabolite repression which mediates the preferred assimilation of carbon sources of high nutritional value such as glucose over others [16]. The truncated form retaining the 96 first amino acids and results in a partial release of catabolite repression [9] and more surprisingly turns CRE1 into an activator [17]. While most specificities (mutations, deletions, etc.) of the genetic background of Rut-C30 are seemingly unrelated to the production of cellulases [13], their impact should not be totally neglected and assesed according to a dedicated experimental design.

In T. reesei, the expression of cellulases is regulated by a set of various transcription. Beside the carbon catabolite repressor CRE1, the most extensively studied is the positive regulator XYR1 which is needed to express most cellulase and hemicellulase genes [18, 19]. Other transcription factors involved in biomass utilization have been characterized: ACE1 [20], ACE2 [21], ACE3 [22], BGLR [15], HAP 2/3/5 complex [23], PAC1 [24], PMH20, PMH25, PMH29 [22], XPP1 [25], RCE1 [26], VE1 [27], MAT1-2-1 [28], VIB1 [29, 30], RXE1/BRLA [31] and ARA1 [32]. Moreover, transcription factors involved in the regulation of cellulolytic enzymes have also been characterized in other filamentous fungi: CLR-1 and CLR-2 in Neurospora crassa [33] or AZF1 [34], PoxHMBB [35], PRO1, PoFLBC [36] and NSDD in Penicillium oxalium [37, 38]. Yet, their respective function has not yet been established in T. reesei. Among the mentioned regulators, some are specific to cellulases or xylanases genes, or to carbon sources while others are global regulators, e.g. PAC1, which is reported to be a pH response regulator. This profusion of transcription factors reveals the complexity of the regulatory network controlling cellulase production. Better understanding links between regulators could be a major key in improving the industrial production of enzymes.

Gene Regulatory Network (GRN) inference methods are computational approaches mainly based on gene expression data and data science to build representative graphs containing meaningful regulatory links between transcription factors and their targets. GRN may be useful to visualize sketches of regulatory relationships and to unveil meaningful information from high-throughput data [39]. We employed BRANE Cut [40], a Biologically-Related Apriori Network Enhancement method based on graph cuts, previously developed by our team. It has been proven to provide robust meaningful inference on real and synthetic datasets from [41, 42]. In complement to classical analysis, such as differential expression or gene clustering, the graph optimization of BRANE Cut on T. reesei RNA-seq is likely to cast a different light on relationships between transcription factors and targets.

While cellulose is the natural inducer of cellulase production, authors in [43] showed that, in Trichoderma reesei, the lactose is capable to play the role of cellulase inducer. For this reason, this carbon source is generally used in the industry to induce the cellulase production in T. reesei. Efficient enzymatic hydrolysis of cellulose requires the synergy of three main catalytic activities: cellobiohydrolase, endoglucanase and β-glucosidase. The cellobiohydrolases cleave D-glucose dimers from the ends of the cellulose chain. Endoglucanases randomly cut the cellulose chain providing new free cellulose ends which are the starting points for cellobiohydrolases to act upon, hydrolyze cellobiose to glucose, thereby preventing inhibition of the rest of enzymes by cellobiose [44]. It is well known that in T. reesei, β-glucosidase activity [45, 46] has generally been found to be quite low in cellulase preparations [47]. It causes cellobiose accumulation which in turn leads to cellobiohydrolase and endoglucanase inhibition. To overcome this low activity, different strategies have been experimented: supplementation of the enzymatic cocktail with exogenous β-glucosidase [48, 49], construction of recombinant strains overexpressing the native enzyme [47, 50, 51], expressing more active enzymes or modifying the inducing process to promote the production of β-glucosidase. This latest approach was performed by using various sugar mixtures to modify the composition of the enzymatic cocktail [52]. Thus, an increase of β-glucosidase activity in the cocktail can be achieved by using a glucose-lactose mixture, also favorable in terms of cost.

In the present study, fed-batch cultivation experiments of the T. reesei Rut-C30 strain, using lactose, glucose and mixtures of both were performed. We chose to analyze this reference strain for industrial production because of its superior cellulase production capacity. The other reference strain for academic studies, QM9414, has for instance a much lower productivity (amount of extracellular protein and cellulase activity) [7]. Rut-C30 is impaired in CRE1-dependent catabolite repression, which modifies the regulatory network. This truncation entails the interest for this strain, while making the understanding of its mechanisms complicated. Our objective is therefore to analyze transcriptomes with different sugar mixtures with a hyperproducing strain under industry-like conditions. As observed previously, productivity was increased with the proportion of lactose in the mixture and an higher β-glucosidase activity was measured in the mixture conditions compare to pure lactose. To explore the molecular mechanisms underlying these results, a transcriptomic study was performed at 24 h and 48 h after the onset of cellulase production triggered by the addition of the inducing carbon source lactose. An overall analysis reveals significant impact of lactose/glucose ratios on the number of differentially expressed genes and, to a lesser extent, of sampling times. According to the following clustering analysis, three main gene expression profiles were identified: genes up or down regulated according to lactose concentration and genes over-expressed in the presence of lactose but independently of its proportion in the sugar mix. Interestingly, expression profile of these genes sets overlaps productivity and β-glucosidase curve confirming a transcripomic basis of the phenotypes observed. As transcription factors were identified in all transcriptomic profiles, we decided to deepen our understanding on the regulation network operating during cellulase production in T. reesei Rut-C30. A system biology analysis with BRANE Cut network selection was carried out to inferred links between differentially regulated transcription factors and their targets. Results highlight three sets of subnetworks, one directly linked to cellulases genes, one matching with β-glucosidase expression and the last one connected to developmental genes.

Results

Cellulase production is increased with lactose proportion but β-glucosidase activity is higher in glucose-lactose mixture

In order to study its transcriptomic behavior on various carbon sources, T. reesei Rut-C30 was cultivated in fed-batch mode in a miniaturized experimental device called “fed-flask” [53], allowing us to obtain up to 6 biological replicates with minimal equipment. Cultures were first operated for 48 h in batch mode on glucose for initial biomass growth (resulting in around 7 g L −1 biomass dry weight), then fed with different lactose/glucose mixtures e.g. pure glucose (G100), pure lactose (L100), 75 % glucose + 25 % lactose mixture (G75-L25), and 90 % glucose + 10 % lactose mixture (G90-L10).

As expected, pure lactose feed resulted in highest protein production, with 2.6 g L −1 protein produced during fed-batch, at a specific protein production rate (qP) of 7.7 ± 1.1 mg g −1 h −1 (Fig. 1a and b). The final protein concentration on pure lactose may appear low (≈3g/L), but the specific productivity is high, similar to that obtained in a bioreactor. In addition, as displayed in Additional file 1, the whole fed substrate is converted into proteins as no biomass is produced during the pure lactose feeding. Hence, despite the low value of protein concentration obtained in our “fed-flask” conditions, these observations show that cellulase induction is at its maximum level. Glucose feed resulted in almost no protein production (qP 15 times lower than on lactose) but in biomass growth (4.2 g L −1 biomass produced during fed-batch, see Additional file 1) while glucose/lactose mixtures resulted in intermediate profiles, with 0.6 g L −1 protein produced on 10 % lactose (G90-L10), and 1.4 g L −1 protein produced on 25 % lactose (G75-L25). We then determined the filter paper and β-glucosidase activities at 48 h after the beginning of fed-batch (Fig. 1c and d): filter paper activity is correlated to lactose amounts whereas β-glucosidase activity is higher in carbon mixture. The obtained results are in accordance with the ones obtained in [53], allowing us to assume the absence of residual sugar accumulation in the medium during the fed-batch.

Fig. 1
figure1

Protein production on different sugar sources in fed-batch mode. a monitoring of protein concentration during fed-batch. For the different glucose-lactose content in feed (G100, G90-L10, G75-L25, L100), b reports the specific protein production rate, c the final β-glucosidase activity and d the final filter paper activity. Reported values are average and standard deviation of the biological replicates

Differentially expressed gene identification

This study aims at better understanding the effect of the lactose on the transcriptom of T. reesei Rut-C30, but not during the early lactose induction as in [54]. For this reason, we chose to extract RNA at 24 h and 48 h after the fed-batch start for further transcriptomic analysis.

Analysis of glucose, lactose and mixture effects was performed to identify differentially expressed (DE) genes between conditions. Specifically, to refine the understanding of the lactose effect on the cellulase production, the gene expressions on various lactose proportions (G90-L10, G75-L25, L100) at 24 h and 48 h have been differentially evaluated regarding gene expression obtained on pure sugar e.g. glucose (G100) or lactose (L100) at 24 h and 48 h. The comparison to both pure glucose and pure lactose feeds leads to ten comparisons (summarized on the circuit design displayed in Additional file 2. The use of two distinct references conditions increases the chances to identify relevant gene expression clusters by exploring a wider gene expression pattern. The number of DE genes obtained for each of the comparisons is displayed in Fig. 2. For a better intelligibility of the results, we focus on DE genes compared to the pure glucose (G100) reference.

Fig. 2
figure2

Differentially expressed genes of Rut-C30 on various of carbon sources mixtures. Number of over- (up, in red) and under-expressed (down, in green) genes on different mixed carbon source media (G90-L10, G75-L25, L100) at 24 h and 48 h

From a global overview, at 24 h, 427 genes are differentially expressed and the number of DE genes increases with the level of lactose. In addition, these DE genes are up-regulated. Results obtained at 48 h lead to 552 DE genes and its number increases with the level of lactose. These results, displaying an increasing number of differentially expressed genes according to the lactose level between 24 h and 48 h, are in accordance with the specific protein production rate results previously presented (cf Fig. 1). Note that this increase is essentially inherent to the threshold of 2 on the log fold-change. Indeed, at 24 h, some genes are considered as non differentially expressed although they are on the verge of becoming one, and then appear at 48 h.

We then focused on the intertwined effects i.e. the impact of time regarding each carbon source mixture. On pure lactose (L100), the number of DE genes increases between 24 h and 48 h. On the contrary, for both the minimal and the intermediate level of lactose (e.g. G90-L10 and G75-L25), the number of DE genes decreases between 24 h and 48 h. We observe that this diminution between the early and the late time samplings on low lactose quantity is mainly due to the diminution of over-expressed genes. This result suggests that a belated process only appears on pure lactose.

Eventually, we checked whether the genes mutated in Rut-C30, by comparison to QM6a, are differentially expressed in our conditions (see Additional file 3). While the total number of mutated genes at the genome scale is 166 (1.8 %), we only found 12 of them in Rut-C30 which are also differentially expressed (1.8 %). Hence, we cannot conclude to an enrichment of mutated genes responsible for cellulase production on lactose. This result is consistent with [54], which demonstrates the weak impact of random mutagenesis on transcription profiles related to cellulase induction and the protein production system.

Subsequent analyses are based on the 650 genes identified as DE in at least one of the ten studied comparisons.

Gene clustering and functional analysis

To detect functional changes on lactose, we performed a clustering on the previously selected 650 genes. For this purpose, each gene is related to a ten-point expression profile corresponding to the ten log2 expression ratios (base-2 logarithm of expression ratios between two conditions according to the circuit design detailed in Additional file 2. Gene clustering was performed using an aggregated K-means classifier (detailed in the Materials and Methods section). Among the five distinct profiles identified (Fig. 3 and Additional file 3 for the exhaustive list of genes), three main trends appear, when we compare the gene expression on lactose relatively to on glucose. The first trend encompasses genes under-expressed on lactose, in a monotonic manner at 24 h and 48 h and is found in two clusters, denoted by \(\mathcal {D}_{+}\) and \(\mathcal {D}_{-}\) (\(\mathcal {D}\) for down-regulation). Conversely, observed in two others clusters named \(\mathcal {U}_{+}\) and \(\mathcal {U}_{-}\) (\(\mathcal {U}\) for up-regulation), the second trend refers to genes over-expressed on lactose in a monotonic manner at 24 h and 48 h. The last trend concerns genes over-expressed on lactose, but where the amount of lactose affects the gene expression in an uneven manner. This trend is recovered in a unique cluster denoted by \(\mathcal {U}_{\simeq }\).

Fig. 3
figure3

Heatmap and median profiles of clustered genes. Clustering results on the 650 differentially expressed genes : cluster \(\mathcal {D}_{+}\) (green), \(\mathcal {D}_{-}\) (dark green) for down-regulation, \(\mathcal {U}_{\simeq }\) (orange), \(\mathcal {U}_{+}\) (red) and \(\mathcal {U}_{-}\) (dark red) for up-regulation. We have highlighted the median profile of the corresponding cluster in black and left the median profiles of the other clusters in grey in the background to facilitate visual comparison

Genes monotonically down-regulated across lactose amount

As mentioned above, genes having a monotonic under-expression regarding the amount of lactose are grouped in clusters \(\mathcal {D}_{+}\) (64 genes: 10 %) and \(\mathcal {D}_{-}\) (254 genes: 39 %). These genes are repressed in lactose: the more the lactose, the more the repression. The main difference between these two clusters is in the levels of under-expression: genes in cluster \(\mathcal {D}_{+}\) are in average more strongly under-expressed than genes in cluster \(\mathcal {D}_{-}\). In addition, we note that cluster \(\mathcal {D}_{-}\), for which the under-expression is the weaker, contains a larger number of genes than cluster \(\mathcal {D}_{+}\). This result suggests that lactose moderately affects the behavior of a large number of genes while only few genes are strongly impacted by lactose concentration. In addition, it is interesting to note that the differential expressions of transcription factors are lower than genes not identified as such. This observation confirms that a weak modification only of transcription factors expression can lead to a strong modification in the expression of their targets.

More specifically, cluster \(\mathcal {D}_{+}\) is enriched in genes related to proteolysis and peptidolysis processes (IDs 22210, 22459, 23171, 106661, 124051) and contains three genes encoding cell wall proteins (IDs 74282, 103458, 122127). Interestingly, no transcription factors are detected in this cluster.

Cluster \(\mathcal {D}_{-}\), whose median profile exhibits a slight repression across lactose concentrations encompasses transcription factors whose ortholog are involved in the development: Tr–WET-1 (ID 4430, [55]), Tr–PRO1 (ID 76590, [56, 57]) and Tr–ACON-3 (ID 123713, [58]). We recall that the Tr–XXX notation refers to the gene in T. reesei for which the ortholog in an other specie is XXX (see the “Functional analysis” section in Materials and Methods). We also found 11 genes involved in proteolysis and peptidolysis processes, five genes encoding for cell wall protein (IDs 80340, 120823, 121251, 121818 and 123659), two genes encoding for hydrophobin proteins (hbf2 and hbf3) and two genes involved in the cell adhesion process (IDs 65522 and 70021). Nine genes encoding for G-protein coupled receptor (GPCR) signaling pathway are also recovered in this cluster. It is important to note that, in addition to the three already mentioned, 11 other transcription factors are also present (including PMH29, RES1 [59], Tr–AZF-1 (ID 103275) and IDs 55272, 59740, 60565, 63563, 104061, 105520, 106654, 112085). We also found the xylanase XYN2 with a strong repression observed on pure lactose in comparison to pure glucose, while its expression seems insensitive to low lactose concentration.

Genes monotonically up-regulated across lactose amount

We recall that clusters \(\mathcal {U}_{+}\) (78 genes: 12 %) and \(\mathcal {U}_{-}\) (201 genes: 31 %) contain genes whose over-expression is monotonic with respect to lactose: the more the lactose, the more the induction. The main difference between expression profiles of these two clusters is the level of over-expression: genes in cluster \(\mathcal {U}_{+}\) are more activated than genes belonging to cluster \(\mathcal {U}_{-}\). A similar remark may be drawn as previously: preliminary observations suggest that a large number of genes is moderately impacted by lactose (cluster \(\mathcal {U}_{-}\)) while only few genes are strongly affected by lactose concentrations (cluster \(\mathcal {U}_{+}\)). As similarly observed on down-regulated genes, the expression level of the transcription factors is weaker than their targets.

In cluster \(\mathcal {U}_{+}\), whose median profile expresses a potent induction regarding lactose concentrations, 26 CAZymes are found, of which 23 belong to the large glycoside hydrolase (GH) family. We recover the principal CAZymes known to be induced in lactose condition: the two cellobiohydrolases CBH1 and CBH2, two endoglucanases CEL5A and CEL7B, one lytic polysaccharide monooxygenase (LPMO) CEL61A, two xylanases XYN1 and XYN3, as well as the mannanase MAN1, the β-galactosidase BGA1. In addition, we found three specific carbohydrate transporters CRT1, XLT1 and ID 69957 and three putative ones (IDs 56684, 67541, and 106556). Interestingly, we found the transcription factor YPR1, which is the main regulator for yellow pigment synthesis [60]. These results, showing a lactose-dependent increase in the expression of genes related to the endoglucanase and cellobiohydrolase, corroborate the phenotype observed in the study of [52]. Indeed, its authors show a rise of the specific endoglucanase and cellobiohydrolase activity positively correlated to lactose concentration and cellulolytic enzymes productivity.

Cluster \(\mathcal {U}_{-}\), distinguishable by its median profile showing a slight induction across lactose concentrations, contains 17 genes involved in the carbohydrate metabolism, of which 16 belong to the large GH family. Among these genes, we identified three β-glucosidases whose two extracellulars CEL3D and CEL3C and one intracellular CEL1A, the xylanase XYN4, and the acetyl xylanase esterase AXE1 are recovered. We also found 14 Major Facilitator Superfamily (MFS) transporters. In addition, seven transcription factors are found in this cluster, including XYR1 the main regulator of cellulase and hemicellulase genes [19], CLR2 (ID 23163) identified as a regulators of cellulases but not hemicellulases in Neurospora crassa [33], Tr–FSD-1 (ID 28781), ID 121121 and three others, with no associated mechanism (IDs 72780, 73792, 106706).

Uneven up-regulation across lactose amount

In cluster \(\mathcal {U}_{\simeq }\) (53 genes: 8 %), we found globally over-expressed genes but with a non-monotonic behavior regarding lactose concentration. A more detailed study of this cluster reveals three main typical characteristics in the gene expression profiles. A tenth of the genes shows an uneven behavior with a high-over expression in all G90-L10, G75-L25 and L100 conditions without significant difference according to the amount of lactose. This kind of profile suggests that the up-regulation is uncorrelated with lactose concentration itself but triggered by lactose detection only. Then we found one third of the genes that demonstrates a high over-expression on the two carbon source mixtures G90-L10 and G75-L25 while no differential expression is observed on pure lactose compared to pure glucose. The transcription factor ID 105805 follows this profile. These two trends of gene expression profiles could be fully explained by the CRE1-dependent catabolite repression impairment and no focus on them are made in the discussion. Finally, a little more than half of the genes has a significant stronger over-expression on G75-L25 compared to the one on G90-L10 and L100. Interestingly, we found one endoglucanase CEL12A, one LPMO CEL61B, three β-glucosidases whose two extracellulars with a peptide signal CEL3E and BGL1 and one intracellular β-glucosidase CEL1B, potentially involved in cellulase induction. We also found the β-xylosidase BXL1 and the transcription factor ACE3 that share this profile. We observe a strong correlation between the transcriptomic behavior we found in our study and the phenotype highlighted in [52]. Actually, the specific β-glucosidase activity is the highest for intermediate amounts of lactose while this activity decreases on glucose or lactose alone. Corroboratively, our transcriptomic study shows a highest over-expression of genes encoding β-glucosidases (cel3e, bgl1 and cel1b) on the intermediate mix of lactose and glucose, while their expression decreases when lactose is present in too low or too high concentration.

Note that a large proportion of genes belonging to the up-regulated clusters are recovered on the co-expressed genomic regions observed in [22]. The biological coherence of clustering results encourage us to pursue the transcriptomic study through a gene regulatory network. The use of network inference approach is driven by the motivation to better understand links between DE transcription factors but also to highlight strong links with the help of alternative proximity definition, and thus to concrete the relationships foreseen though the clustering.

Network inference

From the set of DE genes, we built a gene regulatory network with the combination of CLR [61] and BRANE Cut [40, 62] inference methods. When the use was judicious, we evaluated our discovered TF-targets interactions by performing a promoter analysis of the plausible targets given by the inferred network, with the Regulatory Sequence Analysis Tool (RSAT) [63]. More details on the complete methodology for both the inference and the promoter analysis are provided in section Materials and Methods.

Network enhancement thresholding performed by BRANE Cut post-processing [40] selected 161 genes (including 15 transcription factors) and inferred 205 links (Fig. 4). In order to help network interpretation, we applied the same color code as for the clustering (Fig. 3). We observe a coherence between the function and the expression behavior of genes linked into modules, thus corroborating clustering results. As we will see in details in the following network analysis, we reveal potential links between three mechanisms grouped in modules (SubN1, SubN2, and SubN3) and related to cellulase activation, β-glucosidase expression and repression of developmental process.

Fig. 4
figure4

Inferred network. Network built with BRANE Cut from expression profiles of the differentially expressed genes. BRANE Cut selected 205 edges involving 161 genes. Node colors correspond to cluster labels: \(\mathcal {U}_{+}\) (red, genes highly and monotonically up-regulated on lactose), \(\mathcal {U}_{-}\) (dark red, genes slightly and monotonically up-regulated on lactose), \(\mathcal {U}_{\simeq }\) (orange, genes up-regulated and non-monotonically on lactose), \(\mathcal {D}_{-}\) (dark green, genes slightly and monotonically down-regulated on lactose) and \(\mathcal {D}_{+}\) (green, genes highly and monotonically down-regulated on lactose). Bigger nodes with bold frame correspond to genes coding for a transcription factor while smaller nodes with thin frame correspond to genes not identified to code for a transcription factor

First of all, the global study of the network shows interactions between genes sharing the same gene expression profile. The 161 genes selected by BRANE Cut cover a relatively small number of biological processes, especially regarding half of the 15 retained transcription factors for which only two main biological processes are identified: development (Tr–WET-1, Tr–PRO1, Tr–ACON-3 (IDs 4430, 76590, 123713)) and carbohydrate mechanisms (XYR1, PHM29, ACE3 and CLR2).

In addition, we observe a large proportion of genes related to the enzymatic cocktail for cellulase production. In terms of interaction, we predominantly observed links between up-regulated genes in a monotonic manner (\(\mathcal {U}_{-}\)/\(\mathcal {U}_{-}\) and \(\mathcal {U}_{-}\)/\(\mathcal {U}_{+}\) interactions), and related to cellulase production. A second observation refers to enriched \(\mathcal {U}_{\simeq }\)/\(\mathcal {U}_{\simeq }\) interactions i.e. between up-regulated genes in an uneven way. Note that we also found an interesting proximity with \(\mathcal {U}_{-}\)/\(\mathcal {U}_{\simeq }\) interactions, with inverse expression profiles. Involved genes mainly refer to the cellulase and β-glucosidase production. Finally, a significant number of interactions are found between genes belonging to cluster \(\mathcal {D}_{-}\) and related to development mechanism. Here again, links are also observed between genes having antagonist expression profiles, mainly related to cellulase production and development (\(\mathcal {D}_{-}\)/\(\mathcal {U}_{-}\) interactions). Figure 4 displays the inferred network with highlights on the three sub-networks SubN1, SubN2 and SubN3, extracted from the combination of the above observations and the clustering results. We now focus on each identified sub-network for a more detailed analysis.

Sub-network SubN1 encompasses eight genes associated to the carbohydrate metabolism process. Among them, cel5a, cel6a, cel7a and cel7b are specifically related to cellobiohydrolase and endoglucanase activities. It also includes four carbohydrate transporters including CRT1, known to be responsible for lactose uptake and having a pivotal role in the lactose induction of cellulase genes [22, 64, 65], and three carriers. These genes are linked to transcription factor XYR1, known to be the main actor during the cellulase production process. It also appears specifically linked to a galacturonic acid reductase GAR1, a helicase (ID 35202), a glycoside hydrolase XYN6 [66], a secreted hydrolase CIP1 and Tr–FSD-1 (ID 28781), known to pertain to sexual development. The network highlights the action of another transcription factor CLR2, which is known in other species to participate to cellulase production [33]. These two transcription factors XYR1 and CLR2 seem to be highly correlated and share a large number of cellulose-oriented targets. This sub-network is related to the genes involved in cellulase production and having an increased up-regulation across to the lactose concentration. Based on this sub-network subN1, we performed a promoter analysis. Using independently plausible targets of XYR1 and CLR2, we significantly recovered the degenerated binding-site 5‘-GGC(A/T)3-3’, previously identified in [67] as the binding site specific to XYR1. We also found an enriched non-degenerated motif 5‘-GTTACA-3’ which differs from the XYR1 motif. A straightforward hypothesis is to credit this new motif for CLR2 and a simple statistical test suggests that this motif might be specific to the CLR2. Details regarding this analysis are provided in Additional file 4.

Sub-network SubN2 contains nine genes involved in the carbohydrate metabolism, and some of them are specifically related to β-glucosidase and cellulases activities: bgl1, cel3e, cel12a and cel61b. Interestingly, these genes are linked to the transcription factor ACE3 and have the particularity to be maximally over-expressed on G75-L25. We observe that seven genes belonging to cluster \(\mathcal {D}_{-}\) are also present in this sub-network and are predominantly linked to the transcription factor PHM29 which has been recently identified to play a role in the cellulase activity [22]. We notice that these genes have a maximal under-expression on G75-L25, which is the inverse profile of ace3 and its linked genes, suggesting a dependence between ACE3 and the transcription factor pmh29.

The sub-network SubN3 reveals seven transcriptions factors including two which have been identified to participate to the development process in other species: Tr–WET-1 (ID 4430) and Tr–PRO1 (ID 76570). Interestingly, three other genes EsdC, pro41 and hpr1, also pertaining to the development process, are linked to pro1. In addition, genes in this sub-network are mainly down-regulated on lactose and related to metabolism, secretion, transport and cell surface. This sub-network seems to reveal some interesting links between the repression of the development and the cellulase production that will be investigated in more details in the “Discussion” section.

Results provided by this inferred network and the promoter analysis are in agreement with present knowledge on Trichoderma reesei, particularly for the cellulase production. The additional results given by BRANE Cut are coherent with the literature based on other close species, especially regarding results that suggest a potential link between development and cellulase production and a particular behavior of the β-glucosidase. Table 1 provides some relevant references that coroborate the network generated by BRANE Cut. The coherence of the DE analysis as well as clustering and inference results with the actual knowledge allows us to use these results for prediction. In the following “Discussion” section, we thus formulates some postulates regarding cellulase production mechanism in T. reesei, with respect to these three main results.

Table 1 BRANE Cut network validation from literature. Direct link refers to genes identified as implied in the cellulase production while indirect refers to genes having a side effect on the cellulase production (CP)

Discussion

A cellulase production directly linked to the lactose concentration

The gene xyr1 is widely reported to play the role of the major activator of the cellulase production in T. reesei [19]. As notably expected, we recovered in our network links between XYR1 and the main cellulolytic enzymes (especially the two main cellulases CBH1 and CBH2). In Neurospora crassa, cellulases are regulated by CLR-2 specifically, while Tr–XLNR, the ortholog of xyr1, is responsible of the hemicellulase expression [33, 69]. Thus, the regulation of cellulases and hemicellulases is performed through two independents pathways. While the genes responsible for this regulation are present in T. reesei, their behavior appears to be different as they show a coupling action of the regulation of both cellulases and hemicelullases, suggesting a different regulatory network in T. reesei compared to N. crassa.

Although observed in different T. reesei strains and culture conditions, authors in [70] and [71] have identified links between xyr1 and clr2 genes. Interestingly, we also found in our data such a strong correlation between xyr1 and clr2, suggesting a common regulation on lactose. We found a significant number of regulatory links between clr2 and cellulolytic enzymes. Unlike in N. crassa, where authors in [33] demonstrate a distinct effect of clr2 and xlnR on hemicellulase and cellulase respectively, our data analysis show that clr2 seems to be complemental to xyr1 for both cellulases and hemicellulases activation in T. reesei Rut-C30. Thus, even though gene ID 26163 is the ortholog of clr-2 in N. crassa, this observation argues for a different behavior in T. reesei Rut-C30.

Another difference between T. reesei and N. crassa regarding clr2 is its location on the genome. Contrary to N. crassa, clr2 in T. reesei pertains to a physical cluster, located on chromosome III [72], and containing the lactose permease CRT1, established as essential for cellulase induction on a lactose substrate as it allows lactose uptake [22, 64]. Due to this proximity between clr2 and crt1, we may assume a regulation of crt1 by CLR2. In N. crassa, the ortholog of crt1 is sud26, and encodes a sugar transporter which is located next to a transcription factor of unknown function TF-48.

In N. crassa, clr2 is repressed by the carbon catabolite repression [33]. We do not know if such an extrapolation to T. reesei is valid, but interestingly, the Rut-C30 strain has a partial release of catabolite repression due to the truncation of cre1, allowing us to suggest a possible release of the repression of clr2, leading to a low expression of CLR2 and CRT1, and thus to a low lactose uptake. This low level of lactose would be sufficient to initiate the induction of cellulases through the increased expression of XYR1 and CLR2. Without the lactose inducer, we assume that this low expression of CLR2 and CRT1 is not sufficient to intensively produce cellulases, and their expression remains low.

As established in [22], the gene ace3 is known in T. reesei to be involved in the cellulase induction on lactose. Furthermore, as presented in [73], ace3 seems to interact with xyr1 to initiate cellulase production. Based on your data and their interpretations, especially regarding the strong correlation between clr2 and xyr1, we may suppose an additional interaction between ace3 and clr2. This result can also be corroborated by the fact that the invalidation of ace3 in [73] leads to a decrease of XYR1 and CLR2 expressions. However, we note that the expression of ACE3 is not directly correlated with the lactose concentration as the maximal expression of ACE3 is obtained on a mixture of glucose and lactose (G75-L25). Thus, the regulation of XYR1 by ace3 could be complemented by another mechanism necessary for cellulase induction on pure lactose, and without glucose.

Gene expression profiles of bgl1, cel3e and cel1b follow β-glucosidase activity

A previous study had shown an effect of sugar mixtures to influence the composition of the enzymatic cocktail of T. reesei [52]. A higher β-glucosidase activity was observed in the presence of a glucose-lactose mixture compared to pure lactose. This result obtained in the CL847 strain is here confirmed in the reference hyper-producing Rut-C30 strain.

In the transcriptome performed on the various glucose-lactose mixtures, a group of DE genes (\(\mathcal {U}_{\simeq }\)) has an expression profile correlated to β-glucosidase activity. These genes are overexpressed by lactose but without correlation with the amount of lactose, and their maximal expression are recovered for the intermediate level of lactose (G75-L25). We suppose that these results can not be fully explained by the CRE1-dependent catabolite repression impairment. Among these genes, three β-glucosidase are identified, whose two are extracellular (bgl1 and cel3e) while the other is an intracellular β-glucosidase (cel1b). It has been shown previously that in presence of lactose the extracellular enzyme activity is mainly produced by bgl1 [74]. Our results seem to demonstrate that for a full expression of bgl1, presence of lactose is required independently of glucose. Nothing is known about the regulation of cel3e but its expression profile is similar to bgl1. This two genes have been previously identified as co-regulated by the same substrate [75]. There is therefore a correlation between the expression of these genes and enzymatic activity of BGL1. It would thus be interesting to delete cel3e to study the impact of its absence on the global extracellular β-glucosidase activity in glucose-lactose mixture.

In the regulatory network, bgl1 and cel3e are connected to both ace3 and pmh29. However, ace3 has a similar profile as the previously mentioned β-glucosidase (cel3a and cel3e) while pmh29 is anti-correlated. It would therefore be interesting to explore the role of its two transcription factors in the control of CEL3A/BGL1 and CEL3E under glucose-lactose induction conditions. The roles of ace3 and pmh29 in cellulase regulation have recently been explored [22]. However, the difference in genetic background (QM6a and QM9414) and experimental conditions (100 % lactose batch) does not allow the results of these experiments to be extrapolated to the regulation observed here.

Another β-glucosidase, CEL1B, is present in cluster \(\mathcal {U}_{\simeq }\). This intracellular enzyme appears to play an essential role in lactose induction since the joint invalidation of cel1b and cel1a, another intracellular β-glucosidase, abolishes the production of cellulases on lactose. However, invalidation of cel1b alone does not appear to have any effect while invalidation of cel1a produces a delay in induction on lactose which is restored by galactose [76]. Surprisingly, the transcriptomic profile of cel1a is different from that of cel1b since it belongs to the cellulase cluster \(\mathcal {D}_{-}\). The difference in its profiles could indicate a different response between these two genes depending on whether or not glucose is present. Thus the expression of CEL1A could be negatively regulated by the presence of glucose and induced by lactose while CEL1B could be induced by lactose but insensitive to the presence of glucose. As cel1b is also connected to the regulators ACE3, it would be interesting to explore the role of ACE3 and PMH29 regulators in the expression of CEL1B.

A dedication to cellulase production to the detriment of growth

Strinkingly, orthologs of transcription factor genes (IDs 4430, 76590 and 123713) described as involved in developmental process have been identified in this transcriptomic study. All of them being part of cluster \(\mathcal {D}_{-}\) and so down-regulated in lactose compared to glucose.

Firstly, ID 76590 is the ortholog of pro1 in Sordaria macrospora (67 % identity) and Podospora anserina (49 % identity), and the ortholog of adv-1 in Neurospora crassa (67 % identity). The gene Trpro1 is required for fruiting body development and cell fusion [56, 57]. In P. anserina, pro1 activates the sexual recognition pathway including the pheromone and receptor genes and is probably involved in the control of the entry in stationary phase [77]. In Penicillium oxalicum, deletion of pro1 (43 % identity) has been proved to increase cellulase production [37]. No similar phenotype has been described in other fungi. At low lactose concentration obtained in our experiments, Trpro1 is down regulated and linked in the GRN to hpr1, the mating type pheromone receptor.

Secondly, ID 123713 is the ortholog MedA in Aspegillus nidulans (42 % identity), coding for a protein with unknown function, but required for normal asexual and sexual development. We determined that the N. crassa ortholog of MedA is acon-3, a gene required for early conidiophore development and female fertility. We also note that in [78], authors show that TrMedA is repressed by CRE-1 in QM9414 Δcre1, thus suggesting a role of the partial carbon catabolite derepression regarding the observed down-regulation of ID 123713. In N. crassa, acon-3 is positively regulated by the transcription factor ADA-6 involved in conidiation, sexual developement, and oxidative stress response [58]. Interestingly, ypr1 (ID 102499), the yellow pigment regulator, DE in our data, displayed 35 % identity with ada-6. In contrast to TrMedA, ypr1 is up-regulated on lactose and its regulatory function seems restricted to the sorbicillin cluster [60].

The gene with ID 4430 is the ortholog of wet-1 of N. crassa (72 % identity), of WetA in A. nidulans (60 % identity) and Fusarium graminearum (43 % identity). In contrast to Aspergilli and F. graminearum, wet-1 mutant is phenotypically similar to the wild-type strain with no conidiation defect [55]. A regulatory cascade with WetA regulated by AbaA itself regulated by BrlA was described in Aspergillus [79]. The regulatory cascade between aba1 and wet-1 is preserved in N. crassa and F. graminearum. In P. decumbens, an industrial lignocellulolytic enzymes production strain, expression of cellulases genes is upregulated in BrlA deletion strain [68]. In T. reesei, rxe1 is involved in regulation of conidiation and modulated positively by the expression of xyr1 and cellulase and hemicellulase genes [31]. Unfortunately, the low percentage of identity (20 %) between rxe1 and BrlA does not allow us to go further regarding the regulatory link with Aba1 and then wet-1, as in the regulatory cascade described in Aspergillus. In addition, in our transcriptomic data, neither rxe1 nor Aba1 is differentially regulated, so down-regulation of wet-1 does not seem to be dependent of these genes. Eventually, further experiments would allow us to decipher the role of wet-1 on cellulase production and if there is a putative regulatory link between wet-1 and rxe1.

In Aspergillus nidulans, MEDA acts as a repressor of BrlA expression and is an activator of AbaA expression [80]. Although, no direct regulation relation between MedA and WetA in T. reesei has been described, it is worth to note that these genes, both involved in the regulation of conidiation, are down-regulated on lactose. Interestingly in A. niger, authors in [80] showed that the secretion of the vegetative mycellium is repressed by sporulation, thus indicating a reverse link between conidiation and secretion. Thus, TrWetA and TrMedA down-regulation could be a result of the lactose fed batch cultivation mode where the carbon flux is maintaining a near-vegetative state without growth. Conversely, glucose feed resulted in biomass growth leading to conidiation.

Altogether, the down regulation of Trpro1, Trwet1 and Tracon3 on lactose compared to glucose could reflect a balance between vegetative growth, sexual and asexual development. In the fed-batch condition, the lactose is provided to maintain the biomass without growth. In contrast, starvation could create a path to conidiation or glucose could redirect to sexual development. The equilibrium is maintained through the down regulation of essential developmental transcription factor.

Finally, as already observed in previous studies, but in other strains and conditions ([28, 82]), we noticed that links between development and cellulase production are also recovered in a strain having an highly modified genomic background in addition to be observed in industry-like condition. Hence, such a mechanism seems highly preserved through very heterogeneous strains and conditions.

Conclusions

This study is the first considering the effect of various carbon sources (glucose/lactose mixtures) in a fed-batch mode on the transcriptome of T. reesei Rut-C30. In such a condition, we highlighted an interdependence between crucial transcription factors (XYR1, CLR2 and ACE3) known to participate to cellulase and hemicellulase production. We also correlated the transcriptome to the β-glucosidase activity observed in a previous study [52] and revealed a repression of the development process during the cellulase production. These conclusions provide us with plausible targets for further genetic engineering leading to better cellulase producing strains.

Methods

Strain and media

T. reesei RUT-C30 (ATCC 56765) was received from ATTC on October 2013, spread on PDA plates and incubated until sporulation. Spores were harvested with 50 % glycerol solution then stored at −80C. Spore solution concentration was 6e9 mL −1. Culture media are prepared according to [53] (case with 25 m M dipotassium phthalate) and supplemented with 12.5 g L −1 glucose. Feeding solutions (stoichiometric mix of carbon and nitrogen sources) were prepared according to [52].

Fed-flask cultivations

Fed-flask cultivation was performed according to [52] with few modifications. For each replicate, a Fernbach flask was prepared with 250 mL culture medium and inoculated with around e7spores mL −1. Initial growth phase on glucose lasted around 48 h and resulted in around 7 g L −1 biomass. Immediately after glucose exhaustion, empty 250 mL Erlenmeyer flasks were filled with 50 mL broth per flask then fed at 0.3 mL h −1 (using Dasgip MP8 peristaltic pumps) with different sugar solutions (one flask fed with pure lactose, one flask fed with pure glucose, one flask fed with a mixture of glucose and lactose). Pure glucose (G100) feed and pure lactose (L100) feed were replicated 6 times, 75 % glucose + 25 % lactose mixture (G75-L25) was replicated 4 times, and 90 % glucose + 10 % lactose (G90-L10) was replicated 2 times. Incubation was performed in an Infors rotary shaker at 30 C and 150rpm. Analysis (biomass dry weight, protein concentration, sugars concentration, enzymatic activities) were performed according to [52].

RNA-seq library preparation and analysis

Library preparation and RNA-seq data acquisition

The pipeline used for library preparation and RNA-seq data acquisition is similar to the ones presented in [29, 54], and for which we summarize the main steps. Libraries were prepared using the strand specific RNA-seq library preparation TruSeq Stranded mRNA kit (Illumina). They were multiplexed by 6×6 flowcell lanes for a 50bp read sequencing on a HiSeq 1500 device (Illumina). Eoulsan pipeline [83] is used for reads analysis. For each of the 36 samples, an average of 35±10 millions passing Illumina quality filter reads was obtained. Trimming of poly N read tails, reads with less than 40 bases and reads with quality mean lower than 30 were performed. Reads Alignements and gene expression were performed as previously described with theTrichoderma reesei genome annotation version 2 from Joint Genome Institute. The RNA-seq gene expression data and raw fastq files are available on the GEO repository (www.ncbi.nlm.nih.gov/geo/) under accession number: GSE82287.

Normalization and differentially expressed genes identification

RNA-seq data normalization and differential analysis was performed thanks to the DESeq Bioconductor R package (version 1.8.3) [84]. The normalization method implemented in DESeq assumes that only a few number of genes are differentially expressed and corresponds to a median scale normalization.

The differential analysis relies on a statistical model, and more precisely on the negative binomial distribution with variance and mean related by local regression. This approach allows us to identify, for each gene, if the observed difference in read counts is significant. An adjustment for multiple-testing with the procedure of Benjamini and Hochberg [85] was also performed. Hence, we assumed that a gene is said differentially expressed when the adjusted p-value was lower than 0.001 and the absolute value of the log2(FC) was higher than 2. Here, FC refers to the fold change of the read counts for the tested condition against the read counts for the reference condition. In this way, we independently compared at 24 h and 48 h the read counts obtained on G75-L25, G90-L10 to those obtained on G100, or L100. In addition read counts obtained on L100 are also compared to those obtained on G100. This approach, sketched in the circuit design displayed in the Additional file 2, leads to ten possibilities for a gene to be identified as differentially expressed.

Gene expression matrix construction

For clustering and network inference, the establishment of a relevant gene expression matrix is needed.

For this purpose, we used results from the differential analysis. More precisely, we selected the subset of genes which are identified as differentially expressed in at least one on the ten studied comparisons. We decided to remove genes having at least one missing value over the ten comparisons. Doing this, we selected 650 genes for which a complete expression profile was available, composed of ten log2 expression ratios values leading to the gene expression matrix used to carry out the clustering. We note that, in this matrix, the fold change is computed on the average of the read counts across the biological replicates for a given condition (test or reference). For the network inference part, we choose to deal with a slightly modified version of this expression matrix, while keeping the same initial set of the 650 DE genes. To enforce the relevance of the metric used in network inference methods, we chose to deal with all biological replicates for the tested conditions while all reference conditions were pooled, with glucose or lactose pure are chosen as reference conditions. In other words, the log fold change is computed between the read count coming from a biological replicate of the test condition and the averaged read counts of the reference condition. Hence, for a given comparison, we obtained as many log fold changes as biological replicates. In order to harness the variability caused by this approach, we removed genes for which a biological replicate has a null read count. As a result, the final matrix contains 593 genes, where for each gene the expression profile contains 32 components. This procedure allows us to deal with expression profiles having a sufficient number of components to obtain a more reliable inferred network.

Clustering and functional analysis of differentially expressed genes

Clustering

As previously mentioned, clustering is performed on the 650 genes. Each gene is characterized by its ten-component expression profile. The following approach was completely performed using the Multi Experiment Viewer (MeV) software [86]. Firstly, a hierarchical clustering allows us to estimate the optimal number K of clusters hidden in the data. By choosing the Euclidean distance metric and the average linkage method, results suggest K=5 clusters. Then, the K-means algorithm (originating in [87]) is preferred in order to obtain a final gene classification. As this method is sensitive to initialization, we performed ten independent runs of K-means with random initialization; the Euclidean distance is used for each run. Results are subsequently aggregated into five consensus clusters. The aggregation is constrained by a co-occurrence threshold, fixed to 80 %. As a result, the 650 genes are completely classified into five clusters and no unassigned cluster was found.

Functional analysis

A functional analysis was performed throughout a full expert annotation of the classified genes. More precisely, our annotation is based on the functionnal annotation from T. reesei v2.0 JGI portal [88], including EC annotation and KEGG pathway, SignalP, InterPro and GO files. This first annotation was complemented by functionnal annotation issued from orthologs of several species (S. cerevisiae, P. anserina, S. pombe, A. niduland and N. crassa). The orthologs were obtained from FungiPath (www.fungipath.i2bc.paris-saclay.fr/). Eventually the annotation was updated regularly thanks to literature in addition to the more recent one given in [89]. By convention in this manuscript, we shall denote by Tr–XXX the gene in T. reesei for which the ortholog in an other specie is XXX. Otherwise, genes are labeled as unknown. This functional annotation allows us to manually provide meaning to clustering results. Note that we voluntary chose to use protein IDs of the reference genome QM6a. However, in Additional file 3, we provide the correspondence with Rut-C30 protein IDs given by [14].

Network inference and promotor analysis

Network inference

Network inference was performed using the gene expression matrix containing 593 genes (and 32 differential expression levels) as input. We firstly obtained a complete weighted network \(\mathcal {G}(\mathcal {V}, \mathcal {E} ; \omega)\), linking all genes \(\mathcal {V}\) by links \(\mathcal {E}\) with weight ω. This step was performed thanks to the CLR (Context Likelihood of Relatedness) algorithm [61]. The weights ωi,j, affected to each pair (i,j) of genes, are based on the mutual information metric which quantifies the mutual dependence or the information shared between expression profiles of genes i and j. From this complete gene network, a threshold selects the most relevant gene links. For this purpose, we used the network enhancement algorithm BRANE Cut [40]. Briefly, each edge ei,j in the complete network is labeled by a variable xi,j set to 1 if the link has to be in the final network, and 0 otherwise. By optimizing a cost function over the variable \(\boldsymbol {x} = \left (x_{i,j}\right)_{i \in \mathcal {V}, j \in \mathcal {V}}\), the minimizer x gives us the optimal set of links on the final graph. In order to select the relevant links, biological and structural constraints are encoded in the cost function. Indeed, in addition to favoring strongly weighted edges, this post-processing method prefers links around labeled transcription factors (TF). Moreover, thanks to an additional constraint, links between a gene and a couple of transcription factors, if this latter couple is identified as co-regulator, are also preferentially selected. As a result, we obtain an inferred network composed of 161 genes and 205 edges.

Promoter analysis

The promoter analysis was performed using the Regulatory Sequence Analysis Tools (RSAT) software [63]. From each set of genes to study (linked to a specific TF), promoter sequences from −1 to −1000 upstream bases are retrieved using the retrieve sequence tool. From these sequences, a detection of over-represented oligonucleotides was performed thanks to the oligo-analysis tool. We used the reference sequence set of Trichoderma reesei as background model. As mentioned in [90], this choice is driven by the fact that the input sequences (the query) are a subset of a larger collection (the reference). As a result, we obtain a list of over-expressed oligonucleotides (from hexa- to octo-) and several larger motifs assembled from the previous ones using the pattern assembly tool. Significance and count matrices are also obtained at this stage and lead to the establishment of sequence logo binding motifs. In order to detect the occurrences of the previously discovered patterns, we used the string-based pattern matching (dna-pattern) tool. It provides a list of features indicating the positions of the motifs in the input sequences. A suitable way to deal with this data is to visualize them using the feature map tool. From the feature map, the presence of overlapping close motifs is commonly a good indication for the relevance of the discovered motif. This methodology hints at supposing that the set of initial tested genes detains a binding site of the linked TF. From the given occurrences, we also computed the average number of discovered sites on the tested subset of genes. Then, in order to give a statistical significance, we performed two statistical analyses: one based on the promoter sequence of the whole genome, the other based on the a set of random promoter sequences. For both statistical analysis, the occurrences are also computed and averaged over the number of involved sequences. Then a t-test was carried out in order to deem significance (or not) to the average number of discovered sites. The significance is given for a p-value lower than 0.05.

Availability of data and materials

The datasets used and analyzed during the current study are available on the GEO repository (www.ncbi.nlm.nih.gov/geo/) under accession number: GSE82287. We also used two additional sources : the functional annotation from T. reesei v2.0 JGI portal (https://mycocosm.jgi.doe.gov/Trire2/Trire2.home.html) and the orthologs from FungiPath (http://fungipath.i2bc.paris-saclay.fr/)

Abbreviations

CLR:

Context likelihood of relatedness

DE:

Differentially expressed

FC:

Fold change

GO:

Gene ontology

GRN:

Gene regulatory network

JGI:

Joint genome institute

LPMO:

Lytic polysaccharide monooxygenase

MFS:

Major facilitator superfamily

RSAT:

Regulatory sequence analysis tool

TF:

Transcription factor

EC:

Enzyme commission

References

  1. 1

    Bischof RH, Ramoni J, Seiboth B. Cellulases and beyond: the first 70 years of the enzyme producer Trichoderma reesei. Microb Cell Fact. 2016;15(106). https://doi.org/10.1186/s12934-016-0507-6.

  2. 2

    Eveleigh DE, Montenecourt BS. Increasing yields of extracellular enzymes In: Perlman D, editor. Advances in Applied Microbiology, vol 25. New York, London, Toronto, Sydney, San Francisco: Academic Press: 1979. p. 57–74. https://doi.org/10.1016/S0065-2164(08)70146-1.

    Google Scholar 

  3. 3

    Kawamori M, Morikawa Y, Shinsha Y, Takayama K, Takasawa S. Preparation of mutants resistant to catabolite repression of Trichoderma reesei. Agri Biol Chem. 1985; 49(10):2875–9. https://doi.org/10.1080/00021369.1985.10867203.

    CAS  Google Scholar 

  4. 4

    Kawamori M, Morikawa Y, Takasawa S. Induction and production of cellulases by L-sorbose in Trichoderma reesei. Appl Microbiol Biotechnol. 1986; 24:449–53. https://doi.org/10.1007/BF00250321.

    CAS  Google Scholar 

  5. 5

    Durand H, Clanet M, Tiraby G. Genetic improvement of Trichoderma reesei for large scale cellulase production. Enzym Microb Technol. 1988; 10:341–6. https://doi.org/10.1016/0141-0229(88)90012-9.

    CAS  Article  Google Scholar 

  6. 6

    Kubicek CP, Mikus M, Schuster A, Schmoll M, Seiboth B. Metabolic engineering strategies for the improvement of cellulase production by Hypocrea jecorina. Biotechnol Biofuels. 2009; 2(1):19. https://doi.org/10.1186/1754-6834-2-19.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  7. 7

    Peterson R, Nevalainen H. Trichoderma reesei RUT-C30 — thirty years of strain improvement. Microbiology. 2012; 158(1):58–68. https://doi.org/10.1099/mic.0.054031-0.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  8. 8

    Bisaria VS, Ghose TK. Biodegradation of cellulosic materials: substrates, microorganisms, enzymes and products. Enzym Microb Technol. 1981; 3:90–104. https://doi.org/10.1016/0141-0229(81)90066-1.

    CAS  Article  Google Scholar 

  9. 9

    Ilmén M, Thrane C, Penttilä M. The glucose repressor gene cre1 of Trichoderma: isolation and expression of a full-length and a truncated mutant form. Mol Gen Genet. 1996; 251(4):451–60. https://doi.org/10.1007/BF02172374.

    PubMed  PubMed Central  Google Scholar 

  10. 10

    Geysens S, Pakula T, Uusitalo J, Dewerte I, Penttilä M, Contreras R. Cloning and characterization of the glucosidase II alpha subunit gene of Trichoderma reesei: a frameshift mutation results in the aberrant glycosylation profile of the hypercellulolytic strain Rut-C30. Appl Environ Microbiol. 2005; 71(6):2910–24. https://doi.org/10.1128/AEM.71.6.2910-2924.2005.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  11. 11

    Seidl V, Gamauf C, Druzhinina IS, Seiboth B, Hartl L, Kubicek CP. The Hypocrea jecorina (Trichoderma reesei) hypercellulolytic mutant RUT C30 lacks a 85 kb (29 gene-encoding) region of the wild-type genome. BMC Genom. 2008; 9:327. https://doi.org/10.1186/1471-2164-9-327.

    Article  CAS  Google Scholar 

  12. 12

    Le Crom S, Schackwitz W, Pennacchio L, Magnuson JK, Culley DE, Collett JR, Martin J, Druzhinina IS, Mathis H, Monot F, Seiboth B, Cherry B, Rey M, Berka R, Kubicek CP, Baker SE, Margeot A. Tracking the roots of cellulase hyperproduction by the fungus Trichoderma reesei using massively parallel DNA sequencing. Proc Nat Acad Sci USA. 2009; 106(38):16151–6. https://doi.org/10.1073/pnas.0905848106.

    CAS  PubMed  Article  Google Scholar 

  13. 13

    Vitikainen M, Arvas M, Pakula T, Oja M, Penttilä M, Saloheimo M. Array comparative genomic hybridization analysis of Trichoderma reesei strains with enhanced cellulase production properties. BMC Genom. 2010; 11:441. https://doi.org/10.1186/1471-2164-11-441.

    Article  CAS  Google Scholar 

  14. 14

    Koike H, Aerts A, LaButti K, Grigoriev IV, Baker SE. Comparative genomics analysis of Trichoderma reesei strains. Ind Biotechnol. 2013; 9(6):352–367. https://doi.org/10.1089/ind.2013.0015.

    CAS  Article  Google Scholar 

  15. 15

    Nitta M, Furukawa T, Shida Y, Mori K, Kuhara S, Morikawa Y, Ogasawara W. A new Zn(II) 2Cys 6-type transcription factor BglR regulates β-glucosidase expression in Trichoderma reesei. Fungal Genet Biol. 2012; 49:388–97. https://doi.org/10.1016/j.fgb.2012.02.009.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  16. 16

    Ronne H. Glucose repression in fungi. Trends Genet. 1995; 11(1):12–7. https://doi.org/10.1016/S0168-9525(00)88980-5.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  17. 17

    Rassinger A, Gacek-Matthews A, Strauss J, Mach RL, Mach-Aigner AR. Truncation of the transcriptional repressor protein Cre1 in Trichoderma reesei Rut-C30 turns it into an activator. Fungal Biol Biotechnol. 2018; 5:15. https://doi.org/10.1186/s40694-018-0059-0.

    PubMed  PubMed Central  Article  Google Scholar 

  18. 18

    Derntl C, Gudynaite-Savitch L, Calixte S, White T, Mach RL, Mach-Aigner AR. Mutation of the Xylanase regulator 1 causes a glucose blind hydrolase expressing phenotype in industrially used Trichoderma strains. Biotechnol Biofuels. 2013; 6:62. https://doi.org/10.1186/1754-6834-6-62.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  19. 19

    Stricker AR, Grosstessner-Hain K, Würleitner E, Mach RL. Xyr1 (xylanase regulator 1) regulates both the hydrolytic enzyme system and D-xylose metabolism in Hypocrea jecorina. Eukaryot Cell. 2006; 5(12):2128–37. https://doi.org/10.1128/ec.00211-06.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  20. 20

    Aro N, Ilmén M, Saloheimo A, Penttilä M. ACEI of Trichoderma reesei is a repressor of cellulase and xylanase expression. Appl Environ Microbiol. 2003; 69(1):56–65. https://doi.org/10.1128/AEM.69.1.56-65.2003.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  21. 21

    Aro N, Saloheimo A, Ilmén M, Penttilä M. ACE II, a novel transcriptional activator involved in regulation of cellulase and xylanase genes of Trichoderma reesei. J Biol Chem. 2001; 276(26):24309–14.

    CAS  PubMed  Article  Google Scholar 

  22. 22

    Häkkinen M, Valkonen MJ, Westerholm-Parvinen A, Aro N, Arvas M, Vitikainen M, Penttilä M, Saloheimo M, Pakula TM. Screening of candidate regulators for cellulase and hemicellulase production in Trichoderma reesei and identification of a factor essential for cellulase production. Biotechnol Biofuels. 2014; 7(1):14. https://doi.org/10.1186/1754-6834-7-14.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  23. 23

    Zeilinger S, Ebner A, Marosits T, Mach R, Kubicek CP. The Hypocrea jecorina HAP 2/3/5 protein complex binds to the inverted CCAAT-box (ATTGG) within the cbh2 (cellobiohydrolase II-gene) activating element. Mol Genet Genomics. 2001; 266:56–63. https://doi.org/10.1007/s004380100518.

    CAS  PubMed  Article  Google Scholar 

  24. 24

    He R, Ma L, Li C, Jia W, Li D, Dongyuan Z, Shulin C. Trpac1, a pH response transcription regulator, is involved in cellulase gene expression in Trichoderma reesei. Enzyme Microb Technol. 2014; 67:17–26. https://doi.org/10.1016/j.enzmictec.2014.08.013.

    CAS  PubMed  Article  Google Scholar 

  25. 25

    Derntl C, Rassinger A, Srebotnik E, Mach RL, Mach-Aigner AR. Xpp1 regulates the expression of xylanases, but not of cellulases in Trichoderma reesei. Biotechnol Biofuels. 2015; 8:112. https://doi.org/10.1186/s13068-015-0298-8.

    PubMed  PubMed Central  Article  Google Scholar 

  26. 26

    Cao Y, Zheng F, Wang L, Zhao G, Chen G, Zhang W, Liu W. Rce1, a novel transcriptional repressor, regulates cellulase gene expression by antagonizing the transactivator Xyr1 in Trichoderma reesei. Mol Microbiol. 2017; 105(1):65–83. https://doi.org/10.1111/mmi.13685.

    CAS  PubMed  Article  Google Scholar 

  27. 27

    Liu K, Dong Y, Wang F, Jiang B, Wang M, Fang X. Regulation of cellulase expression, sporulation, and morphogenesis by velvet family proteins in Trichoderma reesei. Appl Microbiol Biotechnol. 2016; 100(2):769–79. https://doi.org/10.1007/s00253-015-7059-2.

    CAS  PubMed  Article  Google Scholar 

  28. 28

    Zheng F, Cao Y, Wang L, Lv X, Meng X, Zhang W, Chen G, Liu W. The mating type locus protein MAT1-2-1 of Trichoderma reesei interacts with Xyr1 and regulates cellulase gene expression in response to light. Sci Rep. 2017; 7:17346. https://doi.org/10.1038/s41598-017-17439-2.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  29. 29

    Ivanova C, Ramoni J, Aouam T, Frischmann A, Seiboth B, Baker SE, Le Crom S, Lemoine S, Margeot A, Bidard F. Genome sequencing and transcriptome analysis of Trichoderma reesei QM9978 strain reveals a distal chromosome translocation to be responsible for loss of vib1 expression and loss of cellulase induction. Biotechnol Biofuels. 2017; 10(1):209. https://doi.org/10.1186/s13068-017-0897-7.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  30. 30

    Zhang F, Zhao X, Bai F. Improvement of cellulase production in Trichoderma reesei Rut-C30 by overexpression of a novel regulatory gene Trvib-1. Bioresour Technol. 2018; 247:676–83. https://doi.org/10.1016/j.biortech.2017.09.126.

    CAS  PubMed  Article  Google Scholar 

  31. 31

    Wang L, Lv X, Cao Y, Zheng F, Meng X, Shen Y, Chen G, Liu W, Zhang W. A novel transcriptional regulator RXE1 modulates the essential transactivator XYR1 and cellulase gene expression in Trichoderma reesei. Appl Microbiol Biotechnol. 2019; 103(11):4511–23. https://doi.org/10.1007/s00253-019-09739-6.

    CAS  PubMed  Article  Google Scholar 

  32. 32

    Benocci T, Aguilar-Pontes MV, Kun RS, Lubbers RJM, Lail K, Wang M, Lipzen A, Ng V, Grigoriev IV, Seiboth B, Daly P, de Vries RP. Deletion of either the regulatory gene ara1 or metabolic gene xki1 in Trichoderma reesei leads to increased CAZyme gene expression on crude plant biomass. Biotechnol Biofuels. 2019; 12:81. https://doi.org/10.1186/s13068-019-1422-y.

    PubMed  PubMed Central  Article  Google Scholar 

  33. 33

    Coradetti ST, Craig JP, Xiong Y, Shock T, Tian C, Glass NL. Conserved and essential transcription factors for cellulase gene expression in ascomycete fungi. Proc Nat Acad Sci USA. 2012; 109(19):7397–402. https://doi.org/10.1073/pnas.1200785109.

    CAS  PubMed  Article  Google Scholar 

  34. 34

    Campos Antoniêto AC, Nogueira KMV, de Paula RG, Nora LC, Cassiano MHA, Guazzaroni M-E, Almeida F, da Silva TA, Ries LNA, de Assis LJ, Goldman GH, Silva RN, Silva-Rocha R. A novel Cys2His2 Zinc finger homolog of AZF1 modulates holocellulase expression in Trichoderma reesei. mSystems. 2019;4(4). https://doi.org/10.1128/msystems.00161-19.

  35. 35

    Xiong Y-R, Zhao S, Fu L-H, Liao X-Z, Li C-X, Yan Y-S, Liao L-S, Feng J-X. Characterization of novel roles of a HMG-box protein PoxHmbB in biomass-degrading enzyme production by Penicillium oxalicum. Appl Microbiol Biotechnol. 2018; 102(8):3739–53. https://doi.org/10.1007/s00253-018-8867-y.

    CAS  PubMed  Article  Google Scholar 

  36. 36

    Yao G, Li Z, Wu R, Qin Y, Liu G, Qu Y. Penicillium oxalicum PoFlbC regulates fungal asexual development and is important for cellulase gene expression. Fungal Genet Biol. 2016; 86:91–102. https://doi.org/10.1016/j.fgb.2015.12.012.

    CAS  PubMed  Article  Google Scholar 

  37. 37

    Zhao S, Yan Y-S, He Q-P, Yang L, Yin X, Li C-X, Mao L-C, Liao L-S, Huang J-Q, Xie S-B, Nong Q-D, Zhang Z, Jing L, Xiong Y-R, Duan C-J, Liu J-L, Feng J-X. Comparative genomic, transcriptomic and secretomic profiling of Penicillium oxalicum HP7-1 and its cellulase and xylanase hyper-producing mutant EU2106, and identification of two novel regulatory genes of cellulase and xylanase gene expression. Biotechnol Biofuels. 2016;9(203). https://doi.org/10.1186/s13068-016-0616-9.

  38. 38

    He Q-P, Zhao S, Wang J-X, Li C-X, Yan Y-S, Wang L, Liao L-S, Feng J-X. Transcription factor NsdD regulates the expression of genes involved in plant biomass-degrading enzymes, conidiation and pigment biosynthesis in Penicillium oxalicum. Appl Environ Microbiol. 2018. https://doi.org/10.1128/aem.01039-18.

  39. 39

    Rapaport F, Zinovyev A, Dutreix M, Barillot E, Vert J-P. Classification of microarray data using gene networks. BMC Bioinformatics. 2007; 8(1):35. https://doi.org/10.1186/1471-2105-8-35.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  40. 40

    Pirayre A, Couprie C, Bidard F, Duval L, Pesquet J-C. BRANE Cut: biologically-related a priori network enhancement with graph cuts for gene regulatory network inference. BMC Bioinformatics. 2015; 16(1):369. https://doi.org/10.1186/s12859-015-0754-2.

    Article  CAS  Google Scholar 

  41. 41

    Marbach D, Prill RJ, Schaffter T, Mattiussi C, Floreano D, Stolovitzky G. Revealing strengths and weaknesses of methods for gene network inference. Proc Nat Acad Sci USA. 2010; 107(14):6286–91. https://doi.org/10.1073/pnas.0913357107.

    CAS  PubMed  Article  Google Scholar 

  42. 42

    Marbach D, Costello JC, Küffner R, Vega NM, Prill RJ, Camacho DM, Allison KR, The DREAM5 Consortium, Kellis M, Collins JJ, Stolovitzky G. Wisdom of crowds for robust gene network inference. Nat Meth. 2012; 9(8):796–804.

    CAS  Article  Google Scholar 

  43. 43

    Farkaš V, Šesták S, Grešík M, Kolarova N, Labudová I, Bauer Š. Induction of cellulase in Trichoderma reesei grown on lactose. Acta Biotechnol. 1987; 7(5):425–9. https://doi.org/10.1002/abio.370070510.

    Article  Google Scholar 

  44. 44

    Aro N, Pakula T, Penttilä M. Transcriptional regulation of plant cell wall degradation by filamentous fungi. FEMS Microbiol Rev. 2005; 29(4):719–39.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  45. 45

    Berlin A, Maximenko V, Gilkes N, Saddler J. Optimization of enzyme complexes for lignocellulose hydrolysis. Biotechnol Bioeng. 2007; 97(2):287–96. https://doi.org/10.1002/bit.21238.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  46. 46

    Chen M, Zhao J, Xia L. Enzymatic hydrolysis of maize straw polysaccharides for the production of reducing sugars. Carbohydr Polym. 2008; 71(3):411–5. https://doi.org/10.1016/j.carbpol.2007.06.011.

    CAS  Article  Google Scholar 

  47. 47

    Nakazawa H, Kawai T, Ida N, Shida Y, Kobayashi Y, Okada H, Tani S, Sumitani J-I, Kawaguchi T, Morikawa Y, Ogasawara W. Construction of a recombinant Trichoderma reesei strain expressing Aspergillus aculeatus β-glucosidase 1 for efficient biomass conversion. Biotechnol Bioeng. 2012; 109(1):92–9.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  48. 48

    Pallapolu VR, Lee YY, Garlock RJ, Balan V, Dale BE, Kim Y, Mosier NS, Ladisch MR, Falls M, Holtzapple MT, Sierra-Ramirez R, Shi J, Ebrik MA, Redmond T, Yang B, Wyman CE, Donohoe BS, Vinzant TB, Elander RT, Hames B, Thomas S, Warner RE. Effects of enzyme loading and β-glucosidase supplementation on enzymatic hydrolysis of switchgrass processed by leading pretreatment technologies. Bioresour Technol. 2011; 102(24):11115–20. https://doi.org/10.1016/j.biortech.2011.03.085.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  49. 49

    Del Pozo MV, Fernández-Arrojo L, Gil-Martínez J, Montesinos A, Chernikova TN, Nechitaylo TY, Waliszek A, Tortajada M, Rojas A, Huws SA, Golyshina OV, Newbold CJ, Polaina J, Ferrer M, Golyshin PN. Microbial β-glucosidases from cow rumen metagenome enhance the saccharification of lignocellulose in combination with commercial cellulase cocktail. Biotechnol Biofuels. 2012; 5(1):73. https://doi.org/10.1186/1754-6834-5-73.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  50. 50

    Fang H, Zhao R, Li C, Zhao C. Simultaneous enhancement of the beta–exo synergism and exo–exo synergism in Trichoderma reesei cellulase to increase the cellulose degrading capability. Microb Cell Fact. 2019;18(9). https://doi.org/10.1186/s12934-019-1060-x.

  51. 51

    Zhang J, Zhong Y, Zhao X, Wang T. Development of the cellulolytic fungus Trichoderma reesei strain with enhanced β-glucosidase and filter paper activity using strong artifical cellobiohydrolase 1 promoter. Bioresour Technol. 2010; 101(24):9815–8. https://doi.org/10.1016/j.biortech.2010.07.078.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  52. 52

    Jourdier E, Cohen C, Poughon L, Larroche C, Monot F, Ben Chaabane F. Cellulase activity mapping of Trichoderma reesei cultivated in sugar mixtures under fed-batch conditions. Biotechnol Biofuels. 2013; 6(1):79.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  53. 53

    Jourdier E, Poughon L, Larroche C, Monot F, Ben Chaabane F. A new stoichiometric miniaturization strategy for screening of industrial microbial strains: application to cellulase hyper-producing Trichoderma reesei strains. Microb Cell Fact. 2012; 11(1):1–11.

    Article  CAS  Google Scholar 

  54. 54

    Poggi-Parodi D, Bidard F, Pirayre A, Portnoy T, Blugeon C, Seiboth B, Kubicek CP, Le Crom S, Margeot A. Kinetic transcriptome analysis reveals an essentially intact induction system in a cellulase hyper-producer Trichoderma reesei strain. Biotechnol Biofuels. 2014;7(1). https://doi.org/10.1186/s13068-014-0173-z.

  55. 55

    Boni AC, Ambrósio DL, Cupertino FB, Montenegro-Montero A, Virgilio S, Freitas FZ, Corrocher FA, Gonçalves RD, Yang A, Weirauch MT, Hughes TR, Larrondo LF, Bertolini MC. Neurospora crassa developmental control mediated by the FLB-3 transcription factor. Fungal Biol. 2018; 122(6):570–82. https://doi.org/10.1016/j.funbio.2018.01.004.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  56. 56

    Masloff S, Pöggeler S, Kück U. The pro1 + gene from Sordaria macrospora encodes a C 6 zinc finger transcription factor required for fruiting body development. Genetics. 1999; 152(1):191–9.

    CAS  PubMed  PubMed Central  Google Scholar 

  57. 57

    Fu C, Iyer P, Herkal A, Abdullah J, Stout A, Free SJ. Identification and characterization of genes required for cell-to-cell fusion in Neurospora crassa. Eukaryot Cell. 2011; 10(8):1100–9. https://doi.org/10.1128/EC.05003-11.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  58. 58

    Sun X, Wang F, Lan N, Liu B, Hu C, Xue W, Zhang Z, Li S. The Zn(II) 2Cys 6 transcription factor ADA-6 regulates conidiation, sexual development, and oxidative stress response in Neurospora crassa. Front Microbiol. 2019; 10:750. https://doi.org/10.3389/fmicb.2019.00750.

    PubMed  PubMed Central  Article  Google Scholar 

  59. 59

    Fan F, Ma G, Li J, Liu Q, Benz JP, Tian C, Ma Y. Genome-wide analysis of the endoplasmic reticulum stress response during lignocellulase production in Neurospora crassa. Biotechnol Biofuels. 2015;8(66). https://doi.org/10.1186/s13068-015-0248-5.

  60. 60

    Derntl C, Rassinger A, Srebotnik E, Mach RL, Mach-Aigner AR. Identification of the main regulator responsible for synthesis of the typical yellow pigment produced by Trichoderma reesei. Appl Environ Microbiol. 2016; 82(20):6247–57. https://doi.org/10.1128/AEM.01408-16.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  61. 61

    Faith JJ, Hayete B, Thaden JT, Mogno I, Wierzbowski J, Cottarel G, Kasif S, Collins JJ, Gardner TS. Large-scale mapping and validation of Escherichia coli transcriptional regulation from a compendium of expression profiles. PLoS Biol. 2007; 5(1):54–66. https://doi.org/10.1371/journal.pbio.0050008.

    CAS  Article  Google Scholar 

  62. 62

    Pirayre A, Couprie C, Duval L, Pesquet J-C. BRANE Clust: cluster-assisted gene regulatory network inference refinement. IEEE/ACM Trans Comput Biol Bioinforma. 2018; 15(3):850–60. https://doi.org/10.1109/TCBB.2017.2688355.

    Article  Google Scholar 

  63. 63

    Thomas-Cholier M, Sand O, Turatsinze JV, Janky R, Defrance M, Vervish E, Brohee S, van Helden J. RSAT: regulatory sequence analysis tools. Nucleic Acids Res. 2008; 36(Web Server issue):119–27. https://doi.org/10.1093/nar/gkn304.

    Article  CAS  Google Scholar 

  64. 64

    Ivanova C, Bååth JA, Seiboth B, Kubicek CP. Systems analysis of lactose metabolism in Trichoderma reesei identifies a lactose permease that is essential for cellulase induction. PLoS ONE. 2013; 8(5):62631. https://doi.org/10.1371/journal.pone.0062631.

    Article  CAS  Google Scholar 

  65. 65

    Zhang W, Kou Y, Xu J, Cao Y, Zhao G, Shao J, Wang H, Wang Z, Bao X, Chen G, Liu W. Two major facilitator superfamily sugar transporters from Trichoderma reesei and their roles in induction of cellulase biosynthesis. J Biol Chem. 2013; 288(46):32861–72. https://doi.org/10.1074/jbc.M113.505826.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  66. 66

    Biely P, Puchart V, Stringer MA, Mørkeberg Krogh KBR. Trichoderma reesei XYN VI — a novel appendage-dependent eukaryotic glucuronoxylan hydrolase. FEBS J. 2014; 281(17):3894–903.

    CAS  PubMed  Article  Google Scholar 

  67. 67

    Furukawa T, Shida Y, Kitagami N, Mori K, Kato M, Kobayashi T, Okada H, Ogasawara W, Morikawa Y. Identification of specific binding sites for XYR1, a transcriptional activator of cellulolytic and xylanolytic genes in Trichoderma reesei. Fungal Genet Biol. 2009; 46(8):564–74.

    CAS  PubMed  Article  Google Scholar 

  68. 68

    Qin Y, Bao L, Gao M, Chen M, Lei Y, Liu G, Qu Y. Penicillium decumbens BrlA extensively regulates secondary metabolism and functionally associates with the expression of cellulase genes. Appl Microbiol Biotechnol. 2013; 97(24):10453–67. https://doi.org/10.1007/s00253-013-5273-3.

    CAS  PubMed  Article  Google Scholar 

  69. 69

    Benocci T, Aguilar-Pontes MV, Zhou M, Seiboth B, de Vries RP. Regulators of plant biomass degradation in ascomycetous fungi. Biotechnol Biofuels. 2017;10(152). https://doi.org/10.1186/s13068-017-0841-x.

  70. 70

    Pakula TM, Nygren H, Barth D, Heinonen M, Castillo S, Penttilä M, Arvas M. Genome wide analysis of protein production load in Trichoderma reesei. Biotechnol Biofuels. 2016;9(1). https://doi.org/10.1186/s13068-016-0547-5.

  71. 71

    Bischof R, Fourtis L, Limbeck A, Gamauf C, Seiboth B, Kubicek CP. Comparative analysis of the Trichoderma reesei transcriptome during growth on the cellulase inducing substrates wheat straw and lactose. Biotechnol Biofuels. 2013; 6(1):127. https://doi.org/10.1186/1754-6834-6-127.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  72. 72

    Li W-C, Huang C-H, Chen C-L, Chuang Y-C, Tung S-Y, Wang T-F. Trichoderma reesei complete genome sequence, repeat-induced point mutation, and partitioning of CAZyme gene clusters. Biotechnol Biofuels. 2017;10(170). https://doi.org/10.1186/s13068-017-0825-x.

  73. 73

    Zhang J, Chen Y, Wu C, Liu P, Wang W, Wei D. The transcription factor ACE3 controls cellulase activities and lactose metabolism via two additional regulators in the fungus Trichoderma reesei. J Biol Chem. 2019; 294:18435–50. https://doi.org/10.1074/jbc.RA119.008497.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  74. 74

    Mach RL, Seiboth B, Myasnikov A, Gonzalez R, Strauss J, Harkki AM, Kubicek CP. The bgl1 gene of Trichoderma reesei QM 9414 encodes an extracellular, cellulose-inducible β-glucosidase involved in cellulase induction by sophorose. Mol Microbiol. 1995; 16(4):687–97. https://doi.org/10.1111/j.1365-2958.1995.tb02430.x.

    CAS  PubMed  Article  Google Scholar 

  75. 75

    Häkkinen M, Arvas M, Oja M, Aro N, Penttilä M, Saloheimo M, Pakula TM. Re-annotation of the CAZy genes of Trichoderma reesei and transcription in the presence of lignocellulosic substrates. Microb Cell Fact. 2012; 11(1):134. https://doi.org/10.1186/1475-2859-11-134.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  76. 76

    Xu J, Zhao G, Kou Y, Zhang W, Zhou Q, Chen G, Liu W. Intracellular β-glucosidases CEL1a and CEL1b are essential for cellulase induction on lactose in Trichoderma reesei. Eukaryot Cell. 2014; 13(8):1001–13. https://doi.org/10.1128/EC.00100-14.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  77. 77

    Gautier V, Tong L, Nguyen T-S, Debuchy R, Silar P. PaPro1 and IDC4, two genes controlling stationary phase, sexual development and cell degeneration in Podospora anserina. J Fungi. 2018; 4(3):85. https://doi.org/10.3390/jof4030085.

    CAS  Article  Google Scholar 

  78. 78

    Antoniêto A, De Paula R, Castro L, Silva-Rocha R, Persinoti G, Silva R. Trichoderma reesei cre1-mediated carbon catabolite repression in response to sophorose through rna sequencing analysis. Curr Genomics. 2015; 17:1–1. https://doi.org/10.2174/1389202917666151116212901.

    Article  CAS  Google Scholar 

  79. 79

    Yu J-H. Regulation of development in Aspergillus nidulans and Aspergillus fumigatus. Mycobiology. 2010; 38(4):229–237. https://doi.org/10.4489/MYCO.2010.38.4.229.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  80. 80

    Krijgsheld P, Bleichrodt R, van Veluw GJ, Wang F, Müller WH, Dijksterhuis J, Wösten HAB. Development in Aspergillus. Stud Mycol. 2013; 74:1–29. https://doi.org/10.3114/sim0006.

    CAS  PubMed  Article  Google Scholar 

  81. 81

    Krijgsheld P, Nitsche BM, Post H, Levin AM, Müller WH, Heck AJR, Ram AFJ, Altelaar AFM, Wösten HAB. Deletion of flbA results in increased secretome complexity and reduced secretion heterogeneity in colonies of Aspergillus niger. J. Proteome Res. 2013; 12(4):1808–19. https://doi.org/10.1021/pr301154w.

    CAS  PubMed  Article  Google Scholar 

  82. 82

    Metz B, Seidl-Seiboth V, Haarmann T, Kopchinskiy A, Lorenz P, Seiboth B, Kubicek CP. Expression of biomass-degrading enzymes is a major event during conidium development in Trichoderma reesei. Eukaryot Cell. 2011; 10(11):1527–35. https://doi.org/10.1128/EC.05014-11.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  83. 83

    Jourdren L, Bernard M, Dillies M-A, Le Crom S. Eoulsan: a cloud computing-based framework facilitating high throughput sequencing analyses. Bioinformatics. 2012; 28(11):1542–3. https://doi.org/10.1093/bioinformatics/bts165.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  84. 84

    Anders S, Huber W. Differential expression analysis for sequence count data. Genome Biol. 2010; 11(10):106. https://doi.org/10.1186/gb-2010-11-10-r106.

    Article  CAS  Google Scholar 

  85. 85

    Benjamini Y, Hochberg Y. Controlling the false discovery rate: a practical and powerful approach to multiple testing. J R Stat Soc Ser B Stat Methodol. 1995; 57(1):289–300.

    Google Scholar 

  86. 86

    Howe E, Holton K, Nair S, Schlauch D, Sinha R, Quackenbush J. MeV: MultiExperiment Viewer In: Ochs FM, Casagrande TJ, Davuluri VR, editors. Biomedical Informatics for Cancer Research. Boston: Springer: 2010. p. 267–77.

    Google Scholar 

  87. 87

    Steinhaus H. Sur la division des corps matériels en parties. Bull Acad Polon Sci. 1956; IV(12):801–4.

    Google Scholar 

  88. 88

    Martinez D, Berka RM, Henrissat B, Saloheimo M, Arvas M, Baker SE, Chapman J, Chertkov O, Coutinho PM, Cullen D, Danchin EGJ, Grigoriev IV, Harris P, Jackson M, Kubicek CP, Han CS, Ho I, Larrondo LF, de Leon AL, Magnuson JK, Merino S, Misra M, Nelson B, Putnam N, Robbertse B, Salamov AA, Schmoll M, Terry A, Thayer N, Westerholm-Parvinen A, Schoch CL, Yao J, Barabote R, Nelson MA, Detter C, Bruce D, Kuske CR, Xie G, Richardson P, Rokhsar DS, Lucas SM, Rubin EM, Dunn-Coleman N, Ward M, Brettin TS. Genome sequencing and analysis of the biomass-degrading fungus Trichoderma reesei (syn. Hypocrea jecorina). Nat Biotechnol. 2008; 26(5):553–60.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  89. 89

    Kubicek CP, Steindorff AS, Chenthamara K, Manganiello G, Henrissat B, Zhang J, Cai F, Kopchinskiy AG, Kubicek EM, Kuo A, Baroncelli R, Sarrocco S, Ferreira Noronha E, Vannacci G, Shen Q, Grigoriev IV, Druzhinina IS. Evolution and comparative genomics of the most common Trichoderma species. BMC Genom. 2019; 20:485.

    Article  CAS  Google Scholar 

  90. 90

    Defrance M, Janky R, Sand O, van Helden J. Using RSAT oligo-analysis and dyad-analysis tools to discover regulatory signals in nucleic sequences. Nat Protoc. 2008; 3(10):1589–603.

    CAS  PubMed  Article  Google Scholar 

Download references

Acknowledgements

We would like to thank Dimitri Ivanoff, Sabine Prigent and Thiziri Aouam for technical assistance.

Funding

The RNA-seq libraries and sequencing were supported by the France Génomique national infrastructure, funded as part of the “Investissements d’Avenir” program managed by the Agence Nationale de la Recherche (contract ANR-10-INBS-09)

Author information

Affiliations

Authors

Contributions

AP analyzed and interpreted the RNA-seq data through a series of bioinformatics analyses (DE, clustering, network inference, GO enrichment, promoter analysis) and also participated to the redaction of the article. LD collaborated to the design of the experiment and reviewed the manuscript. FB coordinated the study, interpreted the data analyses and reviewed the manuscript. EJ and AM designed and supervised the study and drafted the manuscript. CB, CF and SP carried out the RNA-seq experiments and bioinformatics. All authors read and approved the manuscript.

Corresponding author

Correspondence to Aurélie Pirayre.

Ethics declarations

Ethics approval and consent to participate

Not applicable

Consent for publication

Not applicable

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Additional file 1

Study of the biomass concentration during the fed-batch. This PNG file contains experimental results regarding the study of the Rut-C30 biomass concentration at 0 h, 24 h, 48 h and 120 h during the fed-batch on G100, G75-L25, G90-L10 and L100.

Additional file 2

Circuit design. This PDF file contains an illustration of the methodology used to perform the differential analysis.

Additional file 3

List of mutated and/or differentially expressed genes. This Excel file contains two sheets. In the first one, we found the list of differentially expressed genes and contains information regarding gene name, gene function, orthologs in various species (S. cerevisiae, A. nidulans and N. crassa), whether the gene is a transcription factor, expression ratios and the label of the cluster to which it belongs. In the second sheet, there is the list of mutated genes in Rut-C30, by comparison to QM6a, and the ones which are identified to be differentially expressed in our conditions.

Additional file 4

Promoter analysis of clr2. This Excel file contains three sheets. The first one gathers results regarding the promoter analysis of clr2 based on results obtained in the sub-network SubN1 generated by BRANE Cut [40]. The second sheet displays the pattern feature map while the third one contains the statistical analysis regarding the discovered promoter sequence.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Pirayre, A., Duval, L., Blugeon, C. et al. Glucose-lactose mixture feeds in industry-like conditions: a gene regulatory network analysis on the hyperproducing Trichoderma reesei strain Rut-C30. BMC Genomics 21, 885 (2020). https://doi.org/10.1186/s12864-020-07281-8

Download citation

Keywords

  • Trichoderma reesei Rut-C30
  • Carbon sources
  • Cellulases
  • Transcriptome
  • Fed-batch fermentation
  • Data science
  • Gene regulatory network