Differential expression in leaves of Saccharum genotypes contrasting in biomass production provides evidence of genes involved in carbon partitioning

Background The development of biomass crops aims to meet industrial yield demands, in order to optimize profitability and sustainability. Achieving these goals in an energy crop like sugarcane relies on breeding for sucrose accumulation, fiber content and stalk number. To expand the understanding of the biological pathways related to these traits, we evaluated gene expression of two groups of genotypes contrasting in biomass composition. Results First visible dewlap leaves were collected from 12 genotypes, six per group, to perform RNA-Seq. We found a high number of differentially expressed genes, showing how hybridization in a complex polyploid system caused extensive modifications in genome functioning. We found evidence that differences in transposition and defense related genes may arise due to the complex nature of the polyploid Saccharum genomes. Genotypes within both biomass groups showed substantial variability in genes involved in photosynthesis. However, most genes coding for photosystem components or those coding for phosphoenolpyruvate carboxylases (PEPCs) were upregulated in the high biomass group. Sucrose synthase (SuSy) coding genes were upregulated in the low biomass group, showing that this enzyme class can be involved with sucrose synthesis in leaves, similarly to sucrose phosphate synthase (SPS) and sucrose phosphate phosphatase (SPP). Genes in pathways related to biosynthesis of cell wall components and expansins coding genes showed low average expression levels and were mostly upregulated in the high biomass group. Conclusions Together, these results show differences in carbohydrate synthesis and carbon partitioning in the source tissue of distinct phenotypic groups. Our data from sugarcane leaves revealed how hybridization in a complex polyploid system resulted in noticeably different transcriptomic profiles between contrasting genotypes.


Mapping and quantification
Mapping rates of preprocessed reads obtained with Salmon [1] are shown in Table 1.   Table 2: Results of the differential expression analysis in the three proposed tests: i) Low biomass group compared to high biomass group; ii) ANOVA-like test using genotypes within the high biomass group; iii) ANOVA-like test using genotypes within the low biomass group. We evaluated Gene Ontology enriched terms in each of these tests, in the following order:

Tests
• Low biomass genotypes compared to the high biomass genotypes (Figure 3) • Differences within the high biomass group (main document) • Differences within the low biomass group (main document) Figure 4: Heatmap of differential expression for genes associated with transposition. Each column in the heatmap represent the genes fold changes according to a contrast: (i) Low vs High is the comparison between the low biomass to the high biomass group; (ii) US85-1008 vs other high represents the comparison of US85-1008 to wild high-fiber canes; (iii) White Transparent vs other low is the contrast of White Transparent against the low-fiber hybrids.
Using the common DEGs among the three contrast, the enrichment corroborates differences in stress response and transposition ( Figure 5). To avoid the enrichment of these often apparent terms, we performed a functional enrichment analyses using the common genes between the high and low groups contrasts, removing those common to the fiber contrast. With that we verified that sugarcane genotypes, even in a same phenotypic group, have differences in the cell wall biogenesis (Figura 6).

Co-expression enrichment
Our co-expression network was built with the genes passing the expression filter. We obtained 16 modules, of which eleven showed enrichment of 289 Gene Ontology terms. Table 3 presents the Gene Ontology terms enriched in each co-expression module.        We created a Word Cloud representation using a word frequency greater than one in each enriched module to check the most common words (Figure 7).
We used the Gene Set Enrichment Analysis (GSEA) and permuted the genes of the modules 10,000 times in the ranked LFC lists of the following contrasts: i) Low against high biomass; ii) US85-1008 compared to the mean of SES205A and IN84-58; iii) White Transparent compared to the mean of RB72454 and SP80-3280. We found that module 16 was enriched with genes with high absolute LFC values in the three contrasts (Figure 9). Genes in module 16 were positivelly correlated with genes of high LFC in the biomass contrast and in the comparison of US85-1008 with two S. spontaneum genotypes. Ranked genes from the contrast comparing the S. officinarum White Transparent to the hybrids were negativelly correlated with genes within module 16 .
To visualize the expression profile of each module, we assessed the expression level of the eigengenes (Figure 10). We observed that at least five modules were marked by a expression peak or valley for a single genotype. Module 16 contains genes with higher expression in sucrose-rich genotypes, opposite to module 3. In both cases US85-1008 was in the high expression group. The profile of the eigengene of Module 16 indicates a higher expression in the low biomass group, but without a substantial variability among the samples within the group. According to the GSEA, in this module the low-biomass genotypes did not contain genes with high LFC.

Pathway analysis with Gene Ontology terms and MapMan4
We explored the pathways provided by MapMan4 to associate up and downregulated DEGs with metabolic processes. We first used all the isoforms of a gene to map to the functional annotation BINs in Mercator4. Next, in MapMan we used the log fold change of the DEGs identified in each contrast evaluated. Here we present the results of the metabolism overview and lignin pathways. Figure 11: Metabolism overview mapping using the log of fold change of the DEGs from the low biomass genotypes compared to the high biomass group. Genes significantly upregulated were colored in red, while those downregulated were colored in blue.        (ii) US85-1008 vs other high represents the comparison of US85-1008 to wild high-fiber canes; (iii) White Transparent vs other low is the contrast of White Transparent against the low-fiber hybrids. Differentially expressed genes of the contrast are indicated by asterisks. Figure 20: Expression of DEGs coding for expansins. Each column in the heatmap represent the genes fold changes according to a contrast: (i) Low vs High is the comparison between the low biomass to the high biomass group; (ii) US85-1008 vs other high represents the comparison of US85-1008 to wild high-fiber canes; (iii) White Transparent vs other low is the contrast of White Transparent against the low-fiber hybrids. Differentially expressed genes of the contrast are indicated by asterisks. Figure 21: Heatmap for MapMan sucrose metabolism of synthesis (A) and degradation (B). Each column in the heatmap represent the genes fold changes according to a contrast: (i) Low vs High is the comparison between the low biomass to the high biomass group; (ii) US85-1008 vs other high represents the comparison of US85-1008 to wild high-fiber canes; (iii) White Transparent vs other low is the contrast of White Transparent against the low-fiber hybrids. Differentially expressed genes of the contrast are indicated by asterisks. Figure 22: Expression of DEGs coding for sucrose transport proteins (A) and sugar transporters (B). Each column in the heatmap represent the genes fold changes according to a contrast: (i) Low vs High is the comparison between the low biomass to the high biomass group; (ii) US85-1008 vs other high represents the comparison of US85-1008 to wild high-fiber canes; (iii) White Transparent vs other low is the contrast of White Transparent against the low-fiber hybrids. Differentially expressed genes of the contrast are indicated by asterisks.

Expression of investigated genes
In this subsection we present the log of counts per million (logCPM) in the three contrasts for genes investigated by their biological relevance. Expression of four genes encoding sucrose synthase (SuSy) are in Table 4.