Polysome profiling during hESCs cardiomyogenic differentiation
The hES-NKX2–5eGFP/w reporter human embryonic stem cell (hESC) line [27] was used to derive cardiomyocytes using a developmentally staged protocol [2, 28] that includes the induction of a cardiac mesoderm population on days 3 and 4 and a NKX2–5+/cTNT+ population by day 15 (Fig. 1a and b). Cardiomyogenesis progression was followed by flow cytometry using CD56 as a mesoderm marker [29] on day 4 (cut off < 40%) and NKX2–5/eGFP expression on day 9 as a cardiac progenitor marker (cut off < 50%) (Fig. 1b). Beating clusters were observed after 10 days of differentiation (Fig. 1c and Online Additional file 1: Video S1), yielding a population of 50–60% cTNT+ cardiomyocytes on day 15 (D15) (Fig. 1b). At day 20, cTNI immunostaining showed the striations characteristic of sarcomere structures (Fig. 1c).
To investigate the differential association of mRNAs with polysomes and, therefore, post-transcriptional changes in gene expression during cardiac differentiation, we first performed polysome profiling on days D0, D1, D4, D9 and D15, which represent pluripotency, embryoid body (EB) aggregation, cardiac mesoderm, cardiac progenitor and cardiomyocyte stages, respectively (Fig. 1a). After 10 min of cycloheximide treatment, active ribosomes got arrested with associated RNAs and we analyzed cellular extracts by ultracentrifugation in a sucrose gradient (Fig. 1d). Differential density throughout the gradient allowed the isolation of ribosome-free (fractions 1–3) and polysome-bound (fractions 9–22) RNAs. Based on the polysome profile, pooled ribosome-free and polysome-bound RNA fractions were sequenced using the Illumina platform, yielding nearly 30 million reads for each sample. Approximately 70–80% of the reads were mapped onto the reference genome (GRCh38), and more than 17,000 genes were detected in each type of fraction (Additional file 2: Figure S1). As a control, D15 cells were also treated with puromycin to disassemble the polysomes and cardiomyocyte markers were evaluated by qPCR (Additional file 2: Figure S2).
Correspondence analysis (COA) showed that samples were grouped according to the type of RNA fraction (ribosome-free vs. polysome-bound) (Fig. 1e) and according to the day of differentiation (D0, D1, D4, D9 and D15) (Fig. 1f and g). Polysome-bound samples showed more distinct groups relative to the day of differentiation, indicating high similarity between translated genes in experimental replicates (Fig. 1g). On the other hand, ribosome-free samples showed more dispersion and less similarity between the replicates.
The translatome delineates cardiomyogenic gene expression
During heart development, temporal gene expression changes occur to define each step of cardiogenic commitment [6, 30]. RPKM values (reads per kilobase per million mapped reads) for polysome-bound RNA fraction showed the expression levels of known lineage-specific genes throughout cardiac differentiation (Fig. 2a) which were confirmed by qPCR (Additional file 2: Figures S2 and S3). Pluripotency marker genes, including SOX2, POU5F1 (OCT4) and NANOG, were expressed at higher levels on D0 and D1 and down-regulated at following time-points. The mesoderm marker genes T and EOMES, and early cardiac gene MESP1 were highly specifically expressed on D4. Expression of cardiac-related genes such as GATA4, NKX2–5 and TBX5 as well as cardiomyocyte-specific sarcomeric genes TNNI, TNNT, MYL7 and MYH6 were observed from D9 to D15. In contrast, the endoderm (PECAM1 and PDX1) and ectoderm (PAX6 and FOXP2) marker genes did not change their polysome association during cardiomyogenic differentiation (Fig. 2a).
Comparisons between each differentiation time-point and the preceding time-point, considering an overall FDR of ≤0.05 and − 2 ≥ logFC ≥2, identified differentially expressed genes (DEGs) in polysome-bound RNA fractions (Fig. 2b, data available in Additional file 3). Aggregation of embryoid bodies during the first 24 h of differentiation induced differential expression of 288 genes. Mesoderm commitment from D1 to D4 showed 1264 DEGs, and cardiac progenitor progression to D9 showed 1582 DEGs. The final step of differentiation analysis on D15 showed 743 DEGs compared to D9. In general, the majority of DEGs were up-regulated, except for D15 compared to D9, where the number of down-regulated genes was slightly higher than up-regulated genes (Fig. 2b). Similar numbers of genes and patterns of up- and down-regulated genes were shown in the ribosome-free samples (Additional file 2: Figure S4A and Additional file 4). Regarding the protein-coding and non-coding genes in polysome-bound samples, approximately 20% of DEGs in each time-point analysis are annotated as non-coding RNAs, in which 39% correspond to “lincRNA”, 22% to “antisense” and 18% to “processed pseudogene” RNAs (Additional file 2: Figure S4B).
Gene Ontology (GO) analysis of polysome-bound up-regulated DEGs (FDR ≤ 0.05 and logFC ≥2) revealed developmental and cardiac-related “biological process” (BP) during cardiac differentiation (Fig. 2c). BP terms p-values (−log10) of DEGs for each time-point compared to the preceding one are represented in Fig. 2c heatmap. “Cardiac muscle tissue morphogenesis” and “regulation of muscle contraction” are highly enriched on D9 (compared to D4) and only slightly enriched on D15 (compared to D9), indicating that on D9, most cardiac characteristics are already committed. This pattern is similar for “muscle system process” and “muscle contraction”. “Extracellular matrix organization” seems to have important roles in two distinct phases: mesoderm commitment (D4) and cardiomyocyte final differentiation (D15) (Fig. 2c and Additional file 2: Figure S5).
To assess more information about variations in gene expression during cardiac differentiation, we performed gene clustering using logCPM (counts per million mapped reads). Distinct pattern expression groups were shown: genes with decreased expression during the differentiation, called pluripotency-related cluster and enriched in early developmental BP terms, such as “anterior/posterior axis specification” and “BMP signaling pathway”; and genes with increased expression during cardiac differentiation, called cardiac-related cluster and enriched in lineage specific commitment terms, such as “muscle tissue development” and “regulation of heart contraction” (Fig. 2d).
Strong gene expression coordination is observed during mesoderm-to-cardiac progenitor commitment
When considering polysome-bound RNAs, the largest gene expression variation showing 1582 DEGs occurred during the D4 to D9 shift, which represents mesoderm-to-cardiac progenitor commitment (Fig. 2b). GO analysis revealed that some of D4 up-regulated BP terms were also enriched in the D9 down-regulated analysis (Fig. 3a and Additional file 2: Figure S5), such as “pattern specification process” and “embryonic morphogenesis”. These findings suggest a crucial gene expression regulation at this stage. Comparisons between D4 up-regulated and D9 down-regulated genes showed 217 in common, which are related to “mesoderm development” and “embryonic pattern specification” BP terms (Fig. 3b).
Moreover, an expression gene clustering pattern showed a distinct group of genes highly and specifically expressed on D4 (Fig. 3c). Those genes are called mesoderm-related genes and are enriched on developmental processes, such as “pattern specification process”, “regionalization” and “somitogenesis”. Altogether, these results indicate that the developmental progress of mesoderm-to-cardiac progenitor is carefully regulated and can be assessed by polysome-bound RNA analysis.
Cardiac commitment is intensely tuned by differential mRNA association with polysomes
Gene expression fluctuations could be a consequence of coordination or lack thereof between transcription and translation rate changes. To assess the post-transcriptional regulation during the cardiomyogenic differentiation, we performed the two step-analysis: (1) DEGs were identified through the comparison between each differentiation time-point and the preceding time-point, considering an overall FDR ≤ 0.05, − 2 ≥ logFC ≥2 and RPKM > 1 on ribosome-free or polysome-bound samples (Additional files 3 and 4, respectively); and (2) ribosome-free DEGs and polysome-bound DEGs were compared and classified according to the following categories (Additional file 5). Genes that were up- or down-regulated in both fractions were labeled “up-coordinated” or “down-coordinated”. Transcripts that were up-regulated in ribosome-free fraction but might be neutralized by post-transcriptional mechanisms were labeled “up-buffered”, or in the opposite case, as “down-buffered”. Moreover, genes showing that their polysome association was increased or decreased, were labeled “up-loaded” or “down-loaded”, respectively (Fig. 4a).
We used the coordinated, buffered and loaded classification of DEGs and included one more label category considering the final gene expression as coordinated, post-transcriptional positive or negative regulation (Fig. 4a). Therefore, when compared to the preceding time-point, genes up- or down-coordinated were labeled being under coordinated regulation, genes up-buffered and down-loaded under post-transcriptional negative regulation and genes down-buffered and up-loaded under post-transcriptional positive regulation. Approximately 60–80% of DEGs showed one or another kind of post-transcriptional regulation, positive or negative, suggesting a crucial role of this level of gene expression control (Fig. 4b). Interestingly, during the initial steps of differentiation (D0-D1 and D1-D4), there was a prevalence of up- (10/19 and 142/271) or down-loaded (60/138 and 190/351) genes, suggesting a strong post-transcriptional regulation at these stages. During cardiac progenitor commitment on D9, numbers of coordinated, buffered and loaded genes were similar, either on positive or negative regulation. On the other hand, between D9 and D15 most genes were classified as buffered, in particular, on positive regulation (234/370), indicating that transcriptional variations might be controlled by post-transcriptional mechanisms. These results corroborate our previous findings showing that most of cardiac characteristics are already committed on D9 (Fig. 2).
Considering that Gene Ontology-annotated genes are usually protein-coding, we filtered our data and performed GO analysis using only protein-coding genes. Analysis of coordinately regulated genes showed well-established pathways during hESC differentiation, such as the Reactome pathway terms “POU5F1 (OCT4), SOX2, NANOG repress genes related to differentiation”, “Transcriptional regulation of pluripotent stem cells” and “Developmental Biology”, which were upregulated on D1 and D4 and down-regulated later (D9 and D15) (Additional file 2: Figure S6). Additionally, the cardiac-related pathway “Muscle contraction” and cardiac-specific pathway “Cardiac conduction” were coordinately up-regulated at D9 and D15 time-points (Additional file 2: Figure S6).
Genes classified as buffered or loaded showed enriched pathway terms with a diversity of biological processes (Fig. 5). The Reactome pathways “Developmental Biology” and “NCAM signaling for neurite out-growth” were up-loaded on D4 and down-loaded on D9, once more suggesting the critical regulation at this stage. For instance, developmental-related genes, such as NOTUM, CER1 and SOX17, appeared as up-loaded on D4 and down-loaded on D9 (Additional file 5 and Additional file 2: Figure S7A), indicating the polysomal loading regulation. The “M phase” and “Mitotic Metaphase and Anaphase” terms were shown as down-loaded on D15, while “Cyclin A/B1 associated events during G2/M transition” as down-buffered, indicating the fine adjustment of cell cycle during differentiation. The E2F Transcription Factor 1 plays a crucial role in the control of cell cycle [31] and was shown as involved in myoblast proliferation and differentiation through the auto-regulation loop with miR-20a-5p and miR-20b-5p [32]. E2F1 gene appeared less associated to polysomes on D9 (FDR < 0.05, logFC − 1.73, not included on down-loaded group), and down-buffered on D15, illustrating an initial polysomal dissociation followed by transcriptional down regulation of this gene (Additional files 3 and 4).
We have previously shown that “Extracellular matrix organization” seems to have an important role during differentiation (Fig. 2c), and it is probably also being post-transcriptionally regulated once it showed as enriched for up-loaded genes on D4, D9 and D15 (Fig. 5). Between the D15 up-loaded genes are LUM, COL6A3 and COL3A1 (Additional file 5). COL3A1 was already shown as post-transcriptionally regulated by the interaction of the heterogeneous nuclear ribonucleoprotein (hnRNP) A1 mRNA-binding protein with its 3′-UTR [33]. Interestingly, the cardiac-specific NKX2–5 gene was shown as up-buffered on D4, suggesting that its RNA was transcribed but not translated yet. On D9, this gene was shown as up-coordinated, suggesting, in this case, its transcription and translation (Additional file 2 and Additional file 5: Figure S7B). In addition, other crucial cardiac transcription factors, such as MEF2A and TBX5 appeared as up-loaded on D9, indicating their polysome-bounding increase on this stage (Additional file 5 and Additional file 2: Figure S7C).
To further characterize how recruitment to and dissociation from ribosomes influence gene expression, we performed polysome/ribosome-free ratio analysis using RPKM values. The results showed genes affected by polysome recruitment (FDR ≤ 0.05, logFC ≥2) or dissociation (FDR ≤ 0.05, − 2 ≥ logFC) during cardiomyocyte differentiation (Additional file 6). Some of them are also DEGs in polysome-bound or ribosome-free fractions, but most are not differentially expressed (Fig. 6a and Additional file 2: Figure S8). GO analysis for these non-differentially expressed and ratio-variated genes showed a variety of BP terms (Fig. 6b and Additional file 2: Figure S8). For instance, the development-related pathways JUN, Wnt and Notch were strongly regulated between D1, D4 and D9, were recruited from D1 to D4 and dissociated from D4 to D9 (Fig. 6b). The polysome/ribosome-free ratio of the JUN, Wnt and Notch pathway genes MTCH1, GALNT11, NCLN and TMEM237 were plotted to visualize the variations on D1, D4 and D9 as an example (Fig. 6c).
mRNA loading into polysomes fine-tunes crucial processes during hESC cardiomyogenesis
To better understand the changes in translation between hESC and cardiomyocytes, we performed Gene Ontology (GO) analysis with DEGs of D0 (hESC) vs. D15 (cardiomyocytes), considering FDR ≤ 0.05, − 1 ≥ logFC ≥1 and ribosome-free and polysome-bound data combined (Additional file 7). Genes down-regulated on D15 were enriched in Biological Process (BP) terms such as “rRNA processing”, “tRNA aminoacylation for protein translation” and “cytoplasmic translation”, which were grouped as “RNA-related terms” (Fig. 7a). Combining the genes annotated in this group (143 genes down-regulated on D15) and analyzing their post-transcriptional regulation classification, 44 (30.7%) of them were down-coordinated and 72 (50.3%) were down-loaded (Fig. 7b and Additional file 7). In addition, many ribosomal proteins were shown down-regulated on D15, predominantly down-loaded (90.9%) (Fig. 7c). Other translation machinery proteins were also down-regulated after cardiomyocyte commitment, for instance, the initiation factors EIF5AL1 and EIF4E1B, and elongation factor EEF1E1 (Additional file 7). On the other hand, the cardiac elongation factor EEF1A2 [34] was up-regulated on D15 (Additional file 7).
Moreover, to explore if non-differentially expressed genes could have been affected by the variation in polysome occupancy (recruitment vs. dissociation), we also compared the polysome/ribosome-free RPKM ratio between hESCs (D0) and cardiomyocytes (D15) (Additional file 6). Interestingly, genes related to translation processes also showed polysomal dissociation on D15 when compared to D0, illustrated by the decreased polysome/ribosome-free ratio (Fig. 7d and e). Among them, there are translation initiation factors (EIF4A3, EIF4E, EIF4B), ribosomal proteins (RPL6, RPL14) and RNA helicases (DDX52), whose ratios are plotted in Fig. 7f (Additional file 6). Some of these observations were confirmed by qPCR (Additional file 2: Figure S9).
To further confirm that the down-regulation of translation-related genes after cardiomyogenic differentiation could affect protein synthesis, we performed a protein synthesis quantification assay. Cells on D0 (hESC) and D15 (cardiomyocytes) were treated with O-propargyl-puromycin (OPP) which is incorporated into newly translated proteins and then fluorescently labeled. Quantification of fluorescence intensity showed a decrease in protein synthesis after cardiac commitment compared to undifferentiated cells (Fig. 7g and h). Taken together, these findings suggest a translation adjustment during hESC-to-cardiomyocyte differentiation.
Interestingly, other crucial cellular processes also showed regulation by post-transcriptional mechanisms. For instance, on D1 down-buffered and D15 up-buffered GO analysis, the Reactome pathway terms related to cellular metabolism were enriched, suggesting an important post-transcriptional regulation of this process during cardiomyogenic differentiation. Metabolic properties differ between cardiomyocytes and hESCs [35], thus, to understand the regulation of metabolic genes, we grouped the genes annotated as the Reactome pathway terms related to cellular metabolism and called them “Metabolism-related genes” (Additional file 2: Figure S10A and Additional file 7). Comparing the ribosome-free and polysome-bound data, 494 genes were up-regulated on D15, among which 243 (49.2%) were up-loaded, 46 (9.3%) were up-buffered and 205 (41.5%) were up coordinated (Additional file 2: Figure S10B). This likely reflects the post-transcriptional contribution to cardiomyocyte metabolic remodeling demonstrated by metabolic gene recruitment to polysomes.