Systems based analysis of human embryos and gene networks involved in cell lineage allocation
BMC Genomics volume 20, Article number: 171 (2019)
Little is understood of the molecular mechanisms involved in the earliest cell fate decision in human development, leading to the establishment of the trophectoderm (TE) and inner cell mass (ICM) stem cell population. Notably, there is a lack of understanding of how transcriptional networks arise during reorganisation of the embryonic genome post-fertilisation.
We identified a hierarchical structure of preimplantation gene network modules around the time of embryonic genome activation (EGA). Using network models along with eukaryotic initiation factor (EIF) and epigenetic-associated gene expression we defined two sets of blastomeres that exhibited diverging tendencies towards ICM or TE. Analysis of the developmental networks demonstrated stage specific EIF expression and revealed that histone modifications may be an important epigenetic regulatory mechanism in preimplantation human embryos. Comparison to published RNAseq data confirmed that during EGA the individual 8-cell blastomeres are transcriptionally primed for the first lineage decision in development towards ICM or TE.
Using multiple systems biology approaches to compare developmental stages in the early human embryo with single cell transcript data from blastomeres, we have shown that blastomeres considered to be totipotent are not transcriptionally equivalent. Furthermore we have linked the developmental interactome to individual blastomeres and to later cell lineage. This has clinical implications for understanding the impact of fertility treatments and developmental programming of long term health.
The preimplantation period is a unique window in development, when extensive remodelling of the genome and epigenome occur over the few days that the embryo is a free-living entity. This window is of critical importance allowing the fertilised oocyte to (i) correctly reprogram the male genome and activate the new embryonic genome, (ii) substantially reset the epigenome via a process that includes generalised demethylation and remethylation, and (iii) establish the first cell fate decision in development towards formation of the trophectoderm (TE), while maintaining the inner cell mass (ICM) stem cell population [1,2,3].
The processes underlying some of the key molecular and cellular decisions are not fully understood, including Embryonic Genome Activation (EGA), mechanisms underlying epigenome remodelling, and cell fate decisions governing blastomere development and differentiation into the ICM or trophectoderm (TE) tissues of the blastocyst. Early embryos inherit stored mRNA and proteins from the oocyte which guide development under maternal control until EGA. The degradation of particular inherited mRNAs may itself regulate the timing of EGA . The start of EGA in the human was originally defined as occurring between the 4-cell and 8-cell stage . However more recent studies have suggested that there are in fact three waves of EGA, occurring at the 2 cell, 4-cell and 8–10 cell stages, with the final representing the highest level of transcriptional activity . Epigenetic remodelling in the human embryo occurs post fertilisation, with embryonic DNA generally demethylated to 30–40% by the 2-cell stage, with minimal subsequent loss of methylation, and no re-methylation until the post-implantation stages . During this period of intensive remodelling, cell fate decisions are initiated which lead to segregation of human ICM and TE. It is currently unknown as to whether individual blastomeres from 8-cell embryos are pre-determined to become ICM or TE, but it is known that considerable heterogeneity exists between blastomeres at this point, due in part to the absence of gap junctional communication prior to this stage [8, 9]. The reported variation in different transcript subsets between individual human 8-cell blastomeres has been interpreted as indicating that some may arrest whilst others develop . In particular, individual 8-cell blastomeres were reported to display significant variation in down-regulation of maternally-expressed genes, which may indicate early lineage specification. In contrast, other studies have shown that blastomeres from 5- to 8-cell human embryos are not pre-determined to become either ICM or TE . We hypothesised that an analysis of the global human embryo transcriptome using a combination of network analyses, dimensional scaling and change point analysis would allow us to identify the transcriptional co-ordination involved in the key early developmental event of blastomere cell fate specification.
Oocytes and embryos
Human oocytes and embryos were donated to research with fully informed patient consent in writing and approval from South Manchester Research Ethics Committee under Human Fertility and Embryology Authority research licence R0026. Fresh oocytes and embryos surplus to the clinical IVF treatment programme were obtained from Saint Mary’s Hospital Manchester, graded and prepared as described in Shaw et al 2013  (Fig. 1a).
Embryo sample preparation and microarray analysis of transcriptome
Upon donation to research, embryos were cultured in G1 medium (Vitrolife, Goteborg, Sweden) covered with a thin layer of mineral oil (Ovoil, Origio, Malov, Denmark) until day 3, followed by G2 medium (Vitrolife) from day 3 to blastocyst at 37 °C, 6% CO2 in a humidified atmosphere. Oocytes and embryos at a range of developmental stages, including four oocytes, four 4-cell embryos, three 8-cell embryos, 32 individual blastomeres obtained from four eight cell embryos (blastomeres) and four blastocysts were lysed and reverse transcribed as previously described [8, 13] and cDNA was prepared by polyA-PCR amplification . Embryos were only used if they both reached the appropriate development stage for time in culture and were morphologically normal. Of the 32 8-cell blastomeres, 8 disaggregated from a single 8-cell stage embryo were used to generate single cell transcriptomic data, the remaining 24 disaggregated from 3 complete sets of 8 individual blastomeres were used for qPCR validation. 8-cell stage embryos used for blastomere isolation were equivalent in stage and appearance to those used for single embryo analysis and, to further reduce bias, we analysed only blastomere sets in which we recovered all 8 blastomeres from an 8-cell embryo. For blastomere disaggregation, whole embryos were exposed to Acid Tyrode’s Solution (Sigma) for ~ 3 s intervals until zona breakdown occurred and individual blastomeres were isolated via mechanical disaggregation. The polyA-PCR technique amplifies all polyadenylated RNA in a given sample, preserving the relative abundance in the original sample [15, 16]. A second round of amplification using EpiAmp (Epistem, Manchester, UK) and biotin-16-dUTP labelling using EpiLabel (Epistem) was performed at the Paterson Cancer Research Institute Microarray facility, as previously described . For each sample, our minimum inclusion criterion was the expression of β-actin evaluated by gene specific PCR. Labelled PolyA-cDNA was hybridised to the GeneChip® Human Genome U133 Plus 2.0 Array (Affymetrix, SantaClara, CA, USA) and data initially visualised using MIAMIVICE software . The statistical and graphical R computing language was used together with Bioconductor packages [17, 18], to assess quality control of microarray data (Array Quality Metrics package)  (Additional file 1: Supplementary Methods and Additional file 2: Figure S1) and normalisation undertaken using the Mas5 algorithm . Principal component analysis (PCA) was conducted using Partek Genomics Suite software (St. Louis, MO, USA), with IsoMap cross-validation undertaken using Qlucore Omics Explorer 2.3 (Qlucore, Lund, Sweden).
Differential gene expression was determined using ANOVA (Fig. 1b). Benjamini-Hochberg false discovery rate (FDR) corrected p-values ≤0.05 were considered differentially expressed . Genes were defined as expressed at a particular developmental stage if present in at least 75% (or two of three) samples (Additional file 3: Table S1). Normalised eukaryotic initiation factor (EIF) family member expression levels were extracted from the arrays for relative expression across development.
Gene specific PCR amplification of a panel of pluripotency, polarity and hippo-signalling genes was assessed by qPCR using Power SYBR Green Master Mix (Applied Biosystems) according to the manufacturer’s instructions. Primers were designed to produce a product between 50 and 200 base pairs within the first 500 base pairs of the gene to prevent amplification bias. Each 10 μl reaction contained 2 ng polyA-cDNA template and the cycling was performed in an ABI 7300 real-time PCR system. A 10 min denaturation step at 95 °C was followed by 39 PCR cycles comprising 30s denaturation at 95 °C, 30s annealing at 60 °C and 35 s extension at 72 °C. An extended annealing step of 10 min at 72 °C finalised the run. Identification of a single peak at melt curve analysis signified gene expression. Reactions were performed in triplicate for each individual sample. ΔCt was calculated as 40-Ct value for data presentation purposes. Some samples showed low levels of gene expression however genes were not classed as ‘expressed’ above background level unless detected with 37 or fewer cycles of real-time PCR. We only included a sample in the analysis data set if greater than 3 target genes were defined as expressed. Associations between expressed genes were analysed for significance using a chi squared test. Genes displaying p ≤ 0.05 were defined as differentially regulated.
Networks of genes associated with embryonic development
We generated human protein-protein interaction (PPI) networks based on differentially expressed genes in Cytoscape  by inference of protein: protein interactions using the BioGrid database  with the BioGrid Plugin for Cytoscape . This network generation approach uses a process of inference where a minimum number of connecting proteins are added between proteins with differential gene expression to form a fully connected PPI network model. As a result we capture a section of the human interactome that is associated with transcriptomic differences. Networks enriched for differentially expressed genes were constructed for each developmental stage. The Cytoscape plugin Moduland  was used to identify overlapping modules of protein:protein interactions within each developmental network and modular hierarchy was determined using the centrality score (a quantitative measurement of proximity to the centre of the network – the more central a protein is the more it can influence the effect of all other proteins in the network).
Epigenetic regulator genes were compiled from the Qiagen Epigenetic Chromatin Remodelling Factors PCR array (Qiagen) and the Epigenetic Modification Enzymes PCR array (Qiagen), supplemented with genes identified by the network analysis with a known epigenetic role. A hypergeometric distribution was applied to assess enrichment of individual 8-cell blastomeres for chromatin modification enzymes/remodelling factors. Causal Networks (Z-scores >І2І) were used to assess regulatory elements within the networks  (Ingenuity Pathway Analysis [IPA], Qiagen).
Gene expression barcoding of individual 8-cell blastomeres
Frozen robust multi-array analysis (fRMA)  was used to define absolute expression by comparison to publically available microarray datasets within R and an expression barcode was defined for each specific tissue and cell type [28, 29]. This GeneBarcode was used to generate single sample networks for the individual blastomeres. Using GeneBarcode, we considered transcripts expressed in 2–5 blastomeres, resulting in 5044 genes for downstream analysis.
Change point analysis of developmental gene expression
Our embryonic development transcriptomic data (Oocyte, 4-cell, 8-cell and Blastocyst) were filtered by using PCA analysis (Qlucore Omics Explorer 2.3), and 1676 probes were selected with a 1.2 cut-off for the projection score . Then change point analysis was performed in order to identify times in embryonic development when there was a distinct shift in the distribution of gene expression, using the “changepoint” package  within R . The transcript expression levels across the stages between two change points were replaced by the average value within this interval and the transcripts were then clustered using hierarchical clustering.
We analysed previously published single-cell RNAseq data from 81 individual 8-cell human blastomeres . Transcripts per million (TPM) expression values were visualised in Qlucore Omics Explorer 2.3 and outliers were removed. These were normalised for inter-individual variation (mean centred by the embryo donor couple) and by application of a projection score  of 0.21 to remove noise (n = 59). Gene sets identified were analysed for Gene Ontology and Canonical pathway enrichment using DAVID  and mapped onto our differentially regulated 8-cell blastomere network modules.
Hierarchy of network modules across embryo development
We carried out single embryo and blastomere polyA-PCR, global transcriptome profiling on replicate oocytes (n = 4) and embryos at the 4-cell (n = 4), 8-cell (n = 3) and blastocyst (n = 4) stage, together with 8 individual blastomeres disagregated from one 8-cell embryo (Figs. 1, 2a). PCA analysis revealed that the intact 8-cell embryos and the individual 8-cell blastomeres demonstrated overlapping transcriptomes (Fig. 2a), indicating that the embryo disaggregation procedure for isolating blastomeres had not significantly altered the blastomeres’ expression profiles.
Over the developmental series of embryonic stages we identified 7 key transitions in differentially expressed gene probe-sets using change point analysis (Fig. 2b). Genes expressed in the oocyte and 4-cell embryo but not subsequently (Changepoint group 7; CP7) were considered to be maternally expressed only, those expressed in the oocyte and subsequently at 8-cell (CP6), or oocyte/4-cell and subsequently at blastocyst (CP1), were considered to be expressed both maternally and embryonically following EGA, while those not expressed in the oocyte but only at subsequent stages were assumed to reflect EGA at early (CP3), mid (CP4 and CP5) and late (CP2) developmental stages. The gene ontology associated with each change point group was determined (Additional file 3: Table S2).
Gene networks of the transition for each embryonic stage were constructed by inference to the human interactome network (BioGRID version 3.2.110) (Additional file 2: Figure S2A and B) and a hierarchy of gene modules were defined using the topological property of centrality (Additional file 2: Figures S2C and D). Overlapping network modules were identified and centrally connected genes were used to assess the crosstalk between modules (Additional file 2: Figure S2E and F). We tracked centrally connected genes across our human embryo samples, highlighting features which are maintained throughout development or unique to a certain embryonic stage (Fig. 2c).
Modules expressed in embryos post-EGA, or conserved in both the 8-cell and blastocyst, were deemed to reflect transcription following early and/or later EGA in preparation for implantation and continuing development (Fig. 2a). Definition of network modules in the 8-cell embryo and blastocyst (Additional file 2: Figure S2A-D) were used as a basis to identify upstream regulatory elements (Additional file 2: Figure S2E and F). This approach revealed relationships between TRIM28/KAP1, MDM2, HDAC2 and TP53 and the networks of genes they regulate to be of central importance during 8-cell to blastocyst development (Additional file 1: Supplementary results, Additional file 3: Table S3, Additional file 2: Figure S3). Network models defined using inference to the human interactome were confirmed and the robustness of the identified gene modules were assessed by comparison to co-expression networks (Additional file 2: Figure S4).
Mapping TE or ICM gene expression signatures to individual 8-cell blastomeres
We defined the transcriptional signatures underlying maintenance of pluripotency and cell lineage separation in the human embryo by comparing the transcriptomes of individual 8-cell blastomeres (B1-B8, Fig. 3). From these data we constructed interaction networks for each blastomere to reveal a hierarchy of modules of highly connected genes (Fig. 3a). We identified upstream regulatory patterns for each blastomere and overlaid the upstream regulatory genes onto blastomere network modules (Additional file 2: Figure S5); the high level of overlap (p-value range 0.01–1.2 × 10− 23) provided confidence in our gene identifications. Module similarity between the blastomeres was visualised via connectivity in circos plots based on the degree of module overlap (shared ≥5 genes) (Fig. 3b). B4 and B8 were less similar to other blastomeres, whilst B1 shared the greatest amount of similarity to other blastomeres, in particular B3, B5 and B7 (Fig. 3c).
To assess if the differences between the blastomeres represented transcriptional priming towards either ICM or TE lineage, we repeated the same analysis with the inclusion of TE (n = 5) and ICM (n = 5) samples (from Stevens et al 2018 ) (Fig. 3d). Based on the degree of module overlap (shared ≥5 genes), 5 blastomeres are similar to the ICM, B1, B5 and B6 showed statistically significant overlap (p-value range 8.6 × 10− 22 to 2.4 × 10− 82). To verify our circos plots, we visualised the blastomere, TE and ICM global expression profiles using IsoMap dimensional scaling (Fig. 3e). These analyses revealed similarities in global gene expression between blastomere B3, B4, B7, B1, B6 and B5 with ICM, and blastomere B8 and B2 with TE, providing independent validation of the overlaps shown in the network derived circos plot and indicating that the TE expression profile was similar to B2, B5 and B8. B2 was therefore identified as TE-like in all three independent analyses, suggesting that blastomere cell fate may be primed at this stage.
8-cell blastomeres are not transcriptionally equivalent
In order to explore blastomere cell fate and potential mechanisms in more detail we carried out single embryo and blastomere polyA-PCR and qPCR on a further 24 individual blastomeres disaggregated from 3 complete sets of 8 individual blastomeres. We screened for the expression of genes associated with the Hippo signalling pathway which is involved in specifying TE cell fate  (LATS2, YAP, TEAD4, TAZ, CDX2), cell polarity  (PARD3, PARD6A, EZRIN, PRICKLE2 and CDX2) and cell pluripotency  (SOX2 and OCT4) (Additional file 2: Figure S6). No embryo was devoid of LATS2, with expression detected within at least 2 of the 8-cell blastomeres. The blastomeres lacking LATS2 but expressing TAZ/YAP, TEAD4 and CDX2 may have greater potential to give rise to future TE. EZRIN was the only polarity gene expressed in the majority of 8-cell blastomeres; however levels of expression varied greatly between individual cells. We observed no clustering of gene expression by embryo and the differences in expression of genes involved in hippo signalling, polarity and pluripotency pathways between the individual blastomeres verified the finding from whole transcriptome data that 8-cell blastomeres were not transcriptionally equivalent.
Expression of eukaryotic initiation factors (EIFs) at the time of EGA
Expression and activity of EIFs is critical to successful EGA . Whole transcriptome gene expression of the EIF family was significantly upregulated in the 8-cell and blastocyst (Fig. 4a) and this expression pattern closely followed the general wave of transcripts initiated during EGA . Altogether, 45 EIFs were expressed during pre-implantation development (Fig. 4a). Nineteen EIFs were differentially regulated between the 8-cell embryo and blastocyst; EIF2S2, EIF3I, and EIF43M are up-regulated at the 8-cell, EIF4E, EIF4E2 and EIF4G2 are up-regulated in the blastocyst (with EIF4E, EIFE2 and EIFG3 regulated by the TRIM28 network, Additional file 2: Figure S3A), and EIF4A3 was upregulated in the 8-cell embryo compared to both the 4-cell and blastocyst stage embryo (all FDR modified p-value ≤0.05) (Fig. 4b).
The role of EGA in blastomere identity was assessed by examining that transcriptome of individual 8-cell blastomeres to reveal if they displayed varying levels of EIFs. Half of the EIFs differentially regulated in the 8-cell embryo and blastocyst, were present in all individual 8-cell blastomeres, including the 8-cell-specific EIF4A3. However the remaining EIFs displayed varying expression levels, with B1 expressing all (10/10) of these EIFs and B2 (3/10) and B8 (5/10) expressing the least number of EIF family members (Fig. 4c).
Enriched epigenetic signature at the 8-cell stage
As epigenetic modifier genes such as TRIM28 were differentially expressed during preimplantation development (Additional file 2: Figure S3A), we constructed networks of chromatin modifying enzymes/remodelling factors (Additional file 3: Table S4). More Epigenetic regulatory genes were expressed in the 8-cell embryo (102 genes) compared to the blastocyst (40 genes). Only two genes, HDAC2 and YY1, were upregulated at both stages compared to the 4-cell embryo. Both of these genes were identified as key 8-cell and blastocyst network members (Additional file 2: Figure S2E and F). HDAC2 is a downstream target of the blastocyst TRIM28 network (Additional file 2: Figure S3A), whilst YY1 is a centrally connected gene (Additional file 2: Figure S2C and D) in the 8-cell and blastocyst embryo. Overall, the larger subset of histone acetyltransferases, methyltransferases and deacetylases identified in the 8-cell embryo, indicated that these genes play a part in epigenetic remodelling at this stage.
Due to the upregulated epigenetic-associated gene expression at the 8-cell stage, we assessed the expression of epigenetic regulatory genes within the individual 8-cell blastomeres (Fig. 5). Individual 8-cell blastomeres were significantly enriched (P < 3.4 × 10− 20) for chromatin modification enzymes and remodelling factors. TRIM28 network genes, HDAC2 and ZSCAN4 were expressed in all blastomeres. However global epigenetic gene expression patterns revealed two groups of individual 8-cell blastomeres; B3/B4/B6, and B1/B2/B5/B7/B8.
Comparison to published single blastomere RNAseq data
To validate and extend our findings of blastomere transcriptional heterogeneity, we analysed single-cell RNAseq data from 81 individual 8-cell human blastomeres . After outlier removal, 59/81 published blastomere datasets of high quality (embryos in which ≥4 of the 8 blastomeres were recovered) were used in further analysis. A PCA of the remaining blastomeres highlighted that the greatest variation in gene expression existed between the individual embryos rather than the individual blastomeres (Fig. 6a). Once samples were normalised for inter-embryo variation we were able to detect differences between individual blastomeres regardless of their embryo origin (Fig. 6b). We identified the presence of 4 sets of genes; of which group 3 was enriched (p = 0.0012) for Pluripotency of Embryonic Stem Cells (Additional file 1: Supplementary Results, Additional file 2: Figure S7 and Additional file 3: Table S5).
Although there is an increasing body of information about preimplantation human development, relatively little is understood of molecular mechanisms and in particular there is little quantitative information on regulatory gene networks governing the first cell fate decisions in development. Our analysis extends understanding in this area significantly by identifying gene networks involved in EGA and blastomere transcriptional identity and cell fate (Fig. 7).
Gene networks involved in development during EGA
In humans the major wave of EGA occurs between the 4-cell and 8-cell stage [5, 6], although the mechanisms by which this activation is translated into downstream regulation of embryonic development are unclear. We identified 7 distinct patterns of expression in the human embryonic transcriptome corresponding to different early or late patterns around the time of EGA. Eukaryotic initiation factors (EIFs) are closely associated with transcription activation in mammalian development . We report here a pattern of EIF expression which closely follows EGA, and may be responsible for translating its impact. EIF expression peaked in the 8-cell embryo, which agrees with data of Vasenna et al, who identified three waves of transcriptional activation in humans; the last wave occurring at the 6 to 8-cell stage being responsible for the transcription of the largest number of genes . EIF2S2, EIF3I, and EIF4M are up-regulated in the 8-cell embryo, with only EIF2S2 having been previously detected in hESCs and mESCs . EIF expression and activity seems likely to be one of the main mechanisms driving expression of the embryonic developmental programme post-EGA.
Using multiple methods of analysis provides a high confidence framework for understanding the relationship between gene networks at different stages of human preimplantation development. This revealed that TRIM28/KAP1, MDM2, HDAC2 and TP53 and the networks of genes they regulate are of central importance at the 8-cell to blastocyst transition. TRIM28 recruits chromatin modification enzymes and remodelling factors to form a repressive chromatin complex and its expression is essential in mice . We report that TRIM28 controls a central network of 46 genes, regulating MYC, TP53 and MDM2, suggesting this is essential in embryonic development across species, including humans.
Our analysis showed that genes involved in epigenetic regulation were enriched during late EGA. This expression coincides with the highly transient period of genomic and epigenomic reorganisation [43,44,45,46]. We observed the greatest epigenetic signature enrichment in the 8-cell embryo, with more histone deacetylases, methyltransferase and demethylases expressed than DNA methyltransferases and demethylases. DNA modifications are thought to be the main form of epigenetic regulation applied later in development when cells differentiate into particular cell types and genes exhibit long term repression, whereas histone modifications are the major form of gene silencing in early differentiation . The removal of histone modifications is enzymatically easy and so may be preferable during dynamic periods of genome reorganisation, whereas removal of such groups from DNA carry the risk of deleterious DNA mutations. Our analysis of transcriptomic data suggest that in the preimplantation human embryo histone modifications may be the main epigenetic regulatory mechanism during the transient and dynamic period of genome reorganisation. Although, of course, protein synthesis and activity of epigenetic regulators may not necessarily follow transcript expression and further epigenomic analysis would be required to confirm this observation. However, modifications to the epigenome are closely correlated with the transcriptome in developing systems . Epigenetic regulatory networks and chromatin modifying genes in particular were expressed at the 8-cell stage, and in light of their potential role in gene silencing, we explored their association with blastomere identity. Epigenetic gene expression patterns revealed two groups of 8-cell blastomeres (Fig. 7).
Networks involved in 8 cell blastomere identity and fate
To understand the molecular mechanisms by which individual 8-cell blastomeres achieve diverging cellular identities and either retain pluripotency or differentiate towards TE, we assessed the transcriptome of individual 8-cell stage blastomeres using gene network analyses. It has been previously shown that blastomeres have different transcriptional [48,49,50] and epigenomic identities . In this manuscript we have defined the relationship with respect to both the global transcriptome and to gene families involved in specifying cell fate. Analysis of genes involved in the hippo signalling pathway, cell polarity and pluripotency showed that no individual blastomere exhibited an identical expression profile to any other blastomere within the same embryo or to those of other 8-cell embryos. The generation of single sample networks for the individual blastomeres global transcriptomes allowed us to quantify the differences/similarities in modules between blastomeres and to map these with high confidence to the embryonic developmental interactomes identified earlier. Groups of blastomeres were identified as more or less similar to each other, and to global transcriptomes of ICM and/or TE which may represent transcriptional priming towards one or other lineage (Fig. 7).
We examined if the developmental changes identified around the time of EGA mapped onto the blastomere transcriptome, i.e. is blastomere transcriptional identity driven by EGA? The EIF family genes involved in translating EGA expression, revealed a close relationship with blastomere identity. Half of the post-EGA differentially expressed EIFs displayed disparate expression, with blastomere B1 expressing all EIFs and blastomere B2 and B8 expressing the least EIFs, suggesting that loss of EIFs may be a potential first point of differentiation [9, 12].
We accounted for more variation in the data by the application of dimensionality reduction techniques (PCA and IsoMap), which demonstrated a direct relationship between the global transcriptomic profiles of six blastomeres with ICM, of which all but B3 show differentially regulated network module similarities to ICM (Fig. 7). IsoMap also identified a direct relationship in the global genetic profiles of blastomeres B8 and B2 with TE, and blastomere B2 showed differentially regulated network module similarities to TE.
The network approaches used in this work are resistant to random error due to the scale free nature of the interactome  and therefore form a good basis for the comparison of blastomeres where inter individual variation has been recognised. A similar separation of the blastomeres is apparent after comparison to the most detailed publicly available RNAseq dataset from Petropoulos et al . Overall, the blastomeres enriched in the Petropoulos RNAseq group 3 genes have a stronger divergence towards ICM, whereas the blastomeres enriched in RNAseq group 1 genes have no particular lineage bias (Fig. 7). Having established the likely transcriptional bias of blastomeres towards ICM or TE, we further suggest that the pluripotent state leading to ICM is associated with maintenance of EIF expression, and epigenetic status.
We have applied a combination of systems biology approaches to identify a high confidence framework of human preimplantation embryonic networks focussed on EGA and its relationship to blastomere cell fate. This will be uniquely valuable for understanding early human development, where classical experimental design cannot be easily applied due to the rare and protected nature of human embryos. Moreover our gene regulatory framework highlights points of susceptibility during human embryo development, particularly around gene networks involved in EGA and blastomere fate. This provides the scientific community with the opportunity to explore the mechanisms underlying early programming of development, long term health e.g. according to the Developmental Origins of Health and Disease (DOHaD), and a baseline of normal development against which ART practitioners can assess the impact of new clinical technologies [53, 54].
Utilising multiple layers of computational evidence, we have detected two sets of blastomeres within an 8-cell embryo which exhibit diverging tendencies towards ICM and TE (Fig. 7). Overall our results suggest that the majority of blastomeres are still pluripotent and are unlikely to be lineage-specified or committed. This observation agrees with previous studies indicating that 5–8 cell blastomeres showed equal expression in all ICM, stem-ness and TE markers , and with RNAseq data, revealing relatively little transcriptional divergence at the 8-cell stage . However we have identified significantly more heterogeneity between blastomeres than has been previously reported and through the application of our network module approach we have detected biologically significant biases with functional relevance which may prime cells for the earliest cell fate determination. Our approach allows a greater depth of predicted functional analysis than other previously used single dimensional approaches.
Embryonic Genome Activation
Eukaryotic Initiation Factors
False Discovery Rate
Inner Cell Mass
Principal component analysis
Transcripts per million
Morgan HD, Santos F, Green K, Dean W, Reik W: Epigenetic reprogramming in mammals. Hum Mol Genet 2005, 14 Spec No 1:R47–R58.
Ly L, Chan D, Trasler JM. Developmental windows of susceptibility for epigenetic inheritance through the male germline. Semin Cell Dev Biol. 2015;43:96–105.
De Paepe C, Krivega M, Cauffman G, Geens M, Van de Velde H. Totipotency and lineage segregation in the human embryo. Mol Hum Reprod. 2014;20:599–618.
Latham KE, Schultz RM. Embryonic genome activation. Front Biosci. 2001;6:D748–59.
Braude P, Bolton V, Moore S. Human gene expression first occurs between the four- and eight-cell stages of preimplantation development. Nature. 1988;332:459–61.
Vassena R, Boue S, Gonzalez-Roca E, Aran B, Auer H, Veiga A, Izpisua Belmonte JC. Waves of early transcriptional activation and pluripotency program initiation during human preimplantation development. Development. 2011;138:3699–709.
Guo H, Zhu P, Yan L, Li R, Hu B, Lian Y, Yan J, Ren X, Lin S, Li J, et al. The DNA methylation landscape of human early embryos. Nature. 2014;511:606–10.
Bloor DJ, Metcalfe AD, Rutherford A, Brison DR, Kimber SJ. Expression of cell adhesion molecules during human preimplantation embryo development. Mol Hum Reprod. 2002;8:237–45.
Brison DR, Sturmey RG, Leese HJ. Metabolic heterogeneity during preimplantation development: the missing link? Hum Reprod Update. 2014;20:632–40.
Wong CC, Loewke KE, Bossert NL, Behr B, De Jonge CJ, Baer TM, Reijo Pera RA. Non-invasive imaging of human embryos before embryonic genome activation predicts development to the blastocyst stage. Nat Biotechnol. 2010;28:1115–21.
Galan A, Montaner D, Poo ME, Valbuena D, Ruiz V, Aguilar C, Dopazo J, Simon C. Functional genomics of 5- to 8-cell stage human embryos by blastomere single-cell cDNA analysis. PLoS One. 2010;5:e13615.
Shaw L, Sneddon SF, Zeef L, Kimber SJ, Brison DR. Global gene expression profiling of individual human oocytes and embryos demonstrates heterogeneity in early development. PLoS One. 2013;8:e64192.
Shaw L, Sneddon SF, Brison DR, Kimber SJ. Comparison of gene expression in fresh and frozen-thawed human preimplantation embryos. Reproduction. 2012;144:569–82.
Brady G, Iscove NN. Construction of cDNA libraries from single cells. Methods Enzymol. 1993;225:611–23.
Al-Taher A, Bashein A, Nolan T, Hollingsworth M, Brady G. Global cDNA amplification combined with real-time RT-PCR: accurate quantification of multiple human potassium channel genes at the single cell level. Yeast. 2000;17:201–10.
Iscove NN, Barbara M, Gu M, Gibson M, Modi C, Winegarden N. Representation is faithfully preserved in global cDNA amplified exponentially from sub-picogram quantities of mRNA. NatBiotechnol. 2002;20:940–3.
RCoreTeam. R: a language and environment for statistical computing. Vienna, Austria: R Foundation for Statistical Computing; 2016. https://www.r-project.org/
Gentleman RC, Carey VJ, Bates DM, Bolstad B, Dettling M, Dudoit S, Ellis B, Gautier L, Ge Y, Gentry J, et al. Bioconductor: open software development for computational biology and bioinformatics. Genome Biol. 2004;5:R80.
Kauffmann A, Gentleman R, Huber W. arrayQualityMetrics--a bioconductor package for quality assessment of microarray data. Bioinformatics. 2009;25:415–6.
Pepper SD, Saunders EK, Edwards LE, Wilson CL, Miller CJ. The utility of MAS5 expression summary and detection call algorithms. BMC Bioinformatics. 2007;8:273.
Benjamini Y, Drai D, Elmer G, Kafkafi N, Golani I. Controlling the false discovery rate in behavior genetics research. Behav Brain Res. 2001;125:279–84.
Cline MS, Smoot M, Cerami E, Kuchinsky A, Landys N, Workman C, Christmas R, Avila-Campilo I, Creech M, Gross B, et al. Integration of biological networks and gene expression data using Cytoscape. NatProtoc. 2007;2:2366–82.
Chatr-Aryamontri A, Breitkreutz BJ, Oughtred R, Boucher L, Heinicke S, Chen D, Stark C, Breitkreutz A, Kolas N, O'Donnell L, et al. The BioGRID interaction database: 2015 update. Nucleic Acids Res. 2015;43:D470–8.
Saito R, Smoot ME, Ono K, Ruscheinski J, Wang P-L, Lotia S, Pico AR, Bader GD, Ideker T. A travel guide to Cytoscape plugins. Nat Methods. 2012;9:1069.
Szalay-Beko M, Palotai R, Szappanos B, Kovacs IA, Papp B, Csermely P. ModuLand plug-in for Cytoscape: determination of hierarchical layers of overlapping network modules and community centrality. Bioinformatics. 2012;28:2202–4.
Kramer A, Green J, Pollard J Jr, Tugendreich S. Causal analysis approaches in ingenuity pathway analysis. Bioinformatics. 2014;30:523–30.
McCall MN, Bolstad BM, Irizarry RA. Frozen robust multiarray analysis (fRMA). Biostatistics. 2010;11:242–53.
McCall MN, Jaffee HA, Zelisko SJ, Sinha N, Hooiveld G, Irizarry RA, Zilliox MJ. The gene expression barcode 3.0: improved data processing and mining tools. Nucleic Acids Res. 2014;42:D938–43.
Zilliox MJ, Irizarry RA. A gene expression bar code for microarray data. Nat Methods. 2007;4:911–3.
Fontes M, Soneson C. The projection score--an evaluation criterion for variable subset selection in PCA visualization. BMC Bioinformatics. 2011;12:307.
Killick R, Eckley IA. Changepoint: an R package for Changepoint analysis. J Stat Softw. 2014;58:1–19.
Petropoulos S, Edsgard D, Reinius B, Deng Q, Panula SP, Codeluppi S, Plaza Reyes A, Linnarsson S, Sandberg R, Lanner F. Single-cell RNA-Seq reveals lineage and X chromosome dynamics in human preimplantation embryos. Cell. 2016;165:1012–26.
Huang DW, Sherman BT, Tan Q, Kir J, Liu D, Bryant D, Guo Y, Stephens R, Baseler MW, Lane HC, Lempicki RA. DAVID bioinformatics resources: expanded annotation database and novel algorithms to better extract biology from large gene lists. Nucleic Acids Res. 2007;35:W169–75.
Stevens A, Smith H, Garner T, Minogue B, Sneddon S, Shaw L, Oldershaw R, Bates N, Brison D, Kimber S. Interactome comparison of human embryonic stem cell lines with the inner cell mass and trophectoderm. In: bioRxiv; 2018.
Nishioka N, Inoue K, Adachi K, Kiyonari H, Ota M, Ralston A, Yabuta N, Hirahara S, Stephenson RO, Ogonuki N, et al. The hippo signaling pathway components Lats and yap pattern Tead4 activity to distinguish mouse trophectoderm from inner cell mass. Dev Cell. 2009;16:398–410.
Suzuki A, Ohno S. The PAR-aPKC system: lessons in polarity. J Cell Sci. 2006;119:979–87.
Niwa H, Toyooka Y, Shimosato D, Strumpf D, Takahashi K, Yagi R, Rossant J. Interaction between Oct3/4 and Cdx2 determines trophectoderm differentiation. Cell. 2005;123:917–29.
De Sousa PA, Watson AJ, Schultz RM. Transient expression of a translation initiation factor is conservatively associated with embryonic gene activation in murine and bovine embryos. Biol Reprod. 1998;59:969–77.
Latham KE. Mechanisms and control of embryonic genome activation in mammalian embryos. Int Rev Cytol. 1999;193:71–124.
Sonenberg N, Hinnebusch AG. Regulation of translation initiation in eukaryotes: mechanisms and biological targets. Cell. 2009;136:731–45.
Van Hoof D, Passier R, Ward-Van Oostwaard D, Pinkse MW, Heck AJ, Mummery CL, Krijgsveld J. A quest for human and mouse embryonic stem cell-specific proteins. Mol Cell Proteomics. 2006;5:1261–73.
Messerschmidt DM, de Vries W, Ito M, Solter D, Ferguson-Smith A, Knowles BB. Trim28 is required for epigenetic stability during mouse oocyte to embryo transition. Science. 2012;335:1499–502.
Reik W. Stability and flexibility of epigenetic gene regulation in mammalian development. Nature. 2007;447:425–32.
Reik W, Dean W, Walter J. Epigenetic reprogramming in mammalian development. Science. 2001;293:1089–93.
Xie W, Schultz MD, Lister R, Hou Z, Rajagopal N, Ray P, Whitaker JW, Tian S, Hawkins RD, Leung D, et al. Epigenomic analysis of multilineage differentiation of human embryonic stem cells. Cell. 2013;153:1134–48.
Messerschmidt DM, Knowles BB, Solter D. DNA methylation dynamics during epigenetic reprogramming in the germline and preimplantation embryos. Genes Dev. 2014;28:812–28.
Stevens A, Hanson D, Whatmore A, Destenaves B, Chatelain P, Clayton P. Human growth is associated with distinct patterns of gene expression in evolutionarily conserved networks. BMC Genomics. 2013;14:547.
Piras V, Tomita M, Selvarajoo K. Transcriptome-wide variability in single embryonic development cells. Sci Rep. 2014;4:7137.
Shi J, Chen Q, Li X, Zheng X, Zhang Y, Qiao J, Tang F, Tao Y, Zhou Q, Duan E. Dynamic transcriptional symmetry-breaking in pre-implantation mammalian embryo development revealed by single-cell RNA-seq. Development. 2015;142:3468–77.
Zdravkovic T, Nazor KL, Larocque N, Gormley M, Donne M, Hunkapillar N, Giritharan G, Bernstein HS, Wei G, Hebrok M, et al. Human stem cells from single blastomeres reveal pathways of embryonic or trophoblast fate specification. Development. 2015;142:4010–25.
Chavez SL, McElroy SL, Bossert NL, De Jonge CJ, Rodriguez MV, Leong DE, Behr B, Westphal LM, Reijo Pera RA. Comparison of epigenetic mediator expression and function in mouse and human embryonic blastomeres. Hum Mol Genet. 2014;23:4970–84.
Xue Z, Huang K, Cai C, Cai L, C-y J, Feng Y, Liu Z, Zeng Q, Cheng L, Sun YE, et al. Genetic programs in human and mouse early embryos revealed by single-cell RNA sequencing. Nature. 2013;500:593.
Harper J, Cristina Magli M, Lundin K, Barratt CLR, Brison D. When and how should new technology be introduced into the IVF laboratory? Hum Reprod. 2012;27:303–13.
Brison DR, Roberts SA, Kimber SJ. How should we assess the safety of IVF technologies? Reprod BioMed Online. 2013;27:710–21.
We would like to thank the IVF patients who kindly donated their embryos to this research programme, and the clinic staff at the Department of Reproductive Medicine, St Mary’s Hospital, Manchester, Manchester Fertility, Manchester, and the Hewitt Centre, Liverpool Women’s Hospital, Liverpool, who made this possible. We would especially like to thank our Senior Research nurse, Claudette Wright for co-ordinating these activities.
The work presented here was conducted as part of two EU projects: FP7-HEALTH-2011-TWO-STAGE-278418, EPIHEALTH and FP7-PEOPLE-2012-ITN, PITN-GA-2012-317146, EPIHEALTHNET. In accordance with FP7, no new human embryos were used in this research. We also acknowledge funding from the Medical Research Council (MR/L004992/1), Central Manchester NHS Foundation Trust, the NIHR Manchester Clinical Research Facility and the University of Manchester. The funding bodies provided support for the work described in this manuscript all design, analysis and interpretation of data was carried out by the authors.
Availability of data and materials
The datasets generated and analysed during the current study are available in the GEO repository (GSE110693).
Ethics approval and consent to participate
Ethical approval was obtained from the South Manchester (NHS) Research Ethics Committee under Human Fertility and Embryology Authority research licence R0026.
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplemental Methods and results (DOCX 3560 kb)
Figure S1. Un-normalised Affymetrix microarrays A) Boxplots represent summaries of the signal intensity distributions of the arrays. Each box corresponds to one array. B) Boxplot outlier detection was performed by computing the Kolmogorov-Smirnov statistic Kabetween each array’s distribution and the distribution of the pooled data. None of the samples have medians higher than 1.05, which would represent a low quality array C) Density distributions of the log2intensities grouped by the matching type of the probes. The blue line shows a density estimate (smoothed histogram) from intensities of perfect match probes (PM), the grey line, one from the mismatch probes (MM). D) RNA digestion plot. The shown values are computed from the pre-processed data. Each array is represented by a single line. E) MA plots (M = log2 (I1)-log2 (I2), A = 1/2(log2 (I1) + log2 (I2)), where I1 is the intensity of the array studied, and I2 is the intensity of a “pseudo”-array that consists of the median across arrays. The mass of the distribution in an MA plot should be concentrated along the M = 0 axis, and there should be no trend in M as a function of A. Shown are first the 4 arrays with the highest values of Da, then the 4 arrays with the lowest values. F) An example of feature intensities representing the arrays’ spatial distributions (M). Figure S2. A and B) Embryonic genome activation (EGA) interaction networks, differential regulation within the 8-cell and blastocyst compared to the 4-cell. C and D) Metanodes of the 8-cell and blastocyst compared to the 4-cell, metanodes as defined by the Cytoscape plugin ‘Moduland’, metanodes represent genes most central within each module. Red genes represent up-regulation, green nodes represent down-regulation and pink nodes represent non-differentially regulated genes but baseline expressed direct interaction partners. E and F) Tables of network module members. Metanodes represent the most centrally connected gene within a module alongside the next 10 centrally connected genes within each module. Modules are ranked in order from most to least centrally connected within the specific developmental network. Yellow highlighted genes are also identified as Ingenuity causal network genes. Figure S3. A) TRIM28 upstream regulatory network in blastocysts. MDM2 is the only target gene regulated by both upstream regulators MYC and TP53. TRIM28 together with MYC and TP53, may represent the upstream transcriptional control network over the MDM2 module in the blastocyst and provide upstream regulation of epigenetic networks. B) MDM2 is identified as a key module and upstream regulator at the blastocyst stage. MDM2 together with 22 participating regulators, controls the expression of 93 differentially regulated genes within the blastocyst. The participating regulator, transcription factor GATA3, is up-regulated 867-fold in the blastocyst. Pink nodes represent genes identified within module analysis and red nodes represent genes identified within both module and upstream regulatory network analysis. Pathway analysis of the MDM2 module reveals statistically over-represented (p ≤ 0.05) Reactome pathways, ordered from 1 to 10 according to their significance p-value. The MDM2 module is biologically relevant, with 6/10 of the top MDM2 module genes being members of the hedgehog signalling ‘on state’ pathway. Figure S4. A) 107 co-expression functional modules B) The frequency a module of a specific size was detected using co-expression analysis. C) Overlap of the intra-modular hubs between different methods. In order to have enough genes for the comparison between different methods, all the co-expression modules for the robustness evaluation were selected by including more than 5 genes. The diagonal of the table indicates the numbers of the total genes in each method; the lower triangular matrix shows the numbers of overlapping genes between the corresponding two methods; the upper triangular matrix shows the hypergeometric p-values for the numbers of overlapping genes. Figure S5. Each panel represents a single 8-cell blastomeres top 25 network modules (columns) and the top 10 centrally connected genes within each module. Blastomere networks and modules identified using the absolute expression values of 8-cell blastomeres. Modules are ranked in order from most (left) to least (right) centrally connected within the specific blastomere network. The most centrally connected gene within each module are shown in bold and the remaining genes are ranked from most (top) to least (bottom) centrally connected within a specific module. Blue highlighted genes are also identified as upstream regulatory genes. Figure S6. qPCR expression (ΔCt) of Hippo signalling, pluripotency and polarity genes across three sets of 8-cell blastomeres. The first set of blastomers are labelled A1-A8, the second set of blastomeres are labelled B1-B8 and the final set of blastomeres are labelled C1-C8. ΔCt was calculated as 40-Ct. Positive and negative bars represent standard error of the mean. Figure S7. Heat map of individual 8-cell blastomere RNAseq data extracted from Petropoulos et al. Heatmap displays individual 8-cell blastomeres on the horizontal axis and genes on the vertical axis. Individual blastomeres are clustered according to gene expression similarity. After outlier removal we used 59 of the 81 published samples. Embryo origin normalised and variance filter applied (0.21) to exclude noise. Resulting in 588 probes, separated into four groups based on hierarchal clustering. Red represents increased gene expression and blue represents decreased gene expression. (PPTX 6500 kb)
Supplemental Tables (XLSX 274 kb)
About this article
Cite this article
Smith, H.L., Stevens, A., Minogue, B. et al. Systems based analysis of human embryos and gene networks involved in cell lineage allocation. BMC Genomics 20, 171 (2019). https://doi.org/10.1186/s12864-019-5558-8
- Human embryos
- Gene networks
- Cell lineage