Gene expression throughout a vertebrate's embryogenesis

Background Describing the patterns of gene expression during embryonic development has broadened our understanding of the processes and patterns that define morphogenesis. Yet gene expression patterns have not been described throughout vertebrate embryogenesis. This study presents statistical analyses of gene expression during all 40 developmental stages in the teleost Fundulus heteroclitus using four biological replicates per stage. Results Patterns of gene expression for 7,000 genes appear to be important as they recapitulate developmental timing. Among the 45% of genes with significant expression differences between pairs of temporally adjacent stages, significant differences in gene expression vary from as few as five to more than 660. Five adjacent stages have disproportionately more significant changes in gene expression (> 200 genes) relative to other stages: four to eight and eight to sixteen cell stages, onset of circulation, pre and post-hatch, and during complete yolk absorption. The fewest differences among adjacent stages occur during gastrulation. Yet, at stage 16, (pre-mid-gastrulation) the largest number of genes has peak expression. This stage has an over representation of genes in oxidative respiration and protein expression (ribosomes, translational genes and proteases). Unexpectedly, among all ribosomal genes, both strong positive and negative correlations occur. Similar correlated patterns of expression occur among all significant genes. Conclusions These data provide statistical support for the temporal dynamics of developmental gene expression during all stages of vertebrate development.


Background
Much effort has been expended to define developmental stages: cellular and morphological hallmarks of critical points during embryogenesis. Stages, unlike developmental time alone, provide insights into cellular and molecular processes as simple as the eight-cell stage or as complex as the onset of circulation. Although numerous aspects of development have been discovered through studies of diverse species [1][2][3], a comprehensive analysis of gene expression for each separate stage of vertebrate development is lacking.
Among vertebrates, developmental processes are shared [4,5]; thus insights from fish inform human studies [4,5]. Fundulus heteroclitus, similar to zebrafish and medaka (rice fish), has external development and transparent eggs, which facilitate associations between morphological developmental changes and patterns of gene expression. Unlike many other developmental models, F. heteroclitus has a~14 day development that allows greater precision in defining specific stages than is possible in vertebrate species with short developmental times. Finally, this species has large population sizes, a well-described phylogeny, and locally adapted populations making it an exceptional model for environmental and evolutionary studies [6].
To provide quantitative and statistical analyses of development, we used four biological replicates from all 40 developmental stages (from fertilization to free swimming larvae, Figure 1A; Additional File 1 shows the full in vivo morphological atlas and Additional File 2 shows late organodifferentiation histology) of F. heteroclitus. These biological replicates provide the data for statistical analyses of the expression of 6,857 genes throughout embryogenesis and a better understanding of the differences among stages and developmental pathways. Although 6,857 genes are not the full complement of Significance of Effect (-log p-value) 10 Log Fold-Differences between Stages 2 1 0.4 <0. 13 2.3 >7 Expression Level 1 2 vs. 8 9 vs. 15 16 vs. 22 23 vs. 29 30 vs. 36 37 vs. 2 3 vs. 9 1 0 vs. 16 17 vs. 23 24 vs. 30 31 vs. 37 38 vs. 3 4 vs. 10 11 vs. 17 18 vs. 24 25 vs. 31 32 vs.
C. Hierarchical clustering of forty developmental stages based on shared gene expression patterns. Three majors clusters are colored yellow, red and blue and correspond to gene expression patterns. D. Pairwise differences between adjacent stages of F. heteroclitus development. Significances of differences as -log 10 (p-values) are plotted against log 2 differences in expression of adjacent stages. -log 10 (p-values) range from 0 to 8 and log 2 differences in expression range from -3 to 3 (-8-fold to 8-fold differences in expression). Numbers of significant genes (p < 0.01) that differ between stages are shown. Colors in these plots correspond to the colors of the gene tree in 1B.
genes expressed in vertebrates, they provide a statistically robust measure of differences between stages, which is important for experimental sciences that explore embryo responses to altered environments, chemical exposures and physiological differences.

Results and Discussion
Expression of 1,607 genes (23% of 6,857) differs significantly (p < 0.01) among all stages. Our a priori query addressed the number of significant differences between adjacent stages: expression of 3,062 genes (45%) differs significantly between stages. Hierarchical clustering of these genes ( Figure 1B and 1C) groups stages into three main clusters: 1) stages 1-13: fertilization-early gastrula; 2) stages 14-25: blastoderm-heart formation; 3) stages 26-40: onset of circulation-late organogenesis and posthatch. These groups are in developmental order with a few exceptions ( Figure 1C). Thus, the patterns of gene expression recapitulate the series of developmental stages.
The number of genes that alter expression between adjacent stages differs widely, from 5 to 665 ( Figure 1D Additional File 3). At a critical p-value of 0.01, one expects approximately 70 differences due to type-1 errors. Fifteen pairs of adjacent stages have less than this false expectation. Three of these pairs of adjacent stages with few significant genes occur among gastrulation stages (12-13, 16-17, 17-18) and two are prior to the 4-cell stage (1-2, 2-3). Not surprisingly, few differences in expression occur between fertilization and first cell division. In contrast, the lack of changes in expression throughout formation of the germ layers during gastrulation is unexpected.
Expression of 610 (8.9%) and 461 (6.7%) genes are significantly different between the 4-cell (stage 4) and 8cell (stage 5) stages and the 8-cell and 16-cell (stage 6) stages, respectively. During zebrafish embryogenesis, these stages correspond to the timing of maternal gene degradation and onset of embryo gene activity [7]. The large number of significantly differently expressed genes suggests similar timing for F. heteroclitus. Notice, a large number of genes show both significant increases and decreases from stage 4 to 5 ( Figure 1D). However, between stages 5 and 6, most significant differences (83%) are increases in expression (negative value for log 2 values of stage 5 -log 2 values of stage 6) suggesting that initiation of embryonic gene expression becomes more dominant at stage 6.
The greatest number of differentially expressed genes (665, 9.7%) occurs between pre-hatching (stage 35) and hatching (stage 36), which validates our incentive to distinguish these two stages in contrast to previous work [8]. Other notable differences occur between stages 38 and 39 (294, 4.3% of genes), when most of the yolk is consumed by the free-swimming Fundulus larvae (eleutheroembyros [9]) and between stages 25 and 26 (223, 3.3% of genes), marked by the onset of circulation.
Times to stage have large variances (Table 1 Additional File 4 gives full stage descriptions). For example, stages 5 and 6 are on average one hour apart yet have a combined standard deviation greater than one hour. Variability in times to stage becomes more pronounced as later stages become longer (Table 1). For stages 35 and 36, with 665 differences in expression, the mean times to stage differ by fourteen hours, which is nearly equal to the standard deviation for each stage. Thus, using time alone rather than developmental markers can lead to misinterpretations of gene expression changes.
The wave of maximum gene expression throughout development provides insight into how the magnitude of expression relates to developmental processes ( Figure  2AAdditional File 3). Four distinct quadrants, similar to the hierarchical clustering results, are formed: stages 1-13 (unfertilized egg-early gastrula), stages 14-25 (blastoderm-onset of circulation), stages 26-35 (growth and organodifferentiation-pre-hatching), and stages 36-40 (hatching and growth). Pre-mid gastrulation (stage 16) has the most genes (738) with peak expression, followed by pre-hatching (stage 35, 521 genes) and the 8-cell stage (stage 5, 400 genes). Thus, although most genes have a significant increase in expression between 8 and 16 cell stages, the greatest number of genes reaches maximum expression during pre-mid gastrulation. Notably, the lack of many significant differences among gastrulation stages (15-18, Figure 1) corresponds to the large block of maximum expression. These data suggest that both qualitative differences in expression (initiation of new gene expression) and quantitative differences (as reflected in peak gene expression) are important for defining developmental processes.
Genes with peak expression during stage 16 (pre-mid gastrula) are significantly enriched for ribosomal genes (p < 2.7 × 10 -16 , Fisher exact test, Table 1), which show highly positively correlated expression patterns ( Figure  2BAdditional File 5). Interestingly, these ribosomal genes are significantly negatively correlated with many of the other ribosomal genes expressed during development. Stage 16 also is significantly enriched for genes involved in oxidative phosphorylation, fatty acid metabolism, and translation as well as proteases (p < 0.0425, 0.0060, 0.0049, and 0.0067, respectively, Fisher exact tests, Table 1). Taken together, these data suggest that peak gene expression during gastrulation enhances highenergy demands during cellular proliferation and protein synthesis and turnover.
Among the 1,607 genes significantly differently expressed across all stages, almost half (49.2%) have a significant correlation coefficient (> 0.4 or < -0.4,  . Several post-hatch up regulated genes have important functions in muscle tissue development and movement including parvalbumins alpha and beta (calcium-binding proteins involved in muscle relaxation have 5-9-fold higher expression posthatch [10,11]), myosin regulatory light chain, skeletal muscle isoform (6.1-fold increase [12]), myosin light chain 3, skeletal muscle gene (5.2-fold increase [13]), myosin binding protein C (3.4-fold increase [14,15]) and troponins I, T and C (~3-fold increases [16]). Increased transcript levels of these genes suggest increases in movement and muscle activity of a free-swimming Fundulus compared to restricted movement within a chorion microenvironment before hatching. In addition, the creatine-kinase system is important for energy delivery in skeletal and cardiac muscle [17], and the 3-fold up regulation of muscle type creatine-kinase post-hatch indicates increases in metabolic activity and ATP consumption resulting from skeletal muscle activity caused by swimming.
Low transcript levels pre-hatch, and higher levels posthatch of both retinal-cone rhodopsin-sensitve cGMP (2.4-fold increase) and photosystem I reaction center subunit II-like gene (3.6-fold increase) are associated with vision [18,19]. Since both genes are induced by light and Fundulus embryos are exposed to higher light intensity after hatching than within the chorion [20], the up regulation of these two genes suggest embryo photoreceptor system response to a brighter external environment. Finally, several up regulated genes post-hatch suggest changes in metabolic activity of the free-swimming Fundulus. These include nucleoside diphosphate kinase NBR-B (6-fold increase), required for nucleoside triphosphate synthesis (other than ATPs) and necessary for lipid and polysaccharide synthesis, protein elongation, signal transduction and microtubule polymerization [21,22], trifunctional purine biosynthetic protein adenosine-3 (5.6-fold increase) which plays a major role in purine biosynthesis [23,24] and fatty acid-binding protein required for lipid transport and metabolism. Both liver and heart fatty acid binding protein genes are up regulated in Fundulus post hatch (liver: 2.3-fold, heart: 2-fold), and the increased expression levels likely reflect changes taking place within the last two stages of Fundulus development marked by the transition period between complete yolk consumption and increasing dependence on external food sources.

Conclusions
Statistical analyses of nearly 7,000 Fundulus genes during all 40 stages of vertebrate embryogenesis highlight the temporal dynamics of developmental gene expression and stage clustering. Analyses of differences in gene expression between adjacent stages and onset of peak gene activity emphasize the importance of correctly identifying stages during embryogenesis. These data show clear differences between the transitions among stages: 4-16 cell stages have many more significant changes than similar stages (e.g., onset of early blastula has 1/10 the number of significant genes); pre-post hatching involves the largest number of significant temporal changes in gene expression associated with transition from egg enclosed embryos to free swimming  larvae. These statistical analyses are possible because gene expression was quantified with biological replication among well-defined stages.

Fish field collection and maintenance
Adult Fundulus heteroclitus were captured from King's Creek, VA (37°17' 54.04"N; 76°25' 32.06"W) using minnow traps and transported under controlled temperature and aeration conditions to NCSU Aquatic Laboratory. Fish were maintained at 20°C and 15 ppt salinity in 40 gallon flow-though re-circulating tanks under a pseudosummer light cycle (16 h light/8 h dark) for 2 months prior to embryo culturing. Effluent from the tanks was passed through an activated charcoal filter system, and 20% of the water was changed weekly. Fish were fed (brine shrimp flake, blood meal flake, and Spirulina flake -FOD, Aquatic Biosystems), and checked for health status daily.

Fish spawning, fertilization, development, and egg collection for atlas, histology and gene expression
To minimize variability, we used embryos from a single Fundulus population, developing under controlled laboratory conditions. Mature females were stripped of their eggs and sperm from mature males were collected in separate beakers. Eggs from multiple females were fertilized by sperm from multiple males, and excess sperm were removed. Fertilized embryos were maintained in Petri dishes half submerged in 15 ppt filtered seawater in a 25°C environmental chamber under light during the first two stages of development (2-cell stage). Embryos that successfully reached the 2-cell stage within 3 hours were incubated under a 16 hour light/8 hour dark photoperiod at 25°C in the environmental chamber (818 Low Temperature Illuminated Incubator, Precision Scientific, USA). Fertilization success and embryo progress was monitored daily by examining representative stages during pre-determined time periods ([8]; internal data) using a dissecting stereo microscope (Nikon SME1500, Japan). Time to stage, normal versus abnormal development, and mortality also were recorded. Unfertilized eggs, malformed and/or dead embryos were removed from the population, and times and stages of arrest and abnormal development were recorded accordingly. Once the normally developing embryos reached pre-determined developmental stages, embryos were photographed using a Micropublisher 5.0 RTV Camera (QImaging) fitted on the stereo microscope, immediately placed in pre-chilled 1.5 ml microfuge tubes and snapfrozen at -80°C for later RNA analyses.

Average time to stage and heart rate
To determine the average time to stage for all 40 stages, three embryos from 10 different families (each family consisted of offspring from a single female and male cross) were monitored in individual 20-ml scintillation vials. Identification of each stage was determined using a dissecting stereo microscope (Nikon SME1500, Japan) at 70-80× magnification. Multiple images of developing embryos were taken at different phases of each developmental stage. Images were captured with the Micropublisher 5.0 RTV Camera (QImaging), and catalogued, stored, and analyzed using QCapture Pro imaging software.
We calculated average times during which >50% of observed embryos showed most of the morphological characteristics of a particular stage. The embryos were observed during pre-determined time periods based on the reported Fundulus developmental data [8] and our preliminary results.
The same embryos used to determine average time to stage were used to determine heart rate during early organogenesis (stage 31) and pre-hatching (stage 35). A beating heart is formed, with both chambers completely differentiated and in full view by stage 31, and the heart rate can be accurately determined from that stage on. Individual embryos were placed on a depression slide under the dissecting stereo microscope for 1 minute prior to taking heart rate measurements so that the stressed embryo could re-establish resting heart beat (most Fundulus embryos temporarily arrest their heart beat due to a sudden change of environment, such as transfer from the petri dish to a well-lit slide surface). The heart rate of each embryo was measured by counting the number of heart beats for 30 seconds (preliminary results showed no change in the average heart beat when counts were taken at either 30 second or 1 minute intervals).
Differences among embryos were analyzed with Prism Statistical Software using one-way Analysis of Variance (1-way ANOVA, p < 0.05) for time-to-stage and heart rate differences among embryo groups (families), respectively. A pairwise t-test (p < 0.05) was used to test the differences of the means between families for both timeto-stage and heart rate.

Histology
Decisions to refer to a structure as a specific organ or tissue were made using at least three criteria: 1. Spatial -position, relationship to other structures; 2. Temporal -time at which the structure first appears; 3. Features of its tissue and cellular components. All of these criteria were made possible by the publications of earlier works, which have defined stages in development of various fish species [8,[25][26][27][28]. Observations made with stereoscopic dissection microscopes at relatively low magnification are sufficient to provide a list of structures for early cellular stages of development. Similarly, as organ systems begin to appear, take on pigmentation and/or move, their presence provides markers for consistent recognition of specific developmental landmarks.
What has been less common is provision of detailed histological sections with sufficient resolving power to recognize organs, tissues and their component cell types. Despite the strengths of differentiating stains and greater magnification and resolution, the ability to orient sections within the changing architectural plan of a developing embryo is essential for accurate characterization to organ, tissue and cellular levels of organization. In this work, we used horizontal-longitudinal, sagittal and transverse sections. Horizontal-longitudinal sections cleave the embryo into dorsal and ventral portions. This is analogous to the frontal sections of mammalian organisms. However, since the embryo is curved over the yolk sac, a single plane of section is unlikely to be maintained through the length of the embryo. Sagittal sections cleave the embryo at the midline creating equal right and left halves of the organism. Sections to the side of the sagittal section are referred to as parasagittal. Transverse sections separate rostral from caudal portions of the embryo.
Due to the fact that the embryo and the outer surface of its associated yolk are curved, spatial relationships are difficult to define in entirety. This is a result of the embryo being inside spherical membranes and positioned flat upon a spherical surface subsequently maintained through processing by the cross-linking of proteins in the fixed embryo. Given the above, histological sections yielding true planes completely defining all of the above relationships are rare, if they exist at all. Thus, we often were faced with sections skewed to some degree, along dorsal-ventral, lateral to contralateral, lateral to medial, and/or horizontal to longitudinal orientations. For continuity and for overall representation, we regard relationships seen nearly in their entirety at lower magnification (10× objective) as valuable and these are followed by analysis of smaller areas using higher magnification and for the most part, differentiating power (i.e., the ability to distinguish one structure from its neighbors) giving us the potential to label organs, tissues and cells.
We chose late organogenesis of Fundulus development as a representative stage of major histological structures. During this stage, the heart chambers are fully differentiated and all the major organ systems are developed and fully functional. Upon confirming the stage and measuring the heart rate, the embryo pictures were taken and catalogued, the embryos were fixed in 10% neutral buffered formalin for 24-48 hours and stored overnight in 30% sucrose. The embryos were punctured through the chorion once, using the tip of a hypodermic needle, transferred to the mesh tissue cassettes, and allowed to fix longer overnight due to the thickness of chorionic membranes.
Embryos were embedded in paraffin, trimmed into 100 micron blocks and reinfiltrated in paraffin, and then reimbedded into the block. Tissues were then embedded for sectioning, which was done at 5 microns and placed on Silanized coated slides. Embryos were stained with hemotoxylin and eosin. Histological sections were viewed under a Nikon Eclipse E600 microscope, and the images were taken using Lumenera Infinity 2 (model #2-2C) 2.0 megapixel, 12 fps, CCC color camera. Digital images were analyzed using Eclipse Net Version 1.16.5 software.

Embryo RNA isolation, amplification, and labeling
Pools of frozen embryos collected at each developmental stage were used for RNA isolation, labeling, and microarray hybridization. Four pools of 25 embryos were used for stages 1-10, four pools of 15 embryos were used for stages 11-15, and 4 pools of 10 embryos were used for stages . Embryo RNA was extracted using a TRIzol ® buffer (Invitrogen, Carlsbad, CA, USA). Purified RNA was quantified with a spectrophotometer, and RNA quality was assessed by gel electrophoresis. RNA for hybridization was prepared by one round of amplification (aRNA) using the Amino Allyl MessageAmp aRNA Kit (Ambion, Austin, TX, USA) to form copy template RNA by T7 amplification. Amino-allyl UTP was incorporated into targets during T7 transcription, and resulting amino-allyl aRNA was coupled to Cy3 and Cy5 dyes (GE Healthcare, Piscataway, NJ, USA).
Labeled aRNA samples (2 pmol dye/ul) were hybridized to slides in 10 ul of hybridization buffer [50% formamide buffer, 5× SSPE, 1% sodium dodecyl sulfate, 0.2 mg/ml bovine serum albumin, 1 mg/ml denatured salmon sperm DNA (Sigma), and 1 mg/ml RNAse free poly(A) RNA (Sigma)] for 44 hours at 42°C. Slides were prepared for hybridization by blocking in 5% ethanoloamine, 100 mM Tris pH 7.8, and 0.1% SDS added just before use for 30 minutes at room temperature, washed for one hour in 4× SSC, 0.1% SDS at 50°C, and then boiled for 2 minutes in distilled water to denature the cDNAs. Resulting 16 bit Tiff Images were quantified using ImaGene ® (Biodiscovery, Inc.) spotfinding software. Controls and any gene that did not have at least one individual with a signal greater than the average signal from all herring sperm control spots (non-specific hybridization signal) plus one standard deviation were removed prior to statistical analyses. In total, 6,789 genes were analyzed.

Microarrays
Amplified cDNA sequences for 7,000 genes from F. heteroclitus cDNA libraries were spotted onto epoxide slides (Corning Inc., Corning, NY, USA) using an inkjet printer (Aj100, ArrayJet, Scotland, UK). The cDNAs used for the arrays were derived from libraries made from all 40 stages of Fundulus development, immediately post-hatch whole larvae, and adult tissues. Each slide contained four spatially separated arrays of~7,000 spots (genes) including controls. All spotted genes were sequenced and represent all of the unique contigs [29] isolated from the cDNA libraries. Thus, even if multiple sequences were annotated identically, they were treated as different genes. Multiple sequences with the same annotation do not contig together because: 1) they really are the same gene, but the sequences do not overlap, 2) they represent duplicate genes with different chromosomal locations, or 3) they share a high similarity (and hence are named based on this similarity) but are not the same gene. We erred on the side of caution and treated every gene-spot as unique. Each spot was analyzed as a separate gene for all analyses except the ribosomal correlations. For ribosomal correlations, only ribosomal genes that also had unique names were correlated to minimize correlations among potentially the same gene or recent gene duplicates. These ribosomal genes were chosen arbitrarily as the first gene in the gene list with a unique name.

Experimental design for microarrays
A double loop design was used for the microarray hybridizations where each sample is hybridized to 2 arrays using both Cy3 and Cy5 labelled fluorophores [30,31]. The loop consisted of Cy3 and Cy5 labelled embryo aRNAs from 4 biological pools for each of 40 stages (S). In total, 160 biological pools were hybridized to 80 microarrays. Each array had different combinations of biological pools [32]. The double loop formed was S1 S2 S3 ... S40 S1 and S40 S39 S38 ... S2 S1 S40 where each arrow represents a separate hybridization (array) with the biological pool at the base of the arrow labeled with Cy3 and the biological pool at the head of the arrow labelled with Cy5.
To control for batch effects, the biological replicates for each stage were randomized on arrays and were not processed simultaneously, the same batch of arrays was used for all hybridizations, and all arrays were processed within three days.

Embryonic gene expression
Log 2 measures of gene expression were normalized using a linear mixed model in JMP Genomics 3.2 (SAS, Cary, NC, USA) to remove the effects of dye (fixed effect) and array (random effect) following a joint regional and spatial Lowess transformation in MAANOVA version 0.98.8 for R to account for both intensity and spatial bias (Additional File 8 shows representative MA plots after normalization) [33]. The linear mixed model was of the form y ij = μ + A i + D j + (AxD) ij + ε ij where, y ij is the signal from the i th array with dye j, μ is the sample mean, A i and D j are the overall variation in arrays (arrays 1-80) and dyes (Cy3 and Cy5), (AxD) ij is the array × dye interaction and ε ij is the stochastic error [34,35].
Residuals from the above model were used in a linear mixed model to test for differences between stages on a gene-by-gene basis. The model was r ijk = μ + A i + D j + T k + ε ijk where T k is the k th treatment (stage 1-40, 39 d.f.), the D j effect is fixed (1 d.f.) and the A i effect is random (79 d.f., leaving 40 d.f. for the residual error). We used a similar analysis to test for differences between pre and post-hatch embryos except the k th treatment represented pre-hatch (stages 1-35) and post-hatch (stages [36][37][38][39][40] For all mixed model analyses, we used a nominal p-value cut-off for significant genes of p < 0.01. Using this p-value reveals more genes that may be differentially expressed but risks identifying genes that may be false positives. Microarray data have been deposited in NCBI's Gene Expression Omnibus [36] and are accessible through GEO Series accession number GSE21372 http://www.ncbi.nlm. nih.gov/geo/query/acc.cgi?acc=GSE21372.
Hierarchical clustering used JMP Genomics 3.2, Cluster 3.0 for Mac OS X, and Java TreeView version 1.0.8 [37]. Correlation analyses were done in JMP Genomics 3.2 and MATLAB version 7.2 was used for visualization. For peak expression, genes were ordered by their time of peak expression and standardized least square means were visualized. Separate peaks were defined as within the 90% CI of the maximum transcript level measured and at least 3 stages away from another peak.