Skip to main content
Fig. 1 | BMC Genomics

Fig. 1

From: Associations between nucleosome phasing, sequence asymmetry, and tissue-specific expression in a set of inbred Medaka species

Fig. 1

a. Focal tissue types during embryogenesis and lineage separation. b. Venn diagram of the three sets of representative TSSs that were detected in the respective tissue types in our sage data: blastulae, testes, and liver. Numbers of TSSs are labeled with individual subsets. c. Schematic showing the periodicity of nucleosomes upstream or downstream of a representative TSS using the autocorrelation analysis that quantifies the nucleosome positioning consistency (Methods). d. Spearman’s rank correlation coefficient matrix for the data deriving from the Hd-rR (Hd) and HNI (HN) strains. The individual values can be found in Additional file 12: Table S1. The gene expression (Exp), breadth of TSS cluster (Br), nucleosome periodicity (Np) and nucleosome positioning consistency (measured by autocorrelation) were monitored in blastula (Bla), liver (Liv), and testes (Tes). Single nucleotide count of each nucleobase (A, C, G, and T), nucleosome periodicity (Np), and nucleosome consistency (Na) were separately calculated in upstream (Up) and downstream (Dw) TSS regions. The difference of nucleobase counts upstream and downstream of TSSs is called the sequence asymmetry of nucleotides (Sa). To exemplify the nomenclature used, Hd_A_Dw denotes the incidence in the Hd-rR strain of nucleotide A downstream of TSSs. We also report other sequence composition features, A+T (A or T), C+G, AA+TT, and CC+GG, and their sequence asymmetry values; however, we note that these are not independent of single base composition features (and we thus did not use them to avoid redundancy) (see Additional file 1: Figure S1c). We considered 1128 combinations of 48 parameters as candidate hypotheses, with some of the comparisons showing positive or negative association due to technical aspects of the quantitation (e.g., breadth and expression level in a given tissue are associated based on the increased sampling for high expression genes) while others showed positive or negative association due to their intrinsic definitions (e.g., T content upstream and G content upstream are expectedly negatively associated). A Bonferroni correction, a typical multiple hypothesis testing method, is valuable to rigorously test each hypothesis. A significance level of 5 %/1128 (~4.4 x 10−5) would require r (Spearman’s rank correlation coefficient) of |r|> 0.03, achieving a p-value < 10−43 when |r|> 0.1. The white box shows a high correlation among nucleosome consistency downstream of TSSs in each tissue type (Hd/HN_Na_Bla/Liv/Tes_Dw), sequence asymmetry (Hd_Sa_A/T), A/T single nucleotide counts upstream of TSSs (Hd_A/T_Up), gene expression level (Hd/HN_Exp_Bla/Liv/Tes) and breadth of TSS cluster (Hd/HN_Br_Bla/Liv/Tes). The three yellow boxes imply high correlations among periodicity of nucleosome positioning upstream (or, downstream) of TSSs in the three tissues of the two strains, and a high correlation among consistency of nucleosome positioning upstream of TSSs. The cyan boxes mean that nucleosome positioning consistency upstream of TSSs are negatively correlated with sequence asymmetry and count of A and T. The two purple boxes are similar to the white box except for the negative correlation of C/G single nucleotide counts upstream of TSSs (Hd_C/G_Up) with the other parameters in the white box

Back to article page