Gene promoters show chromosome-specificity and reveal chromosome territories in humans

Gagniuc, Paul; Ionescu-Tirgoviste, Constantin

doi:10.1186/1471-2164-14-278

Research article
Open access
Published: 24 April 2013

Gene promoters show chromosome-specificity and reveal chromosome territories in humans

Paul Gagniuc¹ &
Constantin Ionescu-Tirgoviste²

BMC Genomics volume 14, Article number: 278 (2013) Cite this article

11k Accesses
15 Citations
50 Altmetric
Metrics details

Abstract

Background

Gene promoters have guided evolution processes for millions of years. It seems that they were the main engine responsible for the integration of different mutations favorable for the environmental conditions. In cooperation with different transcription factors and other biochemical components, these regulatory regions dictate the synthesis frequency of RNA molecules. Predominantly in the last decade, it has become clear that nuclear organization impacts upon gene regulation. To fully understand the connections between Homo sapiens chromosomes and their gene promoters, we analyzed 1200 promoter sequences using our Kappa Index of Coincidence method.

Results

In order to measure the structural similarity of gene promoters, we used two-dimensional image-based patterns obtained through Kappa Index of Coincidence (Kappa IC) and (C+G)% values. The center of weight of each promoter pattern indicated a structure similarity between promoters of each chromosome. Furthermore, the proximity of chromosomes seems to be in accordance to the structural similarity of their gene promoters. The arrangement of chromosomes according to Kappa IC values of promoters, shows a striking symmetry between the chromosome length and the structure of promoters located on them. High Kappa IC and (C+G)% values of gene promoters were also directly associated with the most frequent genetic diseases. Taking into consideration these observations, a general hypothesis for the evolutionary dynamics of the genome has been proposed. In this hypothesis, heterochromatin and euchromatin domains exchange DNA sequences according to a difference in the rate of Slipped Strand Mispairing and point mutations.

Conclusions

In this paper we showed that gene promoters appear to be specific to each chromosome. Furthermore, the proximity between chromosomes seems to be in accordance to the structural similarity of their gene promoters. Our findings are based on comprehensive data from Transcriptional Regulatory Element Database and a new computer model whose core is using Kappa index of coincidence.

Background

Inside the body, somatic cells exercise their overall functions in G₀ phase (the period between cell divisions) [1–3]. During this phase, individual chromosomes are impossible to distinguish by light or electron microscopy. For instance, when cells are terminally differentiated, some of them enter in a permanent (quiescent state) G₀ phase, such as myocyte cells, the majority of neuronal cell types or pancreatic beta cells. Other types of cells exhibit a temporary G₀ phase, such as glial cells or hepatocyte cells, which divide under controlled conditions. However, less is known of the precise location of chromosomes and their relationship with the internal nuclear membrane and nuclear pores through which the traffic of molecules is made. Inside the nucleus of specialized cells, spatial arrangements of chromosomes in G₀ phase play an important role in the regulation of gene expression patterns [4, 5]. The nucleus lacks of membrane compartmentalization [6, 7]. In telophase, mitotic chromosomes unfold into chromatin state [8, 9]. Immediately after nuclear membrane is formed, heterochromatin is allocated to the nuclear periphery whereas euchromatin is generally contained towards the nuclear interior. In G₀ phase, chromatin shows different states of condensation, such as constitutive heterochromatin, facultative heterochromatin and euchromatin [10, 11]. Constitutive heterochromatin consists of permanently condensed DNA, usually containing multiple short repeats and low gene density. Facultative heterochromatin represents a temporary DNA condensation state, located in heterochromatin landscape surface [12, 13]. The active part of the nucleus (gene rich areas), where the transcription of DNA to mRNA is made, is represented by euchromatin domain. In order to initiate the transcription process, the relaxed structure of euchromatin allows regulatory proteins and RNA polymerase complexes to bind to DNA for transcription initiation and elongation of mRNA [14]. Euchromatin domains which are never stored as facultative heterochromatin are usually under active transcription and contain housekeeping genes, otherwise crucial for basic cell functions [15]. Genes embedded inside facultative heterochromatin can transit to and from euchromatin, depending on different functions that the cell needs to perform, in certain time intervals or under the action of certain external stimuli. It is recognized that many active genes that are brought into or near heterchromatin landscapes become repressed and their transcriptional reactivation is made by reallocation to the nuclear interior [16–18]. Nevertheless, other studies show that some genes are transcriptionally active close to nuclear periphery [19–21]. Electron microscopy images show a lack of heterchromatin around nuclear pores [22]. Although active inside euchromatin, some inducible genes from the nuclear interior are relocated near nuclear pores for a fast response under the action of certain stimuli [23–27]. However, facultative heterochromatin represents one of many methods through which cells, start or stop the expression of certain genes. Heterochromatin is also critical in morphogenesis and differentiation. In embryogenesis, chromatin establishes different structural landscapes depending on cell specialization. For instance, Hox gene clusters [28, 29] are responsible for the spatial structure of the body. In humans, these genes are located on chromosome 7 (HOXA gene clusters), 17 (HOXB gene clusters), 12 (HOXC gene clusters) and 2 (HOXD gene clusters). In embryogenesis, Hox genes are brought to the surface into euchromatin domain in order to be expressed in a sequential manner [30, 31]. Polycomb-group proteins and other biochemical mechanisms reshape chromatin depending on the cell type, allowing a favorable positioning of these genes inside euchromatin domain [32]. In terminally differentiated somatic cells, Hox genes are permanently silenced by their inclusion inside heterochromatin domain. Moreover, modulation of gene expression through chromatin structure is not limited only to single genes or gene clusters. For instance, in female morphogenesis an X chromosome is silenced through its condensation inside facultative heterochromatin [33–35] (the Barr body), while the active X chromosome is included in euchromatin domain. In G₀ phase, genes of common function can colocalize inside the nuclear space in order to share the same transcription machinery [36]. Thus, these genes may be incorporated into the same transcription factory or in close neighboring transcription factories [37, 38]. It appears that these active regions are positioned between chromosome territories.

In this paper we tried to identify some structural features of gene promoters located on different chromosomes in the human genome. Our hypothesis was based on the fact that promoter sequences are more exposed to the biochemical transcription machinery and therefore may reflect the chromosome boundaries much better. Previously, approaches towards promoter analysis include motif sequences and other structural parameters, such as DNA curvature, bendability, stability, nucleosome positioning or comparison of various DNA sequences [39–46]. Nevertheless, a clear association between promoter nucleotide sequences and chromosome territories was never hypothesized. The purpose of our work was to establish a possible functional significance of promoter sequences which may explain the dynamic relationship between different chromosome territories.

Methods

In our approach we used 1200 promoter sequences (50 random promoters from each chromosome) from Transcriptional Regulatory Element Database [47, 48]. We were mainly interested in the regions flanking the putative TSS, ranging from -700b to 299b. We used Visual Basic to develop a software program for promoter analysis - called PromKappa (Promoter analysis by Kappa). The source code implementation of this program is attached to our Additional file 1. We used sliding window approach to extract two types of values, namely Kappa Index of Coincidence (Kappa IC) and (C+G)%.

Kappa index of coincidence

The Index of coincidence principle is based on letter frequency distributions and has been used for the analysis of natural-language plaintext in cryptanalysis [49]. Kappa Index of Coincidence is a form of Index of Coincidence used for matching two text strings. However, we managed to adapt Kappa IC for the analysis of a single DNA sequence. This adaptation of Kappa IC is used for calculating the level of “randomization” of a DNA sequence. Kappa IC is sensitive to various degrees of sequence organization such as simple sequence repeats (SSRs) or short tandem repeats (STRs) [50]. The formula for Kappa IC is shown below, where sequences A and B have the same length N. Only if an A[i] nucleotide from sequence A matches the B[i] correspondent from sequence B, then ∑ is incremented by 1. Q represents the number of letters in the alphabet (in our case Q=4).

Kapp a_{IC} = \frac{\sum_{i = 1}^{N} [A_{i} = B_{i}]}{N / Q}

With small changes, the same method for measuring the Index of Coincidence has been applied for only one sequence, in which the sequence was actually compared with itself, as shown below in the algorithm implementation.

function KIC(A)

T = 0

N = length(A) - 1

for u = 1 to N

B = A[u + 1] … A[N]

for i = 1 to length(B)

If A[i]= B[i] then C = C + 1

next i

T = T + (C / length(B) × 100)

C = 0

next u

IC = Round((T / N), 2)

end function

Where N is the length of the sliding window, A represents the sliding window content, B contains all variants of sequences generated from A (from u+1 to N), C counts the number of coincidences occurring between sequence B and sequence A, and T variable counts the total number of coincidences found between sequences of B and the sequence A.

Cytosine and guanine content

We extracted C+G values from each sliding window considering the nucleotide frequencies from the entire promoter sequence. In the first stage, to determine the (C+G)% content for the entire promoter sequence we used the formula:

C G_{TOT} = (\frac{100}{{(A + T + C + G)}_{TOT}}) \times {(C + G)}_{TOT}

Where “TOT” (total) designates the promoter sequence. CG_TOT represents the percentage of cytosine and guanine, (A+T+C+G)_TOT represents the sum of occurrences of A, T, C and G, and (C+G)_TOT represents the sum of occurrences of C and G. In the next stage we used the value of CG_TOT to calculate the (C+G)% content from the sliding window (SW):

C G_{SW} = (\frac{C G_{TOT}}{{(A + T + C + G)}_{SW}}) \times {(C + G)}_{SW}

Where CG_SW represents the percentage of cytosine and guanine from the sliding window. In this stage, CG_SW value is relative to CG_TOT. The expression (A+T+C+G)_TOT represents the sum of occurrences of A, T, C and G from the sliding window sequence. (C+G)_SW represents the sum of C and G occurrences in the sliding window sequence. Nevertheless, in our implementation we also included the option to extract CG_SW values without considering CG_TOT.

Promoter analysis

By extracting Kappa IC percentages and C+G content from a sliding window (window size of 30 nt and a step of 1 nt) we have been able to measure the localized values along the promoter sequences (Figure 1A,B). Kappa Index of Coincidence values were plotted on a graph against (C+G)% values, which form a recognizable pattern for each promoter sequence (Figure 1C). The x-coordinate of each point was represented by a (C+G)% value and the y-coordinate was represented by a corresponding Kappa IC value. As expected, by using a large window size we obtained smooth promoter patterns, whereas a small window size generated sharp and distinguishable characteristics of promoters. These patterns are composed from clusters of various sizes on the y-axis (Figure 1C and Additional file 2). The center of weight from each pattern was plotted on a graph designed to show the distribution of promoters for each chromosome. Furthermore, in order to observe the boundaries in which Homo sapiens promoters are included, we used 8,515 gene promoters from EPD [51, 52] (Eukaryotic Promoter Database) to perform a more general distribution (Figure 1D and Additional file 3). In this case we used a color scheme to highlight the denser surfaces. Red areas represent clusters of similar promoters while blue areas represent unique or rare promoters.

Results

We first investigated if some promoter patterns occur more often on certain chromosomes. Secondly we determined if chromosome territories could be revealed by using Kappa IC. In the third analysis we examined the distribution of Kappa IC values against the number of genetic diseases associated with each chromosome.

Gene promoters show chromosome-specificity

Initially, our first observation regarding promoter-chromosome specificity originated from a direct correlation between their Kappa IC values and (C+G)% (Additional file 4). For the majority of chromosomes, promoter regions show almost proportional Kappa IC and CG% values relative to each other (Figure 2A). Promoters with the largest Kappa Index of Coincidence are placed on chromosome 4, while promoters from chromosomes 11 and 16 have almost the same Kappa index of coincidence and relatively close variations of cytosine and guanine content. Promoters with the lowest index of coincidence are located on chromosome Y (Figure 2B). The order of chromosomes by promoter Kappa index of coincidence is shown in Figure 2C,D. Interestingly, chromosomes X and Y contain promoters with the lowest CG% and Kappa index of coincidence values. Promoter regions with the highest Kappa Index of Coincidence values (ie. chromosomes 4,5,7,21) contain various SSRs and STRs structures (Figure 2B). This further suggests that in their evolution, promoters located on these chromosomes experienced few point mutations and accumulated more Slipped Strand Mispairing (SSM) mutations [53].

In contrast, promoter regions with the lowest Kappa Index of Coincidence values (ie. chromosomes Y,X,12,8), contain more interspersed nucleotides (A,T,C,G ≈ 25%) and less SSRs and STRs structures (Figure 2B). Acordantly, this further suggests that in their evolution, promoters located on these chromosomes have accumulated a multitude of random point mutations, thus disrupting SSR structures like poly(dA:dT) or poly(dC:dG) tracts [54, 55] in shorter elements. Although without immediate consequences, point mutations that occur in promoter regions, gradually change gene expression patterns and consequently, their gene relation within certain biological pathways.

Heterochromatin and euchromatin are two main evolutionary forces

Chromosomes such as 1, 9, 16 or the Y-chromosome contain large regions of constitutive heterochromatin [56–58]. In terms of evolution, across generations the X-chromosome is also occasionally a part of heterochromatin (the Barr body). Our results suggest that promoters located on chromosomes which contain regions frequently included in heterochromatin, seem to exhibit only average to low Kappa Index of Coincidence values (Figure 2B), which further suggests that among other roles, heterochromatin is also acting as a shield for the inner core against point mutations originating from outside the nucleus. Although controversial, the “bodyguard” model [59] of heterochromatin appears to be partially true, but not as a protective role, but rather as a layered evolutionary mechanism in which some vital regions of the genome are exposed for rapid phenotypic changes (ie. tissue-specific genes) and those regions which need less change are more protected (ie. housekeeping genes). It is known that mammalian housekeeping genes evolve more slowly than tissue-specific genes [60]. Furthermore, is also accepted that non-coding regions suffer more mutations than coding regions [61]. Evolutionary, chromatin structure may influence the distribution of point mutations or other mutational events in the promoter sequence. A chromatin-dependent distribution of point mutations can lead to a gradual shift in gene expression. Gene promoters located mainly inside euchromatin domain remain prone to stable SSM mutations, favoring the maintenance of SSR or STR structures in the promoter regions. For instance, poly(dA:dT) tracts inside promoters were often associated with high gene expression levels while a disruption of poly(dA:dT) tracts in shorter elements had an opposite effect [62]. Although SSM mutations may appear with an equal probability in all promoters during DNA replication, it seems that only SSRs or STRs of promoters stored inside euchromatin are preserved. Accordingly, functional SSRs or STRs of promoters stored inside heterochromatin are gradually deteriorated by point mutations events. In most organisms, constitutive heterochromatin is usually associated with chromosomal areas of repetitive DNA sequences (commonly around the chromosome centromere and near telomeres), which seem to confer an overall trigger pattern for a tight colloid-like formation between nucleosomes [63, 64]. However, functional areas (promoters and genes) that have a lower predisposition for a tight nucleosome packing, are more susceptible to point mutations inside heterochromatin than classical repetitive DNA sequences. Based on the overall promoter-chromosome specificity distributions (Figure 2), our hypothesis for a possible evolutionary dynamics of the eukaryotic nucleus would imply a permanent exchange of DNA areas between heterochromatin and euchromatin domains (Figure 3). Inside heterochromatin (Figure 3A), DNA repetitions degraded by point mutations lose their overall ability for tight nucleosome packing. Inside euchromatin (Figure 3B), SSM mutations favor DNA repetitions, which over time, gain a predisposition for tight nucleosome packing, and ultimately, allowing for heterochromatin formation. Nevertheless, in such a hypothesis the selection pressure may decide the speed by which some DNA areas are brought to the surface into the heterochromatin landscapes.

Chromosome territories in humans

What surprised us in particular, was the symmetry of chromosome order when they are arranged by promoter Kappa IC values (Figure 2D – blue “amphora” shaped semi-circles). Generally, chromosomes were numbered according to their size. In Figure 2D we show an abstracted model in which chromosomes are ordered by Kappa IC values of promoters (colored in blue), however, in this model the blue arrows follow the order of chromosomes according to their size (starting from chromosome 4 - which contains promoters with the highest Kappa IC values). Thus, the arrows that connect more distant chromosomes in this order, show a proportional increased semi-circle radius (a radius proportional with the relative distance between them). Nevertheless, the apparent 2-fold symmetry on Y-axis (between chromosomes 4–11 and chromosomes 19-Y) further suggests that there is a correlation between chromosome length and the structure of gene promoters located on them (Figure 2D and Additional file 5). In addition, by complying with the same rules described above, when chromosomes were ordered by (C+G)% values of promoters, we could not observe any obvious symmetries (Figure 2D - red color arrows). Figure 2C shows the order of chromosomes and their position to one another when they are arranged separately by the two values.

Chromosomal territories have cell-type specificity [65]. Relying exclusively on sequence composition, our promoter distributions may show which chromosomes are most frequently adjacent inside the nucleus in G₀ phase. Human genome codes for ~2600 transcription factors [66]. However, the number of available transcription factors (and consequently the number of transcription factories) expressed at any given time is relative to each cell type. Genes located relatively close to each other in the nuclear space have a greater probability of being incorporated into the same transcription factory [67, 68]. In this regard, our results suggest that gene promoters with similar structures (ie. similar DNA-binding sites and SSRs), seem to be included in the same transcription factories. This further implies that genes with different promoter structures, although close in the nuclear space, may be included in different transcription factories. Interestingly, the order of chromosomes after Kappa IC values of promoters, partially coincide with chromosomal territories of human fibroblast nuclei in G₀ phase observed by Bolzer et al. [69] (Figure 4A). The MDS (multidimensional scaling) plot from Bolzer et al. provides a 2D distance map of the mean locations of the IGCs (fluorescence intensity gravity centers) of all heterologous chromosome territories (CTs) established from 54 G₀ nuclei. Here, we notice some similarity of distribution for certain groups of chromosomes, such as chromosome 1 and 4 or chromosome 11 (containing beta globin gene clusters) and 16 (containing alpha globin gene clusters) (Figure 4A,B). In order to obtain an overview of this correlation with the results presented by Bolzer et al. regarding the mean locations of chromosomes in G₀ phase (Figure 4A), we have subdivided their distribution into two main sectors. We have chosen two circular perimeters, the first perimeter (perimeter 1), which incorporates the chromosomes found at the extremity of their distribution, and a smaller circular perimeter (perimeter 2), which includes the chromosomes that are closer to the zero point (the middle of the chart). In our distribution (Figure 4B), we correlated all points present in perimeter 1 by using green dots and all points present in perimeter 2 by using red dots. We noticed that peripheral dots (red color) from our distribution correspond to perimeter 2 area from Bolzer et al. distribution, whereas central dots (green color) from our distribution correspond to perimeter 1 from Bolzer et. al distribution. Furthermore, the interchromosomal contact probabilities between pairs of chromosomes presented by Lieberman-Aiden E et al. [70], showing that chromosomes 16, 17, 19, 20, 21 and 22 preferentially interact with each other, were also correlated with our results. In our distribution of gene promoters, these chromosomes are located very close to each other and are relatively united by a single diagonal line (except chromosome 22 which is slightly below chromosome 19 – see Figure 4B), suggesting a similar conclusion. Although many factors may be involved, this comparison of observed vs. calculated positions suggests that the DNA sequence composition dictates the overall positions of chromosomes in G₀ phase. In this regard, areas of chromosomes that contain gene promoters with common structures (ie. Kappa IC and (C+G)% values) seem to position themselves next to each other, relative to each cell type. A more detailed distribution of promoters belonging to each chromosome is shown in Figure 5, which may further detail the chromosomal areas of interaction.

Promoter Kappa IC values vs. genetic diseases

A more intriguing association was made between the number of genetic diseases/chromosome and promoter Kappa IC and (C+G) values (Figure 6A,B). Although the number of genetic diseases associated with individual chromosomes may exceed several hundred, we used a list of common types of genetic diseases provided by NCBI [71]. It seems that high values of Kappa IC and (C+G)% of gene promoters are directly associated with the number of classic genetic diseases. Exception to this relative proportion are chromosomes 21, 22 and X, which exhibit asynchronous values between Kappa IC, (C+G) and the number of common genetic diseases/chromosome (Figure 6A,B).

Discussion

Gene promoters are located upstream of TSS (Transcription Start Site). A typical promoter region consists of a core promoter and regulatory domains. The association of transcription factors within a promoter precedes the RNA synthesis [72]. Accordingly, the structure of a promoter is recognized by the presence of known promoter elements, such as TATA box, GC-box, CCAAT-box, BRE and INR box [73]. In order to elucidate the evolutionary relationships, many comparisons have been made between gene promoters of different species. Nevertheless, correlations made between promoters of genes located on different chromosomes of the same species have been poorly studied. In this regard, we have chosen a different approach to analyze promoter sequences by using two-dimensional image-based patterns obtained through Kappa Index of Coincidence (Kappa IC) and (C+G)% values [74]. Each pattern is composed of vertically aligned clusters of Kappa IC (y-axis) and (G+C)% (x-axis) values. Vertical positions of these clusters form a promoter pattern which has a specific form for each promoter sequence. Their shape is explained by the presence of different structures such as simple sequence repeats (SSRs) or short tandem repeats (STRs). In order to investigate a possible relationship between promoters of genes located on different chromosomes, we have plotted the center of weight from 1200 promoter patterns (Figure 5A-X). The center of weight of each promoter pattern indicates an average between all SSRs and STRs present in the promoter sequence. An explanatory model of an image-based promoter pattern can reveal some visual insights into different promoter regions, such as the locations of all SSRs and STRs (Figure 7A-F). We have also noticed the directions and the angles of these promoter distributions which may suggest an evolutionary tendency (Figure 1D).

The haploid human genome contains a nuclear volume of approximately 1000 μm³ and 3.2 billion base pairs of compacted DNA [75–77]. Nucleosomes compact and regulate access to DNA by assuming specific positions [78, 79]. The interaction between nucleosomes that incorporate functional sequences located at great distances inside the nucleous, is provided by a favorable positioning of other nucleosomes that incorporate non-coding sequences. Accordingly, an overall picture begins to take shape, namely that the evolutionary process can not tolerate non-functional information. Although many studies show that refined mechanisms involved in the dynamics of the nucleus are ATP (adenosine-5'-triphosphate) dependent processes [80, 81], we wonderd if self-organization processes and other biophysical phenomena could be evan more involved than previously thought. Nevertheless, DNA guided self-organization processes that may concern chromatin mobility will be of utmost importance for our understanding of the dynamics of the nucleus.

In a recent study, we have suggested that eukaryotic genomes may exhibit at least 10 classes of promoters [82]. In future research we wish to highlight the distribution of these promoter classes on each chromosome. Furthermore, we are also interested to observe the differences between Kappa IC values of introns and exons related to each chromosome in order to understand if the relative proportions presented here will remain constant.

Conclusions

In this paper a comprehensive analysis was undertaken for promoter sequences from Homo sapiens. In our approach we used 1200 promoter sequences (50 random promoters from each chromosome) from Transcriptional Regulatory Element Database. In order to measure the structural similarity of gene promoters, we used two-dimensional image-based patterns obtained through Kappa Index of Coincidence (Kappa IC) and (C+G)% values. The center of weight of each promoter pattern indicated an average between all SSRs and STRs present in the promoter sequence. A distribution of these average values showed that gene promoters appear to be specific to each chromosome. Furthermore, the proximity between chromosomes seems to be in accordance to the structural similarity of their gene promoters. Although chromosomes are positioned differently depending upon each cell type, they exhibit a predisposition for a standard arrangement. High Kappa IC and (C+G)% values of gene promoters were also directly associated with the most frequent genetic diseases. Taking into consideration these observations, a general hypothesis for the evolutionary dynamics of the genome has been proposed. In this hypothesis, heterochromatin and euchromatin domains exchange DNA sequences according to a difference in the rate of mutations.

References

Mendelsohn ML: Autoradiographic analysis of cell proliferation in spontaneous breast cancer of C3H mouse. III. The growth fraction. J Natl Cancer Inst. 1962, 28: 1015-1029.
CAS PubMed Google Scholar
Zetterberg A, Larsson O: Kinetic analysis of regulatory events in G1 leading to proliferation or quiescence of Swiss 3T3 cells. Proc Natl Acad Sci USA. 1985, 82: 5365-5369. 10.1073/pnas.82.16.5365.
Article PubMed Central CAS PubMed Google Scholar
Coller HA: What's taking so long? S-phase entry from quiescence versus proliferation. Nat Rev Mol Cell Biol. 2007, 8 (8): 667-70. 10.1038/nrm2223.
Article CAS PubMed Google Scholar
Jackson DA: The anatomy of transcription sites. Curr Opin Cell Biol. 2003, 15 (3): 311-7. 10.1016/S0955-0674(03)00044-9.
Article CAS PubMed Google Scholar
Jackson DA: The amazing complexity of transcription factories. Brief Funct Genomic Proteomic. 2005, 4 (2): 143-57. 10.1093/bfgp/4.2.143.
Article CAS PubMed Google Scholar
Cremer T, Cremer C: Chromosome territories, nuclear architecture and gene regulation in mammalian cells. Nat Rev Genet. 2001, 2: 292-301. 10.1038/35066075.
Article CAS PubMed Google Scholar
Hetzer MW, Walther TC, Mattaj IW: Pushing the envelope: structure, function and dynamics of the nuclear periphery. Annu Rev Cell Dev Biol. 2005, 21: 347-80. 10.1146/annurev.cellbio.21.090704.151152.
Article CAS PubMed Google Scholar
Verschure PJ: Positioning the genome within the nucleus. Biol Cell. 2004, 96 (8): 569-77. 10.1016/j.biolcel.2004.07.001.
Article CAS PubMed Google Scholar
Tumbar T, Sudlow G, Belmont AS: Large-scale chromatin unfolding and remodeling induced by VP16 acidic activation domain. J Cell Biol. 1999, 145 (7): 1341-54. 10.1083/jcb.145.7.1341.
Article PubMed Central CAS PubMed Google Scholar
Avramova Z: Heterochromatin in Animals and Plants. Similarities and Differences. Plant Physiol. 2002, 129 (1): 40-9. 10.1104/pp.010981.
Article PubMed Central CAS PubMed Google Scholar
Brown SW: Heterochromatin. Science. 1966, 151: 417-425. 10.1126/science.151.3709.417.
Article CAS PubMed Google Scholar
Cremer T, et al: Chromosome territories – a functional nuclear landscape. Curr Opin Cell Biol. 2006, 18: 307-316. 10.1016/j.ceb.2006.04.007.
Article CAS PubMed Google Scholar
Oberdoerffer P, Sinclair D: The role of nuclear architecture in genomic instability and ageing. Nat Rev Mol Cell Biol. 2007, 8: 692-702. 10.1038/nrm2238.
Article CAS PubMed Google Scholar
Butler JEF, Kadonaga JT: The RNA polymerase II core promoter: a key component in the regulation of gene expression. Gene Dev. 2002, 16: 2583-2592. 10.1101/gad.1026202.
Article CAS PubMed Google Scholar
Tamaru H: Confining euchromatin/heterochromatin territory: jumonji crosses the line. Genes Dev. 2010, 24 (14): 1465-78. 10.1101/gad.1941010.
Article PubMed Central CAS PubMed Google Scholar
Zink D, et al: Transcription-dependent spatial arrangements of CFTR and adjacent genes in human cell nuclei. J Cell Biol. 2004, 166: 815-825. 10.1083/jcb.200404107.
Article PubMed Central CAS PubMed Google Scholar
Williams RR, et al: Neural induction promotes large-scale chromatin reorganisation of the Mash1 locus. J Cell Sci. 2006, 119: 132-140. 10.1242/jcs.02727.
Article CAS PubMed Google Scholar
Kosak ST, et al: Subnuclear compartmentalization of immunoglobulin loci during lymphocyte development. Science. 2002, 296: 158-162. 10.1126/science.1068768.
Article CAS PubMed Google Scholar
Reddy KL, et al: Transcriptional repression mediated by repositioning of genes to the nuclear lamina. Nature. 2008, 452: 243-247. 10.1038/nature06727.
Article CAS PubMed Google Scholar
Finlan LE, et al: Recruitment to the nuclear periphery can alter expression of genes in human cells. PLoS Genet. 2008, 4: e1000039-10.1371/journal.pgen.1000039.
Article PubMed Central PubMed Google Scholar
Kumaran RI, Spector DL: A genetic locus targeted to the nuclear periphery in living cells maintains its transcriptional competence. J Cell Biol. 2008, 180: 51-65. 10.1083/jcb.200706060.
Article PubMed Central CAS PubMed Google Scholar
Akhtar A, Gasser SM: The nuclear envelope and transcriptional control. Nat Rev Genet. 2007, 8: 507-517. 10.1038/nrg2122.
Article CAS PubMed Google Scholar
Dieppois G, et al: Cotranscriptional recruitment to the mRNA export receptor Mex67p contributes to nuclear pore anchoring of activated genes. Mol Cell Biol. 2006, 26: 7858-7870. 10.1128/MCB.00870-06.
Article PubMed Central CAS PubMed Google Scholar
Brickner JH, Walter P: Gene recruitment of the activated INO1 locus to the nuclear membrane. PLoS Biol. 2004, 2: e342-10.1371/journal.pbio.0020342.
Article PubMed Central PubMed Google Scholar
Ahmed S, et al: DNA zip codes control an ancient mechanism for targeting genes to the nuclear periphery. Nat Cell Biol. 2010, 12: 111-118. 10.1038/ncb2011.
Article PubMed Central CAS PubMed Google Scholar
Casolari JM, et al: Genome-wide localization of the nuclear transport machinery couples transcriptional status and nuclear organization. Cell. 2004, 117: 427-439. 10.1016/S0092-8674(04)00448-9.
Article CAS PubMed Google Scholar
Taddei A: Active genes at the nuclear pore complex. Curr Opin Cell Biol. 2007, 19: 305-310. 10.1016/j.ceb.2007.04.012.
Article CAS PubMed Google Scholar
Noordermeer D, Leleu M, Splinter E, Rougemont J, De Laat W, Duboule D: The dynamic architecture of Hox gene clusters. Science. 2011, 334 (6053): 222-5. 10.1126/science.1207194.
Article CAS PubMed Google Scholar
Tschopp P, Duboule D: A genetic approach to the transcriptional regulation of Hox gene clusters. Annu Rev Genet. 2011, 45: 145-66. 10.1146/annurev-genet-102209-163429.
Article CAS PubMed Google Scholar
Chambeyron S, Bickmore WA: Chromatin decondensation and nuclear reorganization of the HoxB locus upon induction of transcription. Genes Dev. 2004, 18 (10): 1119-30. 10.1101/gad.292104.
Article PubMed Central CAS PubMed Google Scholar
Pearson JC, et al: Modulating Hox gene functions during animal body patterning. Nat Rev Genet. 2005, 6: 893-904. 10.1038/nrg1726.
Article CAS PubMed Google Scholar
Bantignies F, et al: Polycomb-dependent regulatory contacts between distant Hox loci in Drosophila. Cell. 2011, 144: 214-226. 10.1016/j.cell.2010.12.026.
Article CAS PubMed Google Scholar
Rougeulle C, Avner P: Controlling X-inactivation in mammals: what does the centre hold?. J semcdb. 2003, 14: 331-340.
CAS Google Scholar
Plath K, Mlynarczyk-Evans S, Nusinov DA, Panning B: Xist RNA and the mechanism of X chromosome inactivation. Annu Rev Genet. 2002, 36: 233-278. 10.1146/annurev.genet.36.042902.092433.
Article CAS PubMed Google Scholar
Barr ML, Bertram EG: A Morphological Distinction between Neurones of the Male and Female, and the Behaviour of the Nucleolar Satellite during Accelerated Nucleoprotein Synthesis. Nature. 1949, 163 (4148): 676-7. 10.1038/163676a0.
Article CAS PubMed Google Scholar
Thompson M, et al: Nucleolar clustering of dispersed tRNA genes. Science. 2003, 302: 1399-1401. 10.1126/science.1089814.
Article PubMed Central CAS PubMed Google Scholar
Osborne CS, Chakalova L, Brown KE, Carter D, Horton A, Debrand E, Goyenechea B, Mitchell JA, Lopes S, Reik W, Fraser P: Active genes dynamically colocalize to shared sites of ongoing transcription. Nat Genet. 2001, 36 (10): 1065-71.
Article Google Scholar
Razin SV, Gavrilov AA, Pichugin A, Lipinski M, Iarovaia OV, Vassetzky YS: Transcription factories in the context of the nuclear and genome organization. Nucleic Acids Res. 2011, 39 (21): 9085-92. 10.1093/nar/gkr683.
Article PubMed Central CAS PubMed Google Scholar
Chang WC, Lee TY, Huang HD, Huang HY, Pan RL: PlantPAN: Plant promoter analysis navigator, for identifying combinatorial cis-regulatory elements with distance constraint in plant gene groups. BMC Genomics. 2008, 9: 561-10.1186/1471-2164-9-561.
Article PubMed Central PubMed Google Scholar
Yamamoto YY, Yoshioka Y, Hyakumachi M, Obokata J, Yoshiharu Y: Characteristics of Core Promoter Types with respect to Gene Structure and Expression in Arabidopsis thaliana. DNA Res. 2011, 18: 333-42. 10.1093/dnares/dsr020.
Article PubMed Central CAS PubMed Google Scholar
Fukue Y, Sumida N, Nishikawa J, Ohyama T: Core promoter elements of eukaryotic genes have a highly distinctive mechanical property. Nucleic Acids Res. 2004, 32: 5834-5840. 10.1093/nar/gkh905.
Article PubMed Central CAS PubMed Google Scholar
Florquin K, Saeys Y, Degroeve S, Rouzé P, Van de Peer Y: Large-scale structural analysis of the core promoter in mammalian and plant genomes. Nucleic Acids Res. 2005, 33: 4255-4264. 10.1093/nar/gki737.
Article PubMed Central CAS PubMed Google Scholar
Kanhere A, Bansal M: Structural properties of promoters: similarities and differences between prokaryotes and eukaryotes. Nucleic Acids Res. 2005, 33: 3165-3175. 10.1093/nar/gki627.
Article PubMed Central CAS PubMed Google Scholar
Yamamoto YY, Ichida H, Abe T, Suzuki Y, Sugano S, Obokata J: Differentiation of core promoter architecture between plants and mammals revealed by LDSS analysis. Nucleic Acids Res. 2007, 35: 6219-6226. 10.1093/nar/gkm685.
Article PubMed Central CAS PubMed Google Scholar
Dineen DG, Wilm A, Cunningham P, Higgins DG: High DNA melting temperature predicts transcription start site location in human and mouse. Nucleic Acids Res. 2009, 37: 7360-7367. 10.1093/nar/gkp821.
Article PubMed Central CAS PubMed Google Scholar
Carninci P, Sandelin A, Lenhard B, Katayama S, Shimokawa K, Ponjavic J, Semple CA, Taylor MS, Engström PG, Frith MC, Forrest AR, Alkema WB, Tan SL, Plessy C, Kodzius R, Ravasi T, Kasukawa T, Fukuda S, Kanamori-Katayama M, Kitazume Y, Kawaji H, Kai C, Nakamura M, Konno H, Nakano K, Mottagui-Tabar S, Arner P, Chesi A, Gustincich S, Persichetti F, et al: Genome-wide analysis of mammalian promoter architecture and evolution. Nat Genet. 2006, 38: 626-635. 10.1038/ng1789.
Article CAS PubMed Google Scholar
Jiang C, Xuan Z, Zhao F, Zhang MQ: TRED: a transcriptional regulatory element database, new entries and other development. Nucleic Acids Res. 2007, 35 (Database issue): D137-40.
Article PubMed Central CAS PubMed Google Scholar
Zhao F, Xuan Z, Liu L, Zhang MQ: TRED: a Transcriptional Regulatory Element Database and a platform for in silico gene regulation studies. Nucleic Acids Res. 2005, 33 (Database issue): D103-7.
Article PubMed Central CAS PubMed Google Scholar
Friedman WF: The index of coincidence and its applications in cryptology. Department of Ciphers. Publ 22. 1922, Geneva, Illinois, USA: Riverbank Laboratories
Google Scholar
Kashi Y, King DG: Simple sequence repeats as advantageous mutators in evolution. Trends Genet. 2006, 22 (5): 253-9. 10.1016/j.tig.2006.03.005.
Article CAS PubMed Google Scholar
Schmid CD, Perier R, Praz V, Bucher P: Database issue. Nucleic Acids Res. 2006, 34 (Database issue): D82-5.
Article PubMed Central CAS PubMed Google Scholar
Périer RC, Praz V, Junier T, Bonnard C, Bucher P: The eukaryotic promoter database (EPD). Nucleic Acids Res. 2000, 28 (1): 302-303. 10.1093/nar/28.1.302.
Article PubMed Central PubMed Google Scholar
Levinson G, Gutman GA: Slipped-Strand Mispairing: A Major Mechanism for DNA Sequence Evolution. Mol Biol Evol. 1987, 4 (3): 203-221.
CAS PubMed Google Scholar
Suter B, Schnappauf G, Thoma F: Poly(dA:dT) sequences exist as rigid DNA structures in nucleosome-free yeast promoters in vivo. Nucleic Acids Res. 2000, 28: 4083-4089. 10.1093/nar/28.21.4083.
Article PubMed Central CAS PubMed Google Scholar
Koch KA, Thiele DJ: Functional analysis of a homopolymeric (dA-dT) element that provides nucleosome access to yeast and mammalian transcription factors. J Biol Chem. 1999, 274: 23752-23760. 10.1074/jbc.274.34.23752.
Article CAS PubMed Google Scholar
Podgol'nikova OA, Grigor'eva NM, Bliumina MG: Heterochromatic regions of human chromosomes 1, 9, 16 and Y and the phenotype. Genetika. 1984, 20 (3): 496-500.
PubMed Google Scholar
Kuznetsova SM: Polymorphism of heterochromatin areas on chromosomes 1, 9, 16 and Y in long-lived subjects and persons of different ages in two regions of the Soviet Union. Arch Gerontol Geriatr. 1987, 6 (2): 177-86. 10.1016/0167-4943(87)90010-0.
Article CAS PubMed Google Scholar
Hsu LY, Benn PA, Tannenbaum HL, Perlis TE, Carlson AD: Chromosomal polymorphisms of 1, 9, 16, and Y in 4 major ethnic groups: a large prenatal study. Am J Med Genet. 1987, 26 (1): 95-101. 10.1002/ajmg.1320260116.
Article CAS PubMed Google Scholar
Hsu TC: A possible function of constitutive heterochromatin: the bodyguard hypothesis. Genetics. 1975, 79 (Suppl): 137-50.
PubMed Google Scholar
Zhang L, Li WH: Mammalian housekeeping genes evolve more slowly than tissue-specific genes. Mol Biol Evol. 2004, 21 (2): 236-9.
Article PubMed Google Scholar
Ludwig MZ: Functional evolution of noncoding DNA. Curr Opin Genet Dev. 2002, 12: 634-639. 10.1016/S0959-437X(02)00355-6.
Article CAS PubMed Google Scholar
Lyer V, Struhl K: Poly(dA:dT), a ubiquitous promoter element that stimulates transcription via its intrinsic DNA structure. EMBO J. 1995, 14: 2570-2579.
Google Scholar
Blower MD, Sullivan BA, Karpen GH: Conserved organization of centromeric chromatin in flies and humans. Dev Cell. 2002, 2: 319-330. 10.1016/S1534-5807(02)00135-1.
Article PubMed Central CAS PubMed Google Scholar
Lohe AR, et al: Mapping Simple Repeated DNA Sequences in Heterochromatin of Drosophila Melanogaster. Genetics. 1993, 134 (4): 1149-74.
PubMed Central CAS PubMed Google Scholar
Marella NV, Bhattacharya S, Mukherjee L, Xu J, Berezney R: Cell type specific chromosome territory organization in the interphase nucleus of normal and cancer cells. J Cell Physiol. 2009, 221 (1): 130-8. 10.1002/jcp.21836.
Article CAS PubMed Google Scholar
Babu MM, Luscombe NM, Aravind L, Gerstein M, Teichmann SA: Structure and evolution of transcriptional regulatory networks. Curr Opin Struct Biol. 2004, 14 (3): 283-91. 10.1016/j.sbi.2004.05.004.
Article CAS PubMed Google Scholar
Chuang CH, Belmont AS: Close encounters between active genes in the nucleus. Genome Biol. 2005, 6 (11): 237-10.1186/gb-2005-6-11-237.
Article PubMed Central PubMed Google Scholar
Kang J, Xu B, Yao Y, Lin W, Hennessy C, Fraser P, Feng J: A dynamical model reveals gene co-localizations in nucleus. PLoS Comput Biol. 2011, 7 (7): e1002094-10.1371/journal.pcbi.1002094.
Article PubMed Central CAS PubMed Google Scholar
Bolzer A, Kreth G, Solovei I, Koehler D, Saracoglu K, Fauth C, Müller S, Eils R, Cremer C, Speicher MR, Cremer T: Three-dimensional maps of all chromosomes in human male fibroblast nuclei and prometaphase rosettes. PLoS Biol. 2005, 3 (5): e157-10.1371/journal.pbio.0030157.
Article PubMed Central PubMed Google Scholar
Lieberman-Aiden E, et al: Comprehensive mapping of long-range interactions reveals folding principles of the human genome. Science. 2009, 326 (5950): 289-293. 10.1126/science.1181369.
Article PubMed Central CAS PubMed Google Scholar
National Center for Biotechnology Information (US): Genes and Disease. 1998, Bethesda (MD)
Google Scholar
Emerson BM: Specificity of gene regulation. Cell. 2002, 109: 267-270. 10.1016/S0092-8674(02)00740-7.
Article CAS PubMed Google Scholar
Cooper SJ, Trinklein ND, Anton ED, Nguyen L, Myers RM: Comprehensive analysis of transcriptional promoter structure and function in 1% of the human genome. Genome Res. 2006, 16 (1): 1-10.
Article PubMed Central CAS PubMed Google Scholar
Gagniuc P, Cristea PD, Tuduce R, Ionescu-Tîrgovişte C, Gavrila L: DNA patterns and evolutionary signatures obtained through Kappa Index of Coincidence. Rev Roum Sci Techn Électrotechn et Énerg. 2012, 57 (1): 100-109.
Google Scholar
Bednar J, et al: Nucleosomes, linker DNA, and linker histones form a unique structural motif that directs the higher-order folding and compaction of chromatin. PNAS. 1998, 95: 14173-14178. 10.1073/pnas.95.24.14173.
Article PubMed Central CAS PubMed Google Scholar
Fischle W, et al: Histone and chromatin cross-talk. Curr Opin Cell Biol. 2003, 15: 172-183. 10.1016/S0955-0674(03)00013-9.
Article CAS PubMed Google Scholar
Kornberg RD: Chromatin structure: A repeating unit of histones and DNA. Science. 1974, 184: 868-871. 10.1126/science.184.4139.868.
Article CAS PubMed Google Scholar
Chodavarapu RK, Feng S, Bernatavichute YV, Chen PY, Stroud H, Yu Y, Hetzel JA, Kuo F, Kim J, Cokus SJ, Casero D, Bernal M, Huijser P, Clark AT, Krämer U, Merchant SS, Zhang X, Jacobsen SE, Pellegrini M: Relationship between nucleosome positioning and DNA methylation. Nature. 2010, 466 (7304): 388-92. 10.1038/nature09147.
Article PubMed Central CAS PubMed Google Scholar
Milani P, Chevereau G, Vaillant C, Audit B, Haftek-Terreau Z, Marilley M, Bouvet P, Argoul F, Arneodo A: Nucleosome positioning by genomic excluding-energy barriers. Proc Natl Acad Sci USA. 2009, 106 (52): 22257-62. 10.1073/pnas.0909511106.
Article PubMed Central CAS PubMed Google Scholar
Smith CL, Peterson CL: ATP-dependent chromatin remodeling. Curr Top Dev Biol. 2005, 65: 115-148.
Article CAS PubMed Google Scholar
Elgin SC: Heterochromatin and gene regulation in Drosophila. Curr Opin Genet Dev. 1996, 6 (2): 193-202. 10.1016/S0959-437X(96)80050-5.
Article CAS PubMed Google Scholar
Gagniuc , Ionescu-Tirgoviste : Eukaryotic genomes may exhibit up to 10 generic classes of gene promoter. BMC Genomics. 2012, 13: 512-10.1186/1471-2164-13-512.
Article PubMed Central CAS PubMed Google Scholar

Download references

Acknowledgments

This work was supported by a grant of the Romanian National Authority for Scientific Research, CNCS-UEFISCDI, project number PN-II-ID-PCE-2011-3-0429.

Author information

Authors and Affiliations

Institute of Genetics, University of Bucharest, Bucharest, 060101, Romania
Paul Gagniuc
National Institute of Diabetes, Nutrition and Metabolic Diseases ”N.C. Paulescu”, Bucharest, Romania
Constantin Ionescu-Tirgoviste

Authors

Paul Gagniuc
View author publications
You can also search for this author in PubMed Google Scholar
Constantin Ionescu-Tirgoviste
View author publications
You can also search for this author in PubMed Google Scholar

Corresponding author

Correspondence to Paul Gagniuc.

Additional information

Competing interests

The authors declare that they have no competing interests.

Authors’ contributions

PG conceived of the study and participated in its design and coordination. PG created the algorithms and the software used in the analysis. CIT carried out the assembly of chromosome specific promoter files and manually tested the correctness of each promoter sequence. PA and CIT participated in the promoter sequence analysis and drafted the manuscript. Both authors have verified the accuracy of the data. Both authors read and approved the final manuscript.

Electronic supplementary material

12864_2012_5042_MOESM1_ESM.zip

Additional file 1: Binaries and source code files of PromKappa (Promoter analysis by Kappa Index of Coincidence) software used for promoter pattern analysis.(ZIP 3 MB)

Additional file 2: Examples of image-based promoter patterns.(DOC 143 KB)

Additional file 3: A complete set of 8,512 gene promoters from The Eukaryotic Promoter Database.(ZIP 2 MB)

Additional file 4: Distribution of numerical values.(XLS 49 KB)

Additional file 5: Chromosomes ordered by Kappa IC and (C+G)% mean values of their gene promoters.(PPT 340 KB)

Authors’ original submitted files for images

Below are the links to the authors’ original submitted files for images.

Authors’ original file for figure 1

Authors’ original file for figure 2

Authors’ original file for figure 3

Authors’ original file for figure 4

Authors’ original file for figure 5

Authors’ original file for figure 6

Authors’ original file for figure 7

Rights and permissions

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Reprints and permissions

About this article

Cite this article

Gagniuc, P., Ionescu-Tirgoviste, C. Gene promoters show chromosome-specificity and reveal chromosome territories in humans. BMC Genomics 14, 278 (2013). https://doi.org/10.1186/1471-2164-14-278

Download citation

Received: 18 June 2012
Accepted: 26 February 2013
Published: 24 April 2013
DOI: https://doi.org/10.1186/1471-2164-14-278

Gene promoters show chromosome-specificity and reveal chromosome territories in humans

Abstract

Background

Results

Conclusions

Background

Methods

Kappa index of coincidence

Cytosine and guanine content

Promoter analysis

Results

Gene promoters show chromosome-specificity

Heterochromatin and euchromatin are two main evolutionary forces

Chromosome territories in humans

Promoter Kappa IC values vs. genetic diseases

Discussion

Conclusions

References

Acknowledgments

Author information

Authors and Affiliations

Corresponding author

Additional information

Competing interests

Authors’ contributions

Electronic supplementary material

Authors’ original submitted files for images

Rights and permissions

About this article

Cite this article

Share this article

Keywords

BMC Genomics

Contact us