In silico comparative genomic analysis of GABAA receptor transcriptional regulation
© Joyce. 2007
Received: 24 January 2007
Accepted: 30 June 2007
Published: 30 June 2007
Skip to main content
© Joyce. 2007
Received: 24 January 2007
Accepted: 30 June 2007
Published: 30 June 2007
Subtypes of the GABAA receptor subunit exhibit diverse temporal and spatial expression patterns. In silico comparative analysis was used to predict transcriptional regulatory features in individual mammalian GABAA receptor subunit genes, and to identify potential transcriptional regulatory components involved in the coordinate regulation of the GABAA receptor gene clusters.
Previously unreported putative promoters were identified for the β2, γ1, γ3, ε, θ and π subunit genes. Putative core elements and proximal transcriptional factors were identified within these predicted promoters, and within the experimentally determined promoters of other subunit genes. Conserved intergenic regions of sequence in the mammalian GABAA receptor gene cluster comprising the α1, β2, γ2 and α6 subunits were identified as potential long range transcriptional regulatory components involved in the coordinate regulation of these genes. A region of predicted DNase I hypersensitive sites within the cluster may contain transcriptional regulatory features coordinating gene expression. A novel model is proposed for the coordinate control of the gene cluster and parallel expression of the α1 and β2 subunits, based upon the selective action of putative Scaffold/Matrix Attachment Regions (S/MARs).
The putative regulatory features identified by genomic analysis of GABAA receptor genes were substantiated by cross-species comparative analysis and now require experimental verification. The proposed model for the coordinate regulation of genes in the cluster accounts for the head-to-head orientation and parallel expression of the α1 and β2 subunit genes, and for the disruption of transcription caused by insertion of a neomycin gene in the close vicinity of the α6 gene, which is proximal to a putative critical S/MAR.
There exist at least 16 GABAA receptor subunit isoforms in mammals, each encoded by a separate gene. These isoforms have been categorised into classes based upon sequence similarity: six in the α subunit class, three β, three γ, and one each of δ, ε, θ and π. Typically two α, two β and one γ subunit assemble to form the GABAA receptor, with a GABA binding site located at each α-β interface. The most widely expressed and most common receptor subtype is a combination of two α type 1 subunits with two β type 2 and one γ type 2, which constitute in the region of 40% of receptors in the mammalian brain . Alternatively spliced subunit variants also contribute to the diversity of GABAA receptor composition.
GABAA receptor subtypes are distributed differentially within both cell type and region in the CNS, some subtypes being widespread whilst others have a very restricted expression profile. This, and the observation that expression of receptor subtypes varies with developmental stages, indicates that they each fulfil specific physiological roles. Furthermore, the subunit composition of GABAA receptor populations is not static within regions of the adult CNS, and alterations of subunit expression are observed in response to exposure to a large number of neuroactive compounds .
These complexities in GABAA receptor subtype expression are determined primarily at the level of transcription, by cell-specific coordinate gene expression [3, 4]. Typically, both tissue-specific and ubiquitous transcription factors are required to activate and control the expression of a gene. Many genes which are expressed only in the nervous system contain a 21-bp neuron-restrictive silencer element (NRSE) motif. This element binds with neuron-restrictive silencing factor (NRSF - also known as RE1 silencing transcription factor, REST) to repress gene expression. Since this factor is expressed primarily in non-neuronal tissues, the NRSE element acts as a gene silencer in these tissues. The GABAA subunit γ2 gene contains an NRSE site in the first intron, which was shown to bind to NRSF and repress expression in non-neuronal cell lines. There exist also NRSE-like sequences in the genes of GABAA subunits α1,α5,α6,δ and β3, each downstream of the TSS .
Generic cis-acting regulatory elements can also provide tissue specific transcriptional regulation by binding to factors that are present in a tissue-specific manner. The SP1 binding site is recognized by a family of transcription factors, and has similar binding affinity with SP1, SP3, and SP4 factors. Whilst SP1 and SP3 are ubiquitously expressed, SP4 is relatively brain-specific. SP3 acts as either an activator or a repressor, depending on promoter context. One possible explanation of the neural specificity of the α4 subunit promoter is that, in non-neuronal cells, factors SP1 and SP3 binds with the SP1 site in the proximal promoter to suppress transcription, whereas in neurons SP3 and SP4 bind at the site to activate transcription .
Alternate splicing of the GABAA receptor subunit gene transcripts provides for mRNA subunit variants. The α2 subunit, for example, exhibits a complex pattern of alternative splicing, with distinct promoter regions for the alternative mRNA isoforms. Although the resulting encoded protein sequence is the same, variations in the stability of mRNA isoforms can affect translational efficiency; these variations and the use of alternative promoters allow fine control of α2 subunit expression subunit during brain development . Multiple promoters exist also for the α5 and β3 subunit genes, and provide for further diversity in their transcriptional and post-transcriptional control.
The genes encoding for vertebrate GABAA receptor subunit genes are organised into several gene clusters on different chromosomes, which appear to have arisen due to duplication and subsequent translocation of an ancestral gene cluster. Two clusters on human chromosomes 4 and 5 are composed of two α genes, one β, and one γ (Figure 1). Another cluster on chromosome 15 is composed of one α, one β and one γ gene . Evidence that the clusters have arisen by means of gene duplication events is provided by the conservation of gene order, intergenic distance and the head-to-head orientation of the α and β subunit genes in each cluster. Based on conservation of genomic organization of GABAA receptor gene clusters, the ε and γ subunit genes appear to have a common ancestor.
This organisation into clusters may have been preserved to provide a mechanism for facilitating coordinate gene expression, by allowing adjacent clustered genes to share regulatory elements. Genes in clusters may be co-regulated in part by the establishment of chromatin domains which are insulated from surrounding genetic material by chromatin boundaries. The genes for the components of most common GABAA receptor subunit configuration, α1-β2-γ2, lie within the same cluster on chromosome 5, along with the α6 gene. All of the genes in this cluster are highly expressed, at least in granule cells, in contrast to the genes clustered on chromosomes 4 and 15, suggesting a common transcriptional regulatory mechanism. However, the α6 gene has a much more restricted pattern of expression, showing that it clearly also retains independent regulatory features.
Co-regulated genes may each possess a copy of the same transcriptional regulatory features, or they may share common distal features such as enhancers  or Locus Control Regions (LCRs). LCRs are regions of genomic DNA which are able to exert long-range enhanced transcription of linked genes in a tissue-specific and copy-number dependant manner (i.e. proportionate to the number of gene copies in transgenic experiments). They are composite features, characterised biochemically by a series of DNAse I hypersensitive sites (HSs), some of which contain arrays of transcriptional factor binding sites . There is no single model for the long-range transcriptional control exerted by LCRs. They are thought to both possess classic enhancer activity, recruiting the transcriptional machinery to gene promoters, and to play a role in establishing and propagating chromatin opening. LCRs are composed of varying numbers of both tissue-specific HSs and non-cell specific HSs, some of which are non-functional, others fulfilling distinct roles in the augmentation of transcription. The HSs that constitute an LCR are not necessarily clustered in a region of 1–2 kb, as they are in the canonical human β- globin gene locus . They may consist of HSs spread over large distances, and be interspersed with the genes they control.
Experimental techniques (RNA TRAP and 3C) have verified that the human β-globin LCR directly comes into direct contact with the DNA of the expressed gene whilst the locus is transcriptionally active, with intervening DNA looping out and away from the active region . This contact presumably facilitates chromatin-remodelling activity in the vicinity of the gene, and also allows the LCR enhancer elements and proximal promoter to cooperatively recruit transcriptional machinery proteins, as per the classic model for enhancer function . There is evidence that the human β-globin LCR interacts with only one gene promoter at a time and that it may alternate between two or more promoters, depending on the stage of development .
LCRs, unlike simpler enhancer elements, operate in an orientation-dependent manner with respect to the genes they control . It seems improbable, therefore, that a canonical LCR could control both the α1 and β2 genes, which unlike the genes in other LCR-controlled clusters, are oriented in a head-to-head configuration. It seems more likely that these genes are co-regulated by shared elements which are not orientation specific in relation the genes they activate. Some genes migrate beyond their immediate chromosomal territories to transcription factories, which are shared sites of transcription enriched in RNA Pol II and transcriptional factors. Transcriptional enhancer elements located in between the α1 and β2 genes could exert their effects on both genes, in a transcription factory located outside the of immediate chromosome territory (. The chromatin of the coordinately expressed genes in the HoxB cluster locates in loops which extend outside of the chromosomal territory in expressing cells, but not in non-expressing cells .
Chromatin is a dynamic structure that modulates access to DNA during gene replication and transcription. Whereas housekeeping genes are generally partitioned into chromosomal segments which are constitutively in an open conformation, tissue-specific genes tend to exist in segments which are facultatively opened . Groups of genes may be regulated autonomously by organising into chromatin domains, which are maintained independently from their surroundings and demarked by dynamic chromatin boundaries. Several models have been proposed to explain how chromatin boundaries are established. Barrier activity may be accomplished by the creation of a nucleosome gap to interrupt repressive histone modification, or barrier proteins may disrupt histone modifications associated with activation to prevent the propagation of heterochromatin . It seems clear that chromatin domains are maintained by the tethering of a loop of chromatin to a fixed structure, isolating it from propagating histone modification processes associated with chromatin condensation and transcriptional repression [16, 17]. In eukaryotes, this structure is the nuclear matrix (or nuclear scaffold), a network of proteins that provides a framework for organising chromatin. Filaments of the nuclear matrix provide structural support for the formation of loops of DNA during replication and transcription. Attachment of genomic DNA to the nuclear matrix also places genes in the domain in proximity to chromatin-modifying complexes and the transcriptional factors which enhance expression. Scaffold/Matrix Attachment Regions (S/MARs) are eukaryotic DNA sequences capable of specific binding to nuclear proteins that are part of the nuclear matrix. Their interactions with the nuclear matrix provide anchor points from which chromatin loop domains can be established and maintains the open chromatin conformation necessary for transcription.
The use of S/MARs as nuclear matrix anchors occurs in vivo in a discriminative and tissue-specific manner . Experimental evidence suggests that a subset of S/MAR sequences may function as relatively static nuclear matrix anchors, whilst other S/MARs act more dynamically to draw the chromatin loop into transcriptional machinery on the nuclear matrix surface, which is then scanned in search of promoters using a chromatin reeling model, thus promoting transcriptional initiation (figure 21). A distinction is made between the relatively fixed structural S/MARs serving as anchors, and functional S/MARS which operate dynamically under the control of the transcriptional regulatory system to draw genes into the transcriptional machinery (. S/MARS also may play a role in demethylation of DNA regions to trigger chromatin opening. It seems reasonable to hypothesise that the coordinate and tissue-specific regulation of genes in the GABAA receptor clusters may be achieved at least in part by the S/MAR-directed establishment of chromatin domains.
In two separate experiments, the insertion of neomycin hybrid genes into exon 8 of the α6 subunit gene region brought about a parallel reduction in the expression of α1 and β2 in the forebrain of α6 subunit gene knockout mice . Expression of γ 2 mRNA was unaffected by the neomycin gene insert. This suggests that the insert may have interfered with one or more long-distance regulatory features directing the coordinate expression of the α1 and β2 subunit genes. This feature could be an enhancer or an LCR common to both genes and responsible for their observed parallel expression profiles. Long-range disruption of neighbouring genes was also demonstrated by the insertion of the PGK-Neo hybrid gene into the granzyme B and β-globin gene locus, which also reduced expression of multiple genes within the locus at distances greater than 100 kb .
Any model proposed for the coordinate regulation of α1 and β2 subunit gene expression needs to incorporate the observation that their expression is down-regulated upon insertion of the neomycin gene in the vicinity of the α6 subunit gene. Uusi-Oukari et al  hypothesise that the inserted sequence may contain cis-acting elements which interact with an LCR to reduce its effectiveness as an enhancer of α1 and β2 transcription. The presence of an addition 2 kb or 5 kb of sequence could also critically alter chromatin loop configuration or chromatin remodelling processes; or the neomycin gene inserts in the vicinity of an enhancer could disrupt assembly of, or migration to, a transcription factory, causing a parallel decrease in expression of both genes.
Investigation of the transcriptional regulation of GABAA receptor subunit genes can provide an understanding of mechanisms underlying the spatial and temporal expression patterns of GABAA receptor subunit expression, which contribute to the aetiology of a wide range of neurological disorders and their responses to drug treatment. Steiger & Russek  have conducted a comprehensive review of GABAA receptor subunit gene regulatory features. In addition to summarising experimentally determined promoters, they used the Matinspector and NNNP programs to predict additional promoter regions and transcription factor binding sites. However, the published predictions were based upon analysis of a single (mostly human) genome for each subunit.
The purpose of the current work was to utilise a wide range of promoter prediction software to analyse each mammalian GABAA receptor subunit gene, and to use comparative genomic analysis to further substantiate the predictions made by these tools. Comparative analysis can serve to eliminate false positives from the large result sets typically created by Transcription Factor Binding Site (TFBS) prediction programs, and to provide powerful evidence for the functionality of putative TFBSs. Functional TFBSs sites are likely to be the conserved, and to be located in equivalent positions in multiple sequence alignments of homologous sequences, whereas false positives are not. Additionally, this work attempts to look beyond the regulatory elements for individual genes to propose models for higher level of transcriptional regulation of clustered GABAA receptor genes, focusing upon the coordinate expression of genes in the α1, β2, γ2 and α6 subunit cluster. The methods used for predicting features responsible for coordinate expression included comparative analysis of the cluster to reveal homologous regulatory features in intergenic regions of DNA, prediction of distal regulatory features as regions of DNase hypersensitivity, and prediction of Scaffold/Matrix Attachment Regions (S/MARS).
The α1 subunit gene is strongly expressed in all brain regions, being a component of the most abundant GABAA receptor subtype, α1β2γ2 . Kang et al  have experimentally isolated a 60 bp minimal promoter element in the 5' flanking DNA sequence of the human α1 subunit gene. It was demonstrated that reduction in transcription of the subunit gene following chronic benzodiazepine exposure is brought about by direct repression of the activity of this promoter. Bateson et al  have identified the TSS for the chicken GABAAα1 subunit gene, and reported a number of putative promoter elements, including a TATA box 30 bp upstream of the TSS, SP1 binding site and reverse CCAAT box. The promoter region is CG-rich. The TSS for the rat gene has also been experimentally determined, and lies 5 bp 3' of the chicken TSS (A.N. Bateson, personal communication, unpublished).
The α2 subunit gene is strongly expressed only in the hippocampus and hypothalamus. Enhanced mRNA levels occur early in CNS development, and decline postnatally with a concurrent rise in α1 gene expression . Fuchs & Celepirovic  have investigated the rat α2 subunit gene, and identified six α2 subunit mRNA isoforms with distinct 5-end UTRs (the resulting protein sequence is not affected), generated from three alternative first exons by means of alternative splicing and alternate promoter usage. The isoforms are named AB8, SPL2, SPL3, SPL4, SPL5 and non-spliced variant nonSPL. About 70% of expressed mRNA is of the AB8 isoform. Multiple Starts Sites (MSS) for each isoform were identified within each of the three alternative first exons. The AB8 isoform putative promoter region is GC-rich, and contains putative INR elements and TF binding sites, including SP1 binding sites. Unusually, the AB8 promoter region is located downstream from the transcription initiation sites .
The α3 subunit gene is expressed throughout the adult brain. Like the α2 gene, enhanced expression occurs early in CNS development and declines postnatally with an accompanying rise in α1 expression . The promoter region and major TSS of the mouse α3 subunit gene have been experimentally determined . The highly unusual promoter region contains a series of GA repeats about 40 bp upstream from the major TSS, which bind unspecified nuclear proteins and directly augment transcription, and in which lie several minor TSS sites. An adjacent series of three GC repeats, forming part of a putative E2F sequence motif, appear to also contribute to promoter activity.
The α4 subunit gene is expressed variably in different regions of the brain. Elevated expression levels are observed in animal models of temporal lobe epilepsy, and during withdrawal from alcohol and progesterone treatment, suggesting that the subunit may play a role in neuronal hyperexcitability . Ma et el  have determined the minimal promoter, major TSS and start codon ATG for the mouse α4 subunit gene. They also identified two SP1 sites within the proximal promoter region which are critical for high-level promoter activity in vivo, and bind to the TFs SP3 and SP4 to augment α4 subunit gene expression in neuronal cells. Other putative SP1, AP1, c-Myb, and E-box binding sites did not alter transcription levels when deleted. Roberts et al  have demonstrated increases in mRNA levels of α4, and in early growth response factor 3 (EGR3) expression, in response to induced epilepsy, accompanied by increased binding of EGR3 to α4 subunits in dentate granule cells. Also, EGR3 knockout mice exhibit reduced α4 hippocampal mRNA expression. This data demonstrates the role of EGR3 as a major regulator of α4 subunits, and implicates the factor in epileptogenesis.
Expression of α5 subunit receptors is restricted primarily to the hippocampus . Kim et al  have identified three isoforms of human α5 subunit mRNA, resulting from three alternative first exons. They have demonstrated that each exon appears to be regulated by a different promoter, located in intronic sequence regions immediately upstream. The differential transcriptional activation by alternative promoters may determine the alternative usage of the first exon isoform. No core promoter element motifs are observed in these promoter regions, however both regions are enveloped within a single CpG island and contain a large number of putative TF binding sites including SP-1, AP-1 and AP-2 sites.
The α6 subunit gene is expressed exclusively in cerebellar granule cells , and as such provides a useful model for studying how neuron-specific gene expression is regulated. McLean et al  have identified several minor and one major TSS with a surrounding INR in homologous rat and mouse sequences. They have determined a 155 bp minimal promoter in the rat DNA, within which a 60 bp region, 70 bp upstream of the TSS, enhances expression in cerebellar granule cells only. This region is positionally dependent with respect to the TSS, and contains a conserved NF-1 (NFI, nuclear factor 1) motif. They have also identified a downstream negative regulatory region which is active in fibroblasts but inactive in cerebellar granule cells. Wang et al  have determined that NFI-A factor is abundant in cerebellar granule cells and binds to the α6 subunit gene promoter in vivo in the mouse gene, confirming its critical role in α6 subunit gene expression and the differentiation of cerebellar granule neurons. They have also identified the NFI motif TGCCAAAAC within the α6 gene promoter region which binds NFI-A proteins.
The β1 subunit is present in a high proportion of the GABAA receptors from the hippocampus and cerebral cortex, and a very small proportion of cerebellar GABAA receptors . The human β1 subunit gene contains a TATA-less core promoter region of 270 bp . The core promoter is neural specific and contains an INR element at the major TSS which is critical to promoter activity. Upstream of the INR, A GRE consensus sequence has a neural-specific positive regulatory effect, and CCAAT (NF-Y) and IK2 motifs have negative regulatory effects. Persistent receptor activation by the GABA neurotransmitter induces down-regulation of β1 subunit mRNA expression. The exposure to GABA appears to directly reduce promoter activity by modulating the binding of sequence-specific basal TFs to the promoters INR element, thus regulating formation of the Pre-Initiation Complex (PIC) .
The β3 subunit is the most abundant β isoform in GABAA receptors of the hippocampus, and is also widely expressed in the cerebral cortex and cerebellum. It is frequently co-localised with either a β1 or β2 subunit in the same GABAA receptor, in proportions which vary with brain region . The human β3 subunit gene contains two alternate first exons, 1 and 1A, which are expressed differentially during development and between different brain regions, and encode two dissimilar signal peptide-like sequences. Exon 1A lies upstream of exon 1 and encodes a peptide signal sequence. Transcription of exon 1A is initiated by multiple start sites in a pyrimidine-rich promoter region, which lies in between the alternative first exons, and binds SP1 and other nuclear factors. Nuclear factor binding sites overlap each of these transcription start sites, which may be involved in differential expression of the alternative first exons. The human and rat coding and promoter regions are highly conserved .
The γ2 subunit gene is expressed widely throughout the brain, occurring at high levels early on in CNS development . Mu and Burt  have found two major transcription start sites in a TATA-less promoter region of the mouse γ2 subunit gene. There are two subtypes of the subunit, which are formed by alternate splicing which is regionally and developmentally regulated. The first intron of both the mouse and human γ2 genes contains a conserved neuron-restrictive silencer element (NRSE) site, which was shown to bind to NRSF and direct expression in neuronal cells, and to repress expression in non-neuronal cells. Another sequence element, the "Gamma Promoter Element" (GPE1), which also promotes expression in neuron-like cells, lies 70 bp downstream of the second major TSS. Mu and Burt  report that "probably only a portion of the 24 bp sequence is involved". The first 12 bp matches the cAMP-responsive element binding protein (CREB) consensus sequence (program P-MATCH).
The γ3 subunit gene is expressed at low level in the cerebellum and hippocampus . The gene promoter has not yet been experimentally determined. Analysis of the 2000 bp region upstream of the GenBank computationally-derived TSS for the human sequence by the PromoterInspector, McPromoter, NNPP and Promoter 2 programs gives a consensus promoter prediction within a region approximately 550–100 bp upstream of the GenBank putative TSS and CDS. This region is GC rich and contains no TATA sequences. The program CPGPlot predicts a 554 bp CpG island which matches closely the putative promoter region. The γ3 subunit gene promoter may therefore lie within a CpG island, in common with a number of other GABAAreceptor subunit genes such as α2, α4, α5 and δ. Although such 5-end CpG islands commonly extend downstream of the promoter into the transcription unit , this does not appear to be the case for any of the GABAA subunit genes.
Expression of the GABAA receptor δ subunit gene is most abundant in granule cells of the cerebellum and dentate gyrus. The 5-end of the murine δ subunit gene contains a region with CpG island features, and lacks canonical promoter elements such as the TATA box . The rat δ subunit gene and promoter has been characterised by Motejlek et al  who have mapped the major TSS, which is enclosed within an INR element CCACTCT. The promoter region lies within a CpG island. They have also identified a 22-bp purine-rich element, present in seven partially-overlapping copies, approximately 1800 nucleotides upstream of the TSS. The element is bound by a novel "brain-specific factor" (BSF1) that is present predominantly in cerebellar granule cells, correlating with GABAAδ subunit expression, however it is not clear if the factor plays any role in transcription regulation of the δ subunit gene.
Alternative splicing of the ε subunit gene yields a number of different transcript isoforms. Expression of the full protein sequence occurs abundantly only in discrete areas of the adult brain, such as the hypothalamus and locus ceruleus. In other tissues, truncated subunit protein sequences are transcribed . The ε subunit gene promoter has not been characterised. The programs McPromoter, NNPP Cister and Promoter 2 predicted no promoter in the 2000 bp region upstream of the GenBank computationally-derived CDS for the human sequence GABAAε subunit gene; however the program PromoterInspector predicted a 324 bp promoter within a region approximately 400 bp upstream of the GenBank computationally-derived CDS. For the mouse sequence, PromoterInspector also predicts a 200 bp promoter in a 180 bp upstream of the GenBank CDS.
The putative region appears to contain no fully conserved canonical core promoter features, however the mouse and rat sequence have an INR motif CCAGACC 45 bp upstream of the GenBank CDS, whereas the human sequence contains an INR motif at the beginning of the putative promoter region. For all species, the region has the features of a CpG island (CPGPlot), and contains conversed SP1 binding site sequences.
GABAA receptor θ subunits show very similar distributions to ε subunits throughout the brain, and may also vary significantly between species . Novel α3θε GABAA receptors with unique electrophysiological and pharmacological properties may form in monoaminergic neuronal cell-groups . There is no experimental data for the θ subunit promoter. Promoter prediction programs were used to search both mouse and human sequences. No promoters were predicted by the MatInspector or McPromoter programs in either sequence. The program NNPP predicts two core promoters in the human sequence approximately 1500–1600 bp upstream of the GenBank TSS. Each putative promoter has a TATA motif. No promoter was predicted by NNPP in the homologous mouse sequence, and the TATA sequences are not conserved.
The subunit encoded by this gene is expressed in several non-neuronal tissues, and most abundantly in the uterus .
The 5-end of this gene displays some characteristics of a multiple start site promoter region such as that observed in the α2 subunit gene. Here also, there is no TATA sequence or well-defined TSS, but a number of INR motifs form a transcription initiation window of weak transcription start sites, which again are not fully conserved across species. Sequence identity in this region is lower than in the promoters of GABAA receptor genes with a single TSS, and individual start sites are not as well conserved. This appears to be characteristic of MSS promoters, and is presumably a result of relaxed selected pressure due to a degree of redundancy introduced by MSS transcription. It is quite possible also that the computationally derived GenBank gene origin is incorrect, and the TSS and promoter may lie outside this region of sequence.
The human α1, β2, γ2 and α6 subunit gene cluster is located on chromosome 5q33 and spans 500–700 kb. The genes lie in the order β 2 - α 6 - α 1 - γ 2, with β2 and α6 in a head-to-head orientation. α6 high-level expression is restricted to cerebellar granule cells, and is probably regulated independently from the other genes in the cluster . Wang et al  have determined that NFI-A factor is abundant in cerebellar granule cells and binds the α6 subunit promoter, playing a critical role in α6 gene expression. The γ2 subunit gene's expression profile overlaps with that of α1 and β2, but is yet more widespread, combining with several subtype variants in addition to the most common 2α 1 - 2β 2 - 1γ 2 configuration. Transcriptional control of γ2 is therefore at least partially independent from that of α1 and β2. Its promoter is known to contain an NRSF binding site and a "Gamma Promoter Element", both of which direct expression in neuronal cells .
The α1 and β2 genes are widely expressed in neural tissue, and have almost identical expression profiles. The question therefore arises as to whether they have the same expression profiles by virtue of each possessing common transcriptional regulatory features in individual promoters, or whether coordinate regulation is achieved by the action of shared long-range transcriptional regulatory elements such as enhancers, silencers or an LCR. The α1 subunit gene core promoter is well characterised, and contains a canonical promoter , with TATA box, INR and DPE motifs, and an upstream SP1 site, typical of ubiquitously expressed housekeeper genes. There are also putative sites for NFI and NRSE in the α1 subunit gene, but there is no experimental evidence that they are required for tissue-specific expression (indeed, the NRSE motif is not conserved in cross-species alignments). Although the β2 subunit gene core promoter is undetermined, computational analysis here provides evidence that this subunit is under the control of a very similar promoter, with TATA box, INR, DPE core elements, also with an upstream SP1 site. Based upon this predicted promoter, the β2 subunit expression profile may closely matches that of the α1 subunit at least in part by possessing the same core promoter characteristics.
Both proximal and long-range regulatory features are more highly conserved between species by evolutionary selective pressure than is the surrounding non-regulatory DNA. Functional long-range regulatory features in a gene cluster should therefore stand out as relatively short regions of intergenic sequence which are more highly conserved than the surrounding background DNA. In order to investigate whether the α1 and β2 subunit genes share common long-range regulatory elements, the rVista program ( was used to perform cross-species, comparative analysis of the human chromosome 5 GABAA cluster locus, and intergenic homologous regions were isolated for further analysis.
The rVista program was also used to compare human and mouse DNA in the GABAA cluster, to identify potential regulatory features limited to mammalian species. This revealed a number of intergenic homologous regions with sequence identity greater than 75%, not conserved in the chicken species. These were further analysed for putative neuronal or ubiquitous TFBS motifs. In addition to the motifs found in the conserved chicken homolog, these regions contain motifs for STAT (signal transducer and activator of transcription), CEBP (CCAAT/enhancer binding protein), CDP (CCAAT displacement protein), NF (Nuclear Factor) Kappa B, activator protein 4 (AP4), upstream stimulating factor (USF), activating transcription factor (ATF), enhancer factor 4 (E4F). An analysis of conserved intergenic regions is summarised in figure 18. Each of these conserved regions represent potential distal regulatory elements for one or more of the genes in the cluster, which could be inactivated by the neomycin gene insert  which was observed to cause disruption of α1 and β2 expression.
Regions of DNase I hypersensitivity within a gene locus, which are not associated with core promoters, are indicators of distal cis-regulatory elements such as enhancers or silencers . Clusters of HSs provide markers for possible LCRs controlling transcription in the locus. As there is currently no HS prediction tool available, the locus of the GABAA chromosome 5 cluster was analysed based upon published locations of predicted HSs  (Coordinates are from the April 2003 human genome assembly). Although this data is based upon analysis of erythroid sequences, it was hypothesised that a proportion of the predicted sites would also be DNase I hypersensitive in other cell lines [41, 42], including neuronal cells.
The program MAR-Wiz was used to predict S/MARS within and around the human chromosome 5 GABAA receptor gene cluster, and in the homologous mouse cluster on chromosome 11. Results with a cut-off score of 0.7 were considered.
(a) Firstly, the region surrounding the gene cluster is anchored to the nuclear matrix by flanking S/MARS, which establish a tissue-specific chromatin domain and define the boundaries of transcriptional activity by preventing the propagation of chromatin condensation, thus shielding the locus from the silencing effects of neighbouring chromatin. Tethering of the domain to the nuclear matrix may also bring it into proximity with chromatin modifying processes which participate in opening of the chromatin loop domain to potentiate transcription.
(b) A central, functional S/MAR in the vicinity of the α6 subunit gene promoter migrates to the nuclear matrix, under the control of accessory transcriptional regulator factors. This causes the potentiated chromatin to associate with transcriptional machinery proximal to the nuclear matrix. The location is approximately 100 kb from a region containing a cluster of predicted for HSs, which was also observed to be rich in conserved putative TFBSs upon comparative analysis of the human and mouse genomes. The central S/MAR may also play a role in the demethylation of DNA regions to trigger further chromatin conformational changes.
(c) The transcriptional machinery reels in the open chromatin in either direction until a promoter is encountered, at which point gene transcription initiates. Whilst the α6 and γ2 genes in the cluster are regulated independently by other cell-specific TFs, a shared complement of TFs and corresponding cis-regulatory elements presumably accounts for the identical expression profile of the β2 and α1 subunit genes. Distal enhancer elements, some of which are shared by both β2 and α1 subunit genes, may loop back into the region of the transcriptional machinery to provide tissue-specific transcriptional augmentation.
(d) Transcriptional elongation causes the functional S/MAR to dissociate from the vicinity of the nuclear matrix, so that once the transcription is complete, a new cycle of transcription can be initiated. Selection of either the β2 or α1 gene for transcription may be random, or there may be an unknown mechanism of alternating the expression of β2 and α1 to ensure parallel expression levels.
The model provides an explanation for number of observations - the head-to-head configuration of the β2 and α1 subunit genes, their parallel expression profiles, and the parallel down-regulation of expression observed upon insertion of the neomycin gene insert at exon 8 of the α6 subunit gene , which is the vicinity of the proposed functional S/MAR, and is presumably close enough to interact and to reduce its effectiveness. The basic model could also incorporate expression of the α6 subunit, assuming that additional gene-specific factors contribute to produce a different tissue-specific expression profile. However it is probable that the functional S/MAR would not be critical for γ 2 gene transcription, since its expression is unaffected by the neomycin gene insert and subsequent S/MAR disruption.
In silico comparative analysis of GABAA receptor subunit genes was performed to predict potential regulatory features and in particular to identify the means of coordinate regulation in the gene cluster comprising the α1, β2, γ2 and α6 subunits. Bioinformatics resources were used to generate a number of predictions which were substantiated by cross-species comparative analysis and are subject to wet-laboratory verification. Putative promoters were identified for the β2, γ1, γ3, ε, θ and π subunits, which may be experimentally verified by cloning of the identified DNA segment and screening for promoter activity using luciferase reporter gene assays. Putative core elements and proximal TFs were identified within these predicted promoters, and within the experimentally determined promoters of the other subunits.
A region of predicted DNase I hypersensitive sites within the GABAA receptor cluster on human chromosome 5 represents a candidate site for transcriptional regulatory features controlling one or more genes in the cluster. The experimental procedures RNA TRAP and 3C could be used to verify whether this region comes into contact with the DNA of the expressed genes whilst transcriptionally active, as described by established models for enhancer action .
Given the disparate orientations of the genes in this cluster, it seems unlikely that they are under the control of a canonical, orientation-dependent LCR. Based upon a putative promoter identified for the β2 subunit gene, it is possible that its expression profile closely matches that of the α1 subunit at least in part by possessing the same core promoter characteristics. The model proposed here for their coordinate regulation is based upon the selective use of S/MARS, in which the chromosomal sequence surrounding the gene cluster is first anchored to the nuclear matrix by flanking S/MARS to establish the boundaries of transcriptional activity, and further directed by another functional S/MAR. Spatial and temporal fine-control of transcriptional activity may be achieved by gene-specific factors and cis-regulatory elements, but other distal, intergenic putative regulatory elements were isolated and may be common to both α1 and β2 genes. The model accounts for a number of features of the gene cluster and its regulation, including the orientation of the genes, and disruption of α1 and β2 subunit gene transcription by the insertion of a neomycin gene in the close vicinity of the α6 gene, which is proximal to a critical S/MAR. A first step in verification of this model would be the use of Fluorescence in situ hybridisation (FISH) techniques to visualise the localisation of labelled predicted S/MARs on the nuclear matrix.
For each GABAA receptor subunit, DNA sequence data was obtained from the NCBI website for analysis. In most cases, the sequences were the latest versions of genomic DNA from the Entrez Gene database. Where the promoter region or gene transcription start site is undetermined, the sequences are normally annotated in GenBank with an mRNA start point, derived by automated computational analysis, sometimes with supporting experimental evidence. The first 1000 base pairs upstream of the given 5'-end were typically used for initial analysis. This region of sequence was presumed to contain the core and promoter region and proximal regulatory elements. A number of promoter prediction programs were used, including PromoterInspector , NNPP , McPromoter , CISTER  and Promoter 2 . These are popular, freely available program which each use different promoter-predicting methodologies - neural nets, HMMs and context-based approaches. Typically, several of these programs were used on the sequence to build up a consensus prediction for the core promoter region, which was then further analysed for individual regulatory features. Default options and parameters were used with these programs unless stated otherwise. To search for TFBSs in promoter regions, the Transcription Element Search System (TESS, ), P-MATCH ( and program MatInspector  were used. These programs each provide searches restricted by organism and tissue type; the search results were restricted to binding sites for vertebrate factors which are either ubiquitous or neuron-specific.
Comparative analysis can provide powerful evidence for the in vivo functionality of TF binding, and provides a means to eliminate false positives from the large result sets typically created by PWM-based TFBS prediction programs. Functional TF binding sites are likely to be situated in conserved non-coding regions, and furthermore, to be located in equivalent positions across genomes . For each GABAA receptor subunit, sequences from several species were used to generate multiple sequence alignments of the gene promoter regions to identify conserved features. The alignments were performed using the CLUSTALW (1.81) Multiple Sequence Alignments program using default options unless stated otherwise.
The success of the comparative sequence analysis approach is largely dependent upon the selection of species which are at a suitable evolutionary distance. The sequence difference between closely related species will not provide any meaningful filtering of results, whilst comparison of highly unrelated species will be unlikely to reveal any conserved binding sites. In each case, sequences from human and mouse gene promoter regions were used with those for other selected available species. Sequence data for human and mouse is available for all GABA receptor subtypes, and the species are generally at a suitable evolutionary distance for the effective filtering of results. Sequence conservation between these species in non-coding regions was taken as additional evidence for the biological significance of predicted regulatory features. Quoted sequence identity scores were derived by using the alistat program provided as part of the HMMER 2.2 g HMM analysis package (hmmer.wustl.edu), using the Clustal X alignments as input. In most cases, only putative TF binding sites and promoter features which are largely conserved in all species in the alignment are reported in the results section. The filtering of TFBS predictions by species, cell specificity and finally by conservation in alignments reduces the number of predicted TFBSs to realistic levels, albeit at the risk of eliminating true positives from the result sets.
Co-regulated genes may each possess the same promoter features, or they may share common regulatory features. Whilst analysis of the individual gene promoters can identify the former, prediction of long-range features responsible for the coordinate expression of several genes requires analysis of the whole gene cluster sequence. Regions of intergenic sequence which are more highly conserved than the surrounding background DNA would be strong candidates for such functional regulatory features. The rVista 2.0 program  was used to perform cross-species, comparative analysis of the human chromosome 5 GABAA cluster locus in order to identify potential intergenic regulatory elements in addition to the core promoters of individual genes. The program offers a combination of signal-based TFBS searches with comparative sequence analysis to reduce the number of false positive matches and to provide supporting evidence for site functionality.
HSs occur over shorter stretches of DNA (typically ~250 bp), and are perhaps two orders of magnitude more sensitive to DNAse I than bulk chromatin . Formation of hypersensitivity is a result of interaction of multiple transacting factors bound to cis-regulatory elements, and is taken as a reliable indicator of functional transcriptional features, such as promoters and enhancers, in non-coding DNA .
Noble et al  used a Support Vector Machine (SVM) to recognize HSs in genomic sequence data. The SVM was used to predict HSs for all non-repetitive sequence in the human genome, partitioned into 225 bp segments. High-scoring positive-predicted values in subsequent experimental validation of the predicted HSs suggested to the authors that "elements identified by the SVM might represent a class of HSs that are active in many tissues or are even constitutive." . Crawford et al  have used high-throughput experimental analysis to identify clusters of HSs in CD4+ T cells (lymphocytes) for a representative sample of sequences from the human genome. Whilst 10% of the HSs thus identified are only detectable in lymphocytes, the remaining 90% were confirmed as HSs for all tested cell types. Whilst there is as yet no generally available tool for predicting HSs, this data suggests that a high percentage of sites of DNase I hypersensitivity in one tissue type could also be so in other tissue types, including neuronal tissue. Based upon this hypothesis, the predicted HSs  were analysed as possible markers for LCRs and other distal regulatory elements such as enhancers or silencers, controlling transcription in the locus of the chromosome 5 GABAA gene cluster.
S/MARS are found either in non-transcribed regions, at the borders of chromatin domains, or in close association with non-coding transcription elements such as enhancers or introns. S/MARs themselves are often rich in TF binding sites, with a local over-representation of specific AT-rich motifs. [50, 51]. MAR-Wiz http://futuresoft.org/MarFinder/ is a web-based tool for predicting S/MARs based upon a number of analysis rules including origin of replication, TG-richness, curved DNA, kinked DNA, and topoisomerase II recognition. MAR-Wiz was used to predict S/MAR regions in the gene cluster locus, as potential anchor points delimiting chromatin loop domains.
Gene Regulation Prediction Software utilised
CpG island detection
TESS, Transcription Element Search System
PSSM-based TF binding site search
PSSM-based TF binding site search
MATCH and P-MATCH
PSSM-based TF binding site search
NNPP, Neural Network Promoter Prediction
Uses Neural Networks to predict basal promoter region and TSS.
TSS prediction, limited to TATA class promoters
Uses Neural Networks with Genetic Algorithms to predict vertebrate PolII promoters regions.
Context-based prediction of eukaryotic pol II promoter regions.
Markov Chain/Neural net based Promoter and TSS prediction program
HMM based cis-element search, lists interacting TFs (promoter modules) and TF class summary
NCBI Entrez Gene
Searchable database of genes
Cister, Cis-element Cluster Finder
HMM based cis-element cluster search
Phylogenetic footprinting, combines database searches with comparative sequence analysis.
S/MAR motif sequence search
Chromosome Conformation Capture
Activator protein 4
Activating transcription factor
Brain specific factor 1
CCAAT displacement protein
CCAAT/enhancer binding protein
Central nervous system
cAMP-responsive element binding protein
Downstream promoter element
E4F enhancer factor 2, 4
Early growth response factor 3
Estrogen response element
Fluorescence in situ hybridization
receptor type A
Glucocorticoid response element
HS DNAse I hypersensitive site
Locus Control Region
Ligand-gated ion channel
Multiple start site
Nicotinic acetylcholine receptor
NF-1 Nuclear factor I family of transcription factors (CAAT box binding site)
NRSF Neuron-restrictive silencer element/factor
Open reading frame
Octamer binding protein
Phosphoglycerine kinase neomycin resistance hybrid gene
Positional weight matrix
silencing transcription factor
Scaffold/Matrix Attachment Region
2 etc Specifity protein 1,2
Signal transducer and activator of transcription
Support vector machine
Transcription factor binding site
Tagging and recovery of associated proteins
Transcription start site
Upstream stimulating factor
I would like to thank Dr Alan Bateson for originally proposing the project upon which this paper is based, and for his helpful comments and suggestions during its preparation.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.