Combined analysis of expression data and transcription factor binding sites in the yeast genome
© Nagaraj et al. 2004
Received: 30 April 2004
Accepted: 26 August 2004
Published: 26 August 2004
Skip to main content
© Nagaraj et al. 2004
Received: 30 April 2004
Accepted: 26 August 2004
Published: 26 August 2004
The analysis of gene expression using DNA microarrays provides genome wide profiles of the genes controlled by the presence or absence of a specific transcription factor. However, the question arises of whether a change in the level of transcription of a specific gene is caused by the transcription factor acting directly at the promoter of the gene or through regulation of other transcription factors working at the promoter.
To address this problem we have devised a computational method that combines microarray expression and site preference data. We have tested this approach by identifying functional targets of thea1-α2 complex, which represses haploid-specific genes in the yeastSaccharomyces cerevisiae. Our analysis identified many known or suspected haploid-specific genes that are direct targets of thea1-α2 complex, as well as a number of previously uncharacterized targets. We were also able to identify a number of haploid-specific genes which do not appear to be direct targets of thea1-α2 complex, as well asa1-α2 target sites that do not repress transcription of nearby genes. Our method has a much lower false positive rate when compared to some of the conventional bioinformatic approaches.
These findings show advantages of combining these two forms of data to investigate the mechanism of co-regulation of specific sets of genes.
A bioinformatic approach to identifying cis-regulatory elements controlling transcription has become feasible with the availability of complete genome sequences and large scale expression data using high-throughput methods such as microarrays [1,2] and SAGE . The expression data provides a list of genes whose expression is significantly modified under a particular condition. However, this data does not indicate whether these genes are direct targets of a particular transcription factor or if the changes in expression are the result of an indirect effect caused by altering the expression of other transcription factors that work directly at the promoter. Using information about sequence preference for binding of particular transcription factors, one can identify possible regulatory binding sites within a sequenced genome. However, this approach does not indicate if the sites are functional. We have therefore developed an algorithm that combines both of these approaches to distinguish between the direct and indirect targets that are regulated by a particular transcription factor.
We have applied this methodology to study the transcriptional regulatory system that specifies cell mating-type in the yeastSaccharomyces cerevisiae. Yeast have three cell types, haploidaandαcells, and thea/αdiploid, that differ in their ability to mate and in the proteins they express. Cell mating-type is determined in part byα2 anda1, which are cell-type-specific proteins that are members of the homeodomain (HD) DNA-binding family. In ana/αdiploid cell,α2 binds witha1 to form a heterodimer complex that represses transcription of haploid-specific genes . The crystal structures of theα2 HD binding DNA alone and in complex witha1 have been solved, providing models for how these complexes bind DNA [6,7]. Biochemical and mutational analysis of each protein and their DNA-binding sites have defined the requirements for DNA recognition by this complex [8–10]. Genome-wide expression analysis has also been performed on each of the different cell types . The combination of these resources has allowed us to develop and test algorithms to identify target sites for thea1-α2 complex. Previous work, using a relatively simple binding site search program identified targets for theα2-Mcm1 complex, which repressesa-cell-type specific genes inαanda/αcells . The more advanced methods described in this paper have helped identify several novel targets of thea1-α2 complex that may be involved in cell-type specific processes. Interestingly, we identified several genes that are repressed in diploid cells but do not appear to be direct targets of thea1-α2 complex, suggesting that these genes are controlled by another transcriptional regulatory factor that is directly or indirectly regulated by thea1-α2 complex. We have also identified a number ofa1-α2 target sites that do not repress adjacent genes. The combination of site preference and expression data is therefore a valuable tool to identify direct functional targets of a transcription factor or complex.
To generate an algorithm that combines microarray expression data and mutational analysis of binding sites, we first defined a scoring method that ranks gene expression data. We utilized the microarray expression data from Galitski and coworkers for gene expression in theaandαhaploid anda/ αdiploid cells, as well as various polyploids . Since thea1-α2 complex should be absent in any of the homozygousaorαtype polyploids (a, aa, ...,α,αα,..., etc.) we expect the expression of haploid-specific genes in these cells to be much higher than in cells that are heterozygous for theMATlocus (a α,aa α,a αα,aa αα, etc.). Thus, one term in the scoring function rewards lower expression in heterozygous cell types compared to the homozygous cell types (see the Methods section for details). We expect that most of the haploid-specific genes will be expressed equally in both ofaandαcell types. Consequently, we have introduced a second term in the scoring function that penalizes such differences in expression in the two haploid cell types. This scoring function would identify haploid-specific genes that are repressed in diploid cells, but would not indicate if these genes are direct targets of thea1-α2 repressor complex.
To identify genes from this ranking that are directly repressed by thea1-α2 complex we used the available mutational data on thea1-α2-binding site . In these experiments, the effects of single base pair mutations of thea1-α2 consensus binding site were measured by assaying their ability to repress transcription of a heterologous promoter and by electrophoretic mobility shift DNA-binding assays (EMSA). Under the assumption that the level of expression is proportional to how often that site is unoccupied, we used the effects of the single base mutations to estimate the parameters for the binding energy of sites with different bases at each position. We then used this information to search for potentially strong binding sites in the promoter regions (in practice, 800 bp upstream of the translation start site) of every gene in the genome. This search provides us with a list of genes with putativea1-α2 binding sites in the promoter, irrespective of functionality.
Potentiala1-α2 Binding Sites in Haploid-specific Genes
MAT α 1
In comparison to higher eukaryotes, most yeast promoters are relatively small and contain activator or repressor binding sites within several hundred base pairs of the start site of the open reading frame (ORF) of the gene. However, there are a few genes, likeHO, whose regulation is controlled by a region several Kb long. Consequently, we did a separate search looking for additionala1-α2 binding sites that are within a region 1.5 Kb upstream of the ORF. Most of the sites identified in this search were well above the threshold value and were not bound by thea1-α2 complex in ChIP assays (data not shown). However, the search identified one site upstream of theAMN1/CST13gene that was a potential target site. The ChIP analysis verified this as a functional target site for thea1-α2 complex in vivo (Figure1, Table1).
Haploid-specific Genes that Do Not Containa1-α2 Target Sites
We compare the performance of our algorithm with that of the weight matrix method [15–18]. In our study, we derived our parameters from a set of artificial sequences. Usually, the weight matrix has to be constructed from a set of known sites. We calculate the weight matrix fora1-α2 from regulatory elements upstream several known target genes:HO, GPA1, FUS3, AXL1, STE5, RME1andMAT α 1. As usual, one is faced with a choice of threshold weight matrix score for selecting putative sites in the yeast genome. For a stringent threshold that corresponds to the top 16 targets, we recovered all the genes, other thanRME1, used in construction of the weight matrix. However, we did not recover most of the other genuine targets identified, and verified, in this study. If we set the threshold to be lax enough to includeRME1, we obtained 55 candidate genes, includingSTE18andRDH54,but still miss targets likeSTE4. It is likely that most of the 55 putative targets are false positives, as evidenced by lack of haploid-specific regulation in the corresponding gene expression data.
Overall, we find our method to be more successful than the weight matrix method. The use of mutational data as opposed to literature based data for sequence preference possibly accounts for part of the success (an advantage we may not have for some other transcription factors). However, much of our success has to do with cutting down of false positive rates by using microarray data judiciously.
Potentiala1-α2 target sites in ORFs
Since thea1-α2 complex was able to bind to these sites with weak to moderate affinity in vitro, it is possible these sites may partially repress transcription on their own. To test this model, we cloned these sites into the context of theCYC1promoter driving expression of alacZgene and measured the ability of the sites to repress transcription of the reporter in diploid cells . The sites from theCDC25andURB1ORFs did not repress transcription of the reporter promoter in diploid cells (Fig3A). However, the site fromPRM8ORF, which showed the highest binding affinity among the sites found in ORF regions, weakly (2.8-fold) repressed the reporter promoter. This result indicates that this site can function as a repressor site in vivo if placed in the proper context. We next tested whethera1-α2 bound to these sites in the normal genomic context in vivo by ChIP assays. None of the sites in the ORF regions were bound by thea1-α2 complex (Fig3Band Table3). This result indicates that while they are competent for weak binding and repression in a heterologous promoter, they are unable to repress transcription in their normal genomic context.
Potentiala1-α2 Target Sites in the Promoters of Non-Haploid-specific Genes
Genome-wide gene expression data using SAGE or DNA microarrays has provided a wealth of information on the regulation of genes under certain conditions or by specific transcription factors. The combination of this information with sequence analysis programs has enabled researchers to identify potential regulatory sites. For example, in a pioneering paper, Tavazoieet al.clustered expression data and used multiple local sequence alignment algorithms on the promoter regions of the co-clustered genes to discover regulatory motifs . This approach has been further refined by using Bayesian networks to incorporate additional constraints regarding relative positions and the orientations of the motifs . Another approach has been to break the genes into modules and perform module assignments and motif searches at the same time via an expectation maximization algorithm (as opposed to clustering first and finding motifs later) [21,22]. Although these approaches have worked well at identifying potential targets sites one drawback is that the expression patterns have to cluster well for these methods to work. For a small number of microarray experiments, this may always not be the case. A method that does not utilize clustering is a regression model based analysis to locate "words" in the promoter that correlates with modulation of expression . However, this approach is restricted to retrieving functional consensus binding sites in the promoter regions and for transcription factors with low sequence specificity, this approach needs to be modified. Most of these approaches attack the difficult problem of what to do when relatively little is known about the regulatory system and sequence recognition by the protein. Consequently they develop pattern recognition algorithms that are essentially unsupervised. Our focus has been to take advantage, as much as possible, of knowledge about the biological system and use that information combined with expression analysis to identify potential target sites. The minor loss of generality of the tools resulting from such an approach is more than offset by its predictive power.
To determine if the changes in expression of a specific gene are the result of a transcription factor working at the promoter we developed an algorithm that combines expression data with information on the binding site preference for a transcription factor. As a test for this algorithm we identified genes in yeast that are direct targets for regulation by thea1-α2 repressor complex. We also used this method to identify genes that are repressed in diploid cells but that are not direct targets of the complex, as well as functionala1-α2 binding sites that do not appear to repress transcription in their genomic context. The combination of these sets of findings has provided insight into the regulatory network and mechanism of repression by thea1-α2 complex.
The primary goal of this study was to identify genes that are direct targets for repression by thea1-α2 complex. There are two major functional subsets among thea1-α2 target genes identified in this analysis (Table1). One, not surprisingly, involves genes that are required for various processes in mating of the two haploid cell-types. These include components of the mating pheromone signal transduction pathway, such asGPA1,STE18,STE4, andSTE5,which are activated in response to the binding of pheromone from the other cell type . This group also includes genes further down that pathway, such asFAR1andFUS3, which are required for cell-cycle arrest before mating. A number of these genes have previously been shown or suspected to be under the control ofa1-α2 repressor complex [25,26]. Repression of these genes in diploid cells is biologically important because it prevents further mating by diploid cells. If diploid cells mate they would form triploids or higher ordered genomic polyploids, which are genetically unstable during meiosis and therefore detrimental to cell survival.
The second subset of genes identified in the analysis is associated with mating type switching and recombination. TheHOgene is a known target of thea1-α2 complex and its promoter contains 10 binding sites of varying affinity . Repression ofHOis essential in diploid cells because it prevents switching of one of theMATloci to form homozygousa/aorα/αdiploid cells. Although diploid in genomic content, cells homozygous for theMATloci are competent to mate and therefore would form higher order genomic polyploids that are genetically unstable. We have also shown thatNEJ1, which is involved in non-homologous end-joining (NHEJ), is a direct target for thea1-α2 complex [27,28]. It has been proposed that that repression of the NHEJ pathway may promote homologous recombination and crossing over in diploid cells. In addition, we found thatRDH54, a gene involved in double-stranded DNA break repair, is a direct target for thea1-α2 complex . This result is somewhat unexpected becauseRDH54is required for meiosis and null mutants show significantly reduced spore viability. It is likely that thea1-α2 complex only partially reduces the level of expression of the gene and that diploid cells require a lower level of activity of the protein.
We also identified several genes that fell outside of these two subsets. One isRME1, which encodes a transcriptional repressor ofIME1, the master regulator of meiosis [30–32].a1-α2-mediated repression ofRME1is required to allow cells to enter the meiotic pathway in diploid cells. Interestingly, we also found thatPDE1andMET31are weakly, but reproducibly, direct targets for repression by thea1-α2 complex. The Pde1 protein is a low affinity cAMP phosphodiesterase that appears to have a role in response to stress and cell aging . Repression ofPDE1in diploids may partially account for the difference of starvation response between haploids and diploids. Met31 is a zinc finger DNA-binding protein that activates genes involved in sulfur metabolism . It is unclear why this gene would be a target for thea1-α2 complex.
It is possible that the presence of ana1-α2 target site upstream of a gene that has lower expression in diploid cells was fortuitous and that these sites were not functional targets. However, if this was the case then there would be little pressure to conserve these binding sites through evolution. Several closely related species of yeast have been sequenced and comparison of the corresponding promoter regions has led to the discovery of conserved regulatory motifs [35,36]. Although lack of conservation does not imply non-functionality, significant conservation strongly argues for functionality of a putative regulatory element. To investigate this possibility, we performed a phylogenetic comparison to infer whether these sites are preserved among six sequencedSaccharomycesspecies using the PhyloGibbs program . The program identified thea1-α2 binding site among a promoter set including many known haploid-specific genes (HO, NEJ1, GPA1, STE4,andSTE18). This analysis also showed that thea1-α2 binding sites in theRDH54, PDE1andMET31promoters are strongly conserved among multiple species, suggesting that these sites play an important functional role.
Our analysis identified a number of haploid-specific genes that do not appear to be direct targets of thea1-α2 repressor complex (Table2). Genes in this list do not contain a recognizablea1-α2-binding site and, with the exception ofNEM1, are not detectably bound by thea1-α2 complex in the ChIP assays. It is possible thata1-α2 indirectly turns off these genes by repressing an activator protein that is required for their expression. However, besidesMET31, there were no obvious genes coding for activator proteins that were direct targets of thea1-α2 complex. It is possible that the haploid-specific genes withouta1-α2 sites are indirectly repressed through more complex mechanisms that involve repression ofRME1.
We also identified potentiala1-α2-binding sites in the genome that do not appear to repress expression of nearby genes. Although sites from thePRM8,PRM9,CDC25, andLSM1promoters appear to be moderate binding sites for thea1-α2 complex in vitro, ChIP and heterologous reporter assays showed these sites are neither bound by the proteins nor are functional repressor sites in vivo. Many of these sites lie in open reading frames of actively transcribed genes and so it is possible that transcription through the binding site or the chromatin structure of the region prevents high affinity binding by the complex. The model that the genomic context of these sites is important for their regulatory activity is further supported by our results that show that some of these sites, such asCOX13andREX2, function as stronga1-α2 dependent repressor sites in the context of the heterologous promoter. Althougha1-α2 complex is bound to theCOX13site in vivo it does not appear to repress transcription of this gene in diploid cell. Interestingly, this binding site is very close to the end of the coding region ofIME4, an inducer of meiosis that is expressed in diploid cells . TheIME4gene is only expressed in diploid cells and it was thought that thea1-α2 complex may be indirectly activating its expression by repressing a repressor protein, such asRME1. However, the fact thata1-α2 binds to the downstream region of this gene suggests that it may play a direct role in its expression.
Our data shows that the algorithm we have developed is useful in sorting between direct and indirect targets of a transcription factor. Although we have used mutational data to define the binding site for thea1-α2 complex, in principal binding site sequences derived from site selection experiments may also be used. This analysis may also complement genome-wide ChIP studies to identify the target sites of the transcription factor.
In summary, we show that combining microarray data with motif analysis, lets us distinguish between the genes that are direct targets of a transcription factor and those that are modulated because of secondary effects. We get excellent agreement of the computational predictions with location analysis by ChIP experiments. We find most of the direct targets ofa1/α2 complex to be involved in the mating pathway, mating type switching, recombination and meiosis. We also found a few weak targets that are possibly involved in sensing and control of the metabolic state. We also see that the sites we predict solely based on single species data are often evolutionarily conserved in other species ofSaccharomyces.
We define a scoring algorithm that ranks gene expression patterns. For geneg, the score is given in terms of the expression in different types of cells (a,αanda/α)
Score(g) = sgn((X a (g) +X α (g))/2 -X a/α (g)) [(X a (g)) +X α (g))/2 -X a/α (g)]2-A(X a (g) -X α (g))2.
We initially used the logarithm of expression level of the genegfor the three cell types for the variablesX t(g) with t indicating the type. We have since found that using the complete expression data for the polyploids is a better strategy. In the final results, shown in the paper,X a (g) is the average of the log expression fora,aaandaaa. Likewise,X α (g) is the average fromα, ααandααα.X a/α (g) comes from averaging log expression overaa α,a ααandaa αα. The polyploid averaged quantities tend to be less noisy (demonstrated, for example, by the quantitiesX a andX α being close to each other for the generic gene, which is not regulated by cell type. This, in turn, allows easier detection of genuine haploid-specific targets.
An explanation of the purpose served by different terms in the overall score is described below. The first term scores well when expression in diploids is lower than the average expression in haploids. The second term penalizes the gene if the expressions in different types of haploids are very different.Ais chosen to be large enough so that knowna-specific genes andα-specific genes score worse than known haploid-specific genes, but not large enough to overwhelm the first term. The optimalAis about 10. Comparison of the performance of our algorithm forA= 1 andA= 10, shows that the biologically known sites almost always stay near the top but further down in the list the second choice is better. The exception is a special gene:MATα1. SinceMATα1 is not present in theMAT atype cell, there is a penalty for expression patterns. Cumulative probability for any gene to have higher score than a gene g isP exprn (g), namely, fraction of genes with score higher than g.
This scoring function would rank haploid-specific genes high but may not select out genes that are directly regulated by thea1-α2 repressor complex. In order to select for genes with an upstream region with a stronga1-α2 repressor, we used the binding site mutational data available . In these experiments, repression of a heterologous promoter, incorporating single site mutations of a consensus binding site of thea1-α2 repressor, was measured. Under the assumption that the degree of repression is inversely proportional to how often that site is occupied, we derived the expression: 1/Repression ∝ [1 - 1/(1 + e βE(S)/z)] ≈ e βE(S)/zassuming near saturation of binding. The symbolzrepresents the fugacity andβis inverse ofk B T,k B being the boltzman constant. The binding (free) energy is given byE(S) = Σ ib ε ib S ib , within the single base model [15,16]. The indexiruns over the positions in the motif andbruns over the bases A, C, G, T.S ib is 1 or 0 depending upon whether thei-th base isbor not. The parametersε ib represent effects of single base changes on the binding (free) energy. They are related to the weight matrix parameters [17,18] widely used to characterized variable motifs. Note that e βE(S)/zwould more commonly be represented as (K/ [Protein])•exp(ΔG(S)/RT) in the biochemistry literature [18,39]. The independent base model is only an approximation and mutations in nearest neighbor sites could produce effects that we cannot estimate from the existing data. There is a better separation between well-known sites and generic sequences if an extra penalty is added to the score for neighboring base pairs which both differ from the consensus. In this way every base different from consensus and neighboring another base different from consensus draws an additional penalty to the binding energy score. This parameter was set to be ln(2), by experience. Although this method prevented many false positives, it also penalized a few genuine candidates, such as the binding sites in the promoters ofFAR1andMATα1. Thus, from the effect of the single base mutations, we estimated the parametersε ib . Armed with these parameters, we found the probabilityP(E|ε,L) that a random sequence of a certain length,L, would have a subsequence of binding energy greater thanE. For gene g, the strongest site in the upstream region of lengthLwould have binding energyE g . Low values ofP binding (g) =P(E g |ε,L), indicated the presence of a good binding site.
A weight matrix [15–18] search for binding sites was performed using a set of known sites  to construct the matrix. Each matrix entry,w ia , was set to log((f ia +δ)/(P a +δ)), wheref ia is the frequency with which base a appears in theith position in the known sites,P a is the frequency with which base a appears in the promoters of genes andδis a small number added to ensure that the weight matrix score is finite even whenf ia = 0. Each subsequence,S ia , of length 20 in each promoter was assigned a weight matrix score Σ ia S ia w ia . After a threshold score is chosen, sites scoring above that threshold are declared to be binding sites.
An automated procedure for generating primers flanking a specified site in the genome sequence,σ, was implemented. To each pair of numbers,d u , andd d , representing primer distances upstream and downstream of the candidate binding site respectively, and primer lengthsl u , andl d , a score is assigned via
Values ofd u ,d d ,l u andl d are restricted to those whose corresponding primers have GG, GC, CG, or CC at the end nearest the candidate site. Primers are identified by selecting the values ofd u ,d d ,l u andl d which maximizeS.
[A web-based interface to this algorithm is available athttp://hill-226-174.rutgers.edu/]
Chromatin immunoprecipitation (ChIP) was carried out as described previously  with the following modifications. One liter of JRY103 (MAT α /MAT a ade2-1/ADE2 HIS3/his3-11,15 leu2-3,112/leu2-3,112 trp1-1/trp1-1 ura3-1/ura3-1 ash1Δ::LEU2/ash1Δ::LEU2) and JRY118 (MAT α /mat a Δ::TRP1 ade2-1/ADE2 HIS3/his3-11,15 leu2-3,112/leu2-3,112 trp1-1/trp1-1 ura3-1/ura3-1 ash1Δ::LEU2/ash1Δ::LEU2) cultures were grown to an A600 of 0.5 and treated with 1% formaldehyde for 20 min at RT on a rotating shaker at low speed. Cells were collected, washed 2X with cold 1XTBS. Equal volumes of cells were aliquoted into ten 1.5 ml microfuge tubes, washed once with 1.5 ml of cold 1X TBS. The pellets in each tube were resuspended with 400μl of lysis buffer (50 mM HEPES, pH 7.4, 150 mM NaCl, 1 mM EDTA, 1% Triton X-100, 0.1% Na-Deoxycholate) plus 1 mM PMSF, 1 mM benzamidine, and 1X Protease inhibitor cocktail from Roche (Cat No. 1873580) and also manufacturer recommended concentration of protease inhibitor cocktail from SIGMA (Cat No., P 8215). To this 200μl of glass beads were added to each tube and lysed using a multitube vortexer at full speed for 30 min at 4°C. The lysate was transferred in a new tube and 400μl of lysis buffer was added and vortexed briefly. The lysates were centrifuged at 12,000 g for 10 min at 4°C and the supernatants were sonicated at 30% output for four 10 sec cycles with intermittent cooling on ice.
The lysates were cleared by centrifugation at 12,000 g for 10 min and 1 mM PMSF was added to the samples. A 1/10thvolume aliquot was removed and frozen to be used as total chromatin control. The remaining sample was precleared by the addition of 25μl recombinant protein G-agarose beads, incubated while nutating for 30 min and the supernatant was collected after centrifugation at 12,000 for 5 min. 1μl of rabbit anti-α2 antiserum (a gift from A. Johnson, UCSF) was added to each supernatant of the samples and incubated 12 h on a nutator at 4°C. To immunopreciptiateα2 50μl of recombinant protein G agarose beads (Roche) was added to the samples and nutated for 90 minutes at 4°C. The protein G beads were pelleted, washed once in low salt buffer (0.1%SDS, 1% Triton X-100, 20 mM Tris pH8.0, 2 mM EDTA and 150 mM NaCl), once in high salt (composition same as lowsalt + 500 mM NaCl), once in LiCl buffer (0.25 M LiCl, 1% IGEPAL, 1XTE and 1% Na-Deoxycholate) and twice with 1XTE (pH8.0). The immunoprecipitated DNA was eluted twice with 250μl of elution buffer (1%SDS and 0.1 M NaHCO3) and the eluates were pooled (500μl final volume). To this 20μl of 5 M NaCl was added and incubated 12 h at 65°C. To remove the crosslinks, 10μl of 0.5 M EDTA, 20μl of 1 M Tris-HCl, pH 7.5 and 2μl of proteinase K (10 mg/ml) was added and incubated for 45 minutes at 45°C. The DNA samples were extracted once with Phenol:chloroform:Isoamylalcohol and the DNA was ethanol precipitated, washed once with 70% ethanol and resuspended in 50μl (IP) or 500μl (TC) TE.
Purified DNA from the immunoprecipitated samples was subjected to multiplex PCR amplification with primers specific for theSTE6promoter as a positive control for the immunoprecipitation ofα2 and theYDL223CORF as a negative control for nonspecific immunoprecipitation, along with the specific primers for candidateα2-a1 target sites. PCRs were carried out in 50μl containing 10 pmols of each primer, 0.2 mM dNTPs, 2 mM MgCl2, 1X Eppendorf Taq buffer, 0.5X Taq Master buffer and 2.5 U of Eppendorf Taq polymerase. The amplifications were carried out at 94°C for 1 min and 30 secs, followed by 25 cycles of 94°C for 30 secs, 52°C for 1 min, and 72°C for 30 secs and a final extension step of 7 min at 72°C. The PCR products were separated on 2.5% agarose gels.
Oligonucleotides containing the predicteda1-α2 binding sites from within the ORFs ofURB1,PRM8,PRM9,YKL162CandCDC25and the promoters ofCOX13, REX2, LSM1, andFMP14were synthesized, one strand was end-labeled with [γ-32P]-ATP, and then annealed with excess cold complementary oligonucleotide. TheHO(10) andHO(8)a1-α2 sites within Upstream Regulatory Sequence 1 (URS1) of theHOpromoter were used as strong and weak binding sites respectively. The EMSA was performed as described previously , using a constant 1.4μMa1 and five-fold titrations ofα2 starting at 82 nM in protein dilution buffer (50 mM Tris pH 7.6. 1 mM EDTA, 500 mM NaCl, 10 mM 2-mercaptoethanol, 10 mg/ml bovine serum albumin).
Oligonucleotides containinga1-α2 binding sites were synthesized with 5' overhangs to allow cloning into theXhoI site of pTBA23 (2μ URA3Ampr), a reporter plasmid containing aCYC1-lacZfusion . Reporter constructs were transformed into JRY103 and JRY118 and theβ-galactosidase activity was measured on three independent transformants, as described previously .
Serial Analysis of Gene Expression
Open Reading Frame
Non-Homologous End Joining
Electrophoretic Mobility Shift Assay
Polymerase Chain Reaction
Phenyl Methyl Sulfonyl Fluoride
We thank Alexander Johnson for the rabbitα2 antibody and Yvette Green for help with some of the initial ChIP assays. We also thank Rahul Siddharthan for running PhyloGibbs on the candidate direct target promoter set. JM was supported by a Charles and Johanna Busch predoctoral fellowship. This work was partially supported by a grant from the National Institutes of Health to AKV (GM49265).
This article is published under license to BioMed Central Ltd. This is an open-access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.