BMC Genomics BioMed Central Methodology article A toolbox for epitope-tagging and genome-wide location analysis in

Background: Candida albicans is a diploid pathogenic fungus not yet amenable to routine genetic investigations. Understanding aspects of the regulation of its biological functions and the assembly of its protein complexes would lead to further insight into the biology of this common diseasecausing microbial agent. Results: We have developed a toolbox allowing in vivo protein tagging by PCR-mediated homologous recombination with TAP, HA and MYC tags. The transformation cassettes were designed to accommodate a common set of integration primers. The tagged proteins can be used to perform tandem affinity purification (TAP) or chromatin immunoprecipitation coupled with microarray analysis (ChIP-CHIP). Tandem affinity purification of C. albicans Nop1 revealed the high conservation of the small processome composition in yeasts. Data obtained with in vivo TAP-tagged Tbf1, Cbf1 and Mcm1 recapitulates previously published genome-wide location profiling by ChIPCHIP. We also designed a new reporter system for in vivo analysis of transcriptional activity of gene loci in C. albicans. Conclusion: This toolbox provides a basic setup to perform purification of protein complexes and increase the number of annotated transcriptional regulators and genetic circuits in C. albicans. Background Candida albicans is an important human fungal pathogen because of its clinical significance as well as its use as an experimental model for scientific investigation [1]. This opportunistic pathogen is a natural component of the human skin, gastrointestinal and genitourinary flora, but it can sporadically cause a variety of infections. Although many Candida infections are not life-threatening (oral thrush and vaginal candidiasis, for example), immunosuppressed patients can be subjected to potentially lethal systemic infections, and therefore Candida infections are a major public health concern [2,3]. C. albicans can also colonize various biomaterials, and readily forms dense, complex biofilms that are resistant to most antifungal agents. Because of the challenges of drug resistance [4] and the eukaryotic nature of C. albicans that makes it similar to its Published: 2 December 2008 BMC Genomics 2008, 9:578 doi:10.1186/1471-2164-9-578 Received: 5 September 2008 Accepted: 2 December 2008 This article is available from: http://www.biomedcentral.com/1471-2164/9/578 © 2008 Lavoie et al; licensee BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. BMC Genomics 2008, 9:578 http://www.biomedcentral.com/1471-2164/9/578


Background
Candida albicans is an important human fungal pathogen because of its clinical significance as well as its use as an experimental model for scientific investigation [1]. This opportunistic pathogen is a natural component of the human skin, gastrointestinal and genitourinary flora, but it can sporadically cause a variety of infections. Although many Candida infections are not life-threatening (oral thrush and vaginal candidiasis, for example), immunosuppressed patients can be subjected to potentially lethal systemic infections, and therefore Candida infections are a major public health concern [2,3]. C. albicans can also colonize various biomaterials, and readily forms dense, complex biofilms that are resistant to most antifungal agents. Because of the challenges of drug resistance [4] and the eukaryotic nature of C. albicans that makes it similar to its human host, extensive efforts are underway to identify new drug targets for therapeutic intervention; many of these take advantage of tools from the genomic era [5][6][7]. Because of some of its unique biological features, C. albicans is also becoming attractive to studies of more fundamental aspects of genome maintenance, regulatory biology and morphogenesis [8][9][10].
The diploid nature and the absence of a complete sexual cycle in C. albicans reduces the ease with which genetic manipulations can be achieved [11]. Therefore, biochemical, cell biological and genomic analyses of gene products provide alternate strategies to improve our understanding of this pathogen. In particular, obtaining a better insight into how specific protein complexes assemble in C. albicans would help define targets for drug development. For example, the large-scale definition of protein complexes by tandem affinity purification (TAP), as it was previously done in S. cerevisiae [12,13], should reveal the architecture of biochemical networks and protein machines specific to C. albicans. This approach would also give insight into the evolution of protein complexes in ascomycetes. As well, the availability of tools allowing comprehensive functional analysis of transcription factors (TFs) on a genomewide scale could enhance cellular studies. Genome-wide location analysis, or ChIP-CHIP analysis (chromatin immunoprecipitation (ChIP) followed by DNA microarrays (CHIP)) allows the identification of direct targets of a defined TF on a genomic scale [14]. This approach has been comprehensively applied to the model budding yeast S. cerevisiae in order to map its transcriptional regulatory network [15,16]. While first developed in S. cerevisiae, ChIP-CHIP has since been applied to other organisms including C. albicans [17,18]. In addition, C. albicans has recently been established as an interesting model organism for the study of the evolution of transcriptional regulatory networks. In fact, it appears that C. albicans differs significantly in its mode of gene regulation from the well-characterized S. cerevisiae transcriptional circuits. It has recently served as a comparative system for studying the evolution of the mating type control circuit and the evolution of ribosomal protein (RP) regulation [19,20]. Furthermore, changes occuring in the activity of TFs seem to account for part of acquired drug resistance in C. albicans [21,22]. Moreover, several TFs play a critical role in the C. albicans morphological transitions and in biofilm formation [10,23]. Therefore, a detailed understanding of the transcriptional regulation mechanisms in C. albicans would be valuable for basic research purposes as well as for improving our understanding of drug resistance and for aiding in the development of new antifungals.
Here we report the construction of a new set of PCR-based epitope-tagging vectors for C. albicans that is successfully applied to perform a biochemical characterization of the small processome subunit via Nop1 tandem affinity purification as well as genome-wide location analysis of the model TFs Tbf1, Cbf1 and Mcm1.

Results and discussion
A set of PCR cassettes for protein tagging The choice of selectable marker genes in our cassettes, URA3, HIS1 and ARG4 was guided by the availability of the C. albicans strains BWP17 and SN76 that have the auxotrophic mutations (ura3/ura3; his1/his1; arg4/ arg4) [24,25]. Our cassettes permit the use of a single 120bp primer pair (20 bp of vector sequences and 100 bp from the gene to be tagged) to tag genes with three epitopes (Fig. 1A). In addition, our PCR strategy is compatible with the previously published pFA-XFP-tagging system [26]. With this vector set, the tagged protein is expressed under the control of its own regulatory sequences and chromatin environment and thus at its normal physiological levels, thereby maximizing the biological relevance of biochemical analyses.
The HA and MYC epitope tags have been used to tag proteins in various organisms and for various applications [27,28]. They represent highly immunogenic peptides with little biological activity that can then be used in immunoprecipitation/co-immunoprecipitation experiments to validate protein complex formation. The TAP tag was more recently developed in order to obtain high purified protein samples for mass spectrometry analysis of protein complexes [29]. Tagged strains were first validated by PCR and subsequently by western blotting ( Fig. 2A and 2B), and we confirmed that the tagged constructs are easily immunoprecipitated (data not shown). Doubly tagged strains can also be obtained with combinations of selective markers offered by the BWP17 and the SN76 strain backgrounds. We produced Tbf1-HA/Cbf1-Myc and reciprocal tagged stains (Fig. 2C).

Tandem affinity purification of Nop1
Since its introduction in 1999, the TAP-tag has been used in various organisms to perform protein-complex purifications [29]. In S. cerevisiae, it has been used to assess protein-protein interactions on a large scale [12,30]. We performed a standard TAP procedure using the TAPtagged Nop1, a component of the small processome subunit in S. cerevisiae [13]. Mass spectrometric analysis of insolution digested TCA-precipitated proteins after the TAP procedure identified 18 C. albicans proteins and in-gel digestion of four SYPRO Ruby stained SDS-PAGE bands led to the identification of an additional protein ( Fig. 2D and Table 1). In total, 19 C. albicans proteins were identified, 16 of which are bona fide components of the S. cerevisiae small sub-unit processome (SSU processome) ( Table 1 and Fig. 2E) [13]. In addition, three ribosomal Plasmid constructs for in vivo protein tagging in C. albicans Figure 1 Plasmid constructs for in vivo protein tagging in C. albicans. A) Two-step insertion of tags by 1-PCR and 2-homologous recombination. B) PCR confirmation of tagged constructs. YFG stands for Your Favorite Gene.
proteins were co-purified with Nop1; ribosomal proteins are common false positives in protein complex purification procedures (Table 1) [31]. We compared the coverage generated through our approach to that of the orthologous Nop1-TAP purification in S. cerevisiae. The C. albicans Nop1-TAP bait retrieved 16 proteins (33%) of the annotated SSU processome subunits while its S. cerevisiae ortholog retrieved 24 (50%) http://www.thebiogrid.org/ SearchResults/summary/32040, so the coverage of the S. cerevisiae set was higher. However, the proportion of core components of the SSU processome retrieved in C. albicans (84%) surpasses the proportion retrieved in S. cerevisiae by all published affinity-capture followed by mass spectrometry studies (37%) [32]. It is likely that the stringency of our protocol accounts for the lower coverage and higher specificity of the set of hits retrieved; Modifications of the purification conditions could allow for the desired specificity. Overall, it is apparent that a basic TAP protocol is amenable to C. albicans proteins. This approach will be of great utility in the analysis of function of currently poorly defined proteins and in the study of the evolution of protein complexes in yeasts.

ChIP-CHIP analysis of in vivo tagged transcription factors
One potential application of TAP-tagged TFs consists of their use in ChIP-CHIP to map their DNA-binding targets in vivo. ChIP-CHIP was therefore performed for three TAPtagged TFs; Tbf1, Cbf1 and Mcm1. For Tbf1 and Cbf1, the results obtained were compared to previously published data using ectopically expressed Tbf1-HA and Cbf1-HA constructs [20]. The location profiles of TAP-tagged TFs were essentially identical to the previous results and confirmed that the RP regulon is dominated by the Myb transcription factor Tbf1 working in conjunction with Cbf1 ( Fig. 3A and 3B) [20].  of the pre-replication complex (Pre-RC). The Mcm1 paralog Arg80 arose after the whole-genome duplication of the yeast lineage and is involved in the regulation of arginine metabolism genes [8]. The overlap of our Mcm1 genomewide binding data in yeast state cells with a previously published Mcm1 ChIP-CHIP done on tiling microarrays is highly significant (pvalue = 2.7E-11) with 198 genes common between studies [8] (Fig. 3D). We confirm that C. albicans Mcm1 occupies the roles of S. cerevisiae Mcm1 and Arg80 since genes from the arginine biosynthesis pathway such as ARG3/4/5-6/8, CAR1 and the zinc transcription factor ARG81 are enriched in our set (pvalue = 5.52E-07; Table 2). Genes involved in the biosynthesis of the other basic amino acid, lysine (LYS1/9/21/22 and 144), and also catabolic proline processing (PUT1 and 2) are also enriched in ours and Tuch et al. (2008) gene lists ( Table 2). In addition, we confirm that Mcm1 binds strongly to its own promoter ( Table 2). We finally confirm Validation of the Cbf1, Tbf1 and Mcm1 TAP-tagged TFs by ChIP-CHIP analysis   (Fig. 3D). These major differences are likely due to variations between the two independently conducted studies. Tuch and collaborators 1) treated their cells with pheromone, 2) used a polyclonal rabbit antiserum raised against an Mcm1 peptide, 3) used signal ratios of cy5 labelled IP versus cy3 labelled wholecell extract prior to performing the IP and 4) used tiling arrays to determine the promoter targets of Mcm1. Our procedure is somewhat different: we 1) used YPD grown cells, 2) used IgG beads-proteinA interaction to enrich target regions, 3) compared each experimental IP to a mock IP performed in untagged cells and 4) used full genome arrays with smaller coverage (about two probes/intergenic region). All these experimental changes could affect to different degrees the final result of a ChIP-CHIP. First, treating cells with pheromone might dramatically alter Mcm1 binding at some point in the cell cycle and pheromone response elements and deplete it at others (such as the Mcm complex genes). Second, polyclonal anti-Mcm1 antibodies can have cross-specificities that would not be corrected for by using the whole-cell extract as a control instead of a mock IP or an IP performed with a preimmune serum. Third, the different resolutions of the two studies could account for some overlooked targets in our experiments.
We also performed Mcm1-TAP ChIP-CHIP in the yeast to hyphal transition triggered by serum at 37°C. Mcm1 was recruited to a limited set of genes (36) under these conditions (Fig. 3D). Noticeably, it was enriched in the promoters of ALS3, HWP1 and ECE1 after hyphal induction (see Additional files 1 and 2).
Thus, our genome-wide location data showed a significant consistency with previously published results on the three TFs. This suggests that our chromosomally tagged alleles are functional and that the use of a TAP-IgG pulldown protocol is applicable for ChIP-CHIP as previously reported in S. cerevisiae 26 and that this protocol is comparable to a ChIP method based on a anti-HA (for Cbf1 and Tbf1) or rabbit polyclonal (for Mcm1) antibody IP.

In vivo beta-galactosidase reporter assays
The beta-galactosidase reporter system described here has already been applied to the problem of RP gene regulation in Hogues et al (2008). It allows the integration of reporter constructs in the actual chromatin environment of the gene of interest. Here, we describe its usefulness in the quantitative study of active gene loci. We produced pLYS9-lacZ and pOPI3-lacZ strains and beta-galactosidase reporter assays were performed with YPD grown cells. Our results show variable levels of reporter activity across these three promoters (Fig. 4). The multiple cloning site of our pFA-lacZ-URA3 construct allows the easy creation of various cis-regulatory mutants as examplified by the the pRPL11 promoter (pRPL11) [20].

Conclusion
Here, we have presented a new set of tools for functional characterization of C. albicans gene products by biochemical methods. We believe that the availability of such tools will greatly help future understanding of the biology of this important human pathogen.

Construction of chromosomal tagging PCR cassettes
We first adapted the S. cerevisiae PCR TAP-tagging vector pEB1340 [34] for its use in C. albicans by substituting the S. cerevisiae HIS3 marker with the C. albicans URA3, HIS1 and ARG4 genes from the previously published pFA-GFP plasmid series [26] (Fig. 1A). Subcloning of the C. albicans auxotrophic markers was done by ligation of AscI-PmeI fragments in AscI-PmeI digested pEB1340. We then derived triple HA or MYC epitope-tagging vectors in the same pFA plasmid backbones by cloning oligonucleotides encoding the HA or MYC epitope tags and containing XmaI and AscI sites (Table 3) between the XmaI and AscI sites of the pEB1340 plasmid. Auxotrophic markers URA3, HIS1 and ARG4 from plasmids pFA-XFP [26] were then subcloned into the HA and Myc constructs between AscI and PmeI.
The beta-galactosidase reporter was constructed by subcloning a PstI-MluI fragment corresponding to the Streptococcus thermophilus lacZ ORF from plasmid placpoly [35] between the PstI and AscI sites of plasmid pFA-XFP-URA3 [26].

Construction of C. albicans epitope-tagged strains
Cell growth, transformation, and DNA preparation were carried out using standard procedures [26,36]. Transformants were selected on either of -Ura, -His or -Arg selective plates. Correct integration was verified by PCR, sequencing and finally Western blotting (Fig. 1A). Rate of correct integration was comparable to a previous study using sim-  ilar condiitons and was in the range of 40-80% depending on the gene locus considered [26]. We used this strategy to C-terminally fuse Tbf1, Cbf1 and Mcm1 with a TAP tag and to introduce an HA or a MYC tag to the C-terminus of Tbf1 and Cbf1.

Western blotting
Whole cell extracts were obtained by boiling cells at 2 ODs in loading buffer with 100 mM DTT for 10 minutes. Proteins were then separated on a 10% SDS-PAGE gel and transferred to a PVDF membrane (Millipore). Antibodies were prepared in TBS-0.05% Tween20 5% skim milk powder. A rabbit polyclonal antibody directed against the TAP-tag (Open Biosystems) was used at 0.5 μg/ml while monoclonal antibodies anti-HA (12CA5) and anti-Myc (9E10) were used at 5 μg/ml. HRP-conjugated goat antirabbit and anti-mouse secondary antibodies (Santa Cruz) were used at 0.04 μg/ml. The HRP signal was revealed with Immobilon™ HRP substrate (Millipore).

TAP purifications and mass spectrometry
Tandem affinity purifications were performed as described http://depts.washington.edu/yeastrc/pages/ plasmids.html and then precipitated with Trichloroacetic acid (TCA). For mass spectrometry analysis of the TAP purified proteins, the digestion was performed using Trypsin (Promega) in 50 mM ammonium bicarbonate for 4 hours at 37°C and dried down. One quarter of the TCA precipitate was loaded on a 10% SDS-PAGE gel. The gel was stained with SYPRO Ruby according to manufacturer's instructions (Invitrogen). Excised protein bands were processed as described [37]. Samples were resolubilized in 5% acetonitrile 0.2% formic acid and analyzed on a Eksigent nanoLC system coupled to a Thermo LTQ-Orbitrap MS instrument with a home-made C18 pre-column (5 mm × 300 um) and an analytical column (10 cm × mm × 300 m i.d. Jupiter 3 m C18). Sample injection was 10 ul. The digest was first loaded on the pre-column at a flow rate of 4 ul/min and subsequently eluted onto the analytical column using a gradient from 10% to 60% aqueous acetonitrile (0.2% formic acid) over 56 min at 600 nl/min. Database searches were performed against a non-redundant fungal database using Mascot version 2.1 (Matrix Science).

ChIP-CHIP analysis of Tbf1, Cbf1 and Mcm1by TAP-IgG pull-down in C. albicans
Chromatin immunoprecipitation (ChIP) experiments were performed with chromosomally tagged Tbf1-TAP, Cbf1-TAP and Mcm1-TAP as described [20]. Cells were grown to an optical density at 600 nm of 0.6 in 50 ml of YPD or YPD with 10% FBS for Mcm1-TAP in hyphal state. We followed the ChIP protocol available at http:// www.ircm.qc.ca/microsites/francoisrobert/en/317.html with the following exceptions: chromatin was sonicated to an average 300 bp, and 700 μl of whole-cell extract (WCE) were incubated with IgG-Sepharose beads (GE Healthcare). Tagged ChIPs were labeled with Cy5 dye and untagged (mock) ChIPs were labeled with Cy3 dye and were then co-hybridized to our full-genome arrays.
Candida albicans full-genome arrays, hybridization, scanning and normalization Our C. albicans full-genome microarrays contain single spots of 5,423 intergenic 70-mer oligonucleotide probes combined with 6,394 intragenic 70-mer oligonucleotide probes already in use in our C. albicans ORF microarray [20,38]. We designed the 5,423 probes that correspond to the promoter regions of most of the genes in the C. albicans Genome Assembly 21 by using the same algorithm that was successfully applied to the development of our C. albicans ORF oligonucleotide arrays with added weight provided for regions of high homology among Candida species. Our lab has developed full-genome (ORF and intergenic) arrays for use in location profiling experiments and the DNA microarrays were processed and analysed as previously described [20].