Novel insights into the unfolded protein response using Pichia pastoris specific DNA microarrays
- Alexandra Graf†1,
- Brigitte Gasser†1,
- Martin Dragosits1,
- Michael Sauer2,
- Germán G Leparc3,
- Thomas Tüchler3,
- David P Kreil3 and
- Diethard Mattanovich1, 2Email author
© Graf et al; licensee BioMed Central Ltd. 2008
Received: 08 February 2008
Accepted: 19 August 2008
Published: 19 August 2008
DNA Microarrays are regarded as a valuable tool for basic and applied research in microbiology. However, for many industrially important microorganisms the lack of commercially available microarrays still hampers physiological research. Exemplarily, our understanding of protein folding and secretion in the yeast Pichia pastoris is presently widely dependent on conclusions drawn from analogies to Saccharomyces cerevisiae. To close this gap for a yeast species employed for its high capacity to produce heterologous proteins, we developed full genome DNA microarrays for P. pastoris and analyzed the unfolded protein response (UPR) in this yeast species, as compared to S. cerevisiae.
By combining the partially annotated gene list of P. pastoris with de novo gene finding a list of putative open reading frames was generated for which an oligonucleotide probe set was designed using the probe design tool TherMODO (a thermodynamic model-based oligoset design optimizer). To evaluate the performance of the novel array design, microarrays carrying the oligo set were hybridized with samples from treatments with dithiothreitol (DTT) or a strain overexpressing the UPR transcription factor HAC1, both compared with a wild type strain in normal medium as untreated control. DTT treatment was compared with literature data for S. cerevisiae, and revealed similarities, but also important differences between the two yeast species. Overexpression of HAC1, the most direct control for UPR genes, resulted in significant new understanding of this important regulatory pathway in P. pastoris, and generally in yeasts.
The differences observed between P. pastoris and S. cerevisiae underline the importance of DNA microarrays for industrial production strains. P. pastoris reacts to DTT treatment mainly by the regulation of genes related to chemical stimulus, electron transport and respiration, while the overexpression of HAC1 induced many genes involved in translation, ribosome biogenesis, and organelle biosynthesis, indicating that the regulatory events triggered by DTT treatment only partially overlap with the reactions to overexpression of HAC1. The high reproducibility of the results achieved with two different oligo sets is a good indication for their robustness, and underlines the importance of less stringent selection of regulated features, in order to avoid a large number of false negative results.
Transcriptomics, the parallel quantification of many, or all transcripts of an organism in given conditions, has become a favorite tool for basic research . Messenger-RNA regulation patterns of model organisms under many different conditions have become available during the last years. However, these methods are still not applicable for many industrially important organisms, mainly due to the lack of DNA microarrays targeting these organisms. A typical example is the yeast Pichia pastoris, which is widely applied for the production of recombinant proteins. Several approaches have been taken to derive transcriptomic data without specific microarrays. Sauer et al.  have applied heterologous hybridization of P. pastoris samples to Saccharomyces cerevisiae microarrays. Alternative methodological concepts like Transcript Analysis with the Aid of Affinity Capture (TRAC)  may be applied preferentially to subsets of the transcriptome , provided that genome sequence data are available. If this is not the case, total cDNA may be utilized as a source of probes, either by applying expressed sequence tags to microarrays  or employing RNA fingerprinting like cDNA-amplified fragment length polymorphism (cDNA-AFLP) , which has recently been applied to Trichoderma reesei . These unannotated methods bear of course the disadvantage that specific hits may only be identified after sequencing their respective probes.
Therefore oligonucleotide microarrays have become the method of choice for many applications, although their design depends on the availability of a genomic sequence with good gene identification and annotation. The genome sequence of P. pastoris is not published yet. The data available from Integrated Genomics (IG, Chicago, IL, USA; ) contain a partial gene identification and annotation, so that additional effort in this direction was a first step necessary towards development of comprehensive DNA microarrays for this yeast species. There is a wide choice of computational gene finders available at the moment which can be classified into intrinsic and extrinsic prediction programs. Intrinsic or de novo gene finder only use information from the sequences to be studied, building statistical models to distinguish between coding and non-coding regions of the genome on the basis of biological sequence patterns [9–11]. Extrinsic gene finder utilize homology search to determine where protein coding regions are in the genome. Their applicability is therefore limited to organisms that have homologs in current databases that are correctly annotated. Because of this limitation it is common to integrate homology search with de novo prediction . Most state of the art gene finders use a form of Hidden Markov Model (HMM) differing in the implementation and complexity of the model as well as the ease in which users can adapt the application to their needs .
It is well known that cross-hybridization can confound microarray results rendering good probe design an essential requirement for accurate microarray analyses. The specificity of oligonucleotides is determined by the Gibbs free energy (ΔG) of the hybridization reaction between potential binding partners. Highly specific probes will bind their target transcript much more strongly than any other transcript. Considering that microarray experiments are non-equilibrium measurements, it is desirable that microarray probes exhibit uniform thermodynamic properties, which many probe design tools aim to achieve by demanding a narrow distribution of the probe-target melting temperature Tm. Ideally, probes should have a uniform binding free energy at the hybridization temperature T hyb .
Previous studies have demonstrated that industrial production strains may behave quite differently to laboratory strains and model organisms , which emphasizes the importance of analytical tools for industrially relevant strains and species. As an example, the unfolded protein response (UPR), a regulation circuit of high relevance for heterologous protein production in eukaryotic cells , has been shown to be differentially regulated in P. pastoris  compared to S. cerevisiae , which is the typical model species for hemiascomycete yeasts. The development of specific microarrays for P. pastoris was intended to allow a detailed analysis of UPR regulation in P. pastoris. As in previous transcriptomics work with S. cerevisiae the induction of UPR was either accomplished by addition of dithiothreitol (DTT) or tunicamycin, this work aimed at a comparison of DTT induced gene regulation in P. pastoris to that in S. cerevisiae published by Travers et al. . Finally we aimed at the comparison of DTT induced regulation to the regulatory response to overexpression of HAC1, the transcription factor controlling the UPR. Transcriptional regulation of HAC1 overexpression has not been studied for yeasts so far, so that we expected valuable data to better define the core UPR regulated transcriptome.
Results and Discussion
Gene prediction and Oligo Design
Comparison of gene finder performance on yeast genomic sequence data
Positive prediction value (%)
In a WU-BLASTN search against S. cerevisiae, 6,374 sequences that were predicted by GeneMark, and 3,964 of the IG predictions produced hits with S. cerevisiae using an E value (Expectation value, ) of < 10-4, a hit length > 100 nucleotides and an identity of >50%. To reduce the redundancy within the data set the predicted genes were clustered into groups sharing more than 90% similarity using cd-hit . From a total of 31,896 candidate sequences (GeneMark and IG predictions), 22,020 cd-hit groups were obtained. From the cluster file it was clear that some of the clusters had to be analyzed further before selecting target sequences for the oligo design. After the removal of all sequences that had a short length and a low prediction value, complex clusters were defined as clusters for which the minimum relative length of all sequences was smaller than 0.9. A total of 2,612 clusters fell into this category and were excluded at a first design stage.
Finally 19,508 predicted target sequences remained to be tested in the first microarray experiments. OligoArray 2.1  was able to design oligonucleotide probes for 17,161 sequences ranging in length from 57 to 60 nucleotides.
Validation arrays for the first list of predicted transcript sequences (Same-Same experiment)
With these probes 4 × 44 K slides were produced on the Agilent microarray platform and employed for an initial validation of the predicted transcript sequences by hybridization with the Pool samples of P. pastoris (for preparation of Pool samples see Material and Methods). One slide had to be discarded because of quality issues. For the remaining 12 arrays the number of probes showing a signal varied between 10,708 and 15,598. Of these, 7,980 had a signal on all 12 arrays, and only 951 probes showed no hybridization on all 12 arrays.
Second, curated list of predicted target sequences and second oligo design
The results of the initial validation arrays were utilized to adapt the list of predicted genes, keeping all predictions for which a hybridization signal could be observed for all arrays plus all predictions with significant sequence similarity to annotated genes as well as all sequences with an average gene prediction score > 0.5. This approach allows for the fact that not all genes will have been actively expressed in the target samples. Additionally, predicted transcripts resulting from a subsequent analysis of the complex clusters were included at this stage. Of the 2,612 complex cluster that were not included in the design for the first batch of arrays, only 223 contained more than 2 sequences and for a further 14 no subsequence match of at least 60 nucleotides could be found within the last 1000 bases at the 3'-end. These 237 clusters were manually curated while the rest could be automatically reduced to one sequence. To make full use of the 15,208 features available on the Agilent microarray platform, it was decided to also include predicted sequences with somewhat lower gene prediction score that showed a hybridization signal in at least 8 of the 12 arrays. Finally, a selected set of 15,253 predicted transcript sequences was used as targets for probe design of a comprehensive P. pastoris microarray. While it is obvious that this list is larger than the expected number of open reading frames (6,000–7,000), as judged in comparison to other yeast species , we intentionally included more putative transcript sequences, as false positives with a distinct sequence will not negatively affect microarray design or experiments, in contrast to the damage of falsely excluding a potential transcript target.
Oligonucleotide probes were designed using a probe design tool developed in-house, a thermodynamic model-based oligoset optimizer ('TherMODO', ). TherMODO designed probes for 15,035 sequences, of which only 665 were predicted as having cross-hybridization potential. The TherMODO design was compared to probe design with eArray . The distributions of ΔG and Tm of both designs are shown in additional file 1. Clearly the TherMODO designed probes are more uniform in respect to the Gibbs free energy ΔG, indicating a superior hybridization performance .
Biological evaluation of the new microarrays
The performance of the new arrays was examined by a hybridization experiment using samples, for which transcript regulation data have been obtained before . The biological question evaluated was the regulatory response of P. pastoris to constitutive overexpression of the active form of S. cerevisiae HAC1, the transcription factor controlling UPR target genes. By this approach, the regulation of 52 genes which have been studied before using TRAC  could be verified, with 80% of these genes showing the same regulation pattern for both methods (genes highlighted in bold in Additional file 2). This correlation is statistically significant based on calculating the regression (p = 8.8 · 10-6).
Comparison of UPR induction by DTT in P. pastoris and S. cerevisiae
In order to compare the effects of DTT treatment in S. cerevisiae with those in P. pastoris, the data published by Travers et al.  for 60 min treatment of S. cerevisiae with DTT were evaluated alongside with our results for P. pastoris. All genes of S. cerevisiae which were listed in  and for which homologs in P. pastoris were identified were classified as upregulated, downregulated or unregulated. In order to compare the two data sets, a cutoff of 1.5 fold differential expression was set in both to define regulated genes. A significance threshold on p-values could not be employed, as these data were not provided for S. cerevisiae. 48% of these genes defined as regulated or unregulated reacted in P. pastoris just as in S. cerevisiae.
Similarity of gene regulation between P. pastoris and S. cerevisiae upon DTT treatment
No. of similarly regulated/total
% similar regulation
Core oligosaccharide synthesis
Disulfide bond formation
Fatty acid metabolism
Vacuolar Protein Sorting
Cell Wall Biogenesis
Overexpression of Hac1 triggers a different regulation pattern compared to DTT treatment
In most previous studies of the UPR in lower eukaryotic cells, treatment with DTT or tunicamycin, or heterologous protein expression has been employed to trigger the UPR. This study clearly indicates that the set of regulatory events triggered by DTT analysis only partially overlaps with the reactions to constitutive expression of the activated form of the UPR transcription factor Hac1 (see Figures 2 and 3). Interestingly, both treatments resulted in the same amount of genes being down-regulated as being up-regulated, a fact that has been neglected to some extent in the existing literature.
Those genes appearing beyond the threshold (p-value < 0.05 and FC >1.5) were subjected to a more detailed comparison between the effects of DTT treatment and Hac1 induced regulation. The relative numbers of up- and downregulated genes in each GO biological process term based on the SGD GO slim tool  are depicted in Figure 4.
A pattern common to both treatments is the down-regulation of major metabolic processes like carbohydrate, amino acid and lipid metabolism, as well as that of vitamins, cofactors and aromatic and heterocyclic compounds. This makes it obvious that the UPR has a major impact on decreasing both catabolic and anabolic processes. On the other side, both treatments lead to up-regulation of protein folding and vesicular transport. These effects are in line with the published literature, indicating the cellular reaction towards alleviation of the UPR [4, 25, 26, 17].
As expected, the genes coding for classical UPR targets are induced both in Hac1 overproducing and in DTT stressed cells, and genes underlined in the following paragraphs have been identified as UPR targets in previous studies. Especially the ER folding catalysts PDI1 and ERO1, the DnaJ homologs JEM1 and SCJ1, the ER resident chaperones CNE1 (calnexin), KAR2 /BiP and LHS1 and the mitochondrial chaperones HSP60 and SSC1 are significantly up-regulated in both conditions. Among the functional group of 'protein modification' the majority of up-regulated genes belong to the core oligosaccharide synthesis (DPM1, DIE2), oligosaccharyltransferase complex (OST1, OST2, OST3, SWP1, STT3, WBP1), glycoprotein processing (ALG2, ALG7, SEC53), GPI anchor biosynthesis (GPI2, GPI14, PSA1) and Golgi/O-linked glycosylation (PMT1, PMT2, PMT4, PMT6). Besides these, several genes coding for the translocon pore complex (SEC61, SEC62, SEC63, SEC72, SSS1), which aid the translocation of nascent polypeptides into the ER, are induced. Higashio and Kohno  describe the stimulation of ER-to-Golgi transport through the UPR by inducing COPII vesicle formation. In this context, we see SEC23, SEC24, SFB2, YIP3, and ERV2 upregulated. However, also proteins building the COPI coatomer, which are required for retrograde Golgi-to-ER transport, show increased transcription levels upon ER stress in our experiments (COP1, RET2, SEC21, SEC27).
While we cannot give any information on ERAD regulation, as HRD1 is the only annotated gene of this protein degradation process (up-regulated in the Hac1 strain), we observed the down-regulation of some components involved in the assembly of the 20 S core of the 26 S proteasome (ADD66, PRE1, PRE4, SCL1) and ubiquitin UBI4 upon constitutive UPR activation. In this context, Shaffer et al.  describe reduced degradation of newly synthesized proteins in XBP1-overexpressing human Raji cells.
Induction of genes encoding cytosolic chaperones (Cns1, Jjj3, Hsp82, Ssa1, Ssa2, Sse1, Ydj1, Zuo1) can only be seen in the Hac1-overproducing strain. Additionally, the ER-resident Pdi homolog Mpd1 and two members of the PPIases (FPR4 and CPR6) are only up-regulated in the engineered strain, but not upon DTT addition.
One of the most striking patterns is the significant up-regulation of a large number of genes with functions in ribosomal biogenesis (233 genes assigned to the GO-categories 'ribosome biogenesis and assembly' and 'RNA metabolic process'). Most of these genes are contributing to rRNA processing (RRP family) and ribosome subunit nuclear export and assembly, while the ribosomal proteins (RPS and RPL families) themselves are not among the regulated genes for P. pastoris (see Additional file 2). No genes with a function in mRNA decay show increased transcription levels. The induction of the above functional categories came as a surprise, as translational down-regulation of proteins involved in ribosomal biogenesis was recently reported when S. cerevisiae cell were treated with DTT . In contrast, the transcription levels of 9 out of the 16 mRNAs listed by these authors are enhanced in our study. Transcriptional down-regulation of ribosomal proteins during ER stress conditions was also revealed when reanalysing the raw data provided by Travers et al. . However, Shaffer et al.  describe an increase in total protein synthesis as well as in the number of assembled ribosomes upon the overexpression of the mammalian Hac1 homolog XBP1 in Raji cells, but did not observe upregulation of genes related to ribosome biogenesis. A similar effect was observed after XBP1 overexpression in CHO-K1 cells . These results may be an indication that the positive effect of overexpression of the UPR transcription factor on heterologous protein production [33, 34, 16, 35] results not just from stimulation of folding and secretion of proteins but also their synthesis. The induction of protein folding related genes upon Hac1 overexpression is in line with the literature on UPR effects, while an impact on organelle biosynthesis other than ER and Golgi has so far only been described for mammalian cells.
The stimulatory effects of XBP1 induction on ribosomes and organelle synthesis in mammalian cells like lymphocytes have been attributed to their function as dedicated protein factories. On the other hand the UPR in lower eukaryotes should rather serve to alleviate the load of unfolded, aggregation prone protein. It will be of interest in the future to investigate whether Hac1 stimulates ribosome biogenesis in other yeasts and fungi as well, and whether this leads to increased translation.
In this context, it is worthwhile to mention the induction of two pathways leading to the unusual post-translationally modified amino acid derivatives diphthamide and hypusine which are exclusively found in eukaryotic translation elongation factors 2 (eEF2) and 5 (eEF5), respectively [36, 37]. As these biosynthetic pathways are rather complex, and outstanding in the otherwise downregulated group of 'amino acid biosynthesis', this induction underlines the increased demand for protein synthesis.
Furthermore, we observe that ER stress leads to increased transcription of genes coding for the large and small subunits of the mitochondrial ribosomes (MRPS, RSM and MRPL families), mitochondrial translation initiation and elongation factors (IFM1, MEF1, MEF2) and mitochondrial DNA polymerase (MIP1). Several essential constituents of the mitochondrial inner membrane presequence translocase (TIM family) are also up-regulated, indicating increased necessity for protein import into the mitochondria. Similarly, XBP1 was shown to increase mitochondrial mass and function in two types of mammalian cells .
While previous studies analysing UPR regulation mainly focus on up-regulated genes , more than half of the genes identified in our study to be regulated are strongly down-regulated (at least 1.5 fold). As can be seen in Figure 4, anabolic processes such as vitamin production, amino acid and aromatic compound biosynthesis, heterocycle metabolic processes, carbohydrate, lipid and cofactor metabolism are among the most prominent repressed classes in both DTT-treated as well as Hac1-overproducing cells. The down-regulation of energy consuming biosynthetic pathways emerges as a general picture during ER stress conditions. However, it becomes obvious that the response to the folding perturbation agent DTT strongly differs from constitutive UPR induction by Hac1-overproduction. Especially the prominent down-regulation of genes belonging to 'electron transport' and 'cellular respiration' can easily be explained by the strong reducing capacities of DTT. Prominent members of the mitochondrial inner membrane electron transport chain such as subunits of the cytochrome c oxidase (COX4, COX4, COX5A, COX13) and the ubiquinol cytochrome-c reductase complex (COR1, QRC6, QRC7, QRC9, RIP1) are significantly repressed upon DTT treatment. Additionally, cytochrome c (CYC1), cytochrome c1 (CYT1) and cytochrome c heme lyase (CYC3) are only under DTT-dependent repression (GO: 'generation of precursor metabolites and energy'). The reducing features of DTT are most probably also the reason for the up-regulation of genes involved in the upkeeping of 'cellular homeostasis' and clearly, addition of DTT is provoking a 'response to a chemical stimulus'.
Down-regulated genes appearing in both Hac1 and DTT in the 'protein modification' group focus on protein kinases (CDC5, CDH1, DBF2) and components of the ubiquitinylation complex (BUL1, CUL3) involved in cell cycle regulation driving the cells towards mitotic exit (CDC5, CDH1, MOB1). These effects are even more pronounced in the Hac1-strain, where several more histone modifying enzymes as well as cycline-dependent protein kinases and components of the protein kinase C signalling pathway show reduced transcription levels compared to the wild type. Unlike reported for the filamentous fungi T. reesei  and A. nidulans , genes encoding the histones H2A, H2B, H3 and H4 appear to be down-regulated upon secretion stress in P. pastoris.
No clear picture emerges regarding the regulation of 'lipid metabolism': While sterol and ergosterol biosynthesis tend to be inhibited, the production of sphingolipid precursor substances is enhanced. On the other hand, a down-regulation of the major cell wall constituents (β-1,3 glucanases BGL2 and EXG1, cell wall mannoproteins CCW12, CWP2 and TPI1, GPI-glycoproteins GAS1 and SED1, PST1) and genes coding for proteins required for the transport of cell wall components to the cell surface (SBE22) is manifest. Taken together, these results indicate a significant remodelling process regarding the P. pastoris cell envelope during ER stress conditions.
Interestingly, the major groups of metabolic genes were down-regulated upon Hac1 overexpression, indicating a decrease of the supply of metabolites. However, it should be noted that no reduction of the specific growth rate was observed as compared to the wild type strain (μ = 0.37 and 0.39 h-1, respectively). A reduction of metabolic processes, and amino acid synthesis in particular, is contradictory to translation stimulation. Further research will be needed to elucidate the overall regulatory pattern of UPR in respect to protein synthesis.
Additional gene finding and annotation added to the available data for P. pastoris lead to a list of approximately 4,000 genes with a putative identification of their function, and 11,000 more potential open reading frames. An oligonucleotide probe set was designed, the hybridization results were evaluated for reproducibility, and results from a biologically relevant analysis were tested for meaningfulness. In a direct comparison to S. cerevisiae employing DTT treatment for UPR induction, 45 out of 93 genes reacted similarly. The differences thus observed between P. pastoris and S. cerevisiae underline the importance of DNA microarrays for industrial production strains. HAC1 overexpression in P. pastoris obviously leads to induction of many genes involved in translation: most genes of ribosome biogenesis, as well as many related to RNA metabolism and translation were up-regulated, an effect that has never been observed in yeasts and filamentous fungi so far.
The upregulation of ribosomal biogenesis, RNA metabolism, translation, and organelle biosynthesis is specific for HAC1 overexpression and not observed with DTT treatment, while the latter leads specifically to the upregulation of genes related to chemical stimulus, and the downregulation in the groups electron transport and respiration, so that these reactions have to be regarded as specific for the treatment with a reducing agent rather than UPR regulated.
Gene Prediction and Sequence Selection
Gene prediction and the selection of sequences for oligonucleotide probes were based on sequenced contigs of the P. pastoris genome including predictions of protein coding genes, available through Integrated Genomics . The number of predicted genes was 5,425 of which 3,680 had an assigned function. The ORFs were made up of experimentally identified genes, as well as ORFs predicted by a proprietary gene finder .
To validate and possibly improve these predictions, de novo gene finding was conducted. First three de novo gene finder (GeneMark, Glimmer3, GlimmerHMM) were tested on the genome sequence of S. cerevisiae (data from BioMart, ) to evaluate their performance on yeast genomes. As described in Results and Discussion, GeneMark  was selected for further gene prediction on the P. pastoris genome sequence. To run the gene prediction it was necessary to train GeneMark on S. cerevisiae by building a matrix with transition probabilities for coding and non-coding regions used by the Hidden Markov Model (HMM) of the program. With the amount of data available we were able to generate a matrix of the 7th order. The genes of P. pastoris were predicted using the S. cerevisiae matrix and the lowest possible probability score cut-off (t = 0.05). In the initial stage of the microarray design the aim was to predict as many putative ORFs as possible. In this context a higher false positives rate was accepted in order to keep the false negatives rate as low as possible.
The predicted sequences were merged with data from IG and clustered by running cd-hit  with a similarity cut-off of 90%. For all of the resulting sequences a BLASTX search was done against S. cerevisiae using WU-BLAST . Blast data was further filtered for length (cutoff 55 bp) and low prediction score. Clusters comprised of more than one gene were represented by the longest sequence, or curated manually, if appropriate.
From this first gene list (PpaV1) microarrays were analyzed as described below. Spots with a positive signal were determined using the mean plus one standard deviation of the negative control probes as a cut-off. Sequences were selected if they were positive in at least 8 out of 12 arrays. This criterion was chosen to fill the array capacity. Additionally all sequences with a probability score higher than 0.5 or having an annotation were kept for the second set of sequences (PpaV2).
For the PpaV2 sequence set the program cd-hit-est  was used to find all ORFs that had a global identity of > 80% with S. cerevisiae. WU-BLASTX and WU-TBLASTN searches were conducted against S. cerevisiae, using a low complexity filter and E < 10-7. For all the sequences that did not have a match with S. cerevisiae under these conditions the two BLAST searches were repeated against the SwissProt/TrEMBL  database. A perl script was developed to summarize and compare the BLAST results.
Oligo Design and Array platform
Oligos for the PpaV1 sequences were designed with the Program OligoArray 2.1  to match the melting temperature distribution of Agilent's S. cerevisiae oligos on the Yeast Oligo Microarray (V2), design number 013384.
The oligo-set for the PpaV2 sequence set was designed using the thermodynamic model-based oligoset optimizer 'TherMODO'. This tool incorporates advanced quantitative models for probe-target binding region accessibility and position-dependent target labelling efficiency, and replaces the common greedy search algorithm by a global set optimization step, achieving high discrimination power for particularly uniform probe sets . Probes for Agilent arrays are limited to a maximum length of 60 nucleotides by the manufacturing process. For increased flexibility in the probe design, the oligoset design optimization considered probes ranging in length from 57 to 60 nucleotides.
These arrays were produced on Agilent 60 mer oligonucleotide high density arrays 4 × 44 K (with 42,034 available features) for PpaV1 and 8 × 15 K (with 15,208 available features) for PpaV2.
For the first batch of arrays a same-same design was used, employing six replicates each of Pool 1 and of Pool 2. The aim of this experiment was to determine which of the probes hybridize to P. pastoris targets. For the second batch of arrays a two-state comparison set up was chosen with 6 replicates for each experiment of which 3 were dye swapped.
Strains und Cultures
For the first batch of arrays the aim was to determine which of the predicted probes hybridize with targets from P. pastoris. To make sure that many genes were active it was important to pool samples from various conditions of the cells. Samples were taken from two different P. pastoris strains, X-33 and CBS2612, grown on different media and taken at both exponential and stationary growth phase. The media were YP Medium (1% yeast extract, 2% peptone and either 2% glucose, 2% glycerol or 0.5% methanol as carbon source), Buffered Minimal Medium (1.34% yeast nitrogen base, 4 × 10-5% biotin, 100 mM potassium phosphate pH 6.0 and either 2% glucose, 2% glycerol or 0.5% methanol as carbon source), and Buffered Minimal Medium described above supplemented with amino acids (0.005% of L-glutamic acid, L-methionine, L-lysine, L-leucine and L-isoleucine). The samples were combined into two pools with Pool 1 containing 18 samples from the exponential growth phase and Pool 2 containing 18 samples from the stationary phase. Both pools additionally contained seven chemostat samples of the strain X-33 3H6Fab, grown as in .
For the UPR experiments, strains GS115 HAC1, constitutively overproducing the activated form of S. cerevisiae Hac1, as described in Gasser et al. [33, 4], as well as GS115 transformed with the empty vector pGAPHIS (a histidine prototrophic isogenic strain of GS115) were cultivated in YPD (YP as above with glucose) at 28°C. After growing the cultures to an OD600 = 5.7, dithiothreitol (2.5 mM) was added where appropriate. After 1 more hour of cultivation, 1 ml culture was added to 0.5 ml precooled phenol solution (5% in absolute ethanol) and centrifuged immediately for 30 sec at 13.000 rpm. After discarding the supernatants the pellets were frozen at -80°C.
All samples were resuspended with 1 mL TRI Reagent (Sigma). Cells were disrupted after addition of 500 μL glass-beads with a Thermo Savant Fastprep FP120 Ribolyzer by treatments of 2 × 20 sec at 6.5 ms-1. RNA was extracted with chloroform, precipitated with isopropanol, washed with 75% ethanol and dissolved with diethylpyrocarbonate treated water. The extracted RNAs were quantified via absorption at 260 and 280 nm. The quality of the RNA samples was verified with the Agilent Bioanalyzer 2100 and RNA 6000 Nano Assay kit (Agilent Technologies, California).
Labeling and Hybridization
Hybridization targets for P. pastoris microarrays were prepared according to Agilent's Two-Color Microarray-Based Gene Expression Analysis protocol (Version 5.5, February 2007). Purification of the labelled and amplified RNA was conducted using RNeasy mini spin columns (Qiagen). The quality of labelled cRNA was evaluated on the Agilent Bioanalyzer 2100 and quantified using a ND-1000 NanoDrop Spectrophotometer. Fragmented cRNA samples were applied to the individual arrays. The slides were placed into Agilent hybridization oven and hybridized for 17 h, at 65°C and 10 rpm.
Slides were scanned with an Agilent MicroArray Scanner and intensities were extracted using Agilent's Feature Extraction software (version 9.1). The resulting data was imported into R where data pre-processing and normalization was performed. In the pre-processing step all outliers and saturated spots were given the weight zero. After plotting the data we decided to refrain from background correction since it has the tendency to add more noise to the data . The data were normalized using locally weighted MA-scatterplot smoothing (LOESS) followed by a between array scale normalization. Both functions are available within the limma package of R . For the selection of differentially expressed genes linear models were fitted to the log-ratios of the expression data separately for each gene. An empirical Bayes approach was used to shrink the probe-wise sample variances towards a common value yielding a moderated t-statistic per gene . P-values were corrected for multiple testing using Holm's method . Features were defined as differentially expressed if they had a p-value < 0.05. For the identification of stronger regulatory effects an additional cut-off for the fold change (FC) of 1.5 > FC > 1/1.5 was applied. Description of the platform, array, raw data as well as processed data were deposited at ArrayExpress  under the accession numbers A-MEXP-1157.
All annotated P. pastoris genes were categorized into GO biological process terms using the SGD GO slim tool , whereby P. pastoris specific genes were included into the term 'other'. The significance of a deviation of the number of up- or downregulated genes in each group from the average was verified with a Fisher test (Additional file 4).
This work was supported by the Austrian Science Fund (project No. I37-B03), the European Science Foundation (programme EuroSCOPE), and the Austrian Research Promotion Agency (programme FHplus).
The Vienna Science Chair of Bioinformatics gratefully acknowledges support by the Vienna Science and Technology Fund (WWTF), Baxter AG, Austrian Research Centres (ARC) Seibersdorf, and Austrian Centre of Biopharmaceutical Technology (ACBT). TT acknowledges partial funding by the GenAu BIN-II PhD programme.
- Ye RW, Wang T, Bedzyk L, Croker KM: Applications of DNA microarrays in microbial systems. J Microbiol Methods. 2001, 47 (3): 257-272. 10.1016/S0167-7012(01)00308-6.PubMedView ArticleGoogle Scholar
- Sauer M, Branduardi P, Gasser B, Valli M, Maurer M, Porro D, Mattanovich D: Differential gene expression in recombinant Pichia pastoris analysed by heterologous DNA microarray hybridisation. Microb Cell Fact. 2004, 3 (1): 17-10.1186/1475-2859-3-17.PubMedView ArticleGoogle Scholar
- Rautio JJ, Kataja K, Satokari R, Penttila M, Soderlund H, Saloheimo M: Rapid and multiplexed transcript analysis of microbial cultures using capillary electophoresis-detectable oligonucleotide probe pools. J Microbiol Methods. 2006, 65 (3): 404-416. 10.1016/j.mimet.2005.08.010.PubMedView ArticleGoogle Scholar
- Gasser B, Maurer M, Rautio J, Sauer M, Bhattacharyya A, Saloheimo M, Penttilä M, Mattanovich D: Monitoring of transcriptional regulation in Pichia pastoris under protein production conditions. BMC Genomics. 2007, 8: 179-10.1186/1471-2164-8-179.PubMedView ArticleGoogle Scholar
- Sims AH, Robson GD, Hoyle DC, Oliver SG, Turner G, Prade RA, Russell HH, Dunn-Coleman NS, Gent ME: Use of expressed sequence tag analysis and cDNA microarrays of the filamentous fungus Aspergillus nidulans. Fungal Genet Biol. 2004, 41 (2): 199-212. 10.1016/j.fgb.2003.11.005.PubMedView ArticleGoogle Scholar
- Bachem CW, van der Hoeven RS, de Bruijn SM, Vreugdenhil D, Zabeau M, Visser RG: Visualization of differential gene expression using a novel method of RNA fingerprinting based on AFLP: analysis of gene expression during potato tuber development. Plant J. 1996, 9 (5): 745-753. 10.1046/j.1365-313X.1996.9050745.x.PubMedView ArticleGoogle Scholar
- Arvas M, Pakula T, Lanthaler K, Saloheimo M, Valkonen M, Suortti T, Robson G, Penttila M: Common features and interesting differences in transcriptional responses to secretion stress in the fungi Trichoderma reesei and Saccharomyces cerevisiae. BMC Genomics. 2006, 7: 32-10.1186/1471-2164-7-32.PubMedView ArticleGoogle Scholar
- Genomics I: ERGO bioinformatics suite. [http://ergo.integratedgenomics.com/ERGO/]
- Besemer J, Borodovsky M: GeneMark: web software for gene finding in prokaryotes, eukaryotes and viruses. Nucleic Acids Res. 2005, 33 (Web Server issue): W451-4. 10.1093/nar/gki487.PubMedView ArticleGoogle Scholar
- Majoros WH, Pertea M, Salzberg SL: TigrScan and GlimmerHMM: two open source ab initio eukaryotic gene-finders. Bioinformatics. 2004, 20 (16): 2878-2879. 10.1093/bioinformatics/bth315.PubMedView ArticleGoogle Scholar
- Mathé C, Sagot MF, Schiex T, Rouzé P: Current methods of gene prediction, their strengths and weaknesses. Nucleic Acids Res. 2002, 30 (19): 4103-4117. 10.1093/nar/gkf543.PubMedView ArticleGoogle Scholar
- Majoros WH, Pertea M, Salzberg SL: Efficient implementation of a generalized pair hidden Markov model for comparative gene finding. Bioinformatics. 2005, 21 (9): 1782-1788. 10.1093/bioinformatics/bti297.PubMedView ArticleGoogle Scholar
- Pedersen JS, Hein J: Gene finding with a hidden Markov model of genome structure and evolution. Bioinformatics. 2003, 19 (2): 219-227. 10.1093/bioinformatics/19.2.219.PubMedView ArticleGoogle Scholar
- Kreil DP, Russell RR, Russell S: Microarray oligonucleotide probes. Methods Enzymol. 2006, 410: 73-98. 10.1016/S0076-6879(06)10004-X.PubMedView ArticleGoogle Scholar
- Erasmus DJ, van der Merwe GK, van Vuuren HJ: Genome-wide expression analyses: Metabolic adaptation of Saccharomyces cerevisiae to high sugar stress. FEMS Yeast Res. 2003, 3 (4): 375-399. 10.1016/S1567-1356(02)00203-9.PubMedView ArticleGoogle Scholar
- Valkonen M, Penttilä M, Saloheimo M: Effects of inactivation and constitutive expression of the unfolded- protein response pathway on protein production in the yeast Saccharomyces cerevisiae. Appl Environ Microbiol. 2003, 69 (4): 2065-2072. 10.1128/AEM.69.4.2065-2072.2003.PubMedView ArticleGoogle Scholar
- Travers KJ, Patil CK, Wodicka L, Lockhart DJ, Weissman JS, Walter P: Functional and genomic analyses reveal an essential coordination between the unfolded protein response and ER-associated degradation. Cell. 2000, 101 (3): 249-258. 10.1016/S0092-8674(00)80835-1.PubMedView ArticleGoogle Scholar
- Lomsadze A, Ter-Hovhannisyan V, Chernoff YO, Borodovsky M: Gene identification in novel eukaryotic genomes by self-training algorithm. Nucleic Acids Res. 2005, 33 (20): 6494-6506. 10.1093/nar/gki937.PubMedView ArticleGoogle Scholar
- University W: WU-BLAST. [http://blast.wustl.edu/]
- Li W, Godzik A: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006, 22 (13): 1658-1659. 10.1093/bioinformatics/btl158.PubMedView ArticleGoogle Scholar
- Rouillard JM, Zuker M, Gulari E: OligoArray 2.0: design of oligonucleotide probes for DNA microarrays using a thermodynamic approach. Nucleic Acids Res. 2003, 31 (12): 3057-3062. 10.1093/nar/gkg426.PubMedView ArticleGoogle Scholar
- Arvas M, Kivioja T, Mitchell A, Saloheimo M, Ussery D, Penttila M, Oliver S: Comparison of protein coding gene contents of the fungal phyla Pezizomycotina and Saccharomycotina. BMC Genomics. 2007, 8: 325-10.1186/1471-2164-8-325.PubMedView ArticleGoogle Scholar
- Leparc GG, Tuechler T, Striedner G, Bayer K, Sykacek P, Hofacker I, Kreil DP: Model based probe set optimization for high-performance microarrays. Nucleic Acids Research. 2008, submitted:Google Scholar
- Agilent: eArray. [https://earray.chem.agilent.com/earray/]
- Guillemette T, van Peij NN, Goosen T, Lanthaler K, Robson GD, van den Hondel CA, Stam H, Archer DB: Genomic analysis of the secretion stress response in the enzyme-producing cell factory Aspergillus niger. BMC Genomics. 2007, 8: 158-10.1186/1471-2164-8-158.PubMedView ArticleGoogle Scholar
- Sims AH, Gent ME, Lanthaler K, Dunn-Coleman NS, Oliver SG, Robson GD: Transcriptome analysis of recombinant protein secretion by Aspergillus nidulans and the unfolded-protein response in vivo. Appl Environ Microbiol. 2005, 71 (5): 2737-2747. 10.1128/AEM.71.5.2737-2747.2005.PubMedView ArticleGoogle Scholar
- Rossanese OW, Soderholm J, Bevis BJ, Sears IB, O'Connor J, Williamson EK, Glick BS: Golgi structure correlates with transitional endoplasmic reticulum organization in Pichia pastoris and Saccharomyces cerevisiae. J Cell Biol. 1999, 145 (1): 69-81. 10.1083/jcb.145.1.69.PubMedView ArticleGoogle Scholar
- SGD: SGD Gene Ontology Slim Mapper. [http://db.yeastgenome.org/cgi-bin/GO/goSlimMapper.pl]
- Higashio H, Kohno K: A genetic link between the unfolded protein response and vesicle formation from the endoplasmic reticulum. Biochem Biophys Res Commun. 2002, 296 (3): 568-574. 10.1016/S0006-291X(02)00923-3. 2002/08/15PubMedView ArticleGoogle Scholar
- Shaffer AL, Shapiro-Shelef M, Iwakoshi NN, Lee AH, Qian SB, Zhao H, Yu X, Yang L, Tan BK, Rosenwald A, Hurt EM, Petroulakis E, Sonenberg N, Yewdell JW, Calame K, Glimcher LH, Staudt LM: XBP1, downstream of Blimp-1, expands the secretory apparatus and other organelles, and increases protein synthesis in plasma cell differentiation. Immunity. 2004, 21 (1): 81-93. 10.1016/j.immuni.2004.06.010.PubMedView ArticleGoogle Scholar
- Payne T, Hanfrey C, Bishop AL, Michael AJ, Avery SV, Archer DB: Transcript-specific translational regulation in the unfolded protein response of Saccharomyces cerevisiae. FEBS Lett. 2008Google Scholar
- Tigges M, Fussenegger M: Xbp1-based engineering of secretory capacity enhances the productivity of Chinese hamster ovary cells. Metab Eng. 2006Google Scholar
- Gasser B, Maurer M, Gach J, Kunert R, Mattanovich D: Engineering of Pichia pastoris for improved production of antibody fragments. Biotechnol Bioeng. 2006, 94 (2): 353-361. 10.1002/bit.20851.PubMedView ArticleGoogle Scholar
- Gasser B, Sauer M, Maurer M, Stadlmayr G, Mattanovich D: Transcriptomics-based identification of novel factors enhancing heterologous protein secretion in yeasts. Appl Environ Microbiol. 2007, 73 (20): 6499-6507. 10.1128/AEM.01196-07. 2007/09/04PubMedView ArticleGoogle Scholar
- Valkonen M, Ward M, Wang H, Penttilä M, Saloheimo M: Improvement of foreign-protein production in Aspergillus niger var. awamori by constitutive induction of the unfolded-protein response. Appl Environ Microbiol. 2003, 69 (12): 6979-6986. 10.1128/AEM.69.12.6979-6986.2003.PubMedView ArticleGoogle Scholar
- Mattheakis LC, Sor F, Collier RJ: Diphthamide synthesis in Saccharomyces cerevisiae: structure of the DPH2 gene. Gene. 1993, 132: 149-154. 10.1016/0378-1119(93)90528-B.PubMedView ArticleGoogle Scholar
- Wolff EC, Kang KR, Kim YS, Park MH: Posttranslational synthesis of hypusine: Evolutionary progression and specificity of the hypusine modification. Amino Acids. 2007, 33 (2): 341-350. 10.1007/s00726-007-0525-0.PubMedView ArticleGoogle Scholar
- Overbeek R, Larsen N, Walunas T, D'Souza M, Pusch G, Selkov E, Liolios K, Joukov V, Kaznadzey D, Anderson I, Bhattacharyya A, Burd H, Gardner W, Hanke P, Kapatral V, Mikhailova N, Vasieva O, Osterman A, Vonstein V, Fonstein M, Ivanova N, Kyrpides N: The ERGO genome analysis and discovery system. Nucleic Acids Res. 2003, 31 (1): 164-171. 10.1093/nar/gkg148.PubMedView ArticleGoogle Scholar
- (EBI) EBI, (CSHL) CSHL: BioMart. [http://www.biomart.org/biomart/martview/]
- Technology GI: GeneMark. [http://exon.gatech.edu/GeneMark/]
- (SIB) SIB: Swiss-Prot/TrEMBL. [http://www.expasy.ch/sprot/]
- Baumann K, Maurer M, Dragosits M, Cos O, Ferrer P, Mattanovich D: Hypoxic fed batch cultivation of Pichia pastoris increases specific and volumetric productivity of recombinant proteins. Biotechnol Bioeng. 2008, 100 (1): 177-83. 10.1002/bit.21763.PubMedView ArticleGoogle Scholar
- Zahurak M, Parmigiani G, Yu W, Scharpf RB, Berman D, Schaeffer E, Shabbeer S, Cope L: Pre-processing Agilent microarray data. BMC Bioinformatics. 2007, 8: 142-10.1186/1471-2105-8-142.PubMedView ArticleGoogle Scholar
- Smyth GK, Speed T: Normalization of cDNA microarray data. Methods. 2003, 31 (4): 265-273. 10.1016/S1046-2023(03)00155-5.PubMedView ArticleGoogle Scholar
- Smyth GK: Linear models and empirical bayes methods for assessing differential expression in microarray experiments. Stat Appl Genet Mol Biol. 2004, 3: Article3-PubMedGoogle Scholar
- Holm S: A Simple Sequentially Rejective Bonferroni Test. Scandinavian Journal of Statistics. 1979, 65 -670.Google Scholar
- EMBL-EBI: ArrayExpress. [http://www.ebi.ac.uk/microarray-as/aer/]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.