Gene expression and proteomic analysis of the formation of Phakopsora pachyrhizi appressoria

Background Phakopsora pachyrhizi is an obligate fungal pathogen causing Asian soybean rust (ASR). A dual approach was taken to examine the molecular and biochemical processes occurring during the development of appressoria, specialized infection structures by which P. pachyrhizi invades a host plant. Suppression subtractive hybridization (SSH) was utilized to generate a cDNA library enriched for transcripts expressed during appressoria formation. Two-dimensional gel electrophoresis and mass spectroscopy analysis were used to generate a partial proteome of proteins present during appressoria formation. Results Sequence analysis of 1133 expressed sequence tags (ESTs) revealed 238 non-redundant ESTs, of which 53% had putative identities assigned. Twenty-nine of the non-redundant ESTs were found to be specific to the appressoria-enriched cDNA library, and did not occur in a previously constructed germinated urediniospore cDNA library. Analysis of proteins against a custom database of the appressoria-enriched ESTs plus Basidiomycota EST sequences available from NCBI revealed 256 proteins. Fifty-nine of these proteins were not previously identified in a partial proteome of P. pachyrhizi germinated urediniospores. Genes and proteins identified fell into functional categories of metabolism, cell cycle and DNA processing, protein fate, cellular transport, cellular communication and signal transduction, and cell rescue. However, 38% of ESTs and 24% of proteins matched only to hypothetical proteins of unknown function, or showed no similarity to sequences in the current NCBI database. Three novel Phakopsora genes were identified from the cDNA library along with six potentially rust-specific genes. Protein analysis revealed eight proteins of unknown function, which possessed classic secretion signals. Two of the extracellular proteins are reported as potential effector proteins. Conclusions Several genes and proteins were identified that are expressed in P. pachyrhizi during appressoria formation. Understanding the role that these genes and proteins play in the molecular and biochemical processes in the infection process may provide insight for developing targeted control measures and novel methods of disease management.


Background
Asian soybean rust (ASR), caused by the fungal pathogen Phakopsora pachyrhziri Sydow & Sydow is an aggressive foliar pathogen of soybeans. Initially identified in Asia, it has since spread to all major soybean-growing regions of the world, including the United States [1,2]. The impact of disease on crop yields is influenced by temperature and humidity, spore load introduced into a field, and the growth stage of soybeans when first infection occurs. Field trials in Brazil found yield loss averaging 37% when infection began at R5 growth stage and 67% when infection started at R2 [3]. Observations in Asia have reported yield losses up to 80% under high disease pressure and favorable environmental conditions [4,5].
While fungicide applications are able to reduce yield loses, a limited number of fungicides are available for foliar application on soybeans. The cost of fungicide applications, efficacy of treatments on maturing plants and dense canopies, and environmental impact are all considerations for the soybean grower. Six resistance genes (Rpp1-Rpp6) have been identified that provide resistance to P. pachyrhizi [6][7][8][9][10][11]. However, these genes display only race-specific resistance to selected isolates of P. pachyrhizi.
When a urediniospore makes contact with a soybean leaf, a single germ tube elongates across the leaf surface. At the tip of this germ tube a specialized infection structure, the appressorium, is formed. While most rust pathogens penetrate the host indirectly by entering through stomatal openings and then breaching the mesophyll cell wall, P. pachyrhizi is one of a few rusts that enter the host by direct penetration of the cuticle and epidermal cell wall [12]. Recent transmission electron microscopy confirmed that P. pachyrhizi uses mechanical force to penetrate the cuticle and digestive enzymes to penetrate the epidermal cell wall [13]. The fungus continues to grow invasively in the host forming haustoria, colonizing hyphae, and ultimately uredinia.
In a previous study, a cDNA library was utilized to evaluate gene expression during urediniospore germination of P. pachyrhizi [14]. A subsequent independent study examined the partial proteome of germinated urediniospores [15]. The next critical step in the infection cycle is appressoria formation. P. pachyrhizi infects over 90 species of legumes, and this broad host range is unique among the rusts [16]. Elucidating the events during appressoria formation could shed light on the mechanism allowing broad-spectrum interaction.
Obligate pathogens require a living host on which to survive and propagate, making it difficult to separate fungal-specific genes or proteins from those of the host. However, induction of appressoria by surface contact with artificial substrates is possible for some fungi, including P. pachyrhizi [17,18]. Despite the ability to induce appressoria formation in vitro, a review of the literature found appressoria-specific EST libraries generated only for the plant pathogens Magnaporthe grisea, Puccinia triticina, and Colletotrichum higginsianum [19][20][21].
Several studies have utilized bioinformatics to identify appressoria proteins from EST sequencing projects [21,22], but only a few have used a proteomics approach to identify differentially expressed proteins accumulated during appressoria formation. For example, comparison of protein expression patterns in germinating conidia to those observed in appressoria revealed five proteins that were up-regulated during appressoria formation in M. grisea [23]. Protein profiles of three developmental stages of Phytophthora infestans (cysts, germinated cysts, and appressoria-forming cysts) found 13 proteins to be up-or down-regulated during different developmental stages [24]. Likewise, another study of P. infestans identified four up-regulated genes and their protein products in cysts with appressoria [25].
This study identified ESTs and proteins present during, and possibly required for, appressoria formation in P. pachyrhizi. This is one of the few studies to evaluate genes or proteins present specifically during appressoria formation and the first to combine the two techniques. The comparison of identified transcripts to accumulated proteins allows for a more comprehensive analysis of the molecular and biochemical processes occurring during appressoria formation.

Methods
Fungal strain and growth conditions P. pachyrhizi isolate Taiwan 72-1 was maintained at the USDA-ARS Foreign Disease-Weed Science Research Unit Plant Pathogen Containment Facility at Fort Detrick, MD [26] under Animal and Plant Health Inspection Service permit. Urediniospores were obtained and maintained as described previously [27].
Urediniospores were germinated by floating on the surface of sterile distilled water containing 50 μg/ml each of ampicillin and streptomycin in a 9" x 13" glass baking dish, for 6 h at room temperature in the dark. Germinated urediniospores were collected onto Whatman No. 1 filter paper (Whatman; Piscataway, NJ) and flash frozen in liquid nitrogen.
Appressoria were generated by application of 150 ml of a 100,000 urediniospore/ml suspension onto a 245 mm x 245 mm x 20 mm polystyrene dish, followed by incubation in the dark at room temperature for 6 h. Water was decanted from the plates and collected for protein extraction. This collected water will subsequently be referred to as the Appressoria Water Fraction (AWF). A sterile water wash was used to remove urediniospores that did not germinate from the plate. Plates were examined under a microscope before and after washing to confirm the presence of appressoria. Appressoria, germ tubes, and germinated urediniospores were collected by scraping the plate surface with a cell scraper (Nunc, Thermo Fisher Scientific; Rochester, NY) and immediately frozen and stored in liquid nitrogen.

RNA isolation and library construction
Germinated urediniospores and appressoria were each separately ground under liquid nitrogen with a mortar and pestle. Total RNA was isolated by phenol:chloroform extraction [28], and precipitated overnight with 12 M lithium chloride.
Purification of poly(A) + mRNA, synthesis of cDNA, and library construction was performed by SeqWright (Houston, TX) following standard protocols for library construction via suppression subtractive hybridization [29]. The driver for the subtracted library was generated from 1400 μg of total RNA extracted from germinated urediniospores, and 1400 μg of total RNA from appressoria was used as the tester. The subtracted-cDNA pool was cloned into pBluescript II KS(+) vector (Stratagene; La Jolla, CA) and transformed into DH10B E.coli strain (Invitrogen; Carlsbad, CA). Average insert size for the library was determined by plasmid mini-prep, PCR using T3 and T7 primers, and gel electrophoresis of 15 randomly selected colonies from the library (performed by SeqWright).

Sequence analysis
Single-pass sequencing of clones from the cDNA library was performed on an Applied Biosystems 3700 DNA analyzer (Applied Biosystems; Framingham, MA) at the USDA-ARS Eastern Regional Research Center Nucleic Acids Facility (Wyndmoor, PA . Clones that did not share similarity to these P. pachyrhizi ESTs were subjected to additional, bidirectional sequencing, using anchored poly(T)N primers to read through poly(A) + tails. Additional primers, as needed, were designed using Primer3 (http://frodo.wi.mit. edu/primer3/) [30]. Subsequent sequence analysis and compilation of full-length sequences was performed using Chromas 2.33 (Techelysium Pty; Helensvale, Australia). Putative functions were assigned to clones by BLASTX analysis against the NCBI non-redundant protein database. Putative functional categories were determined using the Munich Information Center of Protein Sequences (MIPS) Functional Catalogue (http://mips.helmholtz-muenchen.de/ proj/funcatDB/search_main_frame.html).
Redundancy within the subtracted library was identified by BLASTN analysis of the library against itself. Redundant and overlapping ESTs were assembled into contigs using the program CAP3 (http://pbil.univ-lyon1. fr/cap3.php) [31].

Quantitative real-time RT-PCR analysis of transcript levels during the infection cycle
The soybean cultivar Williams 82 was inoculated with P. pachyrhizi isolate Taiwan 72-1, as described previously [27]. Leaves were collected from three biological replicates at 0, 6, 12, 24, 72, 168, and 336 h post inoculation (hpi). Freshly germinated urediniospores and appressoria, separate from that used to generate the cDNA library, were produced as described above. Total RNA was isolated from 100 mg of each sample and 100 mg of urediniospores using the RNeasy Mini Plant kit (Qiagen; Valencia, CA) following the manufacturer's protocol.
Six ESTs specific to the appressoria-enriched cDNA library and representing four putative functional categories and one unclassified protein were selected for transcript analysis. Three ESTs common to the appressoria-enriched cDNA library and the germinated urediniospore library [14] were also selected for analysis. Nucleotide sequences of each clone were compared to the Trace Archives for P. pachyrhizi Whole Genome Shotgun (WGS) sequences by BLASTN to identify the positions of potential introns. Primers were designed to span putative inrons where possible, using the primer design program Primer3 (http://frodo.wi.mit.edu/ primer3/).
To assess transcript levels quantitative real-time RT-PCR (qRT-PCR) was performed using three biological replicates and two technical replicates for each template. First-strand cDNA synthesis was performed using the QuantiTect Reverse Transcription kit (Qiagen) following the manufacturer's protocol. Real-time PCR reactions were performed on the SmartCycler System (Cepheid; Santa Clara, CA) using the QuantiTect SYBR Green PCR kit (Qiagen) following the manufacturer's protocol. Sequences and annealing temperature for each primer set are listed in Table 1. Melt curve analysis was performed to verify specificity of the PCR products, and control reactions containing no template or DNA without reverse transcriptase were included. Absolute quantification of target molecules was conducted using the sigmoidal model, and lambda gDNA was used to generate the optical calibration factor (OCF) [32]. Primers for α-tubulin were included to assess RNA integrity and demonstrate functionality of the RT-PCR assay [33]. For each time point, the average number of target molecules and standard deviation was calculated using the three biological replicates. Data were analysed by analysis of variance.
Quantitative real-time RT-PCR analysis of transcript levels during urediniospore germination, germ tube elongation, and appressoria formation Urediniospores were germinated as described above on the surface of water for 6 and 24 h, and on an appressoria-inductive surface for 6 and 24 h. Total RNA was isolated from 100 mg of each sample and from 100 mg of urediniospores using the RNeasy Mini Plant kit (Qiagen) following the manufacturer's protocol. Twelve target genes common to the appressoriaenriched cDNA library and the germinated urediniospore cDNA library were selected for transcript analysis, and primers were designed as described above. Quantitative real-time RT-PCR was performed on three biological replicates and three technical replicates for each target using the SmartCycler System as described above for qRT-PCR during the infection cycle. Sequences and annealing temperature for each primer set are listed in Table 1. Primers for α-tubulin were included to assess RNA integrity and demonstrate functionality of the assay [33]. For each time point, the average number of target molecules and standard deviation was calculated using the three biological replicates. Data were analysed by analysis of variance.

Protein extraction
Appressoria-enriched samples weighing 300 mg were ground to a fine powder in liquid nitrogen and suspended in 1 ml of isoelectric focusing (IEF) buffer containing 7 M Urea, 2 M Thiourea, 4% (3-[(3-Chloramidopropyl) dimethylammonio] propanesulfonate) (CHAPS) and 25 mM dithiothreitol (DTT). The suspension was mixed thoroughly and incubated at room temperature for 30 min on a shaker at 100 rpm. Samples were centrifuged for 15 min at 14,000 g and the supernatant was collected. This protein sample will be referred to as the appressoriaenriched fraction (AEF). The AWF was collected as described above to a final volume of 500 ml per biological replicate and applied to 0.45 μm filter units (Millipore; Billerica, MA) to remove remnant urediniospores and germ tubes. Proteins were precipitated in 80% acetone at -20°C for 16 h, collected by centrifugation at 14,000 g for 20 min, and resuspended in 3 ml of resuspension buffer (10 mM Tris pH 7.0, 1 mM EDTA, 3 mM DTT, 250 μM phenylmethylsulfonyl fluroride (PMSF) and 10% glycerol (w/v)). Samples were dialyzed against 5 l of resuspension buffer for 16 h at 4°C, precipitated in acetone, and resuspended in IEF buffer. Protein quantification was performed using the Markwell assay [34] with bovine serum albumin (BSA) as a standard.

Two-dimensional gel electrophoresis and sample preparation
Two-dimensional gel electrophoresis (2-DE) was carried out as previously described [15]. Peptides were cleaned and spotted on Matrix Assisted Laser Desorption Ionization (MALDI) plates using the Digilab/Investigator ProMS MALDI Preparation Station (Genomics Solutions; Ann Arbor, MI) programmed for peptide cleanup using the C18 ZipTip procedure according to the manufacturers recommendations (Millipore).

Mass spectrometry
Trypsin-digested proteins were subjected to mass spectrometry analysis using a 4700 Proteomics Analyzer instrument (MALDI-Time of Flight (TOF)/TOF) (Applied Biosystems) in the positive reflectron mode as previously described [15]. Spectra between the range of 800 and 4000 Da in MS mode were acquired through the averaging of 1000 spectra, and 2000 spectra in the MS/MS mode. Up to 10 most intense ions were selected for MS/ MS analysis using post-source decay (PSD) with 1 keV acceleration voltage. Criteria for ion selection were based in a signal-to-noise ratio cutoff of 20 and exclusion of all common trypsin autolysis peaks and common keratin contaminants. Instrument conversion of time-of-flight to mass (Da) for the monoisotopic ions was obtained with a tolerance of 50 ppm or better according to calibration with a peptide calibration mixture (Applied Biosystems). The MS/MS TOF calibration was optimized to 0.1 Da or better from the PSD of Glu1-fibrinopeptide B fragments.
Combined MS and MS/MS data were submitted for analysis using GPS Explorer version 3.6 software (Applied Biosystems) with MASCOT version 2.3.02 search engine (Matrix Science; Boston, MA) against a custom Table 1 Primer pairs used for real-time quantitative reverse transcription PCR analysis of expression patterns of Phakopsora pachyrhizi ESTs (Continued) sequence database of EST sequences in FASTA format. The analyses criteria included the following variable modifications: methionine oxidation, formation of pyroglutamine from N-terminal glutamine, and carbamidomethylation of cysteine residues from the reduction and alkylation of proteins. Reported proteins from database searches of putative peptide/protein sequences are within ≥95% confidence interval.
The database was constructed from a subset of the NCBI EST database compiled in March 2010 using the keywords Rust or Basidiomycota. The dataset was combined with EST sequences from the appressoria-enriched SSH library described above for a total of 280,119 sequences. Identification of proteins were validated by reanalyzing all samples using a decoy database composed of randomized entries of the corresponding original database.

Database analysis
Protein scores are based on the sum of the ion scores for the peptide mass fingerprints and MS/MS of selected peptides. A match score greater than the protein score threshold of 75 is considered significant within a 0.05 probability. The full data set from these analyses is included in the Additional file 1: Table S1.
Putative protein identities were assigned to ESTs based on BLAST searches against the NCBI nr protein database using open reading frames (ORF) corresponding to the peptides identified in each accession. BLASTX was performed using EST accessions with the highest protein score to identify protein homologues. Proteins were categorized using Uni-Prot Protein Knowledge Database [35] and placed into functional categories using the Munich Information Center for Protein Sequences (MIPS) Functional Catalogue. Proteins with multiple functions were given secondary classifications. Peptide sequences, protein BLAST identities and functional categories for each protein are listed in Additional file 1: Table S1. A comparison of Additional file 1: Table S1 to a data set of P. pachyrhizi germinated urediniospores proteins [15] identified a subset of different proteins for further analysis.
Proteins with putative signal peptides containing peptide cleavage sites were identified using SignalP 3.0 [36]. Target P [37] and PSORT [38] were used to predict intra-or extracellular localization of the proteins.

Fungal strain and growth conditions
After 6 h on polystyrene plates, germination rates of urediniospores averaged 60%, with 77% of germinated urediniospores showing mature, melanized appressoria ( Figure 1). Washing plates with distilled water raised the percent appressoria on each plate to an average of 86%.
Less than 1% of the urediniospores remaining on the plates following the wash step were ungerminated. The remainder of the urediniospores were germinated with either immature or no appressoria.

Sequence analysis and annotation of expressed genes
Single pass sequencing of 1133 cDNA clones led to the identification of 1029 ESTs. The sequences of the P. pachyrhizi EST clones were submitted to NCBI as Gen-Bank Accession numbers JK649959 to JK650987. The remaining 104 clones sequenced contained either no inserts or consisted of poor quality sequence and were discarded. A total of 238 non-redundant ESTs were identified (Additional file 2: Table S2), of which 169 appeared only once and 69 were represented by multiple clones at frequencies ranging from 2 to 481. The frequency of redundant clones is shown in Figure 2. Assembled sequences of the redundant ESTs were submitted to NCBI as GenBank Accession numbers JR863574 to JR863646. Singleton sequences ranged from 129 to 1217 bp, and assembled contigs ranged from 424 to 1781 bp.
BLASTN analysis of single pass sequences identified 41 ESTs with no homology to P. pachyrhizi ESTs from germinated urediniospores and P. pachyrhizi-infected soybean leaves. Bi-directional sequencing was performed on these 41 ESTs, and the sequences were submitted to NCBI as GenBank Accession numbers JQ083238 to JQ083278. Of these, 29 non-redundant ESTs were identified with no homology to P. pachyrhizi. Twenty-seven ESTs were singletons, and two ESTs occurred twice. Sequence similarity comparisons of the 29 nonredundant ESTs to the NCBI non-redundant protein database identified 26 ESTs, which fell into eight functional categories ( Table 2). Twenty-four ESTs shared identity to proteins from members of the fungal phylum Basidiomycota, and one EST had similarity to a member of the Ascomycota. The remaining EST had identity to an IS10 transposase from tomato. Three ESTs (Pp3394, Pp3734, Pp3842) showed no significant similarity to any protein entries in GenBank.

Quantitative real-time RT-PCR analysis of transcripts during soybean infection
Six ESTs specific to the appressoria-enriched cDNA library were selected for analysis of transcription levels during the infection process on soybean. Transcripts were assessed via real-time RT-PCR as a measure of target molecule for each of these ESTs (genes): Pp3684 (hydroxylmethylglutaryl coenyzme A (HMG-CoA) reductase), Pp3495 (G2/M phase checkpoint control protein Sum2), Pp3243 (prefoldin subunit 5), Pp3282 (ubiquitin-protein ligase), Pp3004 (serine/threonineprotein kinase), and Pp3505 (a hypothetical protein). The sequences represented four protein functional categories and one unclassified hypothetical protein. Three ESTs common to both the appressoria-enriched cDNA library and the germinated urediniospore cDNA library were also selected for transcript analysis. These ESTs represented two protein functional categories and one unclassified hypothetical protein: contig2F (5-aminolevulinate synthase), contig2N (P-type cation-transporting ATPase), and Pp3186 (a hypothetical protein).
While all of the genes were expressed in urediniospores, germinated urediniospores, and appressoria, differences were observed ( Figure 3). The α-tubulin gene had the highest transcription levels. Transcripts were not detected for any of the ESTs in infected soybean leaves immediately following inoculation at 0 hpi, but transcripts were detected at subsequent time points after inoculation.
Patterns of gene expression in urediniospores, germinated urediniospores, and appressoria fell into four groups. The first group consisted of ESTs Pp3004, Pp3495, contig2F, and contig2N, with the highest transcript levels in germinated urediniospores ( Figure 3A). The transcript levels in appressoria ranged from 39 to 57% that of germinated urediniospores. Transcript levels in urediniospores were 11 to 22% that in germinated urediniospores.
The second group, comprised of ESTs Pp3186, Pp3282, Pp3684, and alpha-tubulin, had high transcript levels in both germinated urediniospores and appressoria ( Figure 3B). The number of target molecules in appressoria ranged from 86 to 111% that of germinated urediniospores. Transcript levels in urediniospores were less than transcript levels in both germinated urediniospores and appressoria, but the difference was less than in the previous group. The transcript levels in urediniospores ranged from 34 to 79% that of germinated urediniospores and 32 to 83% that of appressoria.
The third and fourth group were each comprised of single ESTs. Pp3505 showed highest transcription in appressoria and lowest in urediniospores ( Figure 3C), while Pp3243 showed no significant difference in transcription between urediniospores, germinating urediniospores, and appressoria ( Figure 3D).
The largest group consisted of the ESTs Pp3042, Pp3205, Pp3222, Pp3717, Pp3944, Pp4000, and contig2S had lowest transcript levels in urediniospores ( Figure 4A).  Both germinated urediniospore and appressoria samples collected at 6 h yielded high transcript levels, with germinated urediniospores tending to be higher than appressoria. The transcript levels in germinated urediniospore and appressoria samples collected at 24 h were lower than transcript levels from germinated urediniospore and appressoria samples collected at 6 h. The second group, ESTs Pp3409 and Pp3998, had lowest transcript levels in appressoria samples collected at 6 and 24 h ( Figure 4B). For Pp3409 the highest transcript level was from germinated urediniospores collected at 6 h, with the corresponding 24 h samples showing 33% fewer transcripts. Appressoria samples collected at 6 h and 24 h showed 75% and 85% fewer transcripts than germinated urediniospores at 24 h, while urediniospores showed 47% fewer transcripts. For Pp3998, the highest transcript level was from germinated urediniospores collected at 24 h, while the corresponding 6 h samples had 38% fewer transcripts. Appressoria collected at 6 h and 24 h showed 67% and 50% fewer transcripts than germinated urediniospores at 6 h. Transcript levels from urediniospores were 27% less than that found in 24 h germinated urediniospores.
The third group, consisting of ESTs Pp3888 and con-tig481, and α-tubulin, had less change in transcript levels across all samples than any of the other groups ( Figure 4C). The highest transcript levels were observed in germinated urediniospores collected at 6 h.
The EST contig6B comprised the fourth group. The highest transcript levels were in urediniospores and lowest transcript levels occurred in appressoria ( Figure 4D). Transcript levels in appressoria at 6 h and 24 h was only 9% and 7% that of urediniospores. Transcript levels in germinated urediniospores at 6 h was 66% that of urediniospores, and decreased to 35% by 24 h.
Similarity to Magnaporthe grisea appressoria specific protein 3 (MAS3) and fungal proteins with predicted signal peptide sequences BLASTX analysis revealed five ESTs with similarity to MAS3, which was the third most redundant EST from an appressoria cDNA library of Magnaporthe grisea [39]. SignalP predicted all five ESTs contain putative signal peptide sequences. Contig481 had the highest identity to MAS3 at 46% ( Figure 5). Among the ESTs, contig481 and contig2Y share 100% identity at the amino acid level,  and contig19 has 70% identity to contig481. In addition, contig19 showed the highest identity (51%) to gEgh16 of Blummeria graminis. These ESTs also share identity to three secreted proteins from Melampsora larici-populina and to a putative secreted protein from Puccinia graminis f. sp. tritici ( Figure 5). BLASTP analysis of the open reading frames of the ESTs revealed that all have similarity to the protein superfamily DUF3129, which is a eukaryotic family of proteins with no known function.  Figure 4 Absolute quantification of mRNA transcripts of selected ESTs during urediniospore germination, germ tube elongation, and appressoria formation. RNA was extracted from urediniospores (U), urediniospores germinated on the surface of water for 6 hours (G6) or 24 hours (G24), and urediniospores germinated on an appressoria-inductive surface for 6 hours (A6) or 24 hours (A24). The y-axis represents absolute expression of these transcripts. Error bars represent the standard deviation. Bars topped with different letters are significantly different (P < 0.05). Expression patterns fell into four basic groups. Group A had lowest expression in urediniospores and higher expression in germinated spores and appressoria at 6 hours. Group B showed lowest expression from appressoria at both time points. Group C had less variation in expression across samples. Group D showed the highest expression in urediniospores and lowest expression appressoria at both time points.  total of 256 protein spots were identified in replicate gels of the AEF and AWF. Eliminating redundancy within each sample type yielded 140 proteins: 115 and 25 proteins were identified from the AEF and AWF, respectively (Additional file 3: Table S3). Four of the proteins were unique to AWF, and 55 were found only in the AEF. A total of 119 different proteins were identified in this study, 59 of which were not found in the previous analysis of proteins from germinated urediniospores of P. pachyrhizi [15] (Table 3). These proteins have been designated as predicted Phakopsora pachyrhizi proteins (PHAP). A search of the custom EST database found that 110, or 92%, of the 119 proteins shared similarity to putative proteins from rusts. Of these 110 proteins, 90 had identity to P. pachyrhizi ESTs, 15 of which were from the appressoria-enriched cDNA library. The remaining 20 proteins had similarity to ESTs from Melampsora, Uromyces and Puccinia spp. Of the nine proteins that were not similar to any rust sequences, four had similarity to Ustilago maydis (cytochrome C peroxidase, GTP binding protein, a hypothetical protein, and nucleosome assembly protein), two to Sporobolomyces roseus (ATPase delta subunit and ubiquinol-cytochrome c reductase), one to Leucosporidium scottii (methylenetetrahydrofolate dehrogenase), and two to Microbotrium violaceum (serine-threonine phosphatase and ubiquitin). BLASTN analysis of these nine proteins against the P. pachyrhizi trace archive sequences in GenBank revealed accessions with high identity to all nine proteins. The P. pachyrhizi genomic traces for all nine proteins shared significant amino acid similarity to Melamspora and Puccinia entries in the NCBI non-redundant protein database.

MALDI-TOF/TOF analysis
The proteins identified in this study fell into twelve functional categories ( Table 3). The most abundant category of proteins at 24% was proteins with unknown function. Of the proteins with known function, proteins involved in metabolism and energy made up two of the largest group of proteins at 19% and 7%, respectively. Proteins found in these two groups include several key components of the citric acid cycle and glycolysis such as: isocitrate dehydrogenase, isocitrate lyase, acetyl-CoA C-acyltransferase, pyruvate dehydrogenase beta, and succinate-CoA ligase beta. Proteins involved in energy production included glycine dehydrogenase, NADP-dependent mannitol dehydrogenase, methylene-tetrahydrofolate dehydrogenase, and cytochrome C. An enzyme associated with glycogenesis, UTP-glucose-1-phosphate uridylyltransferase, was also identified.
Cell cycle and DNA processing accounted for 8% of proteins and included proteins involved in cell division and differentiation; specifically, cell division cycle protein cdc48, septin, and nuclear segregation protein. Protein fate accounted for 10% of proteins; cellular transport, 8%; cell rescue and defense, 8%; and protein synthesis, 7%. Regulation, transcription, protein binding, and one transposable element accounted for the remaining 9% of the proteins. Among proteins with a known function, 36% were associated with the mitochondria.
Eight proteins listed in Table 3 were predicted to contain classical secretion signals [36]. Three proteins, PHAP0038, PHAP0073, and PHAP0113, shared similarity to well-characterized intracellular proteins related to trafficking and proteolysis. PHAP0038 was identified as a glucose-regulated protein that is secreted into the ER and is involved in the assembly of protein complexes for secretion or translocation to membranes [40]. PHAP0113 was identified as a vacuolar protease A with aminopeptidase activity [41], and PHAP0073 was identified as a neddylin/ubiquitin which is required for protein assembly in the ubiquitination pathway [42].
The remaining five proteins containing a predicted secretion signal were proteins of unknown function. BLAST searches of the custom EST database identified P. pachyrhizi ESTs containing full-length open reading frames for four of the proteins, and a U. maydis EST with an ORF for the other protein. Subsequent BLASTN analysis of the U. maydis EST sequence identified homologous P. pachyrhizi genomic sequences. BLASTX analysis using P. pachyrhizi EST sequences corresponding to PHAP0052 and PHAP0059 found significant similarity to a Melampsora larici-populina nuclear membrane hypothetical protein with a conserved putative stress response domain (EGG05479.1) and hypothetical protein containing a conserved fasciclin domain (EGG10923.1), respectively. BLASTX analysis using the P. pachyrhizi EST corresponding to PHAP0054 revealed similarity to a conserved hypothetical protein of U. maydis (XP756219.1).
Two of the P. pachyrhizi ESTs, gi|120521555 and gi| 120521631, encode for the small molecular weight proteins PHAP0129 and PHAP0055 with 132 and 71 amino acids, respectively. Both proteins possess secretion signals with no motifs for localization to organelles, suggesting they may function as extracellular fungal effectors [43]. EST gi|120521631 did not share similarity to any DNA or amino acid sequences currently in Gen-Bank and is a unique P. pachyrhizi extracellular protein (PHAP0055). BLASTX analysis using the EST for PHAP0129 identified sequence similarity to ESTs from Puccinia triticina (gi|282831716) and Uromyces viciaefabae (gi|164246325). The translated ORFs of three P. pachyrhizi ESTs (gi|120499561, gi|120520239, gi| 120507777) were also found to share significant amino acid similarity to PHAP0129. Alignment of the predicted proteins of these six ESTs revealed two conserved regions ( Figure 6).  Predicted Phakopsora pachyrhizin protei (PHAP). 2 Proteins identified from the appressoria water fraction. 3 Proteins containing a secretion signal.

Discussion
This study identified ESTs and proteins present during, and possibly required for, appressoria formation in P. pachyrizi. Spore germination, germ tube elongation, and appressoria formation are all part of a single complex physiological process from which individual steps cannot be readily uncoupled. Therefore, the collection of data specific to appressoria formation presents a technical challenge. Suppression subtractive hybridization (SSH) was utilized to generate a cDNA library enriched for transcripts specifically involved in appressoria formation, while reducing the occurrence of transcripts also associated with urediniospore germination and germ tube elongation. Generation of a cDNA library also allowed for the possible detection of transcripts that may be present in relatively low copy number. Alternatively, 2-DE and MS analysis of proteins extracted from appressoria offered a profile of high abundance proteins, which may not be reflective of corresponding transcript levels. Combining these two techniques provided a clearer image of the molecular and biochemical processes occurring during germination, germ tube elongation, and appressoria formation than either method viewed alone. Comparison of the appressoria-enriched cDNA library to existing ESTs from germinated urediniospores of P. pachyrhizi revealed 29 non-redundant ESTs specific to the appressoria-enriched cDNA library. Ten of the 29 transcripts (35%) have roles in metabolism or cell cycling. This is to be expected given the importance of both autophagy and mitosis in appressoria formation. Autophagy is widely conserved among eukaryotes and is responsible for the degradation and recycling of proteins, organelles and cytoplasm in response to stress conditions that allow the cell to adapt to environmental or developmental changes [44]. In the absence of exogenous nutrients before host colonization, spore germination, germ tube elongation and appressoria formation must be fuelled by the breakdown of spore contents. Additionally, the metabolism of lipids, glycogen, and sugars from the spore are utilized for the biosynthesis of glycerol in the appressorium, thereby generating the necessary turgor pressure to allow for penetration of the host cuticle. This breakdown and recycling of old cellular components has been shown to play a key a role in pathogenicity of fungal pathogens, including M. grisea (M. oryzae), Colletotrichum orbiculare, and Fusarium graminearum [45][46][47]. The P. pachyrhizi appressoriaenriched ESTs ( Table 2) include genes that encode proteins involved in fungal metabolism and autophagy, such as HMG-CoA reductase, a key enzyme for cholesterol biosynthesis, and glutamine synthetase, an important enzyme in amino acid metabolism. Expression analysis of HMG-CoA reductase indicates that this transcript is present in high levels in both germinated urediniospores and appressoria, but in relative low levels in urediniospores prior to germination (Figure 3, EST Pp3684). HMG-CoA reductase has also been shown to be upregulated in appressoria of M. grisea [48]. Similarly, glutamine synthetase is highly expressed during pathogenesis of Colletotrichum gloeosporiioides on Stylosanthes guianensis [49].
Cell cycling is pivotal for multicellular eukaryotes to coordinate the differentiation of tissues and organs. Appressoria formation requires switching from polarized hyphal growth, to expansion and cellular differentiation of appressoria, to ultimate resumption of polarized growth during penetration peg formation. Entry into mitosis, specifically the S phase, is required to regulate initiation of appressoria morphogenesis and conidial cell death in M. grisea [45]. Blockage of cell cycling at later stages of mitosis did not affect appressoria formation, suggesting that the checkpoint regulating cellular differentiation operates at the G2-M boundary [50]. A transcript for the G2/M phase checkpoint control protein SUM2 was identified among the P. pachyrhizi appressoria-enriched ESTs (Table 2). Interestingly, qRT-PCR analysis showed transcript levels in germinated urediniospores to be nearly twice that in appressoria. However, transcript levels in appressoria were more than four times that in urediniospores prior to germination (Figure 3, EST Pp3495).
Cyclic AMP-, MAP kinase-, and calcium/calmodulindependent signaling pathways are involved in the induction and development of appressoria [51][52][53]. In P. pachyrhizi, a putative serine/threonine protein kinase was identified and expressed in germinated urediniospores and appressoria (Figure 3, EST Pp3004). Serine/ threonine kinases play a role in autophagy and fungal morphogenesis. Autophagy is blocked in mutants of the MgATG1 gene in M. grisea, resulting in reduced lipid turnover, inadequate appressorial turgor, reduced ability to penetrate and infect a host, and decreased conidiation [54].
An EST from the P. pachyrhizi appressoria-enriched cDNA library was found with 98% similarity to autophagy-related protein 8 (Atg8) from Moniliophthora perniciosa (GenBank accession ACD93204) [55]. During autophagy, membrane-bound autophagosomes are formed, delivered to and fused with lysosomes or vacuoles where their contents are degraded. Atg8 is one of two ubiquitin-like proteins required for autophagosome formation [56]. Targeted mutation of Atg8 in M. grisea arrested conidial cell death via autophagy and prevented production of penetration hyphae, thus, preventing appressoria-mediated penetration of the host cuticle [45]. The role of Atg8 during germination and appressoria formation in P. pachyrhizi was supported by qRT-PCR analysis, which found increased transcript levels in germinated urediniospores and appressoria relative to urediniospores (Figure 4, EST Pp3944).
BLASTX analysis of appressoria-enriched ESTs revealed eight with similarity to hypothetical proteins of unknown function from P. graminis f. sp. tritici or M. laricis-populina. These two fungi, along with P. pachyrhizi, are members of the Order Puccinales. ESTs Pp3502 and contig2bb also showed similarity to other members of the Basidiomycota, while the other six ESTs may represent rust-specific transcripts. Gene expression analysis of EST Pp3505 revealed the highest transcript levels occur in appressoria relative to urediniospores and germinated urediniospores (Figure 3., EST Pp3505), supporting a role for this transcript in appressoria formation of P. pachyrhizi. BLASTX analysis of three additional ESTs, Pp3394, Pp3734, and Pp3842, did not find any significant similarity to any entries in the NBCI non-redundant protein database, suggesting that these transcripts represent genes specific to P. pachyrhizi.
Of the 238 non-redundant ESTs identified, 209 (88%) were found in common with previously sequenced ESTs from germinated urediniospores and infected soybean leaves. It is reasonable to expect that a portion of the genes involved in the cascade of events required for appressoria morphogenesis may be triggered early during spore germination and germ tube elongation. Additionally, some of the genes required for appressoria formation may be the same genes necessary for spore germination and germ tube elongation. Identification of genes known to play a role in appressoria formation, penetration peg formation, and early infection in both the appressoria-enriched ESTs and the germinated urediniospore ESTs reinforces this expectation. Analysis of transcripts in other plant pathogenic fungi have shown greater changes in gene expression occur during spore germination, with fewer changes subsequently during germ tube elongation and appressoria formation [48,57].
A putative NADPH oxidase was identified among the ESTs common to both P. pachyrhizi cDNA libraries. NADPH oxidases generate reactive oxygen species (ROS) that are involved in various physiological processes and cellular differentiation in fungi [58]. Deletions of NADPH oxidase (Nox) genes block differentiation of sexual fruiting bodies in Aspergillus nidulans [59] and differentiation of ascogonia to perithecia in Podospora anserins [60]. In M. grisea Nox1 and Nox2 mutants formed normal looking appressoria but failed to penetrate the host, suggesting a role for Nox-derived ROS in penetration peg formation [61]. In P. pachyrhizi, qRT-PCR analysis revealed little to no expression of NOX in urediniospores, while expression in both germinated urediniospores and appressoria were equally high at 6 h, and reduced by more than half at 24 h. This profile matches the proposed role of ROS as a signalling pathway in cellular differentiation. Such signalling is unnecessary in dormant spores, but in high demand during germination and appressoria formation. It has been shown in M. grisea that the generation of ROS occurs during conidial germination, appressoria development, and during hyphal tip growth [61].
Another EST common to both P. pachyrhizi EST libraries, was a putative subtilase-type proteinase. The transcript levels of a subtilase-type proteinase in appressoria is nearly five times that observed in urediniospores, and the expression levels in germinated urediniospores is nearly six times that of urediniospores. Subtilisin-like serine proteases were abundant in both a cDNA library and a serial analysis of gene expression (SAGE) of appressoria of M. grisea [62,63]. Targeted deletion of the vacuolar serine protese, SPM1, resulted in decreased sporulation and appressoria development, and attenuated infection [64]. P-type ATPases are integral membrane proteins required for the maintenance of phospholipid asymmetry in biological membranes, and they are important for infection-related morphogenesis, such as penetration peg formation [65]. P-type ATPase ESTs were identified in both P. pachryhizi cDNA libraries. The penetration defective mutant, PDE1, of M. grisea exhibited reduced appressoria-mediated penetration and reduced disease symptoms on a susceptible host [65]. They may also be significant for the delivery of virulence-associated proteins. A mutation of MgApt2 in M. grisea inhibited the hypersensitive response in resistant rice cultivars, suggesting that the secretion of fungal proteins perceived by the host during a resistance response might also require MgApt2 [66]. The qRT-PCR analysis of the P-type ATPase in P. pachyrhizi showed increased transcript levels in germinated urediniospores and appressoria, compared with low levels in urediniospores, suggesting a possible role in germination and appressoria formation.
Expression levels of genes analyzed by qRT-PCR found most transcripts to be present in urediniospores, usually at low levels relative to germinated urediniospores and appressoria ( Figure 3 and Figure 4). The presence of most transcripts at low levels in dormant spores indicates that stabilized transcripts may be stored in the spore for translation during, or immediately following, germination [67]. Two P. pachyrhizi transcripts did not fit this profile. Transcript levels of EST Pp3205 (NADPH oxidase) in urediniospores were negative in two of three replicate runs. Contig6B had high transcript levels in urediniospores, with decreased levels in germinated urediniospores, and low levels in appressoria. This transcription pattern is consistent with contig6B's putative identification as a conidiation-related protein 6 (CON6). The gene con-6 is expressed during conidiation in Neuropsora crassa, but is not expressed in mycelium. It was shown that shortly after spore germination con-6 mRNA disappears and the CON6 polypeptide is rapidly degraded [68].
Of the 119 proteins identified from appressoriaenriched preparations, 59 (49.6%) proteins were not identified in the previous study of proteins from germinated urediniospores [15]. The differential protein profiles may be indicative of metabolic and physiological differences between germination on water versus a solid substrate. Differential expression was not as great between the two cDNA libraries as it was between the two proteomes, suggesting a possible greater influence of surface contact on translation than on transcription. As with the ESTs, the majority of proteins with an identified putative function play a role in metabolism. Additionally, proteins involved in cell cycling, protein fate, and cellular transport are well represented. Several of these, neddylin/ubquitin, septin, GTP binding protein/GTPase are discussed below for their potential role in appressoria formation and early infection.
Isocitrate lyase was one of the proteins identified in this study, and is necessary for the utilization of fatty acids and is required for pathogenicity in Leptosphaeria maculans, M. grisea, and Colletotrichum lagenarium [69][70][71]. Isocitrate lyase is highly expressed in M. grisea during conidial germination, appressoria formation, and penetration peg formation, indicating that the glyoxylate cycle is stimulated at this time. Lipid metabolism is likely important for turgor generation in appressoria via the synthesis of glycerol.
Fifteen proteins identified in the appressoria-enriched preparations were also found as ESTs in the appressoriaenriched cDNA library. All fifteen of these represented common, well-characterized proteins such as actin, catalase, α-tubulin, and aldehyde reductase, and all were also present in both the germinated urediniospore cDNA library and the partial proteome of germinated urediniospores. It is possible that during germination and germ tube elongation, transcripts are accumulating without activation of subsequent translation. Alternatively, transcripts may be translated, but the protein turnover rate may be such that it precludes identification by 2-DE. The nature of the two techniques used may also explain some of the variation. Generation of a SSH cDNA library allows for the detection of transcripts that may be present in relatively low copy number, while 2-DE detects proteins of relative high abundance.
While 59 proteins from the appressoria-enriched preparations did not share any sequence similarity with the 29 appressoria-enriched ESTs, commonality of function was identified. Neddylin/ubiquitin was identified among the proteins (Table 3, PHAP0073), while ubiquitinprotein ligase was found among the ESTs ( Table 2, EST Pp3282). Both of these proteins are involved in the ubiquination of proteins targeted for degradation in the proteasome. In conjunction with E1 (ubiquitin-activating enzyme) and E2 (ubiquitin-conjugating enzyme), neddylin acts as an intermediate step through a covalent bond with the target protein and forms a bridge between E2 and E3 (ubiquitin-protein ligase) [42]. Expression levels of the ubiquitin-protein ligase in appressoria and germinated urediniospores were more than double that of urediniospores ( Figure 3, EST Pp3282).
A GTP binding protein (GTPase) was identified among the proteins (Table 3, PHAP0044) and ESTs (Table 1, EST Pp3042). Transcript analysis by qRT-PCR indicates that EST Pp3042 was highest during urediniospore germination and early germ tube elongation (Figure 4). GTP binding proteins have been shown to affect the formation of septa in infectious hyphae of U. maydis [72]. Septa are critical for appressoria formation and likely function as mechanical support for generation of turgor pressure necessary to differentiate appressoria and mediate mechanical penetration of the host cuticle [72]. Transcript levels of a putative septin were elevated in germinated urediniospores and appressoria at 6 hpi relative to urediniospores ( Figure 4, EST Pp3222). Septin was also identified among the appressoria proteins (Table 3). Septins are conserved cytoskeletal GTPases with multiple functions, including organizational markers during cell division and polarized growth. In filamentous fungi, septins assemble into a wide variety of complexes, including those that form at growing hyphal tips and at the site of future septum formation. In M. oryzae deposition of the septin ring defines the position of the appressorium septum prior to mitosis in the germ tube, and septin ring formation appears to be regulated by the DNA replication checkpoint that initiates appressorium morphogenesis [73]. In U. maydis septin mutants have reduced symptom development on maize and produce fewer mature teliospores than wild-type [74], which is consistent with the requirement for septin during morphogenesis and cellular division.
A putative acyl-CoA dehydrogenase EST was found in both germinated urediniospore and appressoria cDNA libraries ( Figure 4, EST contig2S), and acetyl-CoA acyltransferase was identified among the high abundance proteins (Table 3). These two enzymes are involved in the first and last steps of fatty acid β-oxidation, which is essential for the development and melanization of appressoria in M. grisea [75].
Five non-redundant EST contigs were identified with partial homology to one another and shared similarity to MAS3 of M. grisea and gEgh16 of Blumeria graminis f. sp. hordei. MAS3 and gEgh16 are both members of multigene families and are expressed during early infection [39,76,77]. Similarly, the five contigs may comprise a multigene family in P. pachyrhizi. Expression analysis of the most redundant transcript, contig481, revealed that this putative MAS3 homolog was highly expressed in germinated urediniospores and present in urediniospores at levels similar to those in appressoria ( Figure 4, EST contig481).
Prior to removing redundancy, nearly half (49.6%) of the clones sequenced from the appressoria-enriched cDNA library showed similarity to MAS3 and gEgh16. The most abundant ESTs, assembled into contig481, showed 99% nucleotide similarity to the most abundant EST from the P. pachyrhizi germinated urediniospore cDNA library [14]. However, MAS3 was not identified within the partial proteomes of appressoria or germinated urediniospores [15]. Because 2-DE selectively yields high abundance proteins, the absence of MAS3 indicates that the protein is either absent or present at low levels during spore germination, germ tube elongation, and appressoria formation.
Deletion mutants of MAS3 in M. grisea were normal in germination, germ tube elongation, and appressoria formation, but were defective in penetration of the host and showed reduced virulence [39]. This suggests that the role of MAS3 arises post-appressoria formation.
While transcripts are present, it is not until the infection process reaches the stage of penetration peg formation or actual penetration that MAS3 plays a role during infection. There may be a signal from the mature appressoria or the host plant that triggers translation at the critical point between appressoria formation and penetration.
Eight putative extracellular proteins were identified in AEF and AWF ( Table 3). Three of these are wellcharacterized intracellular proteins that share similarity to proteins in other fungi: glucose-regulated protein, neddylin/ubiquitin, and vacuolar protease A. Four proteins share similarity to hypothetical proteins found in other fungi. The remaining putative extracellular protein, PHAP0055, did not have any similarity to proteins in the NCBI non-redundant protein database, suggesting it is unique to P. pachyrhizi.
The P. pachyrhizi protein PHAP0059 contains a conserved fasciclin domain which has been demonstrated to be involved in cell adhesion, appressoria turgor, and pathogenicity in M. oryzae [78,79]. Interestingly PHAP0059 was only found in the AEF and not among the germinated urediniospore proteins. The AEF was generated on a hard surface, while urediniospores were germinated by floating on the surface of water. This suggests that contact with a hard surface may be necessary to induce signals that regulate transcription or protein processing for translation of cell adhesion proteins.
Two P. pachyrhizi proteins, PHAP0055 and PHAP0129, exhibit characteristics similar to those of putative effector proteins identified from a haustoriaenriched proteome of P. triticina [43]. Both are small molecular weight proteins, with no known function or conserved domains, and contain a predicted signal peptide cleavage site at the N-terminus. PHAP0129 also contains an even number of cysteine residues which is characteristic of some plant-targeted fungal effector proteins [21]. In addition, PHAP0129 was found only in the AWF, further supporting its putative identification as an extracellular protein. Proteins with high identity to PHAP0129 were found in P. pachyrhizi, U. viciae-fabae and P. triticina, suggesting a putative effector protein family exists in rusts.