Deep and comparative analysis of the mycelium and appressorium transcriptomes of Magnaporthe grisea using MPSS, RL-SAGE, and oligoarray methods
- Malali Gowda†1,
- RC Venu†1,
- Mohan B Raghupathy1,
- Kan Nobuta2,
- Huameng Li1,
- Rod Wing3,
- Eric Stahlberg4,
- Sean Couglan5,
- Christian D Haudenschild6,
- Ralph Dean7,
- Baek-Hie Nahm8,
- Blake C Meyers2Email author and
- Guo-Liang Wang1Email author
© Gowda et al; licensee BioMed Central Ltd. 2006
Received: 14 June 2006
Accepted: 08 December 2006
Published: 08 December 2006
Rice blast, caused by the fungal pathogen Magnaporthe grisea, is a devastating disease causing tremendous yield loss in rice production. The public availability of the complete genome sequence of M. grisea provides ample opportunities to understand the molecular mechanism of its pathogenesis on rice plants at the transcriptome level. To identify all the expressed genes encoded in the fungal genome, we have analyzed the mycelium and appressorium transcriptomes using massively parallel signature sequencing (MPSS), robust-long serial analysis of gene expression (RL-SAGE) and oligoarray methods.
The MPSS analyses identified 12,531 and 12,927 distinct significant tags from mycelia and appressoria, respectively, while the RL-SAGE analysis identified 16,580 distinct significant tags from the mycelial library. When matching these 12,531 mycelial and 12,927 appressorial significant tags to the annotated CDS, 500 bp upstream and 500 bp downstream of CDS, 6,735 unique genes in mycelia and 7,686 unique genes in appressoria were identified. A total of 7,135 mycelium-specific and 7,531 appressorium-specific significant MPSS tags were identified, which correspond to 2,088 and 1,784 annotated genes, respectively, when matching to the same set of reference sequences. Nearly 85% of the significant MPSS tags from mycelia and appressoria and 65% of the significant tags from the RL-SAGE mycelium library matched to the M. grisea genome. MPSS and RL-SAGE methods supported the expression of more than 9,000 genes, representing over 80% of the predicted genes in M. grisea. About 40% of the MPSS tags and 55% of the RL-SAGE tags represent novel transcripts since they had no matches in the existing M. grisea EST collections. Over 19% of the annotated genes were found to produce both sense and antisense tags in the protein-coding region. The oligoarray analysis identified the expression of 3,793 mycelium-specific and 4,652 appressorium-specific genes. A total of 2,430 mycelial genes and 1,886 appressorial genes were identified by both MPSS and oligoarray.
The comprehensive and deep transcriptome analysis by MPSS and RL-SAGE methods identified many novel sense and antisense transcripts in the M. grisea genome at two important growth stages. The differentially expressed transcripts that were identified, especially those specifically expressed in appressoria, represent a genomic resource useful for gaining a better understanding of the molecular basis of M. grisea pathogenicity. Further analysis of the novel antisense transcripts will provide new insights into the regulation and function of these genes in fungal growth, development and pathogenesis in the host plants.
One of the most daunting challenges in the post-genomic era is to identify all transcribed regions in a sequenced genome. Although computational programs have played an important role in genome annotation, highly accurate prediction of the coding regions of many genes is still challenging . Therefore, the experimental approaches such as ESTs (expressed sequenced tags) , full-length cDNA sequencing , SAGE (serial analysis of gene expression) [4, 5], MPSS (massively parallel signature sequencing) [6, 7], RATE (robust analysis of transcript ends  and microarrays [9, 10] are essential tools to validate the annotated putative transcriptional units (TUs) in the sequenced genomes.
Although EST and full-length cDNA sequencing are effective approaches for gene discovery and expression profiling, neither is cost-effective nor comprehensive enough to isolate the majority of rare transcripts. Since the fabrication of most types of microarrays relies on sequences that are derived from ESTs, full-length cDNAs or computer program-predicted coding regions, many rare transcripts may be absent from these arrays. SAGE and MPSS are two powerful methods for genome-wide expression profiling and novel gene identification [4, 5, 7]. Both methods determine the sequence of short tags derived from a defined position at the 3' regions of expressed mRNAs in a cell. SAGE uses traditional sequencing technologies, while MPSS uses a novel bead-based hybridization procedure . The main advantage of MPSS over SAGE is that over one million signatures can be obtained from a library in comparison to 100,000 to 200,000 tags in a SAGE library. However, SAGE method is technically simple and can be performed in most molecular laboratories, whereas MPSS involves many complex cloning steps and library construction and sequencing is only performed at Solexa, Inc. The newly developed 5'-RATE method simplifies tag cloning and sequencing procedures and should have a broad application for transcriptome analysis of complex genomes .
Magnaporthe grisea, the causal agent of rice blast disease, causes severe yield losses to cereals, including the economically important crops, such as rice, barley, wheat, and millets . This fungus completes its life cycle by conidia landing on the leaf surface, conidia germination, appressorium formation, penetration, establishment and new conidia formation within seven to ten days after infection [11, 12]. In addition to its economic importance, the recent release of the complete genome draft sequence  and characterization of several important avirulence/pathogenecity-related genes , makes M. grisea a model plant pathogen for studying host-pathogen interactions. Although 12,841 genes (Annotation release 5, ) are predicted in the genome of M. grisea, many of these genes have no experimental support. Identification and characterization of all expressed genes from the infectious (appressoria) and non-infectious (vegetative mycelia) structures are indispensable for understanding the mechanism of fungal pathogenesis and designing novel strategies to combat the disease.
The main objective of this study was to identify all expressed transcripts encoded in the M. grisea genome using MPSS, RL-SAGE and oligoarrays, and to compare the qualitative and quantitative measurements produced by these three methods. Using these three technologies, the M. grisea transcriptome in the vegetative mycelial stage and in the infectious appressorial stage was analyzed in depth, in which many novel sense and antisense transcripts and alternative splicing variants were discovered. We also compared the correlation coefficients of the gene expression patterns revealed by MPSS, RL-SAGE and oligoarray. Our results provide the most comprehensive analysis of the sense, antisense and alternative transcripts of a fungal genome published to date and represent a useful genomic resource for further detailed functional analysis of the expressed transcripts involved in M. grisea growth and pathogenesis.
Characterization of mycelium and appressorium transcriptomes of M. grisea by MPSS and RL-SAGE methods
MPSS and RL-SAGE libraries from mycelial and appressorial stages of 70-15 strain of M. grisea
24-hr after germination
96-hr liquid culture
96-hr liquid culture
Anchoring enzyme and tag size
GATC (Dpn II) + 13 bases
GATC (Dpn II) + 13 bases
CATG (Nla III) + 17 bases
No. of genomic virtual tags 1
No. of total tags per library
No. of unique tags per library
No. of copies per tag
2 to 28,845
1 to 30,910
1 to 7,383
No. of significant tags 2
No. of non-significant tags 3
No. of significant antisense tags 4
An RL-SAGE library was constructed using the same RNA sample that was used for the construction of the mycelium MPSS library (Table 1). The average insert size of the RL-SAGE clones was approximately 1.1 kb. A total of 245,873 individual tags and 51,925 distinct tags were obtained from 7,292 sequence reads. Out of the total distinct tags, 16,580 (31.9%) were significant (= 2 copies) and 35,345 (68.1%) were non-significant (single copy) transcript tags.
Matching analysis of the MPSS and RL-SAGE tags to M. grisea genomic and expressed sequences
We compared the MPSS and RL-SAGE data to both genomic and expressed sequences of M. grisea to identify the genes from which these tags were obtained. The whole genome sequence from the Broad Institute (Release 5.0 on January 2006 at ) and the ESTs from TIGR (Release 5.0 on April 28, 2004 ) and the COGEME database ) were used to match MPSS and RL-SAGE tags. The matching results are summarized below.
1) MPSS tags
Matching analysis of MPSS and RL-SAGE tags to M. grisea EST and genomic sequences.
Significant tags (%)
Non Significant tags (%)
Significant tags (%)
Non Significant tags (%)
Significant tags (%)
Non Significant tags (%)
Both sense + antisense
Both sense + antisense
Coding region (CDS)
Both sense + antisense
500 bp upstream sequences2
Both sense + antisense
500 bp downstream sequences3
Both sense + antisense
Number of genes transcribed4
Novel transcript tags-TIGR ESTs5
Novel transcript tags-COGEME ESTs6
Putative intergenic tags7
Putative intergenic tags match ESTs8
2) RL-SAGE tags
In comparison with the MPSS libraries, more significant tags were identified in the mycelial RL-SAGE library. About 65% of significant RL-SAGE tags matched the M. grisea genomic sequences (Table 2). Unlike the MPSS tags, only 31% (5,067) of the significant tags from the RL-SAGE library matched to the M. grisea EST sequences (TIGR) and a small portion (16%) of the tags matched to the CDS of the annotated genes (Table 2). We also observed that about 31% of the mycelial RL-SAGE significant tags matched to the 500 bp downstream of the 3' region of the annotated genes and about 6% of the significant tags matched within 500 bp upstream of the 5' region of the annotated genes (Table 2). The reason for the high percentage of the RL-SAGE tags located within the 500 bp downstream region might be due to more NlaIII sites (RL-SAGE) than DpnII sites (MPSS) in the 3' UTR region. Combining the tags hit to CDS, 500 bp upstream and 500 bp downstream regions, we identified 6,028 genes in mycelia by the RL-SAGE method.
3) Identification of novel transcript tags in the MPSS and RL-SAGE libraries
A detailed analysis of the novel transcripts that did not match to any sequences in the ESTs and annotated genes was performed (Table 2). The percentage of the novel tags in the appressorium MPSS, mycelium MPSS, and mycelium RL-SAGE libraries was about 35%, 26%, and 35%, respectively [see Additional File 2]. A total of 3,339 genes in the RL-SAGE library identified by CDS and their 500 bp upstream and down stream regions did not match the ESTs either in TIGR or COGEME database. Similarly, 3,186 and 2,298 genes identified in the MPSS appressorial and mycelial library, respectively, also did not show any matches in TIGR or COGEME M. grisea ESTs. In the two MPSS libraries, about 67% and 76% of the non-significant appressorial and mycelial tags matched to the genome sequence, respectively (Table 2). Even among the tags that did not match the genome sequence, many of them were significant tags, which might be derived from the un-sequenced or intron-exon junctions in the genome. Some of them may be true transcripts encoded in the genome but could not be matched due to sequencing errors in the genome or in the EST, MPSS or RL-SAGE tag sequencing. To validate the MPSS and RL-SAGE results, two genes (MGG_04847.5 and MGG_0490.5) without any EST matches in the public databases were amplified and cloned (data not shown), demonstrating that the majority of the identified novel tags might be true transcripts.
Antisense transcripts for the annotated genes of M. grisea
1) Antisense transcript tags from the MPSS libraries
2) Antisense transcript tags from the RL-SAGE library
Although the total number of significant antisense transcript tags (3,558) corresponding to 3,100 genes identified from the mycelial RL-SAGE library was similar to that of the mycelial MPSS library, the tag distribution in the three regions of the annotated genes was different (Figure 1). The proportion of the RL-SAGE antisense tags located in the 500 bp downstream regions was twice (71.2%) that of the MPSS library (38.7%) at the mycelial stage. Similar with the sense tags, this difference might be due to the use of different anchoring enzymes in MPSS (Dpn II) and RL-SAGE (Nla III) library construction. Among the identified antisense tags, 364 were present in both RL-SAGE and MPSS libraries, 1,730 were specific to the MPSS libraries, and 431 were specific to the RL-SAGE library. The antisense tags and their frequency are listed in Additional File 4.
3) Annotated CDS regions with sense and antisense MPSS tags
Alternative sense and antisense ESTs, MPSS tags and RL-SAGE tags for M. grisea genes.
Annotated M. grisea gene
Sense tag: CATG+17 bases (No. of tags in mycelia)
Antisense tag: CATG+17 bases (No. of tags in mycelia)
Sense tag: GATC+17 bases (No. of tags in appressoria:mycelia)
Antisense tag: GATC+17 bases (No. of tags in appressoria:mycelia)
Ribosomal protein, S9
Ribosomal protein, L34
DEAD box protein
MST12 – transcription factor
ATP synthase alpha chain-like protein
4) Annotated CDS regions with sense and antisense RL-SAGE tags
We identified that nearly 10% of the annotated genes in the M. grisea genome had bothsense and antisense tags from the mycelial RL-SAGE library(data not shown). The significant lower number of tag pairs at the protein coding region in the RL-SAGE library is likely because that the majority of the RL-SAGE tags were located in the 500 bp downstream region of the annotated genes (Figure 1). Some of the sense and antisense RL-SAGE tag pairs are also present in the MPSS libraries and public EST collections (Table 3).
Identification of alternative transcript tags in mycelia and appressoria
Since MPSS and RL-SAGE tags are derived from the 3' end of each transcript, the presence of more than one tag in the CDS of a gene suggests an alternative termination of the gene. These termination differences could be derived from either alternative polyadenylation or alternative splicing at the 3' end. All the transcripts from both cases are called alternative transcripts in this study. To determine the extent of transcriptional diversity in M. grisea, we assessed the proportion of the genes with alternative terminations in both libraries.
1) Alternative sense and antisense MPSS tags in mycelia and appressoria
Frequency of the alternatively transcribed sense and antisense tags for the coding regions of M. grisea.
No. of alternative tags
No of antisense tags/gene
No of genes with ≥ 2 tags (%)
No of genes with ≥ 2 tags (%)
No of genes with single tag (%)
No of genes with single tag (%)
Similarly, many annotated genes had more than one antisense tags in the MPSS libraries (Table 4). Twenty percent of the annotated genes in appressoria and 32% in mycelia had more than one antisense tag. As with the sense transcript tags in mycelia, the same gene (MGG_13239.5) had thirteen alternative antisense tags (data not shown).
2) Alternative sense and antisense RL-SAGE tags in mycelia
About one-fourth (27.5%) of the annotated genes in mycelia were found to produce at least two alternative sense tags in the RL-SAGE library, which is less than that observed in the MPSS mycelial library (Table 4). In the RL-SAGE mycelial library, a hypothetical protein similar to reverse transcriptase (MGG_13890.5) was found to encode sixteen alternative transcript tags (data not shown). Many genes with known functions were found to encode alternative transcript tags [see Additional File 7]. A total of 10,629 alternative tags were commonly present in both RL-SAGE and MPSS [see Additional File 8].
Among the antisense RL-SAGE tags in the mycelium library, nearly a quarter (24%) had at least two alternative antisense transcript tags per gene (Table 4). A hypothetical gene (MGG_00329.5) was found to generate seven alternative antisense transcript tags (data not shown). Several genes with multiple sense and antisense alternative tags were also identified. For example, four sense and one antisense tags were obtained for the HSP70 gene (data not shown).
Characterization of the appressorium and mycelium transcriptomes by oligoarray hybridizations
To compare the transcriptional profiles generated from MPSS and RL-SAGE with that from oligoarray analysis, the same RNA samples used in MPSS and RL-SAGE library construction were hybridized to the M. grisea/O. sativa oligoarray . Using a stringent cut off at false discovery rate (FDR) = 0.05 that corresponds to a p- value of 0.001, 9,138 genes (43.9%) were identified to be statistically significant expressed in mycelium and appressorium tissues. Among them, 8,569 probes are from M. grisea genes and 569 probes from rice genes. The hybridizations with the rice genes were likely due to sequence similarity between housekeeping genes in both organisms. Of the 8,569 M. grisea genes, 4,652 (54%) and 3,917 (46%) were differentially (2 fold) up-regulated in appressoria, and down-regulated in mycelia, respectively. We identified 846 M. grisea genes that were ≥ 3.0 fold significantly up-regulated in appressoria, and 792 genes that were ≤ 3.0 fold significantly down-regulated in mycelia. The top 20 highly and specifically up- and down-regulated genes in appressoria and mycelia are shown in [see Additional File 9].
Qualitative analysis of the M. grisea transcriptome using MPSS, RL-SAGE, and oligoarray platforms
Quantitative analysis of the mycelium and appressorium transcriptomes using MPSS, RL-SAGE, and oligoarray platforms
Comparison of M. grisea transcriptomes measured by MPSS, RL-SAGE and oligoarray approaches.
Correlation coefficient using all tags
Correlation coefficient using only significant tags
Correlation coefficient using tags ≥ 10 copies
Correlation coefficient using tags ≥ 25 copies
Appressoria: MPSS vs microarray 1
Appressoria: MPSS vs microarray 2
Mycelia: MPSS vs microarray 1
Mycelia: MPSS vs microarray 3
Mycelia: RL-SAGE vs microarray 1
Mycelia: RL-SAGE vs microarray 4
Mycelia: MPSS vs RL-SAGE 1
Mycelia: MPSS vs RL-SAGE 5
Conversely, a moderate correlation coefficiency (0.59) was obtained between MPSS and the oligoarray appressorial data without data filtering (Table 5). After the removal of the four outlying genes, the correlation coefficiency was increased from 0.65 to 0.74 for the MPSS tags with copy number = 10 (Table 5). In mycelia, the correlation was only about 0.5 for the genes with 25 or more copies, even after the removal of ten outlying genes. Comparing mycelial MPSS and RL-SAGE tags data, a low correlation was observed without data filtering. After the removal of the 18 outliers from the dataset, the correlation coefficiency was increased from 0.068 to 0.4 between mycelial MPSS and RL-SAGE tags (Table 5).
In summary, we found a low to moderate correlation among the expression data from the three platforms, especially those data between MPSS and oligoarray. In general, a better expression correlation was observed for high copy number tags in the MPSS and RL-SAGE libraries with their corresponding genes on the oligoarray.
Recent technological innovations have advanced genomics in an unprecedented way. Several complex genomes have been sequenced in recent years providing an excellent starting point to fully understand the genetic blueprint of an organism. However, identification of all the expressed portions of a sequenced genome is a challenging task, yet critical to the understanding of gene regulation and metabolic networks. The public availability of the whole genome sequence of M. grisea has established a solid foundation to further understand the pathogenicity mechanism of this notorious fungal plant pathogen which causes severe yield losses in rice growing countries . Elucidating the transcriptome of M. grisea may ultimately lead to the development of novel approaches for combating rice blast disease. In the last few years, many researchers have adopted various gene expression profiling techniques to characterize the M. grisea transcriptome under various conditions or in different cell types including EST sequencing , microarrays , and SAGE . However, these methods have only provided partial information about the M. grisea transcriptome due to technical limitations and the depth of the surveys performed in these studies. In this study, we employed three global and quantitative expression tools, namely MPSS, RL-SAGE and oligoarrays, to profile the M. grisea transcriptome at two developmental stages. In these experiments, the same RNA samples isolated from mycelia and appressoria were used so that the results from three different platforms could be readily compared. A total of 12,531 and 16,580 significant tags in mycelia have been identified by MPSS and RL-SAGE, respectively. In appressoria, 12,927 significant MPSS tags were identified. Many identified transcripts were not present in the existing EST or cDNA collections of M. grisea and many of them matched unannotated regions of the genome.
Both RL-SAGE and MPSS are tag-based approaches for transcriptome analysis and genome annotation. They are different from the conventional approach that focuses on the large clone collections following the principle of collecting a "representative clone" for each gene. Although conventional approaches are useful to catalog the expressed genes in certain tissues, especially moderately or highly expressed genes, many weakly expressed genes might have been missed in these collections. More importantly, it is not possible for the conventional approaches to address the questions regarding the dynamics of transcriptional regulation and regulatory principles like alternative promoter usage and splicing . In contrast, RL-SAGE and MPSS methods isolate and sequence short tags (17–21 bp) from the 3' regions of most transcripts. At least 100,000 RL-SAGE tags or a million MPSS tags can be easily obtained from these libraries.
In this study, we obtained approximately 1.3 and 1.4 million tags from the mycelial and appressorial MPSS libraries, respectively. The matching rate of the significant tags from each library to the M. grisea draft genome sequence was about 85%, suggesting that the MPSS data have a very deep coverage of the transcriptome. In contrast, only about 50 to 60% of the significant tags matched to the existing EST collections in the public databases. A similar result was also obtained from the RL-SAGE library, suggesting that current M. grisea EST collections are incomplete. One possible explanation is that most M. grisea ESTs were sequenced from the 5' region of the transcripts , whereas most of the MPSS and RL-SAGE tags were derived from the 3' region of expressed genes. Due to sequencing cost limitations, we only sequenced only 7,000 clones and obtained a quarter million tags from the mycelium RL-SAGE library. The transcripts recovered from the MPSS and RL-SAGE methods were overlapping but not identical due to the use of different anchoring enzymes in the library construction. Interestingly, we found that the genome matching rate of RL-SAGE tags is lower than that of the MPSS tags. These are two possible reasons. First, sequencing errors might generate unmatched tags, especially for singleton tags. Second, the significant MPSS tags used for matching have = 4 copies whereas the significant RL-SAGE tags have = 2 copies, suggesting the MPSS tags selected for matching may be more reliable. The last reason is that most of the RL-SAGE tags matched putative 3' UTR region, which may frequently targeted for RNA variation as reported in mammalian system . Nevertheless, our results demonstrated that MPSS and RL-SAGE methods are powerful techniques for deep transcriptome analysis and novel gene discovery. The two methods are complementary and different types of transcripts could be identified from each of these methods.
One of the advantages of tag-based techniques is the detection of alternatively terminated transcripts in the RNA population. From both MPSS and RL-SAGE libraries, we found many annotated genes have alternative transcript tags. Some of them have corresponding EST transcripts. The percentage of genes with evidence of alternative termination ranged from 27% in the RL-SAGE mycelium library to 35% in the MPSS mycelium library. The higher rate of alternative transcript tags in the MPSS library may be due to the fact that more MPSS tags (66%) matched to the coding regions of the annotated genes than that of the RL-SAGE tags (37%). It has been previously shown that a high rate of alternative transcripts was found in the protein coding regions (74%), and a low rate (4%) of alternative transcripts was found in the 3' UTR . The reason for the lower rate of alternative transcripts in appressoria than in mycelia is not clear. In addition, cloning and sequencing confirmation of some alternative transcripts without ETS support is required. Regulation of the alternative transcripts and functions of these sense and antisense alternative transcripts in M. grisea warrant more detailed analyses.
Naturally occurring antisense transcripts were first observed in prokaryotes and viruses and later found in eukaryotes. There is evidence for the involvement of antisense transcripts in alternative transcription [26, 27], RNA editing [28, 29], DNA methylation [30, 31], genomic imprinting [32, 33], and X-chromosome inactivation . In this study, many antisense tags were identified in mycelium or appressorium libraries, corresponding to more than 25% of the annotated genes. It is not clear at this point that why M. grisea genome encode so many antisense transcripts. Like the sense tags, almost two-thirds of antisense transcripts detected in the RL-SAGE library were located in the 500 bp downstream regions, whereas the majority of antisense tags from the MPSS library were located within the protein coding regions. Surprisingly, we found that almost one-fifth of the annotated genes encode both sense and antisense transcripts. In these genes, antisense transcripts could form double stranded RNA (dsRNA) with their sense transcripts. If a dsRNA is formed, it could be degraded to form small interfering RNAs that could decrease sense RNA abundance . Alternatively, interference by RNA polymerase II transcription activity on the antisense strand could restrict sense-strand transcription .
Recently, few studies demonstrated the function of antisense transcripts in fungal growth and development. For example, the circadian clock gene in the fungus Neurospora, a close relative of M. grisea, is regulated by the presence of natural antisense transcripts [37, 38]. Casas-Flores et al  expressed an antisense version of the pkr-1 gene of Trichoderma atroviride, encoding the regulatory subunit of protein kinase A (PKA), resulted in a non-sporulating phenotype.
Through data mining for EST, MPSS and RL-SAGE antisense tags, we have identified longer antisense transcripts in M. grisea for transcription factors MST12 (MGG_12958.5) and DEAD box-containing protein (MGG_12894.5), ribosomal protein, S9 (MGG_12892.5), ribosomal protein, L34 (MGG_05296.5) and ATP synthase alpha chain (MGG_07752.5). MST12 (a yeast homologs of STE12) is essential for host penetration and invasive growth, but was not required for appressorium formation [40, 41]. MST12 is regulated by MAP kinase at the downstream of signal transduction cascade during pathogenesis [40, 41]. It will be interesting to know if the MST12-mediated signal transduction cascade is regulated by antisense mechanism or not. A detailed characterization of MST12 antisense transcript may reveal its novel role in pathogenesis.
MPSS, SAGE, and oligoarrays are three widely used methods for transcriptome profiling. We performed qualitative and quantitative comparative analysis of the mycelial and appressorial transcriptomes revealed by the three methods. More than 40% of the annotated genes were detected by both MPSS and RL-SAGE methods. There was a good correlation in gene expression levels between the appressorium MPSS expression data and appressorium oligoarray data (0.67) and a moderate correlation between the mycelium MPSS and the mycelium oligoarray data (0.49) after removing several outlying genes in the datasets. However, the correlation between RL-SAGE and MPSS or oligoarray data was not significant. The low correlation between RL-SAGE and oligoarray might be because that oligoarray probes are designed from the protein coding regions of the annotated genes and a large number of RL-SAGE tags are located in the 500 bp downstream regions (putative 3' UTRs). We speculate that a low correlation between MPSS and RL-SAGE is due to the following two reasons. First, the use of different anchoring enzymes would change the location of the MPSS and RL-SAGE tags within a given transcript. We found that the majority of the MPSS tags matched the protein coding regions and the majority of RL-SAGE tags matched within 500 bp downstream (putative 3' UTRs). In some cases, because the M. grisea annotation is incomplete, the MPSS and RL-SAGE tags derived from the same transcript may be mapped to two different predicted genes. Second, MPSS and RL-SAGE use two completely different library construction and sequencing procedures. Any bias in the PCR amplification of synthesized cDNAs could lead to generation of different tag populations. Nonetheless, these data provide the first detailed analysis of transcriptional activity in an important fungal pathogen of plants, and constitutes a starting point for large-scale functional analysis of many novel fungal genes identified in the study.
We sequenced one RL-SAGE library of mycelia and two MPSS libraries of appressoria and mycelia of M. grisea. Using the same RNA samples of appressoria and mycelia, oligoarray hybridization was performed to check if these three approaches can detect similar sets of expressed genes in M. grisea. The distinct transcripts detected by MPSS and RL-SAGE in appressoria and mycelia ranged from 12,000 to 16,000, which correspond to about 9,000 genes, representing 80% of the predicted genes in M. grisea . A low to moderate correlation among the expression data from the three platforms was observed. MPSS and RL-SAGE methods identified many novel sense and antisense transcripts, which are differentially expressed at the two important growth stages of M. grisea. The identified novel transcripts, especially those specifically expressed in appressoria, are valuable genomic resource for a better understanding of the molecular basis of M. grisea pathogenicity. The established MPSS and RL-SAGE websites provide useful genomics resource for the public. The microarray (GSM126989) and RL-SAGE (GSM127012) data were deposited at the NCBI-GEO website and also the MPSS data can be downloaded at the M. grisea MPSS website .
Fungal strains, growth conditions and RNA isolation
The M. grisea strain 70–15 was chosen for the transcriptome profiling because of the availability of its whole genome sequence . The mycelia of 70–15 was cultured on a liquid medium [0.2% (w/v) yeast extract and 1% (w/v) sucrose] for 72 h (28°C at 200 rpm). The harvested mycelia were filtered and grinded for RNA isolation using the TRIzol method (Invitrogen, CA). For isolation of total RNA from the germinating appressoria, the mycelia of 70–15 were grown for two weeks on oatmeal agar plates and then the conidia were induced under white fluorescence light for five days. About 2 ml of conidia suspension (5 × 105 spores ml-1) was sprayed on Falcon Petri plates (150 mm × 15 mm) (Falcon, NJ) and the lids were covered with moist filter papers. These plates were incubated at 28°C and appressorium formation was monitored under a microscope at 6 h intervals. Fungal tissue was harvested 24 h after incubation since over 90% of the conidia extended a germ tube from the basal and/or the apical cell. Using a sterile blade, appressoria were scrapped and transferred quickly to the TRIzol solution, and then the suspension was centrifuged at 12,000 × g at 5°C for 5 min. About 2 g of appressorium pellet was collected and subjected for total RNA isolation. The poly (A+) mRNA was isolated from the total RNA using the Oligotex mRNA midi kit (Qiagen, CA).
MPSS library construction and sequence analysis
MPSS library construction was carried out at Solexa, Inc. (Hayward, CA) as described by Brenner et al  and Meyers et al . About 500 μg of total RNA isolated from mycelium and appressorium tissues (described above) were used in the MPSS library construction. The entire data set is available at . All tags were normalized to tags per million (TPM) as described by Meyers et al .
RL-SAGE library construction and sequence analysis
About 50 ng of mRNA isolated from mycelium tissue was used for RL-SAGE library construction as described by Gowda et al . A total of 7,292 sequence reads from the RL-SAGE library were sequenced at Arizona Genomics Institute. The ditags and distinct tags were extracted from these sequences using SAGEspy program developed at the Ohio Supercomputer Center . All RL-SAGE tags from mycelium library are available from the MGOS database .
Annotation of MPSS and RL-SAGE tags
The distinct MPSS and RL-SAGE tag sequences were matched to the M. grisea reference sequences including the genomic DNA, annotated genes (CDS), and 500 bp upstream (putative 5'UTR) and downstream (putative 3'UTR) regions that are available from the Broad Institute (version 5.0, release in January, 2006 ). The EST dataset of M. grisea from the TIGR database was used for matching MPSS and RL-SAGE tags release 5.0 on April 28, 2004 ). We also used a number of tools developed at the Ohio Supercomputer Center , the Magnaporthe grisea Oryza sativa (MGOS) database  and the University of Delaware MPSS database for data analyses. We identified the antisense transcripts from MPSS and RL-SAGE tags by converting all of the tags to antisense orientation using a reverse-complementation procedure before matching to the various sequences of M. grisea.
The M. grisea oligoarray and the hybridization procedures
The M. grisea oliogoarray chip containing 60-mer oligos representing a total of 21,885 probes was obtained from Agilent (G4137A; Wilmington, Delaware). Of the 21,885 probes, 13,666 are from the annotated genes of M. grisea and 7,144 probes are from the rice ESTs . The remaining 1,075 probes include quality controls, positive controls and negative controls. The total RNA was isolated from mycelia and appressoria using the TRIzol reagent (Invitrogen) following the manufacturers suggested protocol. Prior to hybridizations, quality and quantity of the total RNA sample was confirmed by running an agarose gel electrophoresis and by using a spectrophotometer. In the oligoarray hybridization experiments, we included six technical replicates of one RNA sample from mycelial and appressorial tissue, of which three were dye-reversal. About 500 ng of total RNA was used as template for cRNA production, and Cyanine dyes were incorporated using the Agilent low RNA input linear amp kit (5198-3523; Agilent). Normal yields from 500 ng total RNA input using an in vitro transcription were 15 μg cRNA (15 pmole cyanine dye incorporated/ug cRNA). One μg of labeled cRNA (Cy3 and Cy5 labeled sample) was diluted to 175 μl and defragmented at 60°C for 30 min following the Agilent hybridization protocols (5184-3568; Agilent). Defragmented samples were diluted to 500 μl (30% formamide final concentration) and hybridized for 20 h at 40°C. Arrays were washed, dried and scanned on an Agilent G2565BA microarray scanner described by . The raw TIFF images were analyzed using the Agilent Feature Extraction software v 8.1 using the recommended default settings.
Microarray data and KOG analysis
To minimize the variation in probe labeling and detection, intensities of Cy3- and Cy5-labeled probes were normalized using subgrid LOWESS normalization. Spots with lower signal intensity than the negative controls or with intensities less than twice the average background for the channel were manually blocked (flagged) from further analysis. This corrected and normalized dye bias data eliminated larger component of the variance, thus giving a greater confidence to the evaluated (treatment) and reference (control) data for testing the statistical significance. The genes with valid signals in all six replicates were exported to Partek Pro v.6.0 software (Partek Inc., Missouri). The normalized values were used to calculate the ratio of channel intensities (Cy5/Cy3), which were then log2 transformed. The transformed ratio was plotted in a histogram with ± two standard deviations away from the mean. A ± 1.7-fold increase or decrease in signal intensity or ± 0.77 on the Log2 scale from the histogram was considered to indicate genes that are differentially expressed. The normalized data were then subjected to ANOVA model using Boenferoni method [45, 46]. FDR (≤ 0.05) [47, 48] was calculated based on the p-value (≤ 0.001) from ANOVA. This stringent criterion limits the ability of the oligoarray experiment to detect small but biologically important changes between the appressorium and mycelium at approximately a 95% confidence interval. The genes that are significantly and differentially expressed in the appressorial and mycelial oligoarray data were used to compare the appressorial MPSS data with mycelial MPSS and RL SAGE data. Appressorium and mycelium-differentially expressed genes from the microarray analysis were functionally categorized using the euKaryotic Orthologous Groups (KOGs) database . The gene sequences were blasted against the KOG database with E-value of 1.0e-20.
This work is supported by the Plant Genome Research Program of the National Science Foundation #0321437 and #0115642). We thank Dr. Jin-Rong Xu for his advice on the harvesting germinating appressoria and RNA isolation and Tae-Ho Lee for functional classification analysis using KOG programs.
- Chen J, Sun M, Lee S, Zhou G, Rowley JD, Wang SM: Identifying novel transcripts and novel genes in the human genome by using novel SAGE tags. Proc Natl Acad Sci USA. 2002, 99: 12257-12262. 10.1073/pnas.192436499.PubMedPubMed CentralView ArticleGoogle Scholar
- Adams MD, Kerlavage AR, Fleischmann RD, Fuldner RA, Bult CJ, Lee NH, Kirkness EF, Weinstock KG, Gocayne JD, White O, et al: Initial assessment of human gene diversity and expression patterns based upon 83 million nucleotides of cDNA sequence. Nature. 1995, 377: 3-174.PubMedGoogle Scholar
- Kikuchi S, Satoh K, Nagata T, Kawagashira N, Doi K, Kishimoto N, Yazaki J, Ishikawa M, Yamada H, Ooka H, Hotta I, Kojima K, Namiki T, Ohneda E, Yahagi W, Suzuki K, Li CJ, Ohtsuki K, Shishiki T, Otomo Y, Murakami K, Iida Y, Sugano S, Fujimura T, Suzuki Y, Tsunoda Y, Kurosaki T, Kodama T, Masuda H, Kobayashi M, Xie Q, Lu M, Narikawa R, Sugiyama A, Mizuno K, Yokomizo S, Niikura J, Ikeda R, Ishibiki J, Kawamata M, Yoshimura A, Miura J, Kusumegi T, Oka M, Ryu R, Ueda M, Matsubara K, Kawai J, Carninci P, Adachi J, Aizawa K, Arakawa T, Fukuda S, Hara A, Hashizume W, Hayatsu N, Imotani K, Ishii Y, Itoh M, Kagawa I, Kondo S, Konno H, Miyazaki A, Osato N, Ota Y, Saito R, Sasaki D, Sato K, Shibata K, Shinagawa A, Shiraki T, Yoshino M, Hayashizaki Y, Yasunishi A: Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice. Science. 2003, 301: 376-379. 10.1126/science.1081288.PubMedView ArticleGoogle Scholar
- Velculescu VE, Zhang L, Vogelstein B, Kinzler KW: Serial analysis of gene expression. Science. 1995, 270: 484-487. 10.1126/science.270.5235.484.PubMedView ArticleGoogle Scholar
- Gowda M, Jantasuriyarat C, Dean RA, Wang GL: Robust-LongSAGE (RL-SAGE): a substantially improved LongSAGE method for gene discovery and transcriptome analysis. Plant Physiol. 2004, 134: 890-897. 10.1104/pp.103.034496.PubMedPubMed CentralView ArticleGoogle Scholar
- Brenner S, Johnson M, Bridgham J, Golda G, Lloyd DH, Johnson D, Luo S, McCurdy S, Foy M, Ewan M, Roth R, George D, Eletr S, Albrecht G, Vermaas E, Williams SR, Moon K, Burcham T, Pallas M, DuBridge RB, Kirchner J, Fearon K, Mao J, Corcoran K: Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat Biotechnol. 2000, 18: 630-634. 10.1038/76469.PubMedView ArticleGoogle Scholar
- Meyers BC, Tej SS, Vu TH, Haudenschild CD, Agrawal V, Edberg SB, Ghazal H, Decola S: The use of MPSS for whole-genome transcriptional analysis in Arabidopsis. Genome Res. 2004, 14: 1641-1653. 10.1101/gr.2275604.PubMedPubMed CentralView ArticleGoogle Scholar
- Gowda M, Li H, Alessi J, Chen F, Pratt R, Wang GL: Robust analysis of 5'-transcript ends (5'-RATE): a novel technique for transcriptome analysis and genome annotation. Nucleic AcidsRes. 2006, 34: e126-10.1093/nar/gkl522.View ArticleGoogle Scholar
- Schena M, Shalon D, Davis RW, Brown PO: Quantitative monitoring of gene expression patterns with a complementary DNA microarray. Science. 1995, 270: 467-470. 10.1126/science.270.5235.467.PubMedView ArticleGoogle Scholar
- Coughlan SJ, Agrawa V, Meyers B: A comparison of global gene expression measurement technologies in Arabidopsis thaliana. Comp Funct Genom. 2004, 5: 245-252. 10.1002/cfg.397.View ArticleGoogle Scholar
- Talbot NJ: Having a blast: exploring the pathogenicity of Magnaporthe grisea. Trends Microbiol. 1995, 3: 9-16. 10.1016/S0966-842X(00)88862-9.PubMedView ArticleGoogle Scholar
- Howard RJ, Valent B: Breaking and entering: host penetration by the fungal rice blast pathogen Magnaporthe grisea. Annu Rev Microbiol. 1996, 50: 491-51212. 10.1146/annurev.micro.50.1.491.PubMedView ArticleGoogle Scholar
- Dean RA, Talbot NJ, Ebbole DJ, Farman ML, Mitchell TK, Orbach MJ, Thon M, Kulkarni R, Xu JR, Pan H, Read ND, Lee YH, Carbone I, Brown D, Oh YY, Donofrio N, Jeong JS, Soanes DM, Djonovic S, Kolomiets E, Rehmeyer C, Li W, Harding M, Kim S, Lebrun MH, Bohnert H, Coughlan S, Butler J, Calvo S, Ma LJ, Nicol R, Purcell S, Nusbaum C, Galagan JE, Birren BW: The genome sequence of the rice blast fungus Magnaporthe grisea. Nature. 2005, 434: 980-986. 10.1038/nature03449.PubMedView ArticleGoogle Scholar
- Talbot NJ: On the trail of a cereal killer: Exploring the Biology of Magnaporthe grisea. Annu Rev Microbiol. 2003, 57: 177-202. 10.1146/annurev.micro.57.030502.090957.PubMedView ArticleGoogle Scholar
- Magnaporthe grisea database. [http://www.broad.mit.edu/annotation/fungi/Magnaporthe]
- TIGR M. grisea gene Index. [http://www.tigr.org/tigr-scripts/tgi/T_index.cgi?species=m_grisea]
- Phytopathogenic Fungi and Oomycete EST database. [http://cogeme.ex.ac.uk/index.html]
- Agilent Magnaporthe grisea oligo arrays. [http://www.chem.agilent.com/scripts/literaturepdf.asp?iWHID=33284]
- Phylogenetic classification of proteins encoded in complete genomes. [http://www.ncbi.nlm.nih.gov/COG/]
- Ebbole DJ, Jin Y, Thon M, Pan H, Bhattarai E, Thomas T, Dean R: Gene discovery and gene expression in the rice blast fungus, Magnaporthe grisea : analysis of expressed sequence tags. Mol Plant Microbe Interact. 2004, 17: 1337-1347.PubMedView ArticleGoogle Scholar
- Irie T, Matsumura H, Terauchi R, Saitoh H: Serial analysis of gene expression (SAGE) of Magnaporthe grisea : genes involved in appressorium formation. Mol Genet Genomics. 2003, 270: 181-189. 10.1007/s00438-003-0911-6.PubMedView ArticleGoogle Scholar
- Harbers M, Carninci P: Tag-based approaches for transcriptome research and genome Annotation. Nature Methods. 2005, 2: 495-502. 10.1038/nmeth768.PubMedView ArticleGoogle Scholar
- Soanes DM, Talbot NJ: A bioinformatics tool for analysis of EST abundance during infection-related development by Magnaporthe grisea. Mol Plant Pathol. 2005, 6: 503-512. 10.1111/j.1364-3703.2005.00298.x.PubMedView ArticleGoogle Scholar
- Hayashizaki Y, Kanamori M: Dynamic transcriptome of mice. Trends Biotechnol. 2004, 22: 161-167. 10.1016/j.tibtech.2004.02.003.PubMedView ArticleGoogle Scholar
- Modrek B, Resch A, Grasso C, Lee C: Genome-wide detection of alternative splicing in expressed sequences of human genes. Nucleic Acids Res. 2001, 29: 2850-2859. 10.1093/nar/29.13.2850.PubMedPubMed CentralView ArticleGoogle Scholar
- Munroe SH, Lazar MA: Inhibition of c-erbA mRNA splicing by a naturally occurring antisense RNA. J Biol Chem. 1991, 266: 22083-22086.PubMedGoogle Scholar
- Sureau A, Soret J, Guyon C, Gaillard C, Dumon S, Keller M, Crisanti P, Perbal B: Characterization of multiple alternative RNAs resulting from antisense transcription of the PR264/SC35 splicing factor gene. Nucleic Acids Res. 1997, 25: 4513-4522. 10.1093/nar/25.22.4513.PubMedPubMed CentralView ArticleGoogle Scholar
- Peters NT, Rohrbach JA, Zalewski BA, Byrkett CM, Vaughn JC: RNA editing and regulation of Drosophila 4f-rnp expression by sas-10 antisense readthrough mRNA transcripts. RNA. 2003, 9: 698-710. 10.1261/rna.2120703.PubMedPubMed CentralView ArticleGoogle Scholar
- Kim DD, Kim TT, Walsh T, Kobayashi Y, Matise TC, Buyske S, Gabriel A: A. Widespread RNA editing of embedded Alu elements in the human transcriptome. Genome Res. 2004, 14: 1719-1725. 10.1101/gr.2855504.PubMedPubMed CentralView ArticleGoogle Scholar
- Tufarelli C, Stanley JA, Garrick D, Sharpe JA, Ayyub H, Wood WG, Higgs DR: Transcription of antisense RNA leading to gene silencing and methylation as a novel cause of human genetic disease. Nat Genet. 2003, 34: 157-165. 10.1038/ng1157.PubMedView ArticleGoogle Scholar
- Lewis A, Mitsuya K, Umlauf D, Smith P, Dean W, Walter J, Higgins M, Feil R, Reik W: Imprinting on distal chromosome 7 in the placenta involves repressive histone methylation independent of DNAmethylation. Nat Genet. 2004, 36: 1291-1295. 10.1038/ng1468.PubMedView ArticleGoogle Scholar
- Moore T, Constancia M, Zubair M, Bailleul B, Feil R, Sasaki H, Reik W: Multiple imprinted sense and antisense transcripts, differential methylation and tandem repeats in a putative imprinting control region upstream of mouse Igf2. Proc Natl Acad Sci USA. 1997, 94: 12509-12514. 10.1073/pnas.94.23.12509.PubMedPubMed CentralView ArticleGoogle Scholar
- Sleutels F, Zwart R, Barlow DP: The non-coding Air RNA is required for silencing autosomal imprinted genes. Nature. 2002, 415: 810-813.PubMedView ArticleGoogle Scholar
- Lee JT, Davidow LS, Warshawsky D: Tsix, a gene antisense to Xist at the X-inactivation centre. Nat Genet. 1999, 21: 400-404. 10.1038/7734.PubMedView ArticleGoogle Scholar
- Nakayashiki H: RNA silencing in fungi: mechanisms and applications. FEBS Lett. 2005, 579: 5950-5957. 10.1016/j.febslet.2005.08.016.PubMedView ArticleGoogle Scholar
- Eszterhas SK, Bouhassira EE, Martin DI, Fiering S: Transcriptional interference by independently regulated genes occurs in any relative arrangement of the genes and is influenced by chromosomal integration position. Mol Cell Biol. 2002, 22: 469-479. 10.1128/MCB.22.2.469-479.2002.PubMedPubMed CentralView ArticleGoogle Scholar
- Kramer C, Loros JJ, Dunlap JC, Crosthwaite SK: Role for antisense RNA in regulating circadian clock function in Neurospora crassa. Nature. 2003, 421: 948-52. 10.1038/nature01427.PubMedView ArticleGoogle Scholar
- Crosthwaite SK: Circadian clocks and natural antisenseRNA. FEBS Lett. 2004, 567: 49-54. 10.1016/j.febslet.2004.04.073.PubMedView ArticleGoogle Scholar
- Casas-Flores S, Rios-Momberg M, Rosales-Saavedra T, Martinez-Hernandez P, Olmedo-Monfil V, Herrera-Estrella A: Cross talk between a fungal blue-light perception system and the cyclic AMP signaling pathway. Eukaryot Cell. 2006, 5: 499-506. 10.1128/EC.5.3.499-506.2006.PubMedPubMed CentralView ArticleGoogle Scholar
- Park G, Xue C, Zheng L, Lam S, Xu JR: MST12 regulates infectious growth but not appressorium formation in the rice blast fungus Magnaporthe grisea. Mol Plant Microbe Interact. 2002, 15: 183-92.PubMedView ArticleGoogle Scholar
- Park G, Bruno KS, Staiger CJ, Talbot NJ, Xu JR: Independent genetic mechanisms mediate turgor generation and penetration peg formation during plant infection in the rice blast fungus. MolMicrobiol. 2004, 53: 1695-707.Google Scholar
- Magnaporthe MPSS database. [http://mpss.udel.edu/mg/]
- SageSpy software. [http://www.osc.edu/hpc/software/apps/sagespy.shtml]
- Magnaporthe Grisea Oryza Sativa. [http://www.mgosdb.org/]
- Kerr MK, Martin M, Churchill GA: Analysis of variance for gene expression microarray data. J comput Biol. 2000, 7: 819-837. 10.1089/10665270050514954.PubMedView ArticleGoogle Scholar
- Sahai H, Agell MI: The Analysis of Variance. 2000, Boston: BirkhauserView ArticleGoogle Scholar
- Benjamini Y, Hochberg Y: Controlling the false discovery rate: A practical and powerful approach to multiple testing. J Royal Stat Soc B. 1995, 57: 289-300.Google Scholar
- Storey JD, Tibshirani R: Statistical significance for genome wide studies. Proc Natl Acad Sci USA. 2003, 100: 9440-9445. 10.1073/pnas.1530509100.PubMedPubMed CentralView ArticleGoogle Scholar