De novo characterization of the gametophyte transcriptome in bracken fern, Pteridium aquilinum
© Der et al; licensee BioMed Central Ltd. 2011
Received: 24 May 2010
Accepted: 8 February 2011
Published: 8 February 2011
Because of their phylogenetic position and unique characteristics of their biology and life cycle, ferns represent an important lineage for studying the evolution of land plants. Large and complex genomes in ferns combined with the absence of economically important species have been a barrier to the development of genomic resources. However, high throughput sequencing technologies are now being widely applied to non-model species. We leveraged the Roche 454 GS-FLX Titanium pyrosequencing platform in sequencing the gametophyte transcriptome of bracken fern (Pteridium aquilinum) to develop genomic resources for evolutionary studies.
681,722 quality and adapter trimmed reads totaling 254 Mbp were assembled de novo into 56,256 unique sequences (i.e. unigenes) with a mean length of 547.2 bp and a total assembly size of 30.8 Mbp with an average read-depth coverage of 7.0×. We estimate that 87% of the complete transcriptome has been sequenced and that all transcripts have been tagged. 61.8% of the unigenes had blastx hits in the NCBI nr protein database, representing 22,596 unique best hits. The longest open reading frame in 52.2% of the unigenes had positive domain matches in InterProScan searches. We assigned 46.2% of the unigenes with a GO functional annotation and 16.0% with an enzyme code annotation. Enzyme codes were used to retrieve and color KEGG pathway maps. A comparative genomics approach revealed a substantial proportion of genes expressed in bracken gametophytes to be shared across the genomes of Arabidopsis, Selaginella and Physcomitrella, and identified a substantial number of potentially novel fern genes. By comparing the list of Arabidopsis genes identified by blast with a list of gametophyte-specific Arabidopsis genes taken from the literature, we identified a set of potentially conserved gametophyte specific genes. We screened unigenes for repetitive sequences to identify 548 potentially-amplifiable simple sequence repeat loci and 689 expressed transposable elements.
This study is the first comprehensive transcriptome analysis for a fern and represents an important scientific resource for comparative evolutionary and functional genomics studies in land plants. We demonstrate the utility of high-throughput sequencing of a normalized cDNA library for de novo transcriptome characterization and gene discovery in a non-model plant.
As the sister lineage to seed plants, ferns represent a critical clade for comparative evolutionary studies in land plants [1, 2]. In contrast to seed plants, ferns typically retain the ancestral condition for a suite of life history traits (e.g. the lack of secondary growth, homospory, motile sperm, and independent free-living gametophyte and sporophyte generations). Ferns are thus an important outgroup for studying the evolution of wood, seeds, pollen, flowers, and fruit among other economically important characteristics found in seed plants, as well as the evolution of development in these complex structures and the expansion of gene families associated with seed plant evolution (e.g. transcription associated proteins). For reasons not yet fully understood, ferns typically have much higher chromosome numbers and larger genomes than seed plants [1, 3, 4].
Understanding the factors that influence these differences and their evolutionary consequences will require developing genomic resources in ferns to provide the necessary comparative context to understand the evolution of these traits [3–5]. Additionally, because ferns have evolved and maintained free-living and photosynthetic gametophyte and sporophyte life stages, they are an ideal group for studies of life-cycle evolution in land plants and genome function in independent haploid and diploid phases.
Among the genomic tools available for non-model organisms, expressed sequence tags (ESTs) have proven to be a rapid and cost effective strategy to develop sequence markers for comparative evolutionary and functional studies. While taxonomic sampling of plants in genome-scale projects has expanded substantially with dramatic decreases in sequencing cost, and increases in throughput, the development of genomic resources in ferns has lagged far behind those of other plant groups. This deficit has primarily been attributed to technical and economic barriers associated with the complex and large genomes in ferns, but is compounded by the limited agronomic value of most ferns . To date (December 2010), genomic information in ferns is limited to a genetic linkage map  and a modest EST data set comprised of about 5,000 Sanger cDNA sequences  for Ceratopteris richardii, just over 30,500 ESTs for Adiantum capillus-veneris, and over 2,600 ESTs in Osmunda lancea. Fewer than 500 ESTs for Pteridium aquilinum have previously been sequenced and deposited in Genbank .
With the introduction of cost efficient and massively-parallel high-throughput sequencing technologies, genome-scale studies in non-model organisms are being actively pursued for gene discovery, expression profiling, SNP and SSR marker development, and studies in functional, comparative, and evolutionary genomics in taxa where little or no previous genomic information exists [10–27]. We chose the Roche 454 GS-FLX Titanium pyrosequencing technology to sequence a full length enriched normalized cDNA library for the gametophyte generation of the bracken fern, Pteridium aquilinum (L.) Kuhn.
Pteridium (family: Dennstaedtiaceae) is a cosmopolitan fern genus comprised of several closely related species that are well differentiated from other genera in the family. Pteridium aquilinum is the most widespread of the bracken species and is distributed throughout the northern hemisphere and Africa . Bracken is notorious as a weed in open fields and is toxic to people and livestock. Despite its toxicity, bracken is eaten as a delicacy in several parts of the world, and due to its often high local abundance and large coarse stature, is sometimes used as thatching or packing material. Because bracken is common, easily cultured and manipulated, and can have a major economic impact, it has become one of the most intensively studied fern species.
Bracken has been used as a model system for the study of the fern life cycle [29–37], gametophyte development and the pheromonal mechanism of sex determination [38–45], cyanogenesis , carcinogenesis [47–49], invasion ecology [50–52], and climate change . Pteridium aquilinum has a diploid chromosome count of 2n = 104 and a total genome size of about 9.8 Gbp .
This study was conceived to develop an extensive expressed gene sequence resource in ferns for evolutionary and functional genomics. We present the first comprehensive transcriptome characterization for a fern gametophyte, including an assessment of transcriptome coverage, gene family and functional representation, SSR identification, and a comparative analysis of gene sets across land plants.
Sequencing and de novo assembly
Number of sequences
Mean length (bp)
Standard deviation on length (bp)
Mode length (bp)
Median length (bp)
Range in length (bp)
2 - 624
78 - 624
86 - 3229
Total length (Mbp)
Assembly summary statistics
Primary assembly (MIRA)
Secondary assembly (MIRA+CAP3)
Number of reads assembled into contigs
Number of reads discarded during assembly
Number of 454 reads retained as singletons
Number of primary contigs (MIRA)
Number of secondary contigs (CAP3)
Total number of unique sequences (unigenes)
Mean unigene length (bp)
Largest unigene length (bp)
Total assembly length (Mbp)
Mean read depth coverage
Transcriptome coverage and data quality
Transcriptome coverage estimates: ESTcalc
Number of technologies
454 GS-FLX (Titanium)
Predicted assembly statistics
Total Assembled Sequence (MB)
Mean unigene length (bp)
Mean unigene length (longest unigene per gene, bp)
Singleton yield (%)
Percent transcriptome (%)
Percent of genes tagged (%)
Percent of genes with 90% coverage (%)
Percent of genes with 90% coverage by largest unigene (%)
Percent of genes with 100% coverage (%)
Percent of genes with 100% coverage by largest unigene (%)
Taxonomic distribution of unigene blastx hits in the nr database
Best blastx hit
Lowest common ancestor for blastx hits
Number of unigenes
Percent of unigenes with hits
Number of unigenes
Percent of Unigenes with hits
Artificial sequences, hits don't pass threshold, or taxon not assigned
Repetitive sequence characterization
A total of 2,679 perfect di-, tri-, tetra-, and pentanucleotide simple sequence repeats (SSRs) longer than 9, 8, 6, and 5 repeats, respectively, were identified in 2,285 unigenes (Additional file 5) using msatCommander . Sufficient flanking sequences existed to design high quality primers for 548 potentially amplifiable SSR loci. PCR primers were chosen using Primer3  as implemented in msatCommander  (Additional file 6). Since this RNA was extracted from gametophytes derived from spores collected from a single diploid sporophyte, we are unable to determine the level of variation present at these SSR loci in natural populations.
Repetitive transposon classification
Number of elements
Percentage of sequence
Rolling circle Helitrons
We have used high-throughput sequencing data to characterize the gametophyte transcriptome of Pteridium aquilinum, a species for which very little genomic data are available. These data represent an 865-fold increase over the expressed sequence data previously available for Pteridium in Genbank .
Because contaminant adapter/primer sequences, polyA/T repeats, and low complexity end sequences can substantially compromise de novo assembly and can be difficult to completely remove (KM Dlugosch, personal communication), we aggressively filtered and trimmed the reads beyond the default instrument-level processing routines at the cost of sequence information loss (approximately 11.6 Mbp were removed, representing 4.4% of the filter-passed data).
Considering the sheer quantity and depth of sequencing produced by next-generation sequencing platforms, we deemed this an acceptable level of loss to improve accuracy in the assembly. We also used a two-step assembly strategy to minimize redundancy in our final unigene sequence set. We adopted this approach because MIRA is able to handle the large number of reads produced by 454 sequencing and utilizes a multi-pass strategy to identify and correct sequencing and assembly errors to produce a highly accurate assembly, but is sensitive to uneven sequencing depth of coverage and allelic diversity, resulting in a high number of redundant contigs. CAP3 is a proven and efficient DNA sequence assembler that can be used to join highly similar overlapping sequences, but is unable to handle the large number of reads produced by new high-throughput sequencing platforms. By combining these two assembly tools, we were able to join contigs and singletons that failed to assemble in MIRA to reduce sequence-level redundancy in our final unigene sequence set.
In examining the taxonomic distribution of nr blastx hits for the unigenes, we identified only a small proportion of sequences with best blast hits or LCA assignments outside of the green plants. When we examine these hits in greater detail, we find that many of them only align to short conserved domains, are hypothetical proteins of unknown function from model organisms, or are genes which are conserved across broad taxonomic levels, such as cytochrome P450, alpha-tubulin and dynein proteins. Additionally, because no other fern genomes have been sequenced, some of these sequences may represent novel fern genes. Thus, the evidence indicates that there is very little heterospecific sequence contamination in these data.
While the simulations that underlay ESTcalc are based on the well characterized Arabidopsis thaliana floral transcriptome (approximately 18,000 genes with transcripts averaging 1,500 bp long) and assume perfect cDNA normalization and sequence assembly, Wall et al.  show that their results were highly predictive for empirical datasets from diverse eukaryotic species and tissues, making their simulations useful as a null model for predicting transcriptome coverage in other organisms. The predictions for transcriptome coverage produced by ESTcalc are largely consistent with that observed in our two-step assembly. However, the larger assembly size, greater number of unigenes, and shorter unigene lengths observed in our data set relative to the ESTcalc prediction may be explained by imperfect cDNA normalization or inefficient de novo assembly. Additionally, it is also becoming evident that with increased transcriptome sequencing throughput, it is possible to capture a richer, more nuanced, picture of transcriptome complexity (e. g. partially processed transcripts and alternative splice forms) [73–75]. This increased information content, however, presents significant challenges for de novo assembly and often results in a fragmented or partially redundant assembly [76, 77]. Also consistent with the ESTcalc estimate that we have tagged all of the transcripts present in this sample, our unigene accumulation curve shows that the rate of new unigene detection for this cDNA library has declined to the point that additional sequencing is unlikely to detect new genes, but may however serve to condense and join non-overlapping contigs in our assembly. Similar approaches to evaluate sufficient sampling in transcriptome projects have been used by other researchers when other information about the transcriptome is absent [15, 78].
The GO functional categories represented in the Pteridium gametophyte transcriptome are not significantly different from the suite of functional categories present in the full Arabidopsis genome GO annotation. Most of the unigenes annotated with a cellular component are localized to plastids or mitochondria, but a large number of them are also targeted for ribosomes or the plasma membrane (Figure 3). The molecular function of unigenes is heavily dominated by binding nucleic acids or proteins and metabolic activity, including hydrolase and kinase activity (Figure 3). The biological processes represented include all of the major cellular processes from transport and cellular organization to transcription, translation, and metabolism (Figure 3).
Visual examination of annotated/colored KEGG maps (not shown) indicates that we have captured all of the genes required for glycolysis, the citrate cycle, and plant hormone biosynthesis including gibberellin, abscisic acid, strigolactone, cytokinin, brassinosteroid, and auxin. We also detected Enzyme code signatures for most of the genes involved in nucleic and amino acid metabolism and chlorophyll biosynthesis.
The PlantTribes database contains an objective MCL cluster-based classification system for plant genes and gene families [60, 79, 80]. By identifying similar sequences in this classification system, we assigned unigene sequences with putative gene family identities. The most abundant of these gene families present in the unigene set was the pentatricopeptide repeat protein (PPR) family, with over 600 unigenes classified as PPR proteins. We were also able to identify 65 unigenes classified in the MADS-box transcription factor family. Using this classification, gene sequences from Pteridium can be extracted for gene families of interest for use in studies of gene family evolution or phylogenomics. The overlap in orthogroup membership and blast hits for proteins in Arabidopsis thaliana, Selaginella moellendorffii, and Physcomitrella patens is similar (Figure 4A/B), but some striking differences can be observed. In both the PlantTribes and blast-based Venn diagrams, most of the unigenes which were identified in Arabidopsis, Selaginella, or Physcomitrella are also shared across all three species. In the PlantTribes classification, most of the genes are shared with Arabidopsis (21,649 unigenes), the species in this comparison that shares the most recent common ancestor with Pteridium, while slightly fewer and approximately equal gene representation is shared with Selaginella and Physcomitrella (19,649 and 19,485 unigenes, respectively). This is in contrast to the blast-based examination of gene set overlap in which Arabidopsis again has the greatest number of unigenes with hits (23,148 unigenes), but Physcomitrella has hits with 6,122 more unigenes than Selaginella (Figure 4). At first this seems counterintuitive because Pteridium shares a more recent common ancestor with Selaginella than with Physcomitrella. This pattern may be explained by the expression of "gametophyte genes" in Pteridium that are conserved with genes in the Physcomitrella genome, however little is known about the expression and evolution of genes between sporophyte and gametophyte stages. Both Physcomitrella and Pteridium have maintained a homosporous life cycle with a large independent gametophyte stage. These life history differences may also play a role on the selective pressures and/or constraints influencing gene evolution and more work is needed to address these hypotheses.
In our examination of Arabidopsis gametophyte genes, we identified gametophyte-expressed homologs in Pteridium for over half (52.7%) of the previously characterized Arabidopsis gametophyte specific or enriched genes. This finding suggests that a highly specific suite of genes required for gamete production and syngamy may be conserved over long periods of evolutionary time, despite substantial differences in life cycle and reproductive strategies between Pteridium and Arabidopsis. It should be noted also that these conserved genes are not the genes required for meiosis because the tissues sampled in this study and those used to identify gametophyte specific genes in Arabidopsis were all post-meiotic. An in-depth study of sporophytic genes in Pteridium is needed to better understand the evolution and expression of genes between the sporophyte and gametophyte stages.
This study is the first comprehensive sequencing effort and analysis of gene function in the transcriptome of a fern and represents the most extensive expressed sequence resource available in ferns to date, nearly 16 times more data than exists for Adiantum capillus-veneris. These data are an important new scientific resource for comparative evolutionary studies in land plants and will be of great value for studies of genome structure and function in ferns. These data can be used to develop microarrays for gene expression assays or serve as a reference transcriptome for future RNA-seq experiments in Pteridium. As additional genome-scale projects in diverse plants are undertaken, these data will be of immense value in representing ferns, the sister clade to seed plants, in comparative genomic analyses.
Gametophyte culture, library preparation, and sequencing
Pteridium aquilinum ssp. aquilinum spores (collection number: Wolf 83; sourced from a single sporophyte individual collected in Norwich, UK) were sown onto sterile agar nutrient media containing Bold's macronutrients and Nitch's micronutrients (prepared as described by ) and grown under white light. Whole gametophytes including both vegetative and sexually mature male, female, and hermaphroditic individuals of various ages (up to 9 months from germination) were flash frozen in liquid nitrogen and ground to a fine powder. Total RNA was isolated using the Sigma Spectrum Plant Total RNA Kit, incorporating on-column DNase I (Qiagen) digestion during extraction to remove traces of genomic DNA. Total RNA was concentrated by precipitating in 2.5 M ammonium acetate and 70% ethanol, then resolubilizing the RNA pellet in RNase-free water to approximately 500 ng/μL. Total RNA was quantified and its quality verified using an Agilent Bioanalyzer 2100. Total RNA was sent to the Center for Genomics and Bioinformatics at Indiana University, Bloomington (IU CGB), where a normalized transcriptome (cDNA) library optimized for Roche 454 GS-FLX Titanium sequencing was prepared .
Briefly, full-length enriched cDNA was synthesized with the CloneTech SMART cDNA synthesis kit using modified 454-ready adapter/primer oligos (K Mockaitis, unpublished). The frequency of abundant cDNA species was reduced using the Evrogen Direct Trimmer normalization kit. Normalized cDNA was fragmented by sonication, blunt end repaired, and ligated to custom 454 sequencing adapters. Amplification of the sequencing library incorporated an adapter-mediated PCR suppression effect to preferentially amplify ligation products suitable for 454 sequencing . The final transcriptome library was size selected and 454 sequencing proceeded according to the manufacturers recommended protocol on 3 regions of a four region PicoTiter plate.
Sequence preprocessing, transcriptome assembly, and coverage assessment
Sequence reads generated in this study were deposited in the NCBI sequence read archive (SRA012887). Raw sequence reads that passed instrument software quality filters were trimmed of custom oligonucleotide adapter sequences (Justin Choi, unpublished, IU CGB). The resulting sequences were further processed with SeqClean  and SnoWhite v1.0.3  to remove low quality, short, and contaminant sequences, and to aggressively trim polyA/T sequences. Cleaned reads were assembled de novo in MIRA v3rc4 [55, 85] using a minimum percent identity of 94% to align reads, retaining singleton reads in the assembly (-OUT:sssip = yes). This primary assembly was passed through a secondary assembly step in CAP3 (95% identity, 25 bp overlap)  to reduce redundancy in the final assembly and join additional contigs. Custom perl scripts were used to extract summary information about the reads and assemblies (Tables 1 and 2; Additional file 7).
We utilized a web-based tool, ESTcalc , to estimate the predicted level of transcriptome coverage for our data set. Input parameters to ESTcalc require that we specify the sequencing technology used (or a combination of technologies) and either the total sequencing level (Mbp), or the number of reads and an estimate of read lengths. We used the best approximation for sequencing technology available (454 GS-FLX) and the empirical values observed for the cleaned sequence data (254 Mbp or 681,722 reads with an average of 372.6 bp/read) to obtain our estimates. The estimates reported were identical whether we parameterized on total sequence or supplied read length information as well.
To determine the number of eukaryotic ultra conserved orthologs (UCOs ) we captured in the Pteridium transcriptome data set, we queried a list of 357 UCO coding sequences from Arabidopsis (sequences available at: http://compgenomics.ucdavis.edu/compositae_reference.php) into the unigene set with an e-value threshold of 1e-10 using NCBI tblastx. These blast results were then parsed to determine then number of UCOs with a positive hit that returned an amino acid alignment greater than 30 residues long.
We assessed the changing rate of new gene detection as a function of sampling effort (unigene accumulation curve, Figure 2) using a bootstrapped random sampling protocol implemented in a custom perl script (Additional file 8). This script uses the empirical distribution of read number per unigene in our final assembly to randomly sample reads one at a time and tracks the total number of unigenes detected at each step. Because the order of sampling can impact the shape of this curve, we computed 1,000 replicate random sample orders and calculated the mean number of unigenes detected after each draw. To evaluate the level of variation in the number of unigenes detected, we also calculated the 95% confidence interval on the number of unigenes. Using this curve, we then estimated the number of reads it took to capture an average of 90%, 95%, and 99% of the unigenes and the average number of additional reads required to detect each of the last 10 unigenes.
To evaluate for the presence of potential contaminating sequences, we examined the taxonomic distribution of blastx hits for the unigene set in the NCBI nr protein database using an e-value threshold of 1e-10. The top 10 blast hits for each unigene were kept and examined in MEGAN v.3.7.2 . MEGAN is a tool built for the examination of metagenomic data sets and provides a number of useful functions to explore the information content of large blast results. The blast results for each unigene were mapped onto the NCBI taxonomy tree by examining just the best hit (lowest e-value) or by using the lowest common ancestor (LCA) algorithm . LCA was determined using at least three blast hits with a bitscore greater than 75 and within 10% of the top bitscore for that unigene.
The same blast search used to examine the taxonomic distribution of blast hits was used to identify putative homologous proteins and annotate each sequence with gene ontology (GO) terms using Blast2GO [65–67]. Blast2GO was also used to automatically handle InterProScan (IPS) searches to identify conserved protein domains in translations of the longest ORF in each unigene. Any GO terms associated with IPS hits were then merged into the blast-based GO annotation. GO terms were then used to map enzyme codes to each sequence. Enzyme codes were then used to automatically color and retrieve KEGG pathway maps [86, 87]. As a final step in examining a broad functional representation of the gametophyte transcriptome, GO terms were mapped to the reduced GO-slim ontology and visualized and explored with directed acyclic graphs (not shown) and summarized with filtered pie charts including GO categories represented by at least 150 sequences (Figure 3)
Unigenes were classified into tribe- and orthoMCL clusters in the PlantTribes2.0 database using a custom Perl pipeline (dePamphilis lab, unpublished) which queries each unigene against the complete inferred protein set from ten plant species that have complete sequenced genomes, using a blastx e-value threshold of 1e-10. Unigenes were assigned to MCL clusters based on the best blast hit. Species used for blast searches and gene clustering in the PlantTribes2.0 database include: Chlamydomonas reinhardtii v3.0, Physcomitrella patens v1.1, Selaginella moellendorffii v1.0, Oryza sativa v5.0, Sorghum bicolor v1.0, Vitis vinifera v1.0, Populus trichocarpa v1.0, Medicago truncatula v1.0, Carica papaya v1.0, and Arabidopsis thaliana (TAIR7). Meta-information about each assigned cluster was extracted from the database for each unigene and was output to a file. Simple text-based searches examined this information to retrieve gene family names and putative gene family functional data. The shared single copy tribes  for Arabidopsis, Vitis, Populus, and Oryza were identified in the PlantTribes2.0 database and the number of these tribes detected in the unigene set was determined by examining the pipeline output file. Orthogroup assignments for Pteridium unigenes were examined for cluster membership by Selaginella, Physcomitrella, and Arabidopsis to generate a Venn diagram showing putative gene level overlap (Figure 4A). Unigenes were also directly queried against each of these proteomes using a blastx e-value threshold of 1e-10 to examine the distribution of similar proteins in these three species. Venn diagrams were generated to graphically illustrate the overlap of unigenes for each proteome (Figure 4B). To screen for the presence of putative gametophyte genes, a list of male gametophyte specific genes in Arabidopsis was extracted from the microarray study of Honys and Twell  and a combined list of significantly enriched female gametophyte genes was compiled from the studies of Wuest et al.  and Yu et al. . These Arabidopsis gametophyte gene lists compiled from the literature were examined for overlap with the list of genes producing significant alignments with Pteridium unigenes in the blastx search against the Arabidopsis genome to produce a Venn diagram of gametophyte genes (Figure 5).
Funding for this research was provided by the Vice President for Research and the Center for Integrated Biosystems at Utah State University. The Department of Biology and the Ecology Center at Utah State University provided funds for publication processing charges. Keithanne Mockaitis directed the cDNA library preparation and sequencing at the Center for Genomics and Bioinformatics at Indiana University, and Justin Choi pre-processed the primary sequencing data. Michael Pfrender, Aaron Duffy, Katrina Dlugosch, Eric Wafula, Paul Cliften, Amanda Grusz, Jacob Davidson, and Kristal Watrous assisted with various aspects of this project or manuscript preparation.
- Barker MS, Wolf PG: Unfurling fern biology in the genomics age. Bioscience. 2010, 60: 177-185. 10.1525/bio.2010.60.3.4.View ArticleGoogle Scholar
- Pryer KM, Schneider H, Zimmer EA, Ann Banks J: Deciding among green plants for whole genome studies. Trends in Plant Science. 2002, 7: 550-554. 10.1016/S1360-1385(02)02375-0.PubMedView ArticleGoogle Scholar
- Barker MS: Evolutionary genomic analyses of ferns reveal that high chromosome numbers are a product of high retention and fewer rounds of polyploidy relative to angiosperms. American Fern Journal. 2009, 99: 136-137.Google Scholar
- Nakazato T, Barker MS, Rieseberg LH, Gastony GJ: Evolution of the nuclear genome of ferns and lycophytes. Biology and evolution of ferns and lycophytes. Edited by: Ranker TA, Haufler CH. 2008, Cambridge University Press, 175-198. full_text.View ArticleGoogle Scholar
- Nakazato T: Fern genome structure and evolution. American Fern Journal. 2009, 99: 134-136.Google Scholar
- Nakazato T, Jung MK, Housworth EA, Rieseberg LH, Gastony GJ: Genetic map-based analysis of genome structure in the homosporous fern Ceratopteris richardii. Genetics. 2006, 173: 1585-1597. 10.1534/genetics.106.055624.PubMed CentralPubMedView ArticleGoogle Scholar
- Salmi ML, Bushart TJ, Stout SC, Roux SJ: Profile and analysis of gene expression changes during early development in germinating spores of Ceratopteris richardii. Plant Physiology. 2005, 138: 1734-1745. 10.1104/pp.105.062851.PubMed CentralPubMedView ArticleGoogle Scholar
- Yamauchi D, Sutoh K, Kanegae H, Horiguchi T, Matsuoka K, Fukuda H, Wada M: Analysis of expressed sequence tags in prothallia of Adiantum capillus-veneris. Journal of Plant Research. 2005, 118: 223-227. 10.1007/s10265-005-0209-3.PubMedView ArticleGoogle Scholar
- NCBI GenBank dbEST. [http://www.ncbi.nlm.nih.gov/nucest]
- Alagna F, D'agostino N, Torchia L, Servili M, Rao R, Pietrella M, Giuliano G, Chiusano ML, Baldoni L, Perrotta G: Comparative 454 pyrosequencing of transcripts from two olive genotypes during fruit development. BMC Genomics. 2009, 10: 399-10.1186/1471-2164-10-399.PubMed CentralPubMedView ArticleGoogle Scholar
- Barakat A, DiLoreto DS, Zhang Y, Smith C, Baier K, Powell WA, Wheeler N, Sederoff R, Carlson JE: Comparison of the transcriptomes of American chestnut (Castanea dentata) and Chinese chestnut (Castanea mollissima) in response to the chestnut blight infection. BMC Plant Biology. 2009, 9: 51-10.1186/1471-2229-9-51.PubMed CentralPubMedView ArticleGoogle Scholar
- Bellin D, Ferrarini A, Chimento A, Kaiser O, Levenkova N, Bouffard P, Delledonne M: Combining next-generation pyrosequencing with microarray for large scale expression analysis in non-model species. BMC Genomics. 2009, 10: 555-10.1186/1471-2164-10-555.PubMed CentralPubMedView ArticleGoogle Scholar
- Collins L, Biggs P, Voelckel C, Joly S: An approach to transcriptome analysis of non-model organisms using short-read sequences. Genome Informatics. 2008, 21: 3-14. full_text.PubMedGoogle Scholar
- Hahn DA, Ragland GJ, Shoemaker DD, Denlinger DL: Gene discovery using massively parallel pyrosequencing to develop ESTs for the flesh fly Sarcophaga crassipalpis. BMC Genomics. 2009, 10: 234-10.1186/1471-2164-10-234.PubMed CentralPubMedView ArticleGoogle Scholar
- Hale MC, McCormick CR, Jackson JR, Dewoody JA: Next-generation pyrosequencing of gonad transcriptomes in the polyploid lake sturgeon (Acipenser fulvescens): the relative merits of normalization and rarefaction in gene discovery. BMC Genomics. 2009, 10: 203-10.1186/1471-2164-10-203.PubMed CentralPubMedView ArticleGoogle Scholar
- Kristiansson E, Asker N, Förlin L, Larsson DG: Characterization of the Zoarces viviparus liver transcriptome using massively parallel pyrosequencing. BMC Genomics. 2009, 10: 345-10.1186/1471-2164-10-345.PubMed CentralPubMedView ArticleGoogle Scholar
- Meyer E, Aglyamova GV, Wang S, Buchanan-Carter J, Abrego D, Colbourne JK, Willis BL, Matz MV: Sequencing and de novo analysis of a coral larval transcriptome using 454 GSFlx. BMC Genomics. 2009, 10: 219-10.1186/1471-2164-10-219.PubMed CentralPubMedView ArticleGoogle Scholar
- Novaes E, Drost DR, Farmerie WG, Pappas GJ, Grattapaglia D, Sederoff RR, Kirst M: High-throughput gene and SNP discovery in Eucalyptus grandis, an uncharacterized genome. BMC Genomics. 2008, 9: 312-10.1186/1471-2164-9-312.PubMed CentralPubMedView ArticleGoogle Scholar
- Parchman TL, Geist KS, Grahnen JA, Benkman CW, Buerkle CA: Transcriptome sequencing in an ecologically important tree species: assembly, annotation, and marker discovery. BMC Genomics. 2010, 11: 180-10.1186/1471-2164-11-180.PubMed CentralPubMedView ArticleGoogle Scholar
- Roeding F, Borner J, Kube M, Klages S, Reinhardt R, Burmester T: A 454 sequencing approach for large scale phylogenomic analysis of the common emperor scorpion (Pandinus imperator). Molecular Phylogenetics and Evolution. 2009, 53: 826-834. 10.1016/j.ympev.2009.08.014.PubMedView ArticleGoogle Scholar
- Rokas A, Abbot P: Harnessing genomics for evolutionary insights. Trends in Ecology and Evolution. 2009, 24: 192-200. 10.1016/j.tree.2008.11.004.PubMedView ArticleGoogle Scholar
- Vera JC, Wheat CW, Fescemyer HW, Frilander MJ, Crawford DL, Hanski I, Marden JH: Rapid transcriptome characterization for a nonmodel organism using 454 pyrosequencing. Molecular Ecology. 2008, 17: 1636-1647. 10.1111/j.1365-294X.2008.03666.x.PubMedView ArticleGoogle Scholar
- Wang W, Wang Y, Zhang Q, Qi Y, Guo D: Global characterization of Artemisia annua glandular trichome transcriptome using 454 pyrosequencing. BMC Genomics. 2009, 10: 465-10.1186/1471-2164-10-465.PubMed CentralPubMedView ArticleGoogle Scholar
- Zagrobelny M, Scheibye-Alsing K, Jensen NB, Møller BL, Gorodkin J, Bak S: 454 pyrosequencing based transcriptome analysis of Zygaena filipendulae with focus on genes involved in biosynthesis of cyanogenic glucosides. BMC Genomics. 2009, 10: 574-10.1186/1471-2164-10-574.PubMed CentralPubMedView ArticleGoogle Scholar
- Bettencourt R, Pinheiro M, Egas C, Gomes P, Afonso M, Shank T, Serrao Santos R: High-throughput sequencing and analysis of the gill tissue transcriptome from the deep-sea hydrothermal vent mussel Bathymodiolus azoricus. BMC Genomics. 2010, 11: 559-10.1186/1471-2164-11-559.PubMed CentralPubMedView ArticleGoogle Scholar
- Crawford JE, Guelbeogo WM, Sanou A, Traoré A, Vernick KD, Sagnon N, Lazzaro BP: De Novo Transcriptome Sequencing in Anopheles funestus using Illumina RNA-Seq Technology. PLoS ONE. 2010, 5: e14202-10.1371/journal.pone.0014202.PubMed CentralPubMedView ArticleGoogle Scholar
- O'Neil ST, Dzurisin JD, Carmichael RD, Lobo NF, Emrich SJ, Hellmann JJ: Population-level transcriptome sequencing of nonmodel organisms Erynnis propertius and Papilio zelicaon. BMC Genomics. 2010, 11: 310-PubMed CentralPubMedView ArticleGoogle Scholar
- Der JP, Thomson JA, Stratford JK, Wolf PG: Global chloroplast phylogeny and biogeography of bracken (Pteridium; Dennstaedtiaceae). American Journal of Botany. 2009, 96: 1041-1049. 10.3732/ajb.0800333.PubMedView ArticleGoogle Scholar
- DeMaggo AE, Raghavan V: Germination of bracken fern spore. Experimental Cell Research. 1972, 73: 182-186. 10.1016/0014-4827(72)90117-6.View ArticleGoogle Scholar
- Gottlieb JE: Development of the bracken fern, Pteridium aquilinum (L.) Kuhn.-- II. stelar ontogeny of the sporeling. Phytomorphology. 1959, 9: 91-105.Google Scholar
- Sheffield E: Apospory in the fern Pteridium aquilinum (L) Kuhn 1: low temperature scanning electron microscopy. Cytobios. 1984, 39: 171-176.Google Scholar
- Sheffield E: Cellular aspects of the initiation of asporous outgrowths in ferns. Proceedings of the Royal Society B. 1985, 86: 45-50.Google Scholar
- Sheffield E: Alternation of generations. Biology and evolution of ferns and lycophytes. Edited by: Ranker TA, Haufler CH. 2008, Cambridge University Press, 49-74. full_text.View ArticleGoogle Scholar
- Sheffield E, Bell PR: Cessation of vascular activity correlated with asporous development in Pteridium aquilinum (L.) Kuhn. New Phytologist. 1981, 88: 533-538. 10.1111/j.1469-8137.1981.tb04096.x.View ArticleGoogle Scholar
- Takahashi C: Chromosome study on induced apospory in the bracken fern. La Kromosomo. 1961, 48: 1602-1605.Google Scholar
- Webster BD, Steeves TA: Morphogenesis in Pteridium aquilinum (L.) Kuhn - general morphology and growth habit. Phytomorphology. 1958, 8: 30-41.Google Scholar
- Wynn JM, Small JL, Pakeman RJ, Sheffield E: An assessment of genetic and environmental effects on sporangial development in bracken [Pteridium aquilinum (L.) Kuhn] using a novel quantitative method. Annals of Botany. 2000, 85: 113-115. 10.1006/anbo.1999.1068.Google Scholar
- Ashcroft CJ, Sheffield E: The effect of spore density on germination and development in Pteridium, monitored using a novel culture technique. American Fern Journal. 2000, 90: 91-99. 10.2307/1547324.View ArticleGoogle Scholar
- Bell PR, Duckett JG: Gametogenesis and fertilization in Pteridium. Botanical Journal of the Linnean Society. 1976, 73: 47-78. 10.1111/j.1095-8339.1976.tb02012.x.View ArticleGoogle Scholar
- Näf U: On the physiology of antheridium formation in the bracken fern, Pteridium aquilinum (L.) Kuhn. Physiologia Plantarum. 1958, 11: 728-746.View ArticleGoogle Scholar
- Robertson FW: Reproductive behavior of cloned gametophytes of Pteridium aquilinum (L.) Kuhn. American Fern Journal. 2002, 92: 270-287. 10.1640/0002-8444(2002)092[0270:RBOCGO]2.0.CO;2.View ArticleGoogle Scholar
- Schneller JJ: Antheridiogens. Biology and evolution of ferns and lycophytes. Edited by: Ranker TA, Haufler CH. 2008, Cambridge University Press, 134-158. full_text.View ArticleGoogle Scholar
- Sobota AE, Partanen CR: The growth and division of cells in relation to morphogenesis in fern gametophytes I. photomorphogenetic studies in Pteridium aquilinum. Canadian Journal of Botany. 1966, 44: 497-506. 10.1139/b66-059.View ArticleGoogle Scholar
- Steeves TA, Sussex IM, Partanen CR: In vitro studies on abnormal growth of prothalli of the bracken fern. American Journal of Botany. 1955, 42: 232-245. 10.2307/2438559.View ArticleGoogle Scholar
- Whittier DP: The origin and development of apogamous structures in the gametophyte of Pteridium in sterile culture. Phytomorphology. 1962, 12: 10-20.Google Scholar
- Cooper-Driver GA, Swain T: Cyanogenic polymorphism in bracken in relation to herbivore predation. Nature. 1970, 260: 604-10.1038/260604a0.View ArticleGoogle Scholar
- Alonso-Amelot ME, Avendano M: Human carcinogenesis and bracken fern: a review of the evidence. Current Medicinal Chemistry. 2002, 9: 675-686.PubMedView ArticleGoogle Scholar
- Beniston RG, Campo MS: Quercetin elevates p27(Kip1) and arrests both primary and HPV16 E6/E7 transformed human keratinocytes in G1. Oncogene. 2003, 22: 5504-5514. 10.1038/sj.onc.1206848.PubMedView ArticleGoogle Scholar
- Potter DM, Baird MS: Carcinogenic effects of ptaquiloside in bracken fern and related compounds. British Journal of Cancer. 2000, 83: 914-920. 10.1054/bjoc.2000.1368.PubMed CentralPubMedView ArticleGoogle Scholar
- Ghorbani J, Le Duc MG, McAllister HA, Pakeman RJ, Marrs RH: Effects of experimental restoration on the diaspore bank of an upland moor degraded by Pteridium aquilinum invasion. Land Degredation and Development. 2007, 18: 659-669. 10.1002/ldr.804.View ArticleGoogle Scholar
- Rodrigues Da Silva UDS, Matos DMDS: The invasion of Pteridium aquilinum and the impoverishment of the seed bank in fire prone areas of Brazilian Atlantic forest. Biodiversity and Conservation. 2006, 15: 3035-3043. 10.1007/s10531-005-4877-z.View ArticleGoogle Scholar
- Schneider LC: Bracken fern invasion in southern Yucatan: a case for land-change science. Geographical Review. 2004, 94: 229-241. 10.1111/j.1931-0846.2004.tb00169.x.View ArticleGoogle Scholar
- Pakeman RJ, Marrs RH: Modelling the effects of climate change on the growth of bracken (Pteridium aquilinum) in Britain. Journal of Applied Ecology. 1996, 33: 561-575. 10.2307/2404985.View ArticleGoogle Scholar
- Tan MK, Thompson JA: Variation in genome size in Pteridium. Bracken 89: Bracken Biology and Management. Edited by: Thompson JA, Smith RT. 1990, Sydney, Australia: Australian Institute of Agricultural Science, 87-93.Google Scholar
- Chevreux B, Pfisterer T, Drescher B, Driesel AJ, Müller WE, Wetter T, Suhai S: Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Research. 2004, 14: 1147-1159. 10.1101/gr.1917404.PubMed CentralPubMedView ArticleGoogle Scholar
- Huang X, Madan A: CAP3: A DNA sequence assembly program. Genome Research. 1999, 9: 868-877. 10.1101/gr.9.9.868.PubMed CentralPubMedView ArticleGoogle Scholar
- Wall PK, Leebens-Mack J, Chanderbali AS, Barakat A, Wolcott E, Liang H, Landherr L, Tomsho LP, Hu Y, Carlson JE, Ma H, Schuster SC, Soltis DE, Soltis PS, Altman N, dePamphilis CW: Comparison of next generation sequencing technologies for transcriptome characterization. BMC Genomics. 2009, 10: 347-10.1186/1471-2164-10-347.PubMed CentralPubMedView ArticleGoogle Scholar
- Kozik A, Matvienko M, Kozik I, Van Leeuwen H, Van Deynze A, Michelmore R: Eukaryotic ultra conserved orthologs and estimation of gene capture In EST libraries [abstract]. Plant and Animal Genomes Conference. 2008, 16: P6-Google Scholar
- Duarte JM, Wall PK, Edger PP, Landherr LL, Ma H, Pires JC, Leebens-Mack J, Depamphilis C: Identification of shared single copy nuclear genes in Arabidopsis, Populus, Vitis and Oryza and their phylogenetic utility across various taxonomic levels. BMC Evolutionary Biology. 2010, 10: 61-10.1186/1471-2148-10-61.PubMed CentralPubMedView ArticleGoogle Scholar
- Wall PK, Leebens-Mack J, Müller KF, Field D, Altman NS, dePamphilis CW: PlantTribes: a gene and gene family resource for comparative genomics in plants. Nucleic Acids Res. 2008, 36: D970-6. 10.1093/nar/gkm972.PubMed CentralPubMedView ArticleGoogle Scholar
- Huson DH, Auch AF, Qi J, Schuster SC: MEGAN analysis of metagenomic data. Genome Res. 2007, 17: 377-386. 10.1101/gr.5969107.PubMed CentralPubMedView ArticleGoogle Scholar
- Der JP: Genomic Perspectives on Evolution in Bracken Fern. Dissertation. 2010, Utah State University, Department of BiologyGoogle Scholar
- Stern DB, Goldschmidt-Clermont M, Hanson MR: Chloroplast RNA metabolism. Annual Review of Plant Biology. 2010, 61: 125-155. 10.1146/annurev-arplant-042809-112242.PubMedView ArticleGoogle Scholar
- Weber AP, Weber KL, Carr K, Wilkerson C, Ohlrogge JB: Sampling the Arabidopsis transcriptome with massively parallel pyrosequencing. Plant Physiology. 2007, 144: 32-42. 10.1104/pp.107.096677.PubMed CentralPubMedView ArticleGoogle Scholar
- Conesa A, Götz S: Blast2GO: A comprehensive suite for functional analysis in plant genomics. International Journal of Plant Genomics. 2008, 2008: 619832-10.1155/2008/619832.PubMed CentralPubMedView ArticleGoogle Scholar
- Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M: Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005, 21: 3674-3676. 10.1093/bioinformatics/bti610.PubMedView ArticleGoogle Scholar
- Götz S, García-Gómez JM, Terol J, Williams TD, Nagaraj SH, Nueda MJ, Robles M, Talón M, Dopazo J, Conesa A: High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic Acids Res. 2008, 36: 3420-3435.PubMed CentralPubMedView ArticleGoogle Scholar
- Honys D, Twell D: Transcriptome analysis of haploid male gametophyte development in Arabidopsis. Genome Biology. 2004, 5: R85-10.1186/gb-2004-5-11-r85.PubMed CentralPubMedView ArticleGoogle Scholar
- Yu H, Hogan P, Sundaresan V: Analysis of the female gametophyte transcriptome of Arabidopsis by comparative expression profiling. Plant Physiology. 2005, 139: 1853-1869. 10.1104/pp.105.067314.PubMed CentralPubMedView ArticleGoogle Scholar
- Wuest S, Vijverberg K, Schmidt A, Weiss M, Gheyselinck J, Lohr M, Wellmer F, Rahnenführer J, von Mering C, Grossniklaus U: Arabidopsis female gametophyte gene expression map reveals similarities between plant and animal gametes. Curr Biol. 2010, 20: 506-512. 10.1016/j.cub.2010.01.051.PubMedView ArticleGoogle Scholar
- Faircloth BC: MSATCOMMANDER: detection of microsatellite repeat arrays and automated, locus-specific primer design. Molecular Ecology Resources. 2008, 8: 92-94. 10.1111/j.1471-8286.2007.01884.x.PubMedView ArticleGoogle Scholar
- Rozen S, Skaletsky H: Primer3 on the WWW for general users and for biologist programmers. Methods Mol Biol. 2000, 132: 365-386.PubMedGoogle Scholar
- Filichkin SA, Priest HD, Givan SA, Shen R, Bryant DW, Fox SE, Wong WK, Mockler TC: Genome-wide mapping of alternative splicing in Arabidopsis thaliana. Genome Res. 2010, 20: 45-58. 10.1101/gr.093302.109.PubMed CentralPubMedView ArticleGoogle Scholar
- Li P, Ponnala L, Gandotra N, Wang L, Si Y, Tausta SL, Kebrom TH, Provart N, Patel R, Myers CR, Reidel EJ, Turgeon R, Liu P, Sun Q, Nelson T, Brutnell TP: The developmental dynamics of the maize leaf transcriptome. Nature Genetics. 2010, 42: 1060-1067. 10.1038/ng.703.PubMedView ArticleGoogle Scholar
- Zhang G, Guo G, Hu X, Zhang Y, Li Q, Li R, Zhuang R, Lu Z, He Z, Fang X, Chen L, Tian W, Tao Y, Kristiansen K, Zhang X, Li S, Yang H, Wang J, Wang J: Deep RNA sequencing at single base-pair resolution reveals high complexity of the rice transcriptome. Genome Res. 2010, 20: 646-654. 10.1101/gr.100677.109.PubMed CentralPubMedView ArticleGoogle Scholar
- Kumar S, Blaxter ML: Comparing de novo assemblers for 454 transcriptome data. BMC Genomics. 2010, 11: 571-10.1186/1471-2164-11-571.PubMed CentralPubMedView ArticleGoogle Scholar
- Surget-Groba Y, Montoya-Burgos JI: Optimization of de novo transcriptome assembly from next-generation sequencing data. Genome Res. 2010, 20: 1432-1440. 10.1101/gr.103846.109.PubMed CentralPubMedView ArticleGoogle Scholar
- Cheung F, Haas B, Goldberg S, May G, Xiao Y, Town C: Sequencing Medicago truncatula expressed sequenced tags using 454 Life Sciences technology. BMC Genomics. 2006, 7: 272-10.1186/1471-2164-7-272.PubMed CentralPubMedView ArticleGoogle Scholar
- Chen F, Mackey AJ, Stoeckert CJ, Roos DS: OrthoMCL-DB: querying a comprehensive multi-species collection of ortholog groups. Nucleic Acids Res. 2006, 34: D363-8. 10.1093/nar/gkj123.PubMed CentralPubMedView ArticleGoogle Scholar
- Enright AJ, Kunin V, Ouzounis CA: Protein families and TRIBES in genome sequence space. Nucleic Acids Res. 2003, 31: 4632-4638. 10.1093/nar/gkg495.PubMed CentralPubMedView ArticleGoogle Scholar
- Farrar DR: Gemmiferous fern gametophytes-Vittariaceae. American Journal of Botany. 1974, 61: 146-155. 10.2307/2441184.View ArticleGoogle Scholar
- Schwartz T, Tae H, Yang Y, Mockaitis K, Vanhemert J, Proulx S, Choi J, Bronikowski A: A garter snake transcriptome: pyrosequencing, de novo assembly, and sex-specific differences. BMC Genomics. 2010, 11: 694-10.1186/1471-2164-11-694.PubMed CentralPubMedView ArticleGoogle Scholar
- The Gene Indices Project software tools: SeqClean. [http://compbio.dfci.harvard.edu/tgi/software/]
- SnoWhite: A cleaning pipeline for cDNA sequences. [http://kdlugosch.net/software/]
- Chevreux B, Wetter T, Suhai S: Genome sequence assembly using trace signals and additional sequence information. Computer Science and Biology: Proceedings of the German Conference on Bioinformatics (GCB). 1999, 99: 45-56.Google Scholar
- Kanehisa M, Goto S: KEGG: Kyoto encyclopedia of genes and genomes. Nucleic Acids Res. 2000, 28: 27-30. 10.1093/nar/28.1.27.PubMed CentralPubMedView ArticleGoogle Scholar
- Kanehisa M, Goto S, Hattori M, Aoki-Kinoshita KF, Itoh M, Kawashima S, Katayama T, Araki M, Hirakawa M: From genomics to chemical genomics: new developments in KEGG. Nucleic Acids Res. 2006, 34: D354-7. 10.1093/nar/gkj102.PubMed CentralPubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.