The highest-copy repeats are methylated in the small genome of the early divergent vascular plant Selaginella moellendorffii
© Chan et al; licensee BioMed Central Ltd. 2008
Received: 14 February 2008
Accepted: 12 June 2008
Published: 12 June 2008
The lycophyte Selaginella moellendorffii is a vascular plant that diverged from the fern/seed plant lineage at least 400 million years ago. Although genomic information for S. moellendorffii is starting to be produced, little is known about basic aspects of its molecular biology. In order to provide the first glimpse to the epigenetic landscape of this early divergent vascular plant, we used the methylation filtration technique. Methylation filtration genomic libraries select unmethylated DNA clones due to the presence of the methylation-dependent restriction endonuclease McrBC in the bacterial host.
We conducted a characterization of the DNA methylation patterns of the S. moellendorffii genome by sequencing a set of S. moellendorffii shotgun genomic clones, along with a set of methylation filtered clones. Chloroplast DNA, which is typically unmethylated, was enriched in the filtered library relative to the shotgun library, showing that there is DNA methylation in the extremely small S. moellendorffii genome. The filtered library also showed enrichment in expressed and gene-like sequences, while the highest-copy repeats were largely under-represented in this library. These results show that genes and repeats are differentially methylated in the S. moellendorffii genome, as occurs in other plants studied.
Our results shed light on the genome methylation pattern in a member of a relatively unexplored plant lineage. The DNA methylation data reported here will help understanding the involvement of this epigenetic mark in fundamental biological processes, as well as the evolutionary aspects of epigenetics in land plants.
DNA methylation has been found throughout the plant kingdom, typically in cytosines, forming part of symmetric (CpNpG and CpG) and asymmetric (CpNpN) sites [1, 2]. The proportion of methylated cytosine in plants is variable, ranging from 6% in Arabidopsis  to 25% in maize . DNA methylation has been associated with the inactivation of transposons and silencing of genes [5–10], and it has also been proposed that the function of DNA methylation is to decrease transcriptional "noise" .
Because of the large size of many plant genomes, particularly those of important crops , gene-enriched sequencing strategies have been designed as an alternative to whole genome sequencing in an attempt to capture the so-called gene-space of such genomes. One of these gene-enrichment techniques, called methylation filtration (MF), takes advantage of the difference in methylation between plant genes and repeats . MF exploits the methylation-dependent restriction endonuclease McrBC (modified cytosine restriction) from E. coli [20, 21]. This enzyme digests DNA in sequences that contain two sites, each one consisting of a purine and a cytosine methylated in carbon 5, separated by 40–3000 bp . Therefore, using an mcrBC+ E. coli strain as a host to construct a genomic shotgun library, heavily methylated repetitive DNA is efficiently counter-selected, while hypomethylated low copy (i.e. genic) sequences are substantially over-represented. MF was first tested in maize, where it yielded a 6-fold enrichment for genes relative to a whole genome shotgun (WGS) library used as a control . Subsequently, MF was applied at large scale in maize [23, 24] and in sorghum , showing that approximately 95% of the genes in each genome were tagged (A. Chan et al., unpublished) and that most genes and regulatory elements are unmethylated in these two species. These results led to the suggestion that a combination of gene-enrichment and traditional genome sequencing techniques could be combined to efficiently sequence large plant genomes . Further analyses of the large-scale MF data in maize and sorghum also provided insights into the biology of transposable element methylation and activity [23–25]. Pilot MF studies of several monocot, dicot, and non-angiosperm plants (such as pine, fern, and moss) were also conducted . These analyses determined that MF enriches for genes in all plants tested, although to different levels, and that it can be an effective approach to selectively clone and sequence genes from some large plant genomes, where the majority of the DNA is composed of methylated repetitive elements.
In this study we performed a MF analysis of the lycophyte Selaginella moellendorffii (family Selaginellaceae), representing a clade not included in previous MF studies. The lycophyte clade diverged from the fern/seed-plant lineage about 400 million years ago .
The S. moellendorffii sporophyte is diploid and consists of dichotomously branching shoot and root systems. The shoot frequently terminates in arrested buds or bulbils that dehisce and allow clonal propagation. The reproductive structures are the strobili, which form toward the tip of the shoot, each one with either one micro- or megasporangium that produce micro- or megaspores, which in turn germinate and divide mitotically to form either the male or female gametophytes, respectively. The gametophyte produces either motile sperm or egg-forming archegonia. After fertilization of the egg, the new sporophyte remains dependent upon the female gametophyte for a short period of time. S. moellendorffii is an excellent model system to study some developmental processes, such as sporogenesis and gametophyte development, which are difficult to study in angiosperms because their spores and gametophytes are dependent upon and surrounded by sporophytic tissues. Seedless plants provide an excellent opportunity to study the epigenetics of these processes, but little is known about DNA methylation and other epigenetic marks in early vascular plants, except for the presence of heterochromatic bands identified by cytological staining . Ferns have been used in attempts to address the methylation of the haploid and diploid generations  but their genomes are usually large and only specific sequences were analyzed. The extremely small genome of S. moellendorffii (90–130 Mbp; ) and its available 8× coverage, high-quality draft genome assembly generated by the Joint Genome Institute of the U.S. Department of Energy (JGI-DOE), will facilitate the study of S. moellendorffii's epigenome and its involvement in the alternation of generations. Due to its small genome size, several transposon families, which are common targets of epigenetic modifications, may be low copy in S. moellendorffii and their sequence and epigenetics can be studied without the complications of high copy numbers, allowing the unequivocal identification of individual transposon loci.
Sequences from this study have been deposited in NCBI GenBank under the accession numbers [ET218553–ET221769].
Results and Discussion
Sequence data and chloroplast content
The chloroplast reads identified in this way were not analyzed further and, therefore, a total of 1,379 MF and 1,471 WGS non-chloroplast reads were used in the following analyses.
C+G content in different sequence classes
%C+G in repeats
% C+G in low-copy DNA
% C+G in genes and EST hits
Approximately 13% of the sequences could not be aligned to the reference genome assembly at the stringency used in this study. This discrepancy may be due to the exclusion of sequence assemblies shorter than 1 kbp from the reference genome sequence.
Sequence composition analysis of the repetitive sequences showed that MF repeats are richer in C+G than those in the WGS set, probably due to the abundance of conserved, non-methylated ribosomal RNA sequences among the MF repeats (Table 1).
Gene sequences, expressed sequences and gene enrichment
In order to estimate the level of gene enrichment achieved with MF in S. moellendorffii in comparison with previous studies done in other plants , all sequences that were not identified as repeats or chloroplast were compared to the same curated database of known gene sequences used in those studies. In this way, 12.8% of the WGS sequences and 21.5% of the MF sequences had a match in the known gene database, resulting in a GEF of 1.7 (Figure 3).
Our results show that even in the small genome of S. moellendorffii, MF sequences display much lower repeat content than WGS sequences, and that each of the identified MF repeats has less than 42 copies in the genome. If the MF repeat sequences are aligned to the reference genome at higher stringency, the number of hits for each repeat decreases, indicating that polymorphisms can be found inside families of repetitive elements (data not shown). Therefore, by sequencing the hypomethylated fraction of the S. moellendorffii genome using MF it would be possible to identify which copies of these repetitive elements are methylated. MF of the S. moellendorffii genome can be used to obtain information on gene methylation as well, as it has been shown in Arabidopsis, where a fraction of the genes do contain cytosine methylation (although at a lower level than repeats and pseudogenes) and this methylation is predominant in particular regions of the genes [14–17]. In consequence, a genome-wide DNA methylation profile can be generated by comprehensive MF sequencing of this genome. Furthermore, combining MF with ultra-high throughput next-generation sequencing techniques will facilitate this kind of analyses using the sequenced genome as a reference. As the variety of S. moellendorffii whose genome was sequenced by JGI-DOE has two distinct haplotypes that differ in nucleotide sequence by ~2–5%, (J. Banks, unpublished), it will be possible to determine if there is haplotype-specific DNA methylation using MF sequencing. Genome-wide epigenetic studies of early-diverging land plants will provide the foundation to broaden our understanding of the evolution of epigenetic regulation of developmental processes in plant biology.
Total DNA was purified using DNeasy kits (Qiagen, CA) from green tissues of S. moellendorffii plants kept in growth chamber. The DNA was mechanically sheared using a Hydroshear device (Genomic Solutions, MI) and fragments ranging from 3 to 4 kb were eluted from an agarose gel after electrophoresis, end-repaired, and ligated into a cloning vector. DNA ligation reactions were transformed into E. coli DH5α (mcrBC+) to consruct the MF library. The WGS library was constructed by introducing the same ligation reaction into E. coli GC10 (mcrBC-). Recombinant clones were sequenced using Big Dye Terminator chemistry and ABI 3730xl sequencers (Applied Biosystems, CA), and vector and low-quality sequences were electronically trimmed.
Chloroplast sequences were identified by BLASTN alignment to the S. uncinata chloroplast genome (GenBank accession AB197035) at high stringency (E value smaller than 10-56). The chloroplast sequences were excluded from any further sequence analyses. Protein sequence alignments against the NIAA database were done using BLAT. Alignments with at least 70% similarity and 40 amino acids long were recorded as matches.
Alignments to assembled EST sequences were done using BLASTN at high stringency. Matches showing an E value smaller than 10-56 were recorded.
De novo repeats were identified by aligning MF and WGS reads to the JGI-DOE S. moellendorffii genome assembly using BLASTN and matches covering 50% of the read with 95% identity were recorded.
Alignments to the curated database of known genes were done as previously reported , using BLASTX and recording matches with an E value better than 10-7.
Known repeats were identified using a nucleotide database and a protein database of known repetitive elements described earlier . These databases do not contain simple sequence repeats. Repetitive element proteins were identified using the protein database of repeats. The same criteria were used to identify known genes, while repetitive nucleotide sequences were identified using BLASTN with an E value smaller than 10-10.
DNA digestion with Hpa II was preformed following manufacturer recommendations. PCR assays were carried out using 50 ng of Hpa II-digested or undigested genomic DNA as template, and denaturing 3 minutes at 94°C followed by 25 amplification cycles using the following program: 30 seconds at 94°C, 30 seconds at 59°C, and 60 seconds at 72°C. Elongation was allowed for 10 minutes at 72°C after amplification. Target and primer sequences are shown in Additional file 3.
This work was funded by The Institute for Genomic Research (TIGR, now called J. Craig Venter Institute or JCVI, Rockville, MD).
- Gruenbaum Y, Naveh-Many T, Cedar H, Razin A: Sequence specificity of methylation in higher plant DNA. Nature. 1981, 292: 860-862. 10.1038/292860a0.PubMedView ArticleGoogle Scholar
- Meyer P, Niedenhof I, ten Lohuis M: Evidence for cytosine methylation of non-symmetrical sequences in transgenic Petunia hybrida. Embo J. 1994, 13: 2084-2088.PubMedPubMed CentralGoogle Scholar
- Kakutani T, Munakata K, Richards EJ, Hirochika H: Meiotically and mitotically stable inheritance of DNA hypomethylation induced by ddm1 mutation of Arabidopsis thaliana. Genetics. 1999, 151: 831-838.PubMedPubMed CentralGoogle Scholar
- Papa CM, Springer NM, Muszynski MG, Meeley R, Kaeppler SM: Maize chromomethylase Zea methyltransferase2 is required for CpNpG methylation. Plant Cell. 2001, 13: 1919-1928. 10.1105/tpc.13.8.1919.PubMedPubMed CentralView ArticleGoogle Scholar
- Bird A: DNA methylation patterns and epigenetic memory. Genes Dev. 2002, 16: 6-21. 10.1101/gad.947102.PubMedView ArticleGoogle Scholar
- Chandler VL, Walbot V: DNA modification of a maize transposable element correlates with loss of activity. Proc Natl Acad Sci USA. 1986, 83: 1767-1771. 10.1073/pnas.83.6.1767.PubMedPubMed CentralView ArticleGoogle Scholar
- Colot V, Rossignol JL: Eukaryotic DNA methylation as an evolutionary device. Bioessays. 1999, 21: 402-411. 10.1002/(SICI)1521-1878(199905)21:5<402::AID-BIES7>3.0.CO;2-B.PubMedView ArticleGoogle Scholar
- Martienssen RA, Colot V: DNA methylation and epigenetic inheritance in plants and filamentous fungi. Science. 2001, 293: 1070-1074. 10.1126/science.293.5532.1070.PubMedView ArticleGoogle Scholar
- Flavell RB: Inactivation of gene expression in plants as a consequence of specific sequence duplication. Proc Natl Acad Sci USA. 1994, 91: 3490-3496. 10.1073/pnas.91.9.3490.PubMedPubMed CentralView ArticleGoogle Scholar
- Martienssen R: Transposons, DNA methylation and gene control. Trends Genet. 1998, 14: 263-264. 10.1016/S0168-9525(98)01518-2.PubMedView ArticleGoogle Scholar
- Bird AP: Gene number, noise reduction and biological complexity. Trends Genet. 1995, 11: 94-100. 10.1016/S0168-9525(00)89009-5.PubMedView ArticleGoogle Scholar
- Bennetzen JL, Schrick K, Springer PS, Brown WE, SanMiguel P: Active maize genes are unmodified and flanked by diverse classes of modified, highly repetitive DNA. Genome. 1994, 37: 565-576. 10.1139/g94-081.PubMedView ArticleGoogle Scholar
- Rabinowicz PD, Palmer LE, May BP, Hemann MT, Lowe SW, McCombie WR, Martienssen RA: Genes and transposons are differentially methylated in plants, but not in mammals. Genome Res. 2003, 13: 2658-2664. 10.1101/gr.1784803.PubMedPubMed CentralView ArticleGoogle Scholar
- Lippman Z, Gendrel AV, Black M, Vaughn MW, Dedhia N, McCombie WR, Lavine K, Mittal V, May B, Kasschau KD, Carrington JC, Doerge RW, Colot V, Martienssen R: Role of transposable elements in heterochromatin and epigenetic control. Nature. 2004, 430: 471-476. 10.1038/nature02651.PubMedView ArticleGoogle Scholar
- Vaughn MW, Tanurd Ic M, Lippman Z, Jiang H, Carrasquillo R, Rabinowicz PD, Dedhia N, McCombie WR, Agier N, Bulski A, Colot V, Doerge RW, Martienssen RA: Epigenetic Natural Variation in Arabidopsis thaliana. PLoS Biol. 2007, 5: e174-10.1371/journal.pbio.0050174.PubMedPubMed CentralView ArticleGoogle Scholar
- Zilberman D, Gehring M, Tran RK, Ballinger T, Henikoff S: Genome-wide analysis of Arabidopsis thaliana DNA methylation uncovers an interdependence between methylation and transcription. Nat Genet. 2007, 39: 61-69. 10.1038/ng1929.PubMedView ArticleGoogle Scholar
- Zhang X, Yazaki J, Sundaresan A, Cokus S, Chan SW, Chen H, Henderson IR, Shinn P, Pellegrini M, Jacobsen SE, Ecker JR: Genome-wide high-resolution mapping and functional analysis of DNA methylation in arabidopsis. Cell. 2006, 126: 1189-1201. 10.1016/j.cell.2006.08.003.PubMedView ArticleGoogle Scholar
- Arumuganathan K, Earle ED: Nuclear DNA contetn of some important plant species. Plant Mol Biol Rep. 1991, 9: 208-218. 10.1007/BF02672069.View ArticleGoogle Scholar
- Rabinowicz PD, Schutz K, Dedhia N, Yordan C, Parnell LD, Stein L, McCombie WR, Martienssen RA: Differential methylation of genes and retrotransposons facilitates shotgun sequencing of the maize genome. Nat Genet. 1999, 23: 305-308. 10.1038/15479.PubMedView ArticleGoogle Scholar
- Dila D, Sutherland E, Moran L, Slatko B, Raleigh EA: Genetic and sequence organization of the mcrBC locus of Escherichia coli K-12. J Bacteriol. 1990, 172: 4888-4900.PubMedPubMed CentralGoogle Scholar
- Raleigh EA, Wilson G: Escherichia coli K-12 restricts DNA containing 5-methylcytosine. Proc Natl Acad Sci USA. 1986, 83: 9070-9074. 10.1073/pnas.83.23.9070.PubMedPubMed CentralView ArticleGoogle Scholar
- Sutherland E, Coe L, Raleigh EA: McrBC: a multisubunit GTP-dependent restriction endonuclease. J Mol Biol. 1992, 225: 327-348. 10.1016/0022-2836(92)90925-A.PubMedView ArticleGoogle Scholar
- Palmer LE, Rabinowicz PD, O'Shaughnessy AL, Balija VS, Nascimento LU, Dike S, de la Bastide M, Martienssen RA, McCombie WR: Maize genome sequencing by methylation filtration. Science. 2003, 302: 2115-2117. 10.1126/science.1091265.PubMedView ArticleGoogle Scholar
- Whitelaw CA, Barbazuk WB, Pertea G, Chan AP, Cheung F, Lee Y, Zheng L, van Heeringen S, Karamycheva S, Bennetzen JL, SanMiguel P, Lakey N, Bedell J, Yuan Y, Budiman MA, Resnick A, Van Aken S, Utterback T, Riedmuller S, Williams M, Feldblyum T, Schubert K, Beachy R, Fraser CM, Quackenbush J: Enrichment of gene-coding sequences in maize by genome filtration. Science. 2003, 302: 2118-2120. 10.1126/science.1090047.PubMedView ArticleGoogle Scholar
- Bedell JA, Budiman MA, Nunberg A, Citek RW, Robbins D, Jones J, Flick E, Rholfing T, Fries J, Bradford K, McMenamy J, Smith M, Holeman H, Roe BA, Wiley G, Korf IF, Rabinowicz PD, Lakey N, McCombie WR, Jeddeloh JA, Martienssen RA: Sorghum genome sequencing by methylation filtration. PLoS Biol. 2005, 3: e13-10.1371/journal.pbio.0030013.PubMedPubMed CentralView ArticleGoogle Scholar
- Rabinowicz PD, Bennetzen JL: The maize genome as a model for efficient sequence analysis of large plant genomes. Curr Opin Plant Biol. 2006, 9: 149-156. 10.1016/j.pbi.2006.01.015.PubMedView ArticleGoogle Scholar
- Rabinowicz PD, Citek R, Budiman MA, Nunberg A, Bedell JA, Lakey N, O'Shaughnessy AL, Nascimento LU, McCombie WR, Martienssen RA: Differential methylation of genes and repeats in land plants. Genome Res. 2005, 15: 1431-1440. 10.1101/gr.4100405.PubMedPubMed CentralView ArticleGoogle Scholar
- Kenrick P, Crane PR: The origin and early evolution of plants on land. Nature. 1997, 389: 33-39. 10.1038/37918.View ArticleGoogle Scholar
- Marcon AB, Barros IC, Guerra M: Variation in chromosome numbers, CMA bands and 45S rDNA sites in species of Selaginella (Pteridophyta). Ann Bot (Lond). 2005, 95: 271-276.View ArticleGoogle Scholar
- McGrath JM, Pichersky E: Methylation of somatic and sperm DNA in the homosporous fern Ceratopteris richardii. Plant Mol Biol. 1997, 35: 1023-1027. 10.1023/A:1005962520544.PubMedView ArticleGoogle Scholar
- Wang W, Tanurdzic M, Luo M, Sisneros N, Kim HR, Weng JK, Kudrna D, Mueller C, Arumuganathan K, Carlson J, Chapple C, de Pamphilis C, Mandoli D, Tomkins J, Wing RA, Banks JA: Construction of a bacterial artificial chromosome library from the spikemoss Selaginella moellendorffii: a new resource for plant comparative genomics. BMC Plant Biol. 2005, 5: 10-10.1186/1471-2229-5-10.PubMedPubMed CentralView ArticleGoogle Scholar
- Tsuji S, Ueda K, Nishiyama T, Hasebe M, Yoshikawa S, Konagaya A, Nishiuchi T, Yamaguchi K: The chloroplast genome from a lycophyte (microphyllophyte), Selaginella uncinata, has a unique inversion, transpositions and many gene losses. Journal of plant research. 2007, 120: 281-290. 10.1007/s10265-006-0055-y.PubMedView ArticleGoogle Scholar
- Singer T, Yordan C, Martienssen RA: Robertson's Mutator transposons in A. thaliana are regulated by the chromatin-remodeling gene Decrease in DNA Methylation (DDM1). Genes Dev. 2001, 15: 591-602. 10.1101/gad.193701.PubMedPubMed CentralView ArticleGoogle Scholar
- Miura A, Yonebayashi S, Watanabe K, Toyama T, Shimada H, Kakutani T: Mobilization of transposons by a mutation abolishing full DNA methylation in Arabidopsis. Nature. 2001, 411: 212-214. 10.1038/35075612.PubMedView ArticleGoogle Scholar
- Purdue Selaginella Genomics. [http://selaginella.genomics.purdue.edu/data.html]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.