An optimized procedure for the design and evaluation of Ecotilling assays
© Coassin et al; licensee BioMed Central Ltd. 2008
Received: 13 June 2008
Accepted: 30 October 2008
Published: 30 October 2008
Single nucleotide polymorphisms (SNPs) are the most common form of genetic variability in the human genome and play a prominent role in the heritability of phenotypes. Especially rare alleles with frequencies less than 5% may exhibit a particularly strong influence on the development of complex diseases. The detection of rare alleles by standard DNA sequencing is time-consuming and cost-intensive. Here we discuss an alternative approach for a high throughput detection of rare mutations in large population samples using Ecotilling embedded in a collection of bioinformatic analysis tools. Ecotilling originally was introduced as TILLING for the screening for rare chemically induced mutations in plants and later adopted for human samples, showing an outstanding suitability for the detection of rare alleles in humans. An actual problem in the use of Ecotilling for large mutation screening projects in humans without bioinformatic support is represented by the lack of solutions to quickly yet comprehensively evaluate each newly found variation and place it into the correct genomic context.
We present an optimized strategy for the design, evaluation and interpretation of Ecotilling results by integrating several mostly freely available bioinformatic tools. A major focus of our investigations was the evaluation and meaningful economical combination of these software tools for the inference of different possible regulatory functions for each newly detected mutation.
Our streamlined procedure significantly facilitates the experimental design and evaluation of Ecotilling assays and strongly improves the decision process on prioritizing the newly found SNPs for further downstream analysis.
Several studies have shown that especially rare alleles with frequencies below 5% (referred to as "rare" alleles) may have a strong impact on quantitative traits and complex diseases [1–6]. Furthermore, recent genome-wide association studies on qualitative and quantitative traits highlighted the importance of genetic variation outside coding regions or even gene regions [7–11] which sheds new light on these in the past often neglected intronic and even intergenic regions, which harbour nearly 90% of all SNPs .
The discovery of rare alleles in large populations is challenging. Standard DNA sequencing of sizeable gene regions in large study populations with the aim to identify the contribution of rare mutation is time-consuming and cost-intensive. Depending on the sequencing quality, heterozygous mutations may be lost in background noise. Thus, Sanger sequencing shows limitations for the discovery of rare and hence mostly heterozygous mutations [13, 14]. In addition, many of the current pre-screening technologies such as single strand conformational polymorphism analysis or gradient gel electrophoresis are laborious, target only relatively small portions of DNA (as it is the case for dHPLC) or are not capable for high throughput. Techniques that allow an inexpensive but high throughput detection of rare mutations with high sensitivity are thus of high interest [13, 15].
A technology that promises a cost-effective screening of 15–20 kb large gene regions for rare mutations with high sensitivity is Ecotilling. Ecotilling was originally developed as TILLING for large scale screenings of chemically induced mutations in plants , and later adopted for human samples . An important advantage of this technology is its capability to detect mutations in assays with up to eightfold pooled DNA samples (which corresponds in case of a heterozygous mutation to the detection of one mutation in out of 16 strands), thus boosting the throughput, lowering the costs and enabling the detection of multiple homozygous mutations . Since up to two runs are possible in a typical working day, the investigation of up to nearly 2.3 Mb (192 samples, 1500 bp each sample) in a single day becomes possible even in a relatively small research laboratory.
Despite its obvious benefits the use of Ecotilling for research projects in human medical research is still limited. A possible reason for this may be the complex process of translating the image information to sequence annotation, the interpretation of the findings and the selection process of newly discovered SNPs for further downstream analysis. For example, a crucial point for the interpretation and evaluation of the gel data is the knowledge about the location of the detected mutations within the gene. Especially when analyzing a SNP-rich organism like homo sapiens, a fast way to localize the signals relative to the analyzed PCR fragment and relative to already known polymorphisms or known functional elements is essential for a fast and efficient examination of Ecotilling gels. While large TILLING facilities apply bioinformatic tools to map and store the mutations found by TILLING (e.g. as done at the Seattle TILLING Project ), no suitable software solution exists for relatively small laboratories.
Unlike DNA sequencing, Ecotilling data have to be manually translated from a "picture form" to a more accessible "DNA sequence form". The efficient realization of this step poses an essential problem in the evaluation of Ecotilling gels. Here we present an optimized strategy for the design, analysis and interpretation of human Ecotilling experiments. Our proposed procedure involves an appropriate usage of mostly freely available software tools and enables a straight-forward post-laboratory analysis of Ecotilling results without the need of sophisticated laboratory information management solutions.
The principles of Ecotilling are described in detail in references [13, 16] and . Briefly, the basic step of TILLING is the amplification of a region of interest of up to 1.5 kb using two primers labeled with different infrared fluorochromes, followed by heat-denaturation and slow cooling to allow the formation of heteroduplexes in the presence of heterozygous mutations. These are digested with Cel-I, a mismatch-specific endonuclease which cuts at the 3' end of mismatches . The digestion products are size-fractionized on a LI-COR slab gel sequencer (DNA Analyzer 4300S; LI-COR Biosciences, Lincoln, NE, USA). The simultaneous detection at both fluorescent dye wavelengths yields two gel-images for every analyzed fragment, with both images showing the undigested full size-product (which is still bearing both 5' dyes). In the case that a digested mutation is present, each detection channel shows an additional signal where the sum of the size of both fragments is equal to the length of the full size product. Since the reaction conditions are supposed to avoid a full digestion of the PCR products (and therefore not every mismatch is recognized in every strand), multiple mutations per fragment and their approximate position (± 10 bp) can be detected. Furthermore, since every real mutation should produce complementary bands in the two detection channels, an immediate quality control of each signal is possible .
Design of Ecotilling experiments
The design of Ecotilling is done by integrating the functions of Vector NTI (Advance 10; Invitrogen Corporation, Carlsbad, CA, USA; http://resources.invitrogen.com/content.cfm?pageid=10373), the Ensembl Database http://www.ensembl.org, the Genomatix Suite (Genomatix Software GmbH, Munich, Germany; http://www.genomatix.de) and some software for primer design such as VisualOMP (DNA Software, Ann Arbor, MI, USA) or Primer3 http://fokker.wi.mit.edu/primer3/input.htm.
Data analysis of Ecotilling experiments
The data analysis procedure includes three software tools: Vector NTI, GelBuddy (; http://www.proweb.org/gelbuddy) and sequencing analysis software solutions such as SeqScape 2.6 (Applied Biosystems, Foster City, CA, USA) or Sequencher 4.8 (GeneCodes, Ann Arbor, MI, USA).
Interpretation of Ecotilling experiments
The interpretation of Ecotilling gels is based on the gene model created in Vector NTI and the findings can further be investigated using different bioinformatic tools, depending on the identity, type and location of each mutation.
Coding SNPs are evaluated using a Vector NTI data set containing the respective cDNA and the homology-based web tools Polyphen http://genetics.bwh.harvard.edu/pph/ and SIFT http://blocks.fhcrc.org/sift/SIFT.html.
The tools for investigation of possible functional effects of SNPs in non-coding regions comprehend the Genomatix Software Suite for investigation of transcription factor binding sites, TARGETSCAN (; http://www.targetscan.org/vert_42), PATROCLES (; http://www.patrocles.org/Patrocles.htm) and polymiRTS (; http://compbio.utmem.edu/miRSNP) for miRNA evaluation, the VISTA Genome Browser http://pipeline.lbl.gov/cgi-bin/gateway2 for long range alignments and the UCSC Browser http://genome.ucsc.edu/cgi-bin/hgGateway for evaluating unknown regulatory regions through analyzing conservational pattern and ESPERR scores .
Results and Discussion
1. Design of Ecotilling experiments
Construction of the Vector NTI design overview
Finally, the construction of a second parallel Vector NTI project using only the cDNA can be highly beneficial for a fast evaluation of the effects of exonic SNPs, since Vector NTI allows easy one-click translation of coding sequences.
The target region is selected in the Vector NTI interface and pasted into any primer design program. The design of the amplification primers yielding overlapping PCR fragments is crucial to fully cover each gene region, since Ecotilling shows a lower resolution in the first approximately 150 bp of each fragment. After design, the primer binding sites can easily be checked in Vector NTI to ensure the optimal position of each primer regarding to any regions of particular interest.
The same procedure is followed for the sequencing primers. By designing and optimizing them immediately together with the PCR fragments, it is possible to set up a high throughput mutation validation routine for the newly detected mutations of each target gene, where any detected mutations can be readily sequenced. This allows sequencing of each detected mutation within 48 hours from the Ecotilling run.
The resulting data set represents the basis for the subsequent primer design. Optimal primer design is essential for the entire experiment as primers are often designed without exact knowledge of the location of potentially interesting regions, resulting sometimes in several PCR redesigns. While the position of coding regions is usually known and taken into account, potential regulatory regions are seldom precisely considered during primer design, although they harbor the majority of (e)QTL loci . This happens mostly because information about their location is hardly to retrieve in a well arranged form. While the use of many of the publicly available databases is often not really intuitive, we found that the Genomatix Database represents a powerful and user-friendly, although proprietary source of information on regulatory elements. Therefore the integration of information about regulatory regions into the Vector NTI seems a strong advantage for mutation screening in epidemiological studies, especially given the recent results outside of coding regions or even gene regions as discussed above [7–11]. Figure 2 and additional file 2 provide an example for a complete Vector NTI data set for Ecotilling screenings.
2. Data analysis of Ecotilling experiments
Ecotilling image analysis
The data analysis starts with the analysis of the Ecotilling gel images using GelBuddy . At the same time, the corresponding Vector NTI project is opened and by blinding out the remaining molecule, only the current PCR product remains displayed. When a differential signal is detected in GelBuddy, its position can easily be mapped in Vector NTI by dragging the mouse from the end of the PCR-product corresponding to the analyzed detection channel across the PCR product and progressively marking the product (Figure 3). Vector NTI automatically counts the marked bases, giving a convenient way to determine the sequence context of each signal. This allows seeing if the signal corresponds to a known SNP or to another structure producing positive Ecotilling signals, such as poly-T regions. At the same time, the region in which the signal is located can be checked for any functionally interesting features and thus, the possible importance of any new signal can rapidly be assessed. The visual determination of the position and context of each new mutation strongly facilitates the choice of sequencing primers for the validation of the mutation. The additional file 3 (movie-2.avi, available online) exemplifies this procedure.
Since the exact position by Ecotilling can only be determined with an accuracy of ± 5–10 bp, prioritized signals should be sequenced to determine their exact identity. Any new mutation is then directly annotated in Vector NTI and considered in any subsequent gel run.
An important point that distinguishes human Ecotilling from Ecotilling or TILLING in all other model organisms is the high number of already known SNPs. This makes the screening for new and rare polymorphisms in humans more challenging and strongly requires an exact knowledge of all already known polymorphisms in the analyzed region.
The described gel evaluation procedure gives a fast and straight-forward way to evaluate any signal on the gel. The rapid determination of the position of a signal and easy inspection of its sequence context for any functional relevance, therefore, simplifies the analysis of Ecotilling images and the choice of the right sequencing primers. This allows creating a high-throughput screening and sequencing procedure.
The definitive identification of previously unknown, possibly heterozygous mutations by sequencing is often difficult, especially in sequence contexts, where the required high quality reads may be hard to obtain. A crucial point when choosing the appropriate sequence analysis software is the ease to compare multiple electropherograms, as offered by SeqScape or Sequencher. Especially the latter offers multiple tools for in-depth analysis of electropherograms and is in our eyes especially appropriate for the confirmation of insertion/deletion polymorphisms, which are otherwise quite difficult to ascertain.
3. Interpretation of Ecotilling experiments
The interpretation of Ecotilling results starts by inspecting the position of new mutations in Vector NTI for already known or in-silico predicted genetic elements. Exonic SNPs are annotated using the cDNA data set and their effect on the protein sequence is determined. The possibility to easily translate DNA sequences into amino acid sequence in Vector NTI enables a fast evaluation of the effects of SNPs on the amino acid composition and their position relative to known protein domains. The effect of a non-synonymous SNP are finally evaluated using the freely available online tools Polyphen and SIFT.
The evaluation of mutations in non-coding regions is more challenging due to the general lack of knowledge about cis-regulatory elements outside the restricted core promoter  and in-depth analyses require strong bioinformatic support.
The effect of sequence variations on putative transcription factor binding sites can be assessed using SNPInspector, MatInspector , ModelInspector  and other tools from the Genomatix' GEMS Launcher.
The potential functional relevance of the region surrounding an interesting SNP can be evaluated by analyzing the evolutionary conservational pattern and potential regulatory functions in the UCSC genome browser. Potential regulatory elements can be found with the ESPERR Regulatory Potential, which is highly accurate in distinguishing regulatory from neutral DNA stretches  and by investigating the nucleotide conservation with the UCSC and VISTA Genome Browsers. A typical example for the application of these methods to investigate the influence of genetic variability on phenotypes of interest can be found in our recent genome-wide association study on HDL cholesterol (reference  including the respective Supplementary Material).
A large amount of information on the functionality of different human genetic regions (regulatory regions, promoters, catalytic domains, etc.) can be obtained through the literature and databases. In principle, all available information has to be taken into consideration for the interpretation and evaluation of newly discovered mutations. Thus, a quick but thorough interpretation of Ecotilling results is the most challenging step in the post-laboratory analysis.
While the evaluation of coding SNPs is usually straightforward, the evaluation of SNPs outside of coding regions is more difficult. We propose here an interpretation procedure dedicated to the analysis of functional effects of SNPs derived from genetic epidemiological studies, which aims to consider as many different regulatory layers as possible.
Although several powerful databases for transcription factor binding sites and mammalian promoters were made available in the last years (such as JASPAR  and TRANSFAC  or MPromDB  and EPD ), these are often too laborious for a quick evaluation and lack the integration of different layers of information.
An optimal tool collection for this purpose is represented by the Genomatix Software Suite, which includes several tools ranging from literature mining tools (Bibliosphere, LitInspector) to an annotated genome browser (Eldorado) and tools for searching and evaluating transcription factor binding sites (MatInspector , SNPInspector). The integration of all tools into one common HTML-interface helps to consider several different layers of functional information, facilitates the prioritization of mutations for downstream analysis and enables the evaluation of potential effects of new SNPs without strong bioinformatic expert knowledge.
As transcription factor binding sites can be located elsewhere than upstream of a gene , we emphasize that all SNPs and their surrounding sequence should routinely be investigated for transcription factor binding sites.
Recently it was shown that SNPs within 3'-UTRs can affect gene expression by affecting miRNA binding sites [22, 34]. It was suggested that up to one third of all human genes may be targeted by miRNAs [35, 36], indicating that SNPs affecting miRNA binding sites may be a common phenomenon. Therefore SNPs in 3' UTR regions should always be evaluated for affecting miRNA binding sites, and intergenic SNPs should be checked for effects on any known miRNAs (see Methods section).
Recent advances in sequencing technologies promise to revolutionize genetics by enabling genome-wide and highly multiplexed mutation discovery studies . These technologies are still challenging and demand both considerable wet lab capacities and extensive bioinformatic support. For the time being, next generation sequencing is not adapted to rapidly screen one region in many different samples. We found that in this case, Ecotilling shows outstanding performance for cost-effective screening of single gene regions in large sample numbers and is therefore especially appropriate for mutation screening studies that may not reach the target region size for next-generation sequencing. Recent technological advances did further improve the performance of Ecotilling by allowing the use of capillary electrophoresis systems . This brings several benefits such as the use of even higher samples pools, reduced background signal and higher throughput due to automated sample loading [38, 39].
Although we did not develop a completely new software solution, we describe in detail an optimized procedure for a thorough evaluation of Ecotilling results, which summarizes know-how on a widespread variety of freely available bioinformatic tools. The basic problem to translate findings from mutation screenings into a more accessible sequence database and evaluate them in a quick yet comprehensive manner, is not restricted to screenings employing Ecotilling. This may apply also to screening projects employing other technologies. Especially with the advent of next-generation sequencing technologies and the resulting huge amount of data that any geneticist will face, our approach may help to keep an overview in every evaluation process involving different layers of information. Thus, we believe that our approach may be interesting also for laboratories using any other mutation screening technology.
(expression) Quantitative Trait Locus
Cap Analysis Gene Expression
denaturing High Performance Liquid Chromatography
Evolutionary and Sequence Pattern Extraction through Reduced Representations
Targeting Induced Local Lesions IN Genomes
This work was supported by grants from the "Genomics of Lipid-associated Disorders – GOLD" of the "Austrian Genome Research Programme GEN-AU" and the Austrian Nationalbank (#12531) to F. Kronenberg as well as by the MFI-Grant Nr. 2007-402 (Medizinische Forschungsförderung Innsbruck) to A. Brandstätter. We appreciate the technical assistance of Anke Gehringer and Markus Haak from the Division of Genetic Epidemiology, Innsbruck Medical University. We are grateful to Dr. Bradley J. Till and Dr. Steven Henikoff, Basic Sciences Division, Fred Hutchinson Cancer Research Center, Seattle, for the technical help during the implementation of Ecotilling. Furthermore, we thank Sylke Winkler and David Drechsel from the Max-Plank Institute for Molecular Cell Biology and Genetics for kindly providing purified Cel-I for the establishment of Ecotilling at our laboratory.
- Cohen JC, Kiss RS, Pertsemlidis A, Marcel YL, McPherson R, Hobbs HH: Multiple rare alleles contribute to low plasma levels of HDL cholesterol. Science. 2004, 305: 869-872. 10.1126/science.1099870.PubMedView ArticleGoogle Scholar
- Fearnhead NS, Wilding JL, Winney B, Tonks S, Bartlett S, Bicknell DC, Tomlinson IP, Mortensen NJ, Bodmer WF: Multiple rare variants in different genes account for multifactorial inherited susceptibility to colorectal adenomas. Proc Natl Acad Sci USA. 2004, 101: 15992-15997. 10.1073/pnas.0407187101.PubMedPubMed CentralView ArticleGoogle Scholar
- Fearnhead NS, Winney B, Bodmer WF: Rare variant hypothesis for multifactorial inheritance: susceptibility to colorectal adenomas as a model. Cell Cycle. 2005, 4: 521-525.PubMedView ArticleGoogle Scholar
- Cohen JC, Pertsemlidis A, Fahmi S, Esmail S, Vega GL, Grundy SM, Hobbs HH: Multiple rare variants in NPC1L1 associated with reduced sterol absorption and plasma low-density lipoprotein levels. Proc Natl Acad Sci USA. 2006, 103: 1810-1815. 10.1073/pnas.0508483103.PubMedPubMed CentralView ArticleGoogle Scholar
- Ahituv N, Kavaslar N, Schackwitz W, Ustaszewska A, Martin J, Hebert S, Doelle H, Ersoy B, Kryukov G, Schmidt S: Medical sequencing at the extremes of human body mass. Am J Hum Genet. 2007, 80: 779-791. 10.1086/513471.PubMedPubMed CentralView ArticleGoogle Scholar
- Romeo S, Pennacchio LA, Fu Y, Boerwinkle E, Tybjaerg-Hansen A, Hobbs HH, Cohen JC: Population-based resequencing of ANGPTL4 uncovers variations that reduce triglycerides and increase HDL. Nat Genet. 2007, 39: 513-516. 10.1038/ng1984.PubMedPubMed CentralView ArticleGoogle Scholar
- Heid IM, Boes E, Müller AM, Kollerits B, Lamina C, Coassin S, Gieger C, Döring A, Klopp N, Frikke-Schmidt R: Genome-wide association analysis of high-density lipoprotein cholesterol in the population-based KORA Study sheds new light on intergenic regions. Circ Cardiovasc Genetics. 2008, 1: 10-20. 10.1161/CIRCGENETICS.108.776708.View ArticleGoogle Scholar
- Scott LJ, Mohlke KL, Bonnycastle LL, Willer CJ, Li Y, Duren WL, Erdos MR, Stringham HM, Chines PS, Jackson AU: A Genome-Wide Association Study of Type 2 Diabetes in Finns Detects Multiple Susceptibility Variants. Science. 2007, 316: 1341-1345. 10.1126/science.1142382.PubMedPubMed CentralView ArticleGoogle Scholar
- Grarup N, Rose CS, Andersson EA, Andersen G, Nielsen AL, Albrechtsen A, Clausen JO, Rasmussen SS, Jorgensen T, Sandbaek A: Studies of Association of Variants Near the HHEX, CDKN2A/B, and IGF2BP2 Genes With Type 2 Diabetes and Impaired Insulin Release in 10,705 Danish Subjects: Validation and Extension of Genome-Wide Association Studies. Diabetes. 2007, 56: 3105-3111. 10.2337/db07-0856.PubMedView ArticleGoogle Scholar
- Samani NJ, Erdmann J, Hall AS, Hengstenberg C, Mangino M, Mayer B, Dixon RJ, Meitinger T, Braund P, Wichmann HE: Genomewide Association Analysis of Coronary Artery Disease. N Engl J Med. 2007, 357: 443-453. 10.1056/NEJMoa072366.PubMedPubMed CentralView ArticleGoogle Scholar
- Kronenberg F: Genome-wide association studies in aging-related processes such as diabetes mellitus, atherosclerosis and cancer. Exp Gerontol. 2008, 43: 39-43. 10.1016/j.exger.2007.09.005.PubMedView ArticleGoogle Scholar
- Ke X, Taylor MS, Cardon LR: Singleton SNPs in the human genome and implications for genome-wide association studies. Eur J Hum Genet. 2008Google Scholar
- Till BJ, Zerr T, Bowers E, Greene EA, Comai L, Henikoff S: High-throughput discovery of rare human nucleotide polymorphisms by Ecotilling. Nucleic Acids Res. 2006, 34: e99-10.1093/nar/gkl479. Erratum in: Nucleic Acids Res. 2006;34:5352.PubMedPubMed CentralView ArticleGoogle Scholar
- Holmila R, Husgafvel-Pursiainen K: Analysis of TP53 gene mutations in human lung cancer: comparison of capillary electrophoresis single strand conformation polymorphism assay with denaturing gradient gel electrophoresis and direct sequencing. Cancer Detect Prev. 2006, 30: 1-6. 10.1016/j.cdp.2005.10.004.PubMedView ArticleGoogle Scholar
- Hestekin CN, Barron AE: The potential of electrophoretic mobility shift assays for clinical mutation detection. Electrophoresis. 2006, 27: 3805-3815. 10.1002/elps.200600421.PubMedView ArticleGoogle Scholar
- Comai L, Young K, Till BJ, Reynolds SH, Greene EA, Codomo CA, Enns LC, Johnson JE, Burtner C, Odden AR: Efficient discovery of DNA polymorphisms in natural populations by Ecotilling. Plant J. 2004, 37: 778-786. 10.1111/j.0960-7412.2003.01999.x.PubMedView ArticleGoogle Scholar
- Till BJ, Reynolds SH, Greene EA, Codomo CA, Enns LC, Johnson JE, Burtner C, Odden AR, Young K, Taylor NE: Large-scale discovery of induced point mutations with high-throughput TILLING. Genome Res. 2003, 13: 524-530. 10.1101/gr.977903.PubMedPubMed CentralView ArticleGoogle Scholar
- Colbert T, Till BJ, Tompa R, Reynolds S, Steine MN, Yeung AT, McCallum CM, Comai L, Henikoff S: High-throughput screening for induced point mutations. Plant Physiol. 2001, 126: 480-484. 10.1104/pp.126.2.480.PubMedPubMed CentralView ArticleGoogle Scholar
- Oleykowski CA, Bronson Mullins CR, Godwin AK, Yeung AT: Mutation detection using a novel plant endonuclease. Nucleic Acids Res. 1998, 26: 4597-4602. 10.1093/nar/26.20.4597.PubMedPubMed CentralView ArticleGoogle Scholar
- Zerr T, Henikoff S: Automated band mapping in electrophoretic gel images using background information. Nucl Acids Res. 2005, 33: 2806-2812. 10.1093/nar/gki580.PubMedPubMed CentralView ArticleGoogle Scholar
- Lewis BP, Shih IH, Jones-Rhoades MW, Bartel DP, Burge CB: Prediction of mammalian microRNA targets. Cell. 2003, 115: 787-798. 10.1016/S0092-8674(03)01018-3.PubMedView ArticleGoogle Scholar
- Georges M, Clop A, Marcq F, Takeda H, Pirottin D, Hiard S, Tordoir X, Caiment F, Meish F, Bibe B: Polymorphic microRNA-target interactions: a novel source of phenotypic variation. Cold Spring Harb Symp Quant Biol. 2006, 71: 343-350. 10.1101/sqb.2006.71.056.PubMedView ArticleGoogle Scholar
- Bao L, Zhou M, Wu L, Lu L, Goldowitz D, Williams RW, Cui Y: PolymiRTS Database: linking polymorphisms in microRNA target sites with complex traits. Nucleic Acids Res. 2007, 35: D51-D54. 10.1093/nar/gkl797.PubMedPubMed CentralView ArticleGoogle Scholar
- Taylor J, Tyekucheva S, King DC, Hardison RC, Miller W, Chiaromonte F: ESPERR: learning strong and weak signals in genomic sequence alignments to identify functional elements. Genome Res. 2006, 16: 1596-1604. 10.1101/gr.4537706.PubMedPubMed CentralView ArticleGoogle Scholar
- Farrall M: Quantitative genetic variation: a post-modern view. Hum Mol Genet. 2004, 13: R1-R7. 10.1093/hmg/ddh084.PubMedView ArticleGoogle Scholar
- Tompa M, Li N, Bailey TL, Church GM, De Moor B, Eskin E, Favorov AV, Frith MC, Fu Y, Kent WJ: Assessing computational tools for the discovery of transcription factor binding sites. Nat Biotech. 2005, 23: 137-144. 10.1038/nbt1053.View ArticleGoogle Scholar
- Cartharius K, Frech K, Grote K, Klocke B, Haltmeier M, Klingenhoff A, Frisch M, Bayerlein M, Werner T: MatInspector and beyond: promoter analysis based on transcription factor binding sites. Bioinformatics. 2005, 21: 2933-2942. 10.1093/bioinformatics/bti473.PubMedView ArticleGoogle Scholar
- Frech K, nescu-Mayer J, Werner T: A novel method to develop highly specific models for regulatory units detects a new LTR in GenBank which contains a functional promoter. J Mol Biol. 1997, 270: 674-687. 10.1006/jmbi.1997.1140.PubMedView ArticleGoogle Scholar
- Bryne JC, Valen E, Tang MH, Marstrand T, Winther O, da PI, Krogh A, Lenhard B, Sandelin A: JASPAR, the open access database of transcription factor-binding profiles: new content and tools in the 2008 update. Nucleic Acids Res. 2008, 36: D102-D106. 10.1093/nar/gkm955.PubMedPubMed CentralView ArticleGoogle Scholar
- Matys V, Kel-Margoulis OV, Fricke E, Liebich I, Land S, Barre-Dirrie A, Reuter I, Chekmenev D, Krull M, Hornischer K: TRANSFAC and its module TRANSCompel: transcriptional gene regulation in eukaryotes. Nucleic Acids Res. 2006, 34: D108-D110. 10.1093/nar/gkj143.PubMedPubMed CentralView ArticleGoogle Scholar
- Sun H, Palaniswamy SK, Pohar TT, Jin VX, Huang TH, Davuluri RV: MPromDb: an integrated resource for annotation and visualization of mammalian gene promoters and ChIP-chip experimental data. Nucleic Acids Res. 2006, 34: D98-103. 10.1093/nar/gkj096.PubMedPubMed CentralView ArticleGoogle Scholar
- Schmid CD, Perier R, Praz V, Bucher P: EPD in its twentieth year: towards complete promoter coverage of selected model organisms. Nucleic Acids Res. 2006, 34: D82-D85. 10.1093/nar/gkj146.PubMedPubMed CentralView ArticleGoogle Scholar
- Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007, 447: 799-816. 10.1038/nature05874.PubMedView ArticleGoogle Scholar
- Mishra PJ, Mishra PJ, Banerjee D, Bertino JR: MiRSNPs or MiR-polymorphisms, new players in microRNA mediated regulation of the cell: Introducing microRNA pharmacogenomics. Cell Cycle. 2008, 7: 853-858.PubMedView ArticleGoogle Scholar
- Lewis BP, Burge CB, Bartel DP: Conserved seed pairing, often flanked by adenosines, indicates that thousands of human genes are microRNA targets. Cell. 2005, 120: 15-20. 10.1016/j.cell.2004.12.035.PubMedView ArticleGoogle Scholar
- Xie X, Lu J, Kulbokas EJ, Golub TR, Mootha V, Lindblad-Toh K, Lander ES, Kellis M: Systematic discovery of regulatory motifs in human promoters and 3' UTRs by comparison of several mammals. Nature. 2005, 434: 338-345. 10.1038/nature03441.PubMedPubMed CentralView ArticleGoogle Scholar
- Mardis ER: The impact of next-generation sequencing technology on genetics. Trends in Genetics. 2008, 24: 133-141.PubMedView ArticleGoogle Scholar
- Cordeiro G, Eliott FG, Henry RJ: An optimized ecotilling protocol for polyploids or pooled samples using a capillary electrophoresis system. Anal Biochem. 2006, 355: 145-147. 10.1016/j.ab.2006.03.026.PubMedView ArticleGoogle Scholar
- Cross MJ, Waters DL, Lee LS, Henry RJ: Endonucleolytic mutation analysis by internal labeling (EMAIL). Electrophoresis. 2008, 29: 1291-1301. 10.1002/elps.200700452.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.