Touring Ensembl: A practical guide to genome browsing
© Spudich and Fernández-Suárez; licensee BioMed Central Ltd. 2010
Received: 16 November 2009
Accepted: 11 May 2010
Published: 11 May 2010
The number of databases in molecular biological fields has rapidly increased to provide a large-scale resource. Though valuable information is available, data can be difficult to access, compare and integrate due to different formats and presentations of web interfaces. This paper offers a practical guide to the integration of gene, comparative genomic, and functional genomics data using the Ensembl website at http://www.ensembl.org.
The Ensembl genome browser and underlying databases focus on chordate organisms. More species such as plants and microorganisms can be investigated using our sister browser at http://www.ensemblgenomes.org.
In this study, four examples are used that sample many pages and features of the Ensembl browser. We focus on comparative studies across over 50 mostly chordate organisms, variations linked to disease, functional genomics, and access of external information housed in databases outside the Ensembl project. Researchers will learn how to go beyond simply exporting one gene sequence, and explore how a genome browser can integrate data from various sources and databases to build a full and comprehensive biological picture.
The ongoing increase in the number of databases in biological fields provides a large-scale resource. Last year saw the development of nearly 100 new molecular biological databases, bringing the total number of popular databases in this field to over 1,000 . However, different formats and presentations of the GUIs (graphical user interfaces) make it difficult to access data. Collecting biological information from various sources and comparing them can be time consuming for the researcher. Genome browsers provide an aid to the researcher by importing biological data from various sources and presenting these data in an integrated way.
Genome browser and annotation (chordates)
Genome browser and annotation (nonchordates)
Genome browser and annotation (fruit fly)
NCBI Map Viewer
Genome browser and annotation (multi-species)
NCBI Sequence Viewer
Genome browser and annotation (multi-species)
Genome browser and annotation (multi-species)
VISTA Enhancer Browser
Non-coding elements (human)
1000 Genomes Browser
Genome browser (human, multiple individuals)
Visualization tool (multiple sources)
Visualization tool (multiple sources)
DAS sources available
Ensembl DAS list
DAS sources available in Ensembl
Protein and Nucleotides RefSeq
Repository of nucleic acid and protein sequences
Repository of protein sequence (manually curated)
Data mining tool for export of tables and sequences
Open source software for molecular biology
Construction and analysis of phylogenetic trees
Regulatory Features CisRed
Database: Regulatory sequences
DNase I Footprint
Database: Transcription factor binding sites (fly)
Database: miRNA targets (multi-species)
We focus on the Ensembl genome browser in this article, though a similar approach can be used with other genome browsers shown in table 1. The Ensembl project focuses on the chordate genomes, with the inclusion of additional model organisms that have been extensively studied in biological research and have a reliable, manually annotated gene set (Caenorhabditis elegans, Drosophila melanogaster and Saccharomyces cerevisiae). In addition to providing carefully predicted gene sets based on experimental evidence (sequences from UniProtKB/Swiss-Prot , manually-curated sequences from NCBI RefSeq , and sequences from UniProtKB/TrEMBL (Table 1)), Ensembl includes annotation such as sequence variation, comparative associations, mRNA and protein from other databases, predicted features such as CpG islands , and repeats and motifs mapped along the genome. These annotations are graphically depicted along the genomic assembly in order to allow easier visualisation of a gene neighbourhood or a stretch of sequence.
Ensembl and other browsers provide displays of complex data sets that require time and computing power not generally available to the researcher. Homology relationships based on gene comparisons across all annotated species in Ensembl (53 species in release 55), along with whole-genome alignments, such as alignments of 31 mammalian genomes, can be readily viewed in the browser.
In the following four case studies, we use the Ensembl genome browser to demonstrate how to view and predict functional regions in the genome based on existing evidence. First, we examine known regulatory features for the human IL2 gene and discuss how to display these features in Ensembl. These promoter and enhancer-related elements can be readily exported using the BioMart tool [5–7].
In study 2, we use human MYO6, a case in which gene regulation is not well-understood. Using comparative genomics, we show how the location of functional sequences may be predicted. In case study 3, we demonstrate how the information in Ensembl can be extended through DAS (the Distributed Annotation System) to view data from external sources. Finally, in study 4, we explore a variation associated with disease phenotypes.
These case studies aim to show how data from different sources can be viewed and compared for a gene or region in Ensembl. For a walk-through of how to use the browser to view comparative genomics, variations, and other Ensembl resources, please see our videos and previous publications [10, 11]
Case Study 1: Regulatory Regions for the IL2 Gene
We investigate IL2, the interleukin 2 gene, in human (ENSG00000109471). Gene regulation has been studied at the 5' end of the IL2 transcript and flanking sequence [12–14]. Within only 200 bp upstream of the translational start site, binding sites for proteins such as NF-κB, AP-1, and NFAT (nuclear factor of activated T-cells), DNase I hypersensitive sites and a TATA box can all be found. These regions have been shown to be involved in the control of T-cell mediated immune response[15, 16].
Pop-up windows reveal more information for each track if a feature is clicked. In figure 1, the pop-up window indicates a CTCF binding site in the regulatory features track. CTCF proteins are highly conserved zinc finger proteins associated with transcriptional activation and repression. Mutations in these genes are associated with invasive breast cancers, prostrate cancers and Wilms' tumours[21, 22]. These sites have been recently and extensively mapped onto the human genome and are included in Ensembl as part of the regulatory build.
Regulatory features can be exported using the BioMart tool, or accessed via the Perl API from the Ensembl functional genomics database. An walk-through of the BioMart web interface  is provided by Smedley et. al. [5–7] Based on this, to download regulatory features, choose the database as "Ensembl functional genomics" and the dataset as the species of interest. Filters can be applied to select by a region (for example chromosome) or a specific type of regulatory feature (such as DNase I hypersensitive site). Attributes output information (such as chromosomal coordinates, or cell type) about these specific features. For more information about feature sources, and the Ensembl regulatory build, see Ensembl documentation .
The "constrained elements" blocks (Figure 1, label 1) are genomic regions that are highly conserved across 33 species, in this example. Constrained elements result from GERP-scoring of each base pair position within a multi-species alignment. High GERP scores represent the most conserved base pairs, and correspond to blocks in the 'conservation' track. The constrained elements in figure 1 align to the 5' and 3' ends of the Ensembl transcript for IL2, and align with regulatory regions, indicating regions of high sequence conservation and thus, possible function.
A third track displays data from 'CisRED' a database of patterns and motifs associated with regulatory regions, 'miRanda' a collection of miRNA targets identified in the genome, and the 'VISTA' enhancer set  (Figure 1, label 2) Features in this track align to the flanking regions to the IL2 coding sequence, and to the conserved sequence blocks.
The alignment display is highly customisable. Numbering can be turned on or off, and exons highlighted. Pairwise comparisons or multiple alignments can be displayed at the nucleotide level. Alignments can be exported using the export data link at the left of the view.
Case Study 2: Function for a Gene
In case 1, we investigated a gene for which there is information already known about promoter and enhancer elements. Although most human genes in Ensembl are labelled as 'known', signifying a good match to a cDNA or protein in a biological database such as UniProt or NCBI RefSeq, many of these genes have un-investigated regulatory sequences. In addition, many proteins have unknown function. How can we predict function for a protein that is not well-understood in terms of its role in the cell?
In this example we consider human MYO6, ENSG00000196586, which has been studied in the mouse model to understand its role in endocytosis and inner-ear development[30, 31]. What is known about this gene? We can first look for mouse homologues for the human MYO6 gene ENSG00000196586. Do so by clicking on the orthologues link at the left of the gene tab for ENSG0000196586. At the time of writing, one mouse orthologue is known for human MYO6 (in Ensembl release 55): ENSMUSG00000033577.
Many GO terms for the human MYO6 transcript have been projected from mouse homologues (one example is shown in figure 4A). Clicking on the mouse protein identifier ENSMUSP00000108893, then on the Gene ontology link at the left shows the GO terms associated to the mouse protein. Protein binding is 'inferred from physical interaction (IPI)' in transcript ENSMUST00000113268.
The same GO term is listed for the human MYO6 gene in figure 4A, based on homology to the mouse Myo6 gene. The evidence code 'IEA' or 'inferred from electronic annotation' demonstrates a projected GO term. They may aid in predicting functions for a protein, based on homology.
Identifying sequences involved in gene regulation is also important in understanding function. In case 1 we looked at the region upstream of the IL2 gene, which is rich with known regulatory regions. For the human MYO6 gene, we can make some predictions using a similar approach to case 1.
The constrained elements track, and CisRED/miRANDA/VISTA features are also selected in this example. These indicate regions that may function in gene regulation.
In addition, more elements associated with regulatory regions can be displayed along the genome in this view. For example, other elements associated with promoters such as CpG islands[4, 40, 41], or those determined with FirstEF  or Eponine can be selected using the configure this page option at the left.
Conclusions from this case study can be drawn from the GO term associations and the putative regulatory regions. Proposed functions for the human MYO6 gene and protein include actin filament binding and regulation of secretion . These are based on the known functions of the human MYO6 gene homologous to mouse Myo6, based on the gene tree. Furthermore, the regulatory build indicates signatures of open chromatin such as CTCF binding sites, DNase I hypersensitive sites, along with histone modification sites. Open chromatin and histone modification sites at the 5' end of MYO6 transcripts suggest a potential regulatory region (Figure 5). This sequence could be further investigated for promoter activity.
Case Study 3: Viewing information outside Ensembl databases
The Distributed Annotation System (DAS) allows Ensembl to link out to and display information from external databases in supported formats. DAS transforms Ensembl into a framework where third party annotation can be added and viewed alongside Ensembl annotation. The DAS registry  provides a repository of external sources, and makes it easy for users to select these data to be displayed in Ensembl. These data can be viewed in the browser along the genome, or as annotation for a gene or transcript. This powerful system integrates data from databases around the world, and is available for all species.
To add DAS tracks to the Location views (Such as Region in Detail, shown in figure 6), users can click on the configure this page link at the left. A greater selection of DAS tracks is found upon clicking manage your data at the left, and then following the Attach DAS link to access the DAS registry. In addition to viewing 'live' external data with DAS, users may draw their own tracks along the chromosome. User data can be displayed in Location views, such as Region in Detail, a chromosome or karyotype.
Case Study 4: From phenotype to SNP- exploring variation
Population variation in Ensembl is imported from NCBI dbSNP, among other sources, and is represented in a variety of views . Clicking on a variation identifier within the Ensembl website opens the variation tab and brings the focus to data for one specific variation, such as a single nucleotide polymorphism (SNP) or insertion-deletion (indel) mutation. Associated data such as allele frequencies from genotype studies done by HapMap  or Perlegen, or the phenotype information described above can be found in this way.
Results and Discussion
Genome browsers have gone beyond the simple display of genes and transcripts, moving into the integration of biological data. Ensembl pages allow information annotated on a genome to be shown alongside genes in one display. This annotation comes from various sources and includes sequence variation, conserved regions, motifs such as CpG islands and sequences associated with regulatory regions and promoters. DAS allows Ensembl to draw together more information in more databases, displaying data from external sources as an added layer of information. It also allows the biological community to display and publish their data in an integrated framework. Furthermore, Ensembl itself is a DAS server, and other browsers may display Ensembl data as a respective external source.
As demonstrated in the case studies outlined here, experimentalists targeting potential functional regions for a gene could use a quick display of a variety of sequence features to form a basis for such predictions. The whole genome alignments leading to comparison of sequences across species can indicate important functional regions that are highly conserved. Regulatory features and associated motifs can be compared with these conserved regions to direct researchers towards undiscovered, potentially functional sites.
- Galperin MY, Cochrane GR: Nucleic Acids Research annual Database Issue and the NAR online Molecular Biology Database Collection in 2009. Nucleic Acids Res. 2009, D1-4. 10.1093/nar/gkn942. 37 Database
- Bairoch A, Apweiler R, Wu CH, Barker WC, Boeckmann B, Ferro S, Gasteiger E, Huang H, Lopez R, Magrane M, Martin MJ, Natale DA, O'Donovan C, Redaschi N, Yeh LS: The Universal Protein Resource (UniProt). Nucleic Acids Res. 2005, D154-9. 33 Database
- Pruitt KD, Tatusova T, Maglott DR: NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2007, D61-5. 10.1093/nar/gkl842. 35 Database
- Antequera F, Bird A: CpG islands as genomic footprints of promoters that are associated with replication origins. Curr Biol. 1999, 9: R661-7. 10.1016/S0960-9822(99)80418-7.PubMedView ArticleGoogle Scholar
- Kasprzyk A, Keefe D, Smedley D, London D, Spooner W, Melsopp C, Hammond M, Rocca-Serra P, Cox T, Birney E: EnsMart: a generic system for fast and flexible access to biological data. Genome Res. 2004, 14 (1): 160-169. 10.1101/gr.1645104.PubMed CentralPubMedView ArticleGoogle Scholar
- Haider S, Ballester B, Smedley D, Zhang J, Rice P, Kasprzyk A: BioMart Central Portal--unified access to biological data. Nucleic Acids Res. 2009, W23-7. 10.1093/nar/gkp265. 37 Web Server
- Smedley D, Haider S, Ballester B, Holland R, London D, Thorisson G, Kasprzyk A: BioMart--biological queries made easy. BMC Genomics. 2009, 10: 22-10.1186/1471-2164-10-22.PubMed CentralPubMedView ArticleGoogle Scholar
- Dowell RD, Jokerst RM, Day A, Eddy SR, Stein L: The distributed annotation system. BMC Bioinformatics. 2001, 2 (1): 7-10.1186/1471-2105-2-7.PubMed CentralPubMedView ArticleGoogle Scholar
- Ensembl Tutorials and Worked Examples. [http://www.ensembl.org/info/website/tutorials/index.html]
- Spudich G, Fernandez-Suarez XM, Birney E: Genome browsing with Ensembl: a practical overview. Brief Funct Genomic Proteomic. 2007, 6 (3): 202-219. 10.1093/bfgp/elm025.PubMedView ArticleGoogle Scholar
- Fernández-Suárez XM, Schuster MK: Using the Ensembl Genome Server to Browse Genomic Sequence Data. Current Protocols in Bioinformatics. 2010, 1.15.1-1.15.48. Supplement 30
- Fujita T, Takaoka C, Matsui H, Taniguchi T: Structure of the human interleukin 2 gene. Proc Natl Acad Sci USA. 1983, 8: 7437-7441. 10.1073/pnas.80.24.7437.View ArticleGoogle Scholar
- Jankevics E, Makarenkova G, Tsimanis A, Grens E: Structure and analysis of the 5' flanking region of the human interleukin-2 gene. Biochim Biophys Acta. 1994, 1217 (2): 235-238.PubMedView ArticleGoogle Scholar
- Jain J, Loh C, Rao A: Transcriptional regulation of the IL-2 gene. Curr Opin Immunol. 1995, 7 (3): 333-342. 10.1016/0952-7915(95)80107-3.PubMedView ArticleGoogle Scholar
- Wu Y, Borde M, Heissmeyer V, Feuerer M, Lapan AD, Stroud JC, Bates DL, Guo L, Han A, Ziegler SF, Mathis D, Benoist C, Chen L, Rao A: FOXP3 controls regulatory T cell function through cooperation with NFAT. Cell. 2006, 12: 375-387. 10.1016/j.cell.2006.05.042.View ArticleGoogle Scholar
- Ono M, Yaguchi H, Ohkura N, Kitabayashi I, Nagamura Y, Nomura T, Miyachi Y, Tsukada T, Sakaguchi S: Foxp3 controls regulatory T-cell function by interacting with AML1/Runx1. Nature. 2007, 446 (7136): 685-689. 10.1038/nature05673.PubMedView ArticleGoogle Scholar
- ENCODE Project Consortium: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007, 447 (7146): 799-816. 10.1038/nature05874.View ArticleGoogle Scholar
- Horak CE, Snyder M: ChIP-chip: a genomic approach for identifying transcription factor binding sites. Methods Enzymol. 2002, 350: 469-483. full_text.PubMedView ArticleGoogle Scholar
- Robertson G, Hirst M, Bainbridge M, Bilenky M, Zhao Y, Zeng T, Euskirchen G, Bernier B, Varhol R, Delaney A, Thiessen N, Griffith OL, He A, Marra M, Snyder M, Jones S: Genome-wide profiles of STAT1 DNA association using chromatin immunoprecipitation and massively parallel sequencing. Nat Methods. 2007Google Scholar
- Barski A, Cuddapah S, Cui K, Roh TY, Schones DE, Wang Z, Wei G, Chepelev I, Zhao K: High-resolution profiling of histone methylations in the human genome. Cell. 2007, 129 (4): 823-837. 10.1016/j.cell.2007.05.009.PubMedView ArticleGoogle Scholar
- Ohlsson R, Renkawitz R, Lobanenkov V: CTCF is a uniquely versatile transcription regulator linked to epigenetics and disease. Trends Genet. 2001, 17 (9): 520-527. 10.1016/S0168-9525(01)02366-6.PubMedView ArticleGoogle Scholar
- Hancock AL, Brown KW, Moorwood K, Moon H, Holmgren C, Mardikar SH, Dallosso AR, Klenova E, Loukinov D, Ohlsson R, Lobanenkov VV, Malik K: A CTCF-binding silencer regulates the imprinted genes AWT1 and WT1-AS and exhibits sequential epigenetic defects during Wilms' tumourigenesis. Hum Mol Genet. 2007, 16 (3): 343-354. 10.1093/hmg/ddl478.PubMedView ArticleGoogle Scholar
- Kim TH, Abdullaev ZK, Smith AD, Ching KA, Loukinov DI, Green RD, Zhang MQ, Lobanenkov VV, Ren B: Analysis of the vertebrate insulator protein CTCF-binding sites in the human genome. Cell. 2007, 128 (6): 1231-1245. 10.1016/j.cell.2006.12.048.PubMed CentralPubMedView ArticleGoogle Scholar
- ensembl.org. [http://www.ensembl.org/biomart/martview]
- ensembl.org regulatory build. [http://www.ensembl.org/info/docs/funcgen/index.html]
- Cooper GM, Stone EA, Asimenos G, NISC Comparative Sequencing Program, Green ED, Batzoglou S, Sidow A: Distribution and intensity of constraint in mammalian genomic sequence. Genome Res. 2005, 15 (7): 901-913. 10.1101/gr.3577405.PubMed CentralPubMedView ArticleGoogle Scholar
- Robertson G, Bilenky M, Lin K, He A, Yuen W, Dagpinar M, Varhol R, Teague K, Griffith OL, Zhang X, Pan Y, Hassel M, Sleumer MC, Pan W, Pleasance ED, Chuang M, Hao H, Li YY, Robertson N, Fjell C, Li B, Montgomery SB, Astakhova T, Zhou J, Sander J, Siddiqui AS, Jones SJ: cisRED: a database system for genome-scale computational discovery of regulatory elements. Nucleic Acids Res. 2006, D68-73. 10.1093/nar/gkj075. 34 Database
- Enright AJ, John B, Gaul U, Tuschl T, Sander C, Marks DS: MicroRNA targets in Drosophila. Genome Biol. 2003, 5 (1): R1-10.1186/gb-2003-5-1-r1.PubMed CentralPubMedView ArticleGoogle Scholar
- enhancer. [http://enhancer.lbl.gov/]
- Osterweil E, Wells DG, Mooseker MS: A role for myosin VI in postsynaptic structure and glutamate receptor endocytosis. J Cell Biol. 2005, 168 (2): 329-338. 10.1083/jcb.200410091.PubMed CentralPubMedView ArticleGoogle Scholar
- Self T, Sobe T, Copeland NG, Jenkins NA, Avraham KB, Steel KP: Role of myosin VI in the differentiation of cochlear hair cells. Dev Biol. 1999, 214 (2): 331-341. 10.1006/dbio.1999.9424.PubMedView ArticleGoogle Scholar
- Gene MYO6. [http://Mar2010.archive.ensembl.org/Homo_sapiens/Gene/Compara_Ortholog?g=ENSG00000196586]
- Vilella AJ, Severin J, Ureta-Vidal A, Heng L, Durbin R, Birney E: EnsemblCompara GeneTrees: Complete, duplication-aware phylogenetic trees in vertebrates. Genome Res. 2009, 19 (2): 327-335. 10.1101/gr.073585.107.PubMed CentralPubMedView ArticleGoogle Scholar
- Waterhouse AM, Procter JB, Martin DM, Clamp M, Barton GJ: Jalview Version 2--a multiple sequence alignment editor and analysis workbench. Bioinformatics. 2009, 25 (9): 1189-1191. 10.1093/bioinformatics/btp033.PubMed CentralPubMedView ArticleGoogle Scholar
- Harris MA, Clark J, Ireland A, Lomax J, Ashburner M, Foulger R, Eilbeck K, Lewis S, Marshall B, Mungall C, Richter J, Rubin GM, Blake JA, Bult C, Dolan M, Drabkin H, Eppig JT, Hill DP, Ni L, Ringwald M, Balakrishnan R, Cherry JM, Christie KR, Costanzo MC, Dwight SS, Engel S, Fisk DG, Hirschman JE, Hong EL, Nash RS, Sethuraman A, Theesfeld CL, Botstein D, Dolinski K, Feierbach B, Berardini T, Mundodi S, Rhee SY, Apweiler R, Barrell D, Camon E, Dimmer E, Lee V, Chisholm R, Gaudet P, Kibbe W, Kishore R, Schwarz EM, Sternberg P, Gwinn M, Hannick L, Wortman J, Berriman M, Wood V, de la Cruz N, Tonellato P, Jaiswal P, Seigfried T, White R, Gene Ontology Consortium: The Gene Ontology (GO) database and informatics resource. Nucleic Acids Res. 2004, D258-61. 32 Database
- Ashburner M, Ball CA, Blake JA, Botstein D, Butler H, Cherry JM, Davis AP, Dolinski K, Dwight SS, Eppig JT, Harris MA, Hill DP, Issel-Tarver L, Kasarskis A, Lewis S, Matese JC, Richardson JE, Ringwald M, Rubin GM, Sherlock G: Gene ontology: tool for the unification of biology. The Gene Ontology Consortium. Nat Genet. 2000, 25 (1): 25-29. 10.1038/75556.PubMed CentralPubMedView ArticleGoogle Scholar
- geneontology. [http://www.geneontology.org]
- Ensembl 57: Gene Ontology (MYO6). [http://Mar2010.archive.ensembl.org/Homo_sapiens/Transcript/GO?g=ENSG00000196586;t=ENST00000428345]
- Ensembl 57: Gene Ontology (Mouse Myo6). [http://Mar2010.archive.ensembl.org/Mus_musculus/Transcript/GO?db=core;t=ENSMUST00000113268]
- Larsen F, Gundersen G, Lopez R, Prydz H: CpG islands as gene markers in the human genome. Genomics. 1992, 13 (4): 1095-1107. 10.1016/0888-7543(92)90024-M.PubMedView ArticleGoogle Scholar
- Bird A: CpG-rich islands and the function of DNA methylation. Nature. 1986, 321: 209-10.1038/321209a0.PubMedView ArticleGoogle Scholar
- Davuluri RV, Grosse I, Zhang MQ: Computational identification of promoters and first exons in the human genome. Nat Genet. 2001, 29 (4): 412-417. 10.1038/ng780.PubMedView ArticleGoogle Scholar
- Down TA, Hubbard TJ: Computational detection and location of transcription start sites in mammalian genomic DNA. Genome Res. 2002, 12 (3): 458-461. 10.1101/gr.216102.PubMed CentralPubMedView ArticleGoogle Scholar
- The DAS Registry. [http://www.dasregistry.org/]
- Adams DJ, Biggs PJ, Cox T, Davies R, Weyden van der L, Jonkers J, Smith J, Plumb B, Taylor R, Nishijima I, Yu Y, Rogers J, Bradley A: Mutagenic insertion and chromosome engineering resource (MICER). Nat Genet. 2004, 36 (8): 867-871. 10.1038/ng1388.PubMedView ArticleGoogle Scholar
- Archive Ensembl. [http://Mar2010.archive.ensembl.org/Mus_musculus/Location/Genome]
- Hindorff LA, Sethupathy P, Junkins HA, Ramos EM, Mehta JP, Collins FS, Manolio TA: Potential etiologic and functional implications of genome-wide association loci for human diseases and traits. Proc Natl Acad Sci USA. 2009, 106 (23): 9362-9367. 10.1073/pnas.0903103106.PubMed CentralPubMedView ArticleGoogle Scholar
- Sayers EW, Barrett T, Benson DA, Bryant SH, Canese K, Chetvernin V, Church DM, DiCuccio M, Edgar R, Federhen S, Feolo M, Geer LY, Helmberg W, Kapustin Y, Landsman D, Lipman DJ, Madden TL, Maglott DR, Miller V, Mizrachi I, Ostell J, Pruitt KD, Schuler GD, Sequeira E, Sherry ST, Shumway M, Sirotkin K, Souvorov A, Starchenko G, Tatusova TA, Wagner L, Yaschenko E, Ye J: Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2009, D5-15. 10.1093/nar/gkn741. 37 Database
- Ensembl Variation. [http://Mar2010.archive.ensembl.org/info/docs/variation/index.html]
- Chen J, Cunningham F, Rios D, McLaren WM, Smith J, Pritchard B, Spudich GM, Brent S, Kulesha E, Marin-Garcia P, Smedley D, Birney E, Flicek P: Ensembl Variation Resources. BMC Genomics. 2010, 11: 293-10.1186/1471-2164-11-293.PubMed CentralPubMedView ArticleGoogle Scholar
- International HapMap Consortium, Frazer KA, Ballinger DG, Cox DR, Hinds DA, Stuve LL, Gibbs RA, Belmont JW, Boudreau A, Hardenbol P, Leal SM, Pasternak S, Wheeler DA, Willis TD, Yu F, Yang H, Zeng C, Gao Y, Hu H, Hu W, Li C, Lin W, Liu S, Pan H, Tang X, Wang J, Wang W, Yu J, Zhang B, Zhang Q, Zhao H, Zhao H, Zhou J, Gabriel SB, Barry R, Blumenstiel B, Camargo A, Defelice M, Faggart M, Goyette M, Gupta S, Moore J, Nguyen H, Onofrio RC, Parkin M, Roy J, Stahl E, Winchester E, Ziaugra L, Altshuler D, Shen Y, Yao Z, Huang W, Chu X, He Y, Jin L, Liu Y, Shen Y, Sun W, Wang H, Wang Y, Wang Y, Xiong X, Xu L, Waye MM, Tsui SK, Xue H, Wong JT, Galver LM, Fan JB, Gunderson K, Murray SS, Oliphant AR, Chee MS, Montpetit A, Chagnon F, Ferretti V, Leboeuf M, Olivier JF, Phillips MS, Roumy S, Sallee C, Verner A, Hudson TJ, Kwok PY, Cai D, Koboldt DC, Miller RD, Pawlikowska L, Taillon-Miller P, Xiao M, Tsui LC, Mak W, Song YQ, Tam PK, Nakamura Y, Kawaguchi T, Kitamoto T, Morizono T, Nagashima A, Ohnishi Y, Sekine A, Tanaka T, Tsunoda T, Deloukas P, Bird CP, Delgado M, Dermitzakis ET, Gwilliam R, Hunt S, Morrison J, Powell D, Stranger BE, Whittaker P, Bentley DR, Daly MJ, de Bakker PI, Barrett J, Chretien YR, Maller J, McCarroll S, Patterson N, Pe'er I, Price A, Purcell S, Richter DJ, Sabeti P, Saxena R, Schaffner SF, Sham PC, Varilly P, Altshuler D, Stein LD, Krishnan L, Smith AV, Tello-Ruiz MK, Thorisson GA, Chakravarti A, Chen PE, Cutler DJ, Kashuk CS, Lin S, Abecasis GR, Guan W, Li Y, Munro HM, Qin ZS, Thomas DJ, McVean G, Auton A, Bottolo L, Cardin N, Eyheramendy S, Freeman C, Marchini J, Myers S, Spencer C, Stephens M, Donnelly P, Cardon LR, Clarke G, Evans DM, Morris AP, Weir BS, Tsunoda T, Mullikin JC, Sherry ST, Feolo M, Skol A, Zhang H, Zeng C, Zhao H, Matsuda I, Fukushima Y, Macer DR, Suda E, Rotimi CN, Adebamowo CA, Ajayi I, Aniagwu T, Marshall PA, Nkwodimmah C, Royal CD, Leppert MF, Dixon M, Peiffer A, Qiu R, Kent A, Kato K, Niikawa N, Adewole IF, Knoppers BM, Foster MW, Clayton EW, Watkin J, Gibbs RA, Belmont JW, Muzny D, Nazareth L, Sodergren E, Weinstock GM, Wheeler DA, Yakub I, Gabriel SB, Onofrio RC, Richter DJ, Ziaugra L, Birren BW, Daly MJ, Altshuler D, Wilson RK, Fulton LL, Rogers J, Burton J, Carter NP, Clee CM, Griffiths M, Jones MC, McLay K, Plumb RW, Ross MT, Sims SK, Willey DL, Chen Z, Han H, Kang L, Godbout M, Wallenburg JC, L'Archeveque P, Bellemare G, Saeki K, Wang H, An D, Fu H, Li Q, Wang Z, Wang R, Holden AL, Brooks LD, McEwen JE, Guyer MS, Wang VO, Peterson JL, Shi M, Spiegel J, Sung LM, Zacharia LF, Collins FS, Kennedy K, Jamieson R, Stewart J: A second generation human haplotype map of over 3.1 million SNPs. Nature. 2007, 449 (7164): 851-861. 10.1038/nature06258.View ArticleGoogle Scholar
- Peacock E, Whiteley P: Perlegen sciences, inc. Pharmacogenomics. 2005, 6 (4): 439-442. 10.1517/14622422.214.171.1249.PubMedView ArticleGoogle Scholar
- Genome Reference Consortium. [http://www.ncbi.nlm.nih.gov/projects/genome/assembly/grc/index.shtml]
- Ensembl 57: Region in Detail (IL2). [http://Mar2010.archive.ensembl.org/Homo_sapiens/Location/View?r=4:123371793-123378703]
- Ensembl 57: Genomic Alignments (IL2). [http://Mar2010.archive.ensembl.org/Homo_sapiens/Gene/Compara_Alignments?align=457&db=core&g=ENSG00000109471&r=4%3A123372630-123377650&t=ENST00000226730]
- Thomas MC, Chiang CM: The general transcription machinery and general cofactors. Crit Rev Biochem Mol Biol. 2006, 41 (3): 105-178. 10.1080/10409230600648736.PubMedView ArticleGoogle Scholar
- Shaw JP, Utz PJ, Durand DB, Toole JJ, Emmel EA, Crabtree GR: Identification of a putative regulator of early T cell activation genes. Science. 1988, 241 (4862): 202-205. 10.1126/science.3260404.PubMedView ArticleGoogle Scholar
- Ensembl 57: Gene Tree (MYO6). [http://Mar2010.archive.ensembl.org/Homo_sapiens/Gene/Compara_Tree?g=ENSG00000196586]
- Gene Ontology. [http://www.geneontology.org/GO.evidence.shtml]
- Ensembl 57: Region in Detail (MYO6). [http://Mar2010.archive.ensembl.org/Homo_sapiens/Location/View?db=core;g=ENSG00000196586;t=ENST00000428345;r=6:76441551-76501760]
- Ensembl 57: Region in Detail (mouse IL2). [http://Mar2010.archive.ensembl.org/Mus_musculus/Location/View?r=3:37018158-37039973]
- Ensembl 57: Phenotype Data (rs2476601). [http://Mar2010.archive.ensembl.org/Homo_sapiens/Variation/Phenotype?r=1:114372569-114382568;v=rs2476601;vdb=variation;vf=13431890]
- Ensembl 57: Region in Detail (PTPN22). [http://Mar2010.archive.ensembl.org/Homo_sapiens/Location/View?r=1:114372569-114382568;v=rs2476601;vdb=variation;vf=13431890]
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.