Genomic sequencing and analysis of a Chinese hamster ovary cell line using Illumina sequencing technology
© Hammond et al; licensee BioMed Central Ltd. 2011
Received: 14 September 2010
Accepted: 26 January 2011
Published: 26 January 2011
Chinese hamster ovary (CHO) cells are among the most widely used hosts for therapeutic protein production. Yet few genomic resources are available to aid in engineering high-producing cell lines.
High-throughput Illumina sequencing was used to generate a 1x genomic coverage of an engineered CHO cell line expressing secreted alkaline phosphatase (SEAP). Reference-guided alignment and assembly produced 3.57 million contigs and CHO-specific sequence information for ~ 18,000 mouse and ~ 19,000 rat orthologous genes. The majority of these genes are involved in metabolic processes, cellular signaling, and transport and represent attractive targets for cell line engineering.
This demonstrates the applicability of next-generation sequencing technology and comparative genomic analysis in the development of CHO genomic resources.
With over half of all recombinant therapeutic proteins produced in mammalian cell lines, Chinese hamster ovary (CHO) cells remain the predominant production system for glycosylated biopharmaceuticals . Although improvements in cell engineering, cell line selection, and culture conditions have increased productivity levels , the genetic basis underlying hyperproductivity remains poorly defined. The further development of genomic resources will facilitate detailed studies of genome structure, gene regulation, and gene expression in high-producing cell lines and aid in the use of sequence-specific molecular tools in cell line development.
A number of resources are required to support the assembly and annotation of the CHO genome including physical maps, genomic sequences, expressed sequence tag (EST) sequences, and proteomic data. Recent efforts to sequence and characterize bacterial artificial chromosome (BAC) libraries derived from CHO cells provide information for physical mapping of the CHO genome [3, 4]. Transcriptomic and proteomic studies are currently used to examine differential expression of high-producing cell lines and to identify gene candidates for host cell engineering [5–7]. Transcriptomic studies which rely on cross-hybridization to mouse DNA microarrays showed some success [8, 9], but also demonstrated the need for CHO-specific sequence information. Continued EST sequencing of CHO cells lines has generated databases containing more than 60,000 sequences and allowed for the development of CHO-specific DNA microarrays [10, 11]. Furthermore, mapping of CHO EST sequences to a mouse genomic scaffold can potentially reveal structural and regulatory features of the CHO genome . Such studies are limited in that only a subset of genes expressed at sufficiently high levels are captured for sequence analysis, providing little information regarding genome structure or non-transcribed portions of the genome.
At present, there is little genomic sequence data available for CHO cells. This limits the application of high-throughput molecular tools in gene discovery and cell line engineering. CHO cell lines also undergo multiple genomic rearrangements during the generation of high-producing cell lines, necessitating the sequencing of individual cell lines rather than the Chinese hamster [13, 14]. Until recently, EST sequences were obtained by traditional Sanger technology , but current efforts are employing next-generation sequencing technologies including 454 and Illumina [11, 16, 17]. 454 pyrosequencing can generate up to 1 Gb of data in a single run, producing average read lengths of 330 bp with an average error rate of 4%, although a major limitation of this technology is the resolution of homopolymer regions [18, 19]. Illumina sequencing can produce up to 90 Gb of data in a single run, generating reads up to 100 bp in length with an average error rate of 1-1.5% [19, 20]. These technologies have significantly improved sequencing throughput and decreased cost, making mammalian genome sequencing feasible .
In this work, Illumina sequencing technology was used to generate an initial genomic sequence library of a Chinese hamster cell line with the goal of making these data available to the community. Comparative genomic analysis of this library was used to identify and functionally classify assembled sequences that were aligned to mouse and rat genes. An initial ~ 1x coverage of the CHO cell genome provided CHO-specific sequence information for a large number of protein coding genes, including those from functional classes typically underrepresented in EST libraries. This demonstrates that even low coverage genomic sequencing studies of CHO cell lines can increase the amount of sequence information available for this cell line.
Illumina sequencing and reference-guided alignment
Alignment of CHO short reads to mouse and rat reference genomes
Aligned to unique regions
Aligned to repeat regions
Aligned to protein-coding genes*
Consensus sequence assembly and analysis
Summary of CHO contigs extracted from consensus sequences
< 100 bp
100 to 500 bp
> 500 bp
Summary of assembly and BLAST analysis of CHO sequences
Contigs with BLAST hits
Average% similarity of BLAST hits
Contigs hit known genes
Total unique genes hit
A comparative genomics approach was used to generate sequence-specific information for a high-producing CHO cell line with the goal of making this data publicly available. The development of CHO genomic resources will benefit not only cell line engineering efforts to enhance biopharmaceutical production but other areas of research utilizing CHO cells, such as the use of radiation hybrid mapping for comparative genomic analysis . The analysis presented here demonstrates the potential of applying Illumina sequencing in the development of CHO genomic resources. Integration of genomic sequences derived from multiple next-generation technologies, such as 454 and Illumina sequencing, with those derived from Sanger sequencing enhance genomic coverage . The inclusion of long paired-end or mate-paired libraries, with varied insert sizes, coupled with the high-throughput of next-generation sequencing technologies should also provide not only sequence but structural information required for de novo assembly of the CHO genome.
Neither the short reads nor the reference genomes were repeat-masked prior to alignment. A prevalent feature of mammalian genomes is the high content of repetitive sequences. Approximately 46% of the human genome, 37% of the mouse genome, and 40% of the rat genome are repetitive sequences [27, 28]. Repeat-masking either the short reads or reference genome would discard information about a significant fraction of the genome and would reduce coverage in an uneven manner . Recent work suggests that endogenous repetitive structures on CHO chromosomes may promote gene amplification and increase the stability of the amplified gene . Including repetitive regions in the assembly and analysis may help identify genomic structures associated with hyperproductive CHO cell lines.
Several studies employed a similar approach to successfully generate genomic resources for non-model organisms from low-coverage data [25, 34, 35]. There are inherently some limitations to a reference-guided alignment and analysis regarding sequence similarity and genomic structure. MAQ allows up to 2 mismatches within the first 28 bp of each read and does not allow for gaps in the alignment . Short reads derived from regions with less than 94% identity to the reference sequence may not be aligned . This may account for the low percentage of total CHO reads aligned to either the mouse or rat genome and suggests that the CHO contigs presented here represent highly conserved regions between CHO cells and mouse or rat. In an initial genomic sequencing of the turkey using Illumina technology, only one-third of the short 35 bp reads could be directly aligned to the chicken genome, a closely related species, suggesting that a large portion of the short reads may not be expected to align in this type of analysis .
Additionally, during alignment, the sequenced genome is scaffolded onto the reference, so the structure of the final consensus sequence may not be representative of the true genomic architecture . New methodologies are being developed to improve the consensus genomic sequences produced by reference-guided alignment [37, 38]. CHO cell lines commonly used in biopharmaceutical production have a reduced chromosome number compared to primary Chinese hamster cells . These cell lines also undergo genomic rearrangements as a result of amplification procedures used to develop high-producing cell lines [13, 14]. Therefore, the genomic structure of the Chinese hamster may not be representative of the individual cell lines and analysis of specific CHO cell lines may provide a better understanding of the structural changes associated with hyperproductivity.
Of particular interest in CHO cell lines is examining the relationship between the location of the amplified gene and productivity of the cell line. BAC libraries were recently used to examine the site and structure of the transgene vector in gene-amplified cell lines [3, 4]. The DHFR amplicon is large, up to several hundreds of thousands of nucleotides, and may contain repeated segments of the endogenous CHO genome [3, 4, 39]. The small lengths of the CHO contigs makes it unlikely that any contig will span both the DHFR amplicon and the host genome. Additionally, the transgene vector sequence is not present in the reference genome used during alignment. This makes it difficult to determine the integration site of the DHFR transgene vector in this analysis. A greater coverage of the CHO genome to permit de novo assembly of the reads will facilitate determining the integration site and copy number of the DHFR amplicon in this cell line. Increased coverage and refinement of the CHO genome will also enable detection of other copy number variants, such as insertions and deletions, and accurate SNP identification to assist cell line engineering efforts [40–42].
The complexity of the CHO genome, including the structural rearrangements that occur during gene amplification and cell line derivation, makes assembly of a genomic sequence challenging. Next-generation sequencing technologies allow for the rapid acquisition of genomic sequence from CHO cell lines. This sequencing information can be used to generate a draft genome sequence when coupled with physical maps that can be derived from BAC libraries and a CHO scaffold that can be derived from cross-species comparative analysis. Incorporation of additional sequence data from transcriptomic studies and EST libraries will be necessary for complete annotation. The development of these resources is required to fully utilize sequence-specific tools, such as DNA microarrays and RNA interference, in cell line development and to understand how gene regulation and genome structure is altered in high-producing cell lines.
Genomic library construction and Illumina sequencing
CHO cells engineered to express human secreted alkaline phosphatase (SEAP) were generated from CHO-DUK cells (ATCC 9096) as described previously . CHO-SEAP cells were maintained as adherent cultures in IMDM (Iscove's modified Dulbecco's medium, Invitrogen, Carlsbad, CA) supplemented with 10% dFBS (dialyzed fetal bovine serum, Invitrogen) and 5120 nM methotrexate (Calbiochem, San Diego, CA). Genomic DNA from CHO-SEAP cells was isolated using the Genomic DNA mini kit (Invitrogen). A single-end library was prepared using the DNA sample kit (Illumina, San Diego, CA) according to manufacturer's instructions. The genomic library was sequenced on an Illumina GA system at the Cornell University Life Sciences Core Laboratory Center (Ithaca, NY) by running 36 cycles according to manufacturer's instructions. Approximately 2.72 Gb from 75,583,814 high quality reads passed the Illumina GA Pipeline filter and were used for alignment. FASTQ files containing raw sequences and sequence qualities were deposited at the National Center for Biotechnology Information Sequence Read Archive (NCBI SRA) under the accession SRA012218. While analysis of SRA data sets is computationally challenging, rapid improvements in assembly algorithms and computational power are enabling more researchers to benefit from this type of data set.
Sequence alignment and assembly
Reference genomes for mouse chromosomes 1-19 and X (M_musculus Build 37) and rat chromosomes 1-20 and X (R_norvegicus Build 3.4) were obtained from the NCBI genomic download site . Reference guided alignment to both mouse and rat reference genomes and consensus sequence assembly was performed with MAQ 0.7.1  using default settings. MapView was used for visual inspection of alignments . Aligned reads were analyzed to determine if they mapped to unique or repetitive genomic regions, based on mapping qualities, or within protein-coding genes, based on genomic coordinates, using MATLAB (The MathWorks, Inc., Natick, MA). To verify sequencing reliability, the short read data set was aligned to the Chinese hamster mitochondrial genome (NC_007936.1) and resulted in significant homology.
Gene coverage of protein-coding gene sets
Known protein-coding gene sets for both mouse and rat were established as follows: genomic coordinates for mouse and rat genes were retrieved from Mouse Genome Informatics (MGI)  and the Rat Genome Database (RGD)  and filtered to retain only known protein-coding genes. The mouse protein-coding gene sets contains 21,691 genes from chromosomes 1-19 and X and the rat protein-coding gene set contains 26,450 genes from chromosomes 1-20 and X. Gene size is defined from the genomic coordinates from MGI and RGD and includes exons, introns, and untranslated regions.
Normalized read counts were calculated for each gene in the protein-coding gene sets to which short reads were mapped. A normalization factor was calculated by dividing the size of each gene by the average gene size in the protein-coding gene sets, with an average gene size of 44,862 bp for mouse and 34,186 bp for rat. Normalized read counts were determined by dividing the raw number of reads aligned to each gene by the normalization factor calculated for that gene.
Functional analysis of genomic assembly
Gene names and Gene Ontology (GO) terms were assigned to all contigs that shared sequence similarity with known protein-coding mouse and rat genes. Contigs were extracted from consensus sequences using Python and custom scripts. Contigs were aligned to reference genomes using BLAT  and viewed using the UCSC Genome Browser . Sequence comparisons were done using standalone BLAST from NCBI . Custom genomic databases were generated from mouse and rat reference chromosomes. Contigs were mapped to these genomic databases using BLASTN with a significance threshold of e < 1-10. BLAST outputs were parsed using Perl scripts to retrieve the best hit for each contig. Gene names and GO terms were retrieved for each contig that hit a known protein-coding gene. GO IDs for mouse (NCBIM37) and rat (RGSC3.4) genes were retrieved from ENSEMBL (release 56) using BioMart . GO analysis was performed using the CateGOrizer web tool .
This work was supported by Cornell University and the University of Delaware. The authors thank Peter Schweitzer at the Cornell Biotechnology Resource Center for Illumina sequencing and Manoj Pillay for assistance in processing consensus sequences.
- Wurm FM: Production of recombinant protein therapeutics in cultivated mammalian cells. Nat Biotechnol. 2004, 22: 1393-1398. 10.1038/nbt1026.PubMedView ArticleGoogle Scholar
- Kuystermans D, Krampe B, Swiderek H, Al-Rubeai M: Using cell engineering and omic tools for the improvement of cell culture processes. Cytotechnology. 2007, 53: 3-22. 10.1007/s10616-007-9055-6.PubMed CentralPubMedView ArticleGoogle Scholar
- Park JY, Takagi Y, Yamatani M, Honda K, Asakawa S, Shimizu N, Omasa T, Ohtake H: Identification and analysis of specific chromosomal region adjacent to exogenous Dhfr-amplified region in Chinese hamster ovary cell genome. J Biosci Bioeng. 2010, 109: 504-511. 10.1016/j.jbiosc.2009.10.019.PubMedView ArticleGoogle Scholar
- Omasa T, Cao Y, Park JY, Takagi Y, Kimura S, Yano H, Honda K, Asakawa S, Shimizu N, Ohtake H: Bacterial artificial chromosome library for genome-wide analysis of Chinese hamster ovary cells. Biotechnol Bioeng. 2009, 104: 986-994. 10.1002/bit.22463.PubMedView ArticleGoogle Scholar
- Pascoe DE, Arnott D, Papoutsakis ET, Miller WM, Andersen DC: Proteome analysis of antibody-producing CHO cell lines with different metabolic profiles. Biotechnol Bioeng. 2007, 98: 391-410. 10.1002/bit.21460.PubMedView ArticleGoogle Scholar
- Nissom PM, Sanny A, Kok YJ, Hiang YT, Chuah SH, Shing TK, Lee YY, Wong KT, Hu WS, Sim MY, Philp R: Transcriptome and proteome profiling to understanding the biology of high productivity CHO cells. Mol Biotechnol. 2006, 34: 125-140. 10.1385/MB:34:2:125.PubMedView ArticleGoogle Scholar
- Yee JC, Gerdtzen ZP, Hu WS: Comparative transcriptome analysis to unveil genes affecting recombinant protein productivity in mammalian cells. Biotechnol Bioeng. 2009, 102: 246-263. 10.1002/bit.22039.PubMedView ArticleGoogle Scholar
- Ernst W, Trummer E, Mead J, Bessant C, Strelec H, Katinger H, Hesse F: Evaluation of a genomics platform for cross-species transcriptome analysis of recombinant CHO cells. Biotechnol J. 2006, 1: 639-650. 10.1002/biot.200600010.PubMedView ArticleGoogle Scholar
- Yee JC, Wlaschin KF, Chuah SH, Nissom PM, Hu WS: Quality assessment of cross-species hybridization of CHO transcriptome on a mouse DNA oligo microarray. Biotechnol Bioeng. 2008, 101: 1359-1365. 10.1002/bit.21984.PubMedView ArticleGoogle Scholar
- Bahr SM, Borgschulte T, Kayser KJ, Lin N: Using microarray technology to select housekeeping genes in Chinese hamster ovary cells. Biotechnol Bioeng. 2009, 104: 1041-1046. 10.1002/bit.22452.PubMedView ArticleGoogle Scholar
- Kantardjieff A, Nissom PM, Chuah SH, Yusufi F, Jacob NM, Mulukutla BC, Yap M, Hu WS: Developing genomic platforms for Chinese hamster ovary cells. Biotechnol Adv. 2009, 27: 1028-1035. 10.1016/j.biotechadv.2009.05.023.PubMedView ArticleGoogle Scholar
- Wlaschin KF, Hu WS: A scaffold for the Chinese hamster genome. Biotechnol Bioeng. 2007, 98: 429-439. 10.1002/bit.21430.PubMedView ArticleGoogle Scholar
- Derouazi M, Martinet D, Besuchet Schmutz N, Flaction R, Wicht M, Bertschinger M, Hacker DL, Beckmann JS, Wurm FM: Genetic characterization of CHO production host DG44 and derivative recombinant cell lines. Biochem Biophys Res Commun. 2006, 340: 1069-1077. 10.1016/j.bbrc.2005.12.111.PubMedView ArticleGoogle Scholar
- Ruiz JC, Wahl GM: Chromosomal destabilization during gene amplification. Mol Cell Biol. 1990, 10: 3056-3066.PubMed CentralPubMedView ArticleGoogle Scholar
- Wlaschin KF, Nissom PM, Gatti Mde L, Ong PF, Arleen S, Tan KS, Rink A, Cham B, Wong K, Yap M, Hu WS: EST sequencing for gene discovery in Chinese hamster ovary cells. Biotechnol Bioeng. 2005, 91: 592-606. 10.1002/bit.20511.PubMedView ArticleGoogle Scholar
- Birzele F, Schaub J, Rust W, Clemens C, Baum P, Kaufmann H, Weith A, Schulz TW, Hildebrandt T: Into the unknown: expression profiling without genome sequence information in CHO by next generation sequencing. Nucleic Acids Res. 2010, 38: 3999-4010. 10.1093/nar/gkq116.PubMed CentralPubMedView ArticleGoogle Scholar
- Jacob NM, Kantardjieff A, Yusufi FN, Retzel EF, Mulukutla BC, Chuah SH, Yap M, Hu WS: Reaching the depth of the Chinese hamster ovary cell transcriptome. Biotechnol Bioeng. 2010, 105: 1002-1009.PubMedGoogle Scholar
- Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen YJ, Chen Z, et al: Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005, 437: 376-380.PubMed CentralPubMedGoogle Scholar
- Shendure J, Ji H: Next-generation DNA sequencing. Nat Biotechnol. 2008, 26: 1135-1145. 10.1038/nbt1486.PubMedView ArticleGoogle Scholar
- Metzker ML: Sequencing technologies - the next generation. Nat Rev Genet. 2010, 11: 31-46. 10.1038/nrg2626.PubMedView ArticleGoogle Scholar
- Hayduk EJ, Lee KH: Cytochalasin D can improve heterologous protein productivity in adherent Chinese hamster ovary cells. Biotechnol Bioeng. 2005, 90: 354-364. 10.1002/bit.20438.PubMedView ArticleGoogle Scholar
- Li H, Ruan J, Durbin R: Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008, 18: 1851-1858. 10.1101/gr.078212.108.PubMed CentralPubMedView ArticleGoogle Scholar
- Bouck J, Miller W, Gorrell JH, Muzny D, Gibbs RA: Analysis of the quality and utility of random shotgun sequencing at low redundancies. Genome Res. 1998, 8: 1074-1084.PubMed CentralPubMedGoogle Scholar
- Kirkness EF, Bafna V, Halpern AL, Levy S, Remington K, Rusch DB, Delcher AL, Pop M, Wang W, Fraser CM, Venter JC: The dog genome: survey sequencing and comparative analysis. Science. 2003, 301: 1898-1903. 10.1126/science.1086432.PubMedView ArticleGoogle Scholar
- Rasmussen DA, Noor MA: What can you do with 0.1x genome coverage? A case study based on a genome survey of the scuttle fly Megaselia scalaris (Phoridae). BMC Genomics. 2009, 10: 382-10.1186/1471-2164-10-382.PubMed CentralPubMedView ArticleGoogle Scholar
- Bai X, Zhang W, Orantes L, Jun TH, Mittapalli O, Mian MA, Michel AP: Combining next-generation sequencing strategies for rapid molecular resource development from an invasive aphid species, Aphis glycines. PLoS One. 2010, 5: e11370-10.1371/journal.pone.0011370.PubMed CentralPubMedView ArticleGoogle Scholar
- Waterston RH, Lindblad-Toh K, Birney E, Rogers J, Abril JF, Agarwal P, Agarwala R, Ainscough R, Alexandersson M, An P, et al: Initial sequencing and comparative analysis of the mouse genome. Nature. 2002, 420: 520-562. 10.1038/nature01262.PubMedView ArticleGoogle Scholar
- Gibbs RA, Weinstock GM, Metzker ML, Muzny DM, Sodergren EJ, Scherer S, Scott G, Steffen D, Worley KC, Burch PE, et al: Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature. 2004, 428: 493-521. 10.1038/nature02426.PubMedView ArticleGoogle Scholar
- Camacho C, Coulouris G, Avagyan V, Ma N, Papadopoulos J, Bealer K, Madden TL: BLAST+: architecture and applications. BMC Bioinformatics. 2009, 10: 421-10.1186/1471-2105-10-421.PubMed CentralPubMedView ArticleGoogle Scholar
- Kent WJ: BLAT--the BLAST-like alignment tool. Genome Res. 2002, 12: 656-664.PubMed CentralPubMedView ArticleGoogle Scholar
- The UCSC Genome Browser. [http://genome.ucsc.edu]
- Murphy WJ, Stanyon R, O'Brien SJ: Evolution of mammalian genome organization inferred from comparative gene mapping. Genome Biol. 2001, 2: 10.1186/gb-2001-2-6-reviews0005. REVIEWS0005Google Scholar
- Quinn NL, Levenkova N, Chow W, Bouffard P, Boroevich KA, Knight JR, Jarvie TP, Lubieniecki KP, Desany BA, Koop BF, et al: Assessing the feasibility of GS FLX Pyrosequencing for sequencing the Atlantic salmon genome. BMC Genomics. 2008, 9: 404-10.1186/1471-2164-9-404.PubMed CentralPubMedView ArticleGoogle Scholar
- Kerstens HH, Crooijmans RP, Veenendaal A, Dibbits BW, Chin AWTF, den Dunnen JT, Groenen MA: Large scale single nucleotide polymorphism discovery in unsequenced genomes using second generation high throughput sequencing technology: applied to turkey. BMC Genomics. 2009, 10: 479-10.1186/1471-2164-10-479.PubMed CentralPubMedView ArticleGoogle Scholar
- Wernersson R, Schierup MH, Jorgensen FG, Gorodkin J, Panitz F, Staerfeldt HH, Christensen OF, Mailund T, Hornshoj H, Klein A, et al: Pigs in sequence space: a 0.66X coverage pig genome survey based on shotgun sequencing. BMC Genomics. 2005, 6: 70-10.1186/1471-2164-6-70.PubMed CentralPubMedView ArticleGoogle Scholar
- Dutilh BE, Huynen MA, Strous M: Increasing the coverage of a metapopulation consensus genome by iterative read mapping and assembly. Bioinformatics. 2009, 25: 2878-2881. 10.1093/bioinformatics/btp377.PubMed CentralPubMedView ArticleGoogle Scholar
- Schneeberger K, Hagmann J, Ossowski S, Warthmann N, Gesing S, Kohlbacher O, Weigel D: Simultaneous alignment of short reads against multiple genomes. Genome Biol. 2009, 10: R98-10.1186/gb-2009-10-9-r98.PubMed CentralPubMedView ArticleGoogle Scholar
- Gnerre S, Lander ES, Lindblad-Toh K, Jaffe DB: Assisted assembly: how to improve a de novo genome assembly by using related species. Genome Biol. 2009, 10: R88-10.1186/gb-2009-10-8-r88.PubMed CentralPubMedView ArticleGoogle Scholar
- Wurm FM, Petropoulos CJ: Plasmid integration, amplification and cytogenetics in CHO cells: questions and comments. Biologicals. 1994, 22: 95-102. 10.1006/biol.1994.1015.PubMedView ArticleGoogle Scholar
- Yoon S, Xuan Z, Makarov V, Ye K, Sebat J: Sensitive and accurate detection of copy number variants using read depth of coverage. Genome Res. 2009, 19: 1586-1592. 10.1101/gr.092981.109.PubMed CentralPubMedView ArticleGoogle Scholar
- Medvedev P, Stanciu M, Brudno M: Computational methods for discovering structural variation with next-generation sequencing. Nat Methods. 2009, 6: S13-20. 10.1038/nmeth.1374.PubMedView ArticleGoogle Scholar
- Liao PY, Lee KH: From SNPs to functional polymorphism: The insight into biotechnology applications. Biochem Eng J. 2010, 49: 149-158. 10.1016/j.bej.2009.12.021.View ArticleGoogle Scholar
- NCBI Genomes FTP site. [ftp://ftp.ncbi.nih.gov/genomes/]
- Bao H, Guo H, Wang J, Zhou R, Lu X, Shi S: MapView: visualization of short reads alignment on a desktop computer. Bioinformatics. 2009, 25: 1554-1555. 10.1093/bioinformatics/btp255.PubMedView ArticleGoogle Scholar
- Mouse Genome Informatics. [http://www.informatics.jax.org]
- Rat Genome Database. [http://rgd.mcw.edu/]
- BioMart. [http://www.biomart.org]
- Zhi-Liang H, Bao J, Reecy JM: CateGOrizer: A Web-Based Program to Batch Analyze Gene Ontology Classification Categories. Online J Bioinformatics. 2008, 9: 108-112.Google Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.