The full-ORF clone resource of the German cDNA Consortium
- Stephanie Bechtel1Email author,
- Heiko Rosenfelder1,
- Anny Duda1,
- Christian Peter Schmidt1,
- Ute Ernst1,
- Ruth Wellenreuther1,
- Alexander Mehrle1,
- Claudia Schuster1,
- Andre Bahr2,
- Helmut Blöcker3,
- Dagmar Heubner4,
- Andreas Hoerlein5,
- Guenter Michel6,
- Holger Wedler2,
- Karl Köhrer6,
- Birgit Ottenwälder7,
- Annemarie Poustka1,
- Stefan Wiemann1 and
- Ingo Schupp1
© Bechtel et al; licensee BioMed Central Ltd. 2007
Received: 26 January 2007
Accepted: 31 October 2007
Published: 31 October 2007
With the completion of the human genome sequence the functional analysis and characterization of the encoded proteins has become the next urging challenge in the post-genome era. The lack of comprehensive ORFeome resources has thus far hampered systematic applications by protein gain-of-function analysis. Gene and ORF coverage with full-length ORF clones thus needs to be extended. In combination with a unique and versatile cloning system, these will provide the tools for genome-wide systematic functional analyses, to achieve a deeper insight into complex biological processes.
Here we describe the generation of a full-ORF clone resource of human genes applying the Gateway cloning technology (Invitrogen). A pipeline for efficient cloning and sequencing was developed and a sample tracking database was implemented to streamline the clone production process targeting more than 2,200 different ORFs. In addition, a robust cloning strategy was established, permitting the simultaneous generation of two clone variants that contain a particular ORF with as well as without a stop codon by the implementation of only one additional working step into the cloning procedure. Up to 92 % of the targeted ORFs were successfully amplified by PCR and more than 93 % of the amplicons successfully cloned.
The German cDNA Consortium ORFeome resource currently consists of more than 3,800 sequence-verified entry clones representing ORFs, cloned with and without stop codon, for about 1,700 different gene loci. 177 splice variants were cloned representing 121 of these genes. The entry clones have been used to generate over 5,000 different expression constructs, providing the basis for functional profiling applications. As a member of the recently formed international ORFeome collaboration we substantially contribute to generating and providing a whole genome human ORFeome collection in a unique cloning system that is made freely available in the community.
Recent efforts have completely unravelled also the human genome sequence [1–6]. Since, attention has shifted towards the detailed understanding of gene functions in health and disease by analysing the structure, biological activities and dynamics of the encoded proteins. To this end, RNA interference (RNAi) has received much attention as a powerful tool for systematic loss-of-function genetic studies on a large scale [7–9]. However, for many functional genomics and proteomics applications including studies on protein subcellular localization , protein structures [11, 12], protein functions in cell-based experiments [13, 14], analysis of protein-protein interactions [15, 16], and disease-related processes [17, 18], expression clones are indispensable. The clones of cDNA collections [2, 5, 6, 19] are generally not ideal for immediate use in these experiments, as they contain 5'and 3'untranslated regions (UTRs) of varying lengths. These interfere with the expression of the encoded proteins especially when coexpression of in-frame fusions with specific tags at either ends are anticipated. The 5'UTRs may contain in-frame stop codons or lead to the inclusion of artificial amino acid sequences. The native stop codon that terminates any ORF furthermore impedes the expression of C-terminal protein fusions. In consequence, the generation of clone collections that only contain the protein coding part of the genes (ORFs) has become a key component for the comprehensive and systematic analysis of protein functions in many different systems. Despite the availability of the human genome sequence, a respective full-ORF clone collection is far from being complete . This is in part due to the fact that the structures of many genes are still unclear, and thus require considerable manual and individual verification . Furthermore, the phenomenon of alternative splicing has not received much attention in ORF clone collections yet. Here, we report on the production of a full-length ORF clone library of human genes and splice forms, using the recombination-based Gateway cloning system (Invitrogen) . We have developed a cloning approach applied to more than 2,200 different ORFs including (1) optimization and improvement of gene models, and of the ORF amplification and cloning processes, (2) development of a cloning strategy to simultaneously generate Gateway entry clones with and without stop codon, (3) establishment of a pipeline for ORF sequence validation (4) programming and implementation of a sample tracking database. The generated entry clone resource currently comprises more than 3,800 sequence-validated Gateway clones for more than 1,850 ORFs, the coding sequences have an average size of greater 2 kb. As a member of the recently initiated international ORFeome collaboration  we significantly contribute to generating and providing ORF clone resources for all human genes and their splice forms in a unique and flexible cloning system. The Gateway entry clones have since been used to generate over 5,000 different expression constructs that have been successfully exploited in functional profiling applications [13, 14, 23, 24]. All entry clones are available through the international ORFeome collaboration .
Results and Discussion
Gene structures and models
Efficient ORF amplification procedure
- Tagging the ORFs with Gateway sites
- Primer quality and processivity and fidelity of DNA polymerases
Comparison of primer quality of three different suppliers
total # of analysed clones
% of clones with frame shift mutations
% of clones with missense mutations
% of positive clones
Success rates of ORF amplification
amplified with alternative template
amplified with alternative 1-step primers
However, if the amplification was clone-based and the expected PCR product was not obtained, the template DNA was sequence controlled. More than 10 % of all clones used did not contain the expected insert probably due to picking or annotation errors, or they did not contain the complete ORFs. If available, the amplification was repeated with an alternative template which proved to be efficient for ≥ 78 % of these ORFs (Table 2). Where the amplification failed due to no priming or mispriming events, first-step primer redesign generated a PCR fragment in 81 % of the cases (Table 2).
By the application of our PCR pipeline optimized by the combination of amplification step improvements up to 92 % of the ORFs could be successfully amplified (Fig. 2) and more than 86 % irrespective of the ORF size (upper limit tested: 6.5 kb) (Fig. 2; Table 2). We successfully generated amplicons for a total of 1997 different ORFs (Table 2) which were subsequently subjected to BP cloning.
Recombinatorial cloning of target ORFs
Success rates of ORF cloning in dependence on the template used
# +/- stop codonc
# + stop codond
# - stop codone
Simultaneous generation of ORF clones with and without a stop codon
For colony-PCR after E.coli transformation the nested PCR forward primer was used in combination with a reverse primer designed to anneal 200 bp downstream of the ORF to the vector backbone (Fig. 6b). PCR products were digested with Bam HI and the absence or presence of the stop codon was determined by agarose gel electrophoresis to distinguish the two species of entry clones. Clones with an open configuration displayed an additional band of 200 bp and a corresponding size shift of the ORF band in contrast to undigested clones containing a stop codon, as shown in Fig. 6c.
In summary, with this straightforward cloning protocol entry clones containing specific ORFs with and without a stop codon were obtained in parallel, while introducing only one additional working step, namely the Bam HI digest of colony PCR products. The success rate was > 90 % when eight individual entry clones were analyzed for every ORF. In few cases (< 5 %) only one of the two variants were found or no ORF (< 5 %) was present in the clones. Thus, the modification of the ORF flanking region in the 3'-primer did not significantly influence the recombination efficiency of the BP reaction. This strategy has a high capacity for automation and can thus be applied in high-throughput. It enabled the distinction of clones already before entry clone sequencing, saving the laborious and costly sequencing of randomly selected clones that would otherwise be required to identify ORF clones with as well as without a stop codon.
Sequence validation of entry clones
Four entry clones per ORF scored positive by colony-PCR, two containing and two lacking a stop codon, were subjected to 5' and 3' sequencing using vector primers. The sequences were analysed for matching the target gene and for the integrity of the recombination sites to exclude clones containing primer or recombination errors. If the clones matched the target sequences the inserts were verified by complete sequencing using ORF specific primers. Entry clones were scored positive if the assembled sequences were identical to the expected sequences or if they contained base changes that were silent mutations or confirmed as SNPs. When base changes were observed that did result in amino acid substitutions they were evaluated as follows: If an alternative entry clones was present containing the correct ORF this clone was further used. Where amino acid substitutions were detected at different positions in the clones analysed, further clones were subjected to the sequencing process. If all clones contained the same amino acid substitutions cloning was repeated using an alternative template. Clones containing either nonsense mutations leading to in-frame stop codons or base changes within the recombination sites which potentially impaired the subcloning efficiency, were rejected. In cases where the ORF was not present or only partially cloned due to internal deletions or mispriming events or where introns were retained, the cloning was repeated. If the sequencing reaction failed new primers were designed.
Overview on sequence validated accepted clones
additional splice variants of the targeted genes
# initially targeted
sequence validated clones generated for b
Database application for sample tracking, standardization and quality control
We have described the ORF cloning pipeline of the German cDNA Consortium, where human full-length ORFs are manually modelled and annotated, and subsequently efficiently amplified and cloned into Gateway entry vectors. We have improved and streamlined protocols to circumvent possible size bias, to simultaneously generate ORF constructs with and without stop codons, and to automate most of the processes. SOPs describing the ORF cloning processes in detail are available at . The German cDNA Consortium ORFeome resource currently consists of more than 3,800 sequence-verified entry clones for more than 1,850 ORF models, most of them cloned with and without a stop codon. These entry clones represent about 1,700 genes, 177 splice variants were cloned representing 121 of these genes. The entry clones allow for a broad range of subsequent applications to functionally characterize the ORF encoded proteins in multiple expression systems in parallel [1, 13, 14, 23, 24]. With this resource we significantly contribute to the international ORFeome collaboration  that aims at the generation and provision of a whole genome ORFeome collection of Gateway entry clones. The sequences are available at EMBL/GenBank/DDBJ databases and the clones are distributed via the ORFeome Collaboration and are made available through I.M.A.G.E. clone providers.
Gene annotation and modeling of new gene structures
Using the UCSC genome browser  for visualization, gene models were built based on mRNA, EST and gene prediction data. The HUSAR software package  was employed with its BLAST and ORF-prediction tools mostly for fine analysis and mapping of the gene structures, and to retrieve data from RefSeq  and EntrezGene  databases. The UCSC Table Browser function  was used to retrieve relevant sequences for subsequent joining to construct full-length ORF models for the different gene loci. Gene features rendered most relevant for full-length ORF selection were: EST- and mRNA coverage, presence of CpG islands, polyA signals, canonical splice signals, conservation from comparative genome data, exclusion of repetitive elements, and not to be target of nonsense mediated decay (NMD) [27, 28]. If functional alternative splicing was observed for a gene locus different ORF models were build which were used as reference sequences for the generation of ORF cloning and sequencing primers for entry clone sequence verification. For ORF cloning we selected promising cDNAs or 5'-EST clones using our DKFZ or the MGC clone resources obtained via the RZPD (German Resource Center for Genome Research, Heidelberg). 5'-EST clones were first sequenced completely to analyze their potential to contain the full ORF. If no cDNA clones were available, suitable RNA sources were employed for RT-PCR to amplify full-length ORFs for subsequent cloning.
ORF amplification by PCR
The amplification of ORFs had originally been performed in a single PCR reaction as described previously , and has since been replaced by a 2-step procedure  performed on 96-well format. Primers for first-step PCR were designed using the PRIDE program  and purchased salt free from three different suppliers. The standard PCR contained a final concentration of 1x Phusion HF buffer, 10 ng template DNA, 10 pmol of primers, 0.5 mM dNTPs and 0.5 U Phusion DNA polymerase in a total reaction volume of 25 μl. Standard first-step PCR parameters were: 98°C for 30 sec, 12 cycles of 98°C for 10 sec, 55°C for 10 sec, 63°C for 15–30 sec/1 kb, 63°C for 5 min final extension. The Gateway™ recombination sites were completed in a second PCR using a universal pair of PAGE-purified primers (Eurogentec). Forward primer: GGGGACAAGTTTGTACAAAAAAGCAGGCTCCACCATG; reverse primer: GGGGACCACTTTGTACAAGAAAGCTGGGTG (underlined sequences overlap with primers of first-step PCR). The nested PCR was performed in a 50 μl reaction volume consisting of 1–5 μl of first PCR reaction, 10 pmol of primers, 1 mM dNTPs, 1x Phusion HF buffer and 1 U Phusion DNA polymerase. The standard cycling conditions were identical to those of the first-step PCRs. For PCR product purification ethanol precipitation as well as other methods including QIAquick PCR Purification (Qiagen), ChargeSwitch PCR Clean-Up (Invitrogen), QIAquick Gel extraction (Qiagen) or MinElute Gel Extraction (Qiagen) were compared for best results. Detailed protocols for the two-step ORF amplification process are available at .
BP cloning of PCR-products
PCR products were cloned by BP recombination (Invitrogen)  into pDONR201 or pDONR221 in 96-well format according to the supplier's instructions, except using only half of the recommended volumes . Incubation was at 25°C for 2–20 h. Ca2+-competent DH10B E.coli bacteria were transformed with the BP product using a Multiprobe pipetting robot (Perkin Elmer). Transformants were spread in two Q-trays (22 × 22 cm, Genetix), each subdivided into 48 squares by plastic grids, and containing LB agar supplemented with 50 μg/ml kanamycin. Eight colonies per ORF were analysed for the presence of the ORF of expected size in a colony PCR, utilizing the Perkin Elmer Multiprobe robot to set-up the reactions. Simultaneously, the colonies were inoculated into a 96 deep well block (Greiner) and bacteria were grown for 16 hours.
Generation of ORF clones in open and closed configuration
ORFs both with and without a stop codon were generated simultaneously by introducing the following protocol modifications: six additional base pairs (underlined in the primer sequences below) were added upstream of the ORF-specific sequence in the reverse PCR primer for the first PCR step. One of these base pairs represented a degenerated position (Y = C or T): 5'-TGGGTGGATYCA-ORF-specific sequence-3'. For the nested PCR two reverse primers were mixed in an equimolar ratio, each containing either a "C" or "T" at the degenerated base position of the first step primer. For entry clone analysis by colony-PCR the second step ORF-PCR forward primer was combined with the following reverse primer: 5'-TCTTGTGCAATGTAACATCAG-3'. Subsequently, the reaction volume was doubled and 2 units of Bam HI were added directly into the wells of the 96-well colony PCR plate in order to screen for clones with and without a stop codon. After 2 h incubation at 37°C the samples were analysed on agarose gel.
Sequence validation of entry clones
Four entry clones of every ORF two with and two without a stop codon that had been scored positive in the colony PCR (Fig. 6c) were rearrayed using the Mulitprobe pipetting robot. Plasmid preparation was done with the Nucleospin Robot-96 plasmid kit (Macherey-Nagel) on the Bio Robot 9600. Entry clones were subsequently monitored by Bsr GI single and Bam HI/Pvu I double digest. Clones scoring positive were subjected to automated sequencing on 3100 Genetic Analyzers (Applied Biosystems) with BigDye Terminators v3.1 (Applied Biosystems). The entry clones were completely sequence-verified including the Gateway recombination sites applying primer walking strategy. The primer were designed to aneal every 450 bp based on the reference sequence of the ORF model using the PRIDE program . Sequences were assembled using the Staden package  together with the reference ORF model sequence and checked for differences. Entry clones sequences were annotated based on the reference sequences using the Blast tools of the HUSAR software package . Sequences are constantly submitted to the GenBank database.
The software for cloning process management ("SCISSORS") is a MS .NET application using MS SQL Server as a database backend. The software is a Lab Information Management System (LIMS) providing user interfaces for working step management, data acquisition and analysis. It furthermore represents an administration tool for clone and plate storage and is also used to store and display clone annotation information.
We thank Silke Argo for critical reading of the manuscript. This work was supported by grant 01GR0420 of the National Genome Research Network from the Bundesministerium für Bildung und Forschung (BMBF).
- Wiemann S, Bechtel S, Bannasch D, Pepperkok R, Poustka A: The German cDNA network: cDNAs, functional genomics and proteomics. J Struct Funct Genomics. 2003, 4 (2–3): 87-96. 10.1023/A:1026148428520.PubMedView ArticleGoogle Scholar
- Wiemann S, Weil B, Wellenreuther R, Gassenhuber J, Glassl S, Ansorge W, Bocher M, Blocker H, Bauersachs S, Blum H, Lauber J, Dusterhoft A, Beyer A, Kohrer K, Strack N, Mewes HW, Ottenwalder B, Obermaier B, Tampe J, Heubner D, Wambutt R, Korn B, Klein M, Poustka A: Toward a catalog of human genes and proteins: sequencing and analysis of 500 novel complete protein coding human cDNAs. Genome Res. 2001, 11 (3): 422-435. 10.1101/gr.GR1547R.PubMed CentralPubMedView ArticleGoogle Scholar
- Adams MD, Dubnick M, Kerlavage AR, Moreno R, Kelley JM, Utterback TR, Nagle JW, Fields C, Venter JC: Sequence identification of 2,375 human brain genes. Nature. 1992, 355 (6361): 632-634. 10.1038/355632a0.PubMedView ArticleGoogle Scholar
- Nomura N, Miyajima N, Sazuka T, Tanaka A, Kawarabayasi Y, Sato S, Nagase T, Seki N, Ishikawa K, Tabata S: Prediction of the coding sequences of unidentified human genes. I. The coding sequences of 40 new genes (KIAA0001-KIAA0040) deduced by analysis of randomly sampled cDNA clones from human immature myeloid cell line KG-1 (supplement). DNA Res. 1994, 1 (1): 47-56. 10.1093/dnares/1.1.47.PubMedView ArticleGoogle Scholar
- Strausberg RL, Feingold EA, Grouse LH, Derge JG, Klausner RD, Collins FS, Wagner L, Shenmen CM, Schuler GD, Altschul SF, Zeeberg B, Buetow KH, Schaefer CF, Bhat NK, Hopkins RF, Jordan H, Moore T, Max SI, Wang J, Hsieh F, Diatchenko L, Marusina K, Farmer AA, Rubin GM, Hong L, Stapleton M, Soares MB, Bonaldo MF, Casavant TL, Scheetz TE, et al: Generation and initial analysis of more than 15,000 full-length human and mouse cDNA sequences. Proc Natl Acad Sci USA. 2002, 99 (26): 16899-16903. 10.1073/pnas.242603899.PubMedView ArticleGoogle Scholar
- Ota T, Suzuki Y, Nishikawa T, Otsuki T, Sugiyama T, Irie R, Wakamatsu A, Hayashi K, Sato H, Nagai K, Kimura K, Makita H, Sekine M, Obayashi M, Nishi T, Shibahara T, Tanaka T, Ishii S, Yamamoto J, Saito K, Kawai Y, Isono Y, Nakamura Y, Nagahari K, Murakami K, Yasuda T, Iwayanagi T, Wagatsuma M, Shiratori A, Sudo H, et al: Complete sequencing and characterization of 21,243 full-length human cDNAs. Nat Genet. 2004, 36 (1): 40-45. 10.1038/ng1285.PubMedView ArticleGoogle Scholar
- Moffat J, Sabatini DM: Building mammalian signalling pathways with RNAi screens. Nat Rev Mol Cell Biol. 2006, 7 (3): 177-187. 10.1038/nrm1860.PubMedView ArticleGoogle Scholar
- Hannon GJ, Rossi JJ: Unlocking the potential of the human genome with RNA interference. Nature. 2004, 431 (7006): 371-378. 10.1038/nature02870.PubMedView ArticleGoogle Scholar
- Brummelkamp TR, Bernards R: New tools for functional mammalian cancer genetics. Nat Rev Cancer. 2003, 3 (10): 781-789. 10.1038/nrc1191.PubMedView ArticleGoogle Scholar
- Simpson JC, Wellenreuther R, Poustka A, Pepperkok R, Wiemann S: Systematic subcellular localization of novel proteins identified by large-scale cDNA sequencing. EMBO Rep. 2000, 1 (3): 287-292. 10.1093/embo-reports/kvd058.PubMed CentralPubMedView ArticleGoogle Scholar
- Folkers GE, van Buuren BN, Kaptein R: Expression screening, protein purification and NMR analysis of human protein domains for structural genomics. J Struct Funct Genomics. 2004, 5 (1–2): 119-131. 10.1023/B:JSFG.0000029200.66197.0c.PubMedView ArticleGoogle Scholar
- Yokoyama S: Protein expression systems for structural genomics and proteomics. Curr Opin Chem Biol. 2003, 7 (1): 39-43. 10.1016/S1367-5931(02)00019-4.PubMedView ArticleGoogle Scholar
- Arlt D, Huber W, Liebel U, Schmidt C, Majety M, Sauermann M, Rosenfelder H, Bechtel S, Mehrle A, Bannasch D, Schupp I, Seiler M, Simpson JC, Hahne F, Moosmayer P, Ruschhaupt M, Guilleaume B, Wellenreuther R, Pepperkok R, Sultmann H, Poustka A, Wiemann S: Functional profiling: from microarrays via cell-based assays to novel tumor relevant modulators of the cell cycle. Cancer Res. 2005, 65 (17): 7733-7742.PubMedGoogle Scholar
- Starkuviene V, Liebel U, Simpson JC, Erfle H, Poustka A, Wiemann S, Pepperkok R: High-content screening microscopy identifies novel proteins with a putative role in secretory membrane traffic. Genome Res. 2004, 14 (10A): 1948-1956. 10.1101/gr.2658304.PubMed CentralPubMedView ArticleGoogle Scholar
- Stelzl U, Worm U, Lalowski M, Haenig C, Brembeck FH, Goehler H, Stroedicke M, Zenkner M, Schoenherr A, Koeppen S, Timm J, Mintzlaff S, Abraham C, Bock N, Kietzmann S, Goedde A, Toksoz E, Droege A, Krobitsch S, Korn B, Birchmeier W, Lehrach H, Wanker EE: A human protein-protein interaction network: a resource for annotating the proteome. Cell. 2005, 122 (6): 957-968. 10.1016/j.cell.2005.08.029.PubMedView ArticleGoogle Scholar
- Rual JF, Venkatesan K, Hao T, Hirozane-Kishikawa T, Dricot A, Li N, Berriz GF, Gibbons FD, Dreze M, Ayivi-Guedehoussou N, Klitgord N, Simon C, Boxem M, Milstein S, Rosenberg J, Goldberg DS, Zhang LV, Wong SL, Franklin G, Li S, Albala JS, Lim J, Fraughton C, Llamosas E, Cevik S, Bex C, Lamesch P, Sikorski RS, Vandenhaute J, Zoghbi HY, et al: Towards a proteome-scale map of the human protein-protein interaction network. Nature. 2005, 437 (7062): 1173-1178. 10.1038/nature04209.PubMedView ArticleGoogle Scholar
- Subramanian G, Adams MD, Venter JC, Broder S: Implications of the human genome for understanding human biology and medicine. Jama. 2001, 286 (18): 2296-2307. 10.1001/jama.286.18.2296.PubMedView ArticleGoogle Scholar
- Guttmacher AE, Collins FS: Genomic medicine – a primer. N Engl J Med. 2002, 347 (19): 1512-1520. 10.1056/NEJMra012240.PubMedView ArticleGoogle Scholar
- Wellenreuther R, Schupp I, Poustka A, Wiemann S: SMART amplification combined with cDNA size fractionation in order to obtain large full-length clones. BMC Genomics. 2004, 5 (1): 36-10.1186/1471-2164-5-36.PubMed CentralPubMedView ArticleGoogle Scholar
- Temple G, Lamesch P, Milstein S, Hill DE, Wagner L, Moore T, Vidal M: From genome to proteome: developing expression clone resources for the human genome. Hum Mol Genet. 2006, 15 Spec No 1: R31-43. 10.1093/hmg/ddl048.PubMedView ArticleGoogle Scholar
- Kuryshev VY, Vorobyov E, Zink D, Schmitz J, Rozhdestvensky TS, Munstermann E, Ernst U, Wellenreuther R, Moosmayer P, Bechtel S, Schupp I, Horst J, Korn B, Poustka A, Wiemann S: An anthropoid-specific segmental duplication on human chromosome 1q22. Genomics. 2006, 88 (2): 143-151. 10.1016/j.ygeno.2006.02.002.PubMedView ArticleGoogle Scholar
- Hartley JL, Temple GF, Brasch MA: DNA cloning using in vitro site-specific recombination. Genome Res. 2000, 10 (11): 1788-1795. 10.1101/gr.143000.PubMed CentralPubMedView ArticleGoogle Scholar
- Korf U, Kohl T, van der Zandt H, Zahn R, Schleeger S, Ueberle B, Wandschneider S, Bechtel S, Schnolzer M, Ottleben H, Wiemann S, Poustka A: Large-scale protein expression for proteome research. Proteomics. 2005, 5 (14): 3571-3580. 10.1002/pmic.200401195.PubMedView ArticleGoogle Scholar
- Wiemann S, Arlt D, Huber W, Wellenreuther R, Schleeger S, Mehrle A, Bechtel S, Sauermann M, Korf U, Pepperkok R, Sultmann H, Poustka A: From ORFeome to biology: a functional genomics pipeline. Genome Res. 2004, 14 (10B): 2136-2144. 10.1101/gr.2576704.PubMed CentralPubMedView ArticleGoogle Scholar
- International ORFeome Collaboration. [http://www.orfeomecollaboration.org]
- Jones SJ: Prediction of genomic functional elements. Annu Rev Genomics Hum Genet. 2006, 7: 315-338. 10.1146/annurev.genom.7.080505.115745.PubMedView ArticleGoogle Scholar
- Brent MR: Genome annotation past, present, and future: how to define an ORF at each locus. Genome Res. 2005, 15 (12): 1777-1786. 10.1101/gr.3866105.PubMedView ArticleGoogle Scholar
- Ashurst JL, Collins JE: Gene annotation: prediction and testing. Annu Rev Genomics Hum Genet. 2003, 4: 69-88. 10.1146/annurev.genom.4.070802.110300.PubMedView ArticleGoogle Scholar
- Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JG, Storey R, Swarbreck D, Rossier C, Ucla C, Hubbard T, Antonarakis SE, Guigo R: GENCODE: producing a reference annotation for ENCODE. Genome Biol. 2006, 7 (Suppl 1): 1-9. 10.1186/gb-2006-7-s1-s4.View ArticleGoogle Scholar
- Kolb-Kokocinski A, Mehrle A, Bechtel S, Simpson JC, Kioschis P, Wiemann S, Wellenreuther R, Poustka A: The systematic functional characterisation of Xq28 genes prioritises candidate disease genes. BMC Genomics. 2006, 7: 29-10.1186/1471-2164-7-29.PubMed CentralPubMedView ArticleGoogle Scholar
- Aguiar JC, LaBaer J, Blair PL, Shamailova VY, Koundinya M, Russell JA, Huang F, Mar W, Anthony RM, Witney A, Caruana SR, Brizuela L, Sacci JB, Hoffman SL, Carucci DJ: High-throughput generation of P. falciparum functional molecules by recombinational cloning. Genome Res. 2004, 14 (10B): 2076-2082. 10.1101/gr.2416604.PubMed CentralPubMedView ArticleGoogle Scholar
- Labaer J, Qiu Q, Anumanthan A, Mar W, Zuo D, Murthy TV, Taycher H, Halleck A, Hainsworth E, Lory S, Brizuela L: The Pseudomonas aeruginosa PA01 gene collection. Genome Res. 2004, 14 (10B): 2190-2200. 10.1101/gr.2482804.PubMed CentralPubMedView ArticleGoogle Scholar
- Standardized protocols SMP-Cell. [http://www.smp-cell.org/groups.asp?siteID=49]
- Sun YH, G; Colburn, N H: PCR-direct sequencing of a GC-rich region by inclusion of 10% DMSO: application to mouse c-jun. BioTechniques. 1993, 15 (3): 372-374.PubMedGoogle Scholar
- Cheng S, Fockler C, Barnes WM, Higuchi R: Effective amplification of long targets from cloned inserts and human genomic DNA. Proc Natl Acad Sci USA. 1994, 91 (12): 5695-5699. 10.1073/pnas.91.12.5695.PubMed CentralPubMedView ArticleGoogle Scholar
- Lindahl T, Nyberg B: Rate of depurination of native deoxyribonucleic acid. Biochemistry. 1972, 11 (19): 3610-3618. 10.1021/bi00769a018.PubMedView ArticleGoogle Scholar
- Marsischky G, LaBaer J: Many paths to many clones: a comparative look at high-throughput cloning methods. Genome Res. 2004, 14 (10B): 2020-2028. 10.1101/gr.2528804.PubMedView ArticleGoogle Scholar
- Michaelson D, Philips M: The use of GFP to localize Rho GTPases in living cells. Methods Enzymol. 2006, 406: 296-315.PubMedView ArticleGoogle Scholar
- Mehrle A, Rosenfelder H, Schupp I, del Val C, Arlt D, Hahne F, Bechtel S, Simpson J, Hofmann O, Hide W, Glatting KH, Huber W, Pepperkok R, Poustka A, Wiemann S: The LIFEdb database in 2006. Nucleic Acids Res. 2006, D415-418. 10.1093/nar/gkj139. 34 DatabaseGoogle Scholar
- Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D: The human genome browser at UCSC. Genome Res. 2002, 12 (6): 996-1006. 10.1101/gr.229102. Article published online before print in May 2002.PubMed CentralPubMedView ArticleGoogle Scholar
- HUSAR Bioinformatics Lab. [http://genome.dkfz-heidelberg.de/]
- Pruitt KD, Tatusova T, Maglott DR: NCBI reference sequences (RefSeq): a curated non-redundant sequence database of genomes, transcripts and proteins. Nucleic Acids Res. 2007, D61-65. 10.1093/nar/gkl842. 35 DatabaseGoogle Scholar
- Maglott D, Ostell J, Pruitt KD, Tatusova T: Entrez Gene: gene-centered information at NCBI. Nucleic Acids Res. 2007, D26-31. 10.1093/nar/gkl993. 35 DatabaseGoogle Scholar
- Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, Kent WJ: The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 2004, D493-496. 10.1093/nar/gkh103. 32 DatabaseGoogle Scholar
- Haas S, Vingron M, Poustka A, Wiemann S: Primer design for large scale sequencing. Nucleic Acids Res. 1998, 26 (12): 3006-3012. 10.1093/nar/26.12.3006.PubMed CentralPubMedView ArticleGoogle Scholar
- Staden R: The Staden sequence analysis package. Mol Biotechnol. 1996, 5 (3): 233-241.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.