- Methodology article
- Open Access
Design and application of a target capture sequencing of exons and conserved non-coding sequences for the rat
© The Author(s). 2016
- Received: 8 April 2016
- Accepted: 28 July 2016
- Published: 9 August 2016
Target capture sequencing is an efficient approach to directly identify the causative mutations of genetic disorders. To apply this strategy to laboratory rats exhibiting various phenotypes, we developed a novel target capture probe set, TargetEC (target capture for exons and conserved non-coding sequences), which can identify mutations not only in exonic regions but also in conserved non-coding sequences and thus can detect regulatory mutations.
TargetEC covers 1,078,129 regions spanning 146.8 Mb of the genome. We applied TargetEC to four inbred rat strains (WTC/Kyo, WTC-swh/Kyo, PVG/Seac, and KFRS4/Kyo) maintained by the National BioResource Project for the Rat in Japan, and successfully identified mutations associated with these phenotypes, including one mutation detected in a conserved non-coding sequence.
The method developed in this study can be used to efficiently identify regulatory mutations, which cannot be detected using conventional exome sequencing, and will help to deepen our understanding of the relationships between regulatory mutations and associated phenotypes.
- Target capture sequencing
- Conserved non-coding sequence
Whole exome sequencing (WES) is used to sequence only the exonic portion of the genome, which comprises 1–2 % of the entire genome. Hence, WES reduces the cost associated with the identification of the causative mutations of a certain disease while maintaining the efficiency of mutation detection in protein-coding regions that might substantially affect the phenotype. Since the use of this method to select an entire set of human exons for sequencing using probe hybridization  and its first application for the identification of a gene involved in a genetic disease were reported , WES has been successfully applied to the identification of responsible genes and their causative mutations in many diseases (for review, see ).
Although WES has been mainly applied to human genetics, some studies have investigated WES application to mouse exome [4, 5], which is among the most widely used animal models in basic molecular biology and biomedical science. Rat has also been used extensively as a model for various human diseases and the genome has been sequenced ; however, WES analyses of rat exome have not yet been reported because of the lack of a commercially available exome capture kit. It is possible to design a customized capture kit for the desired target sequences in such species. Some applications of exome sequencing have been described for some species for which commercial capture kits are not available. These species include dogs [7, 8], pigs , cows/bison , chipmunks , and non-human primates .
In the conventional WES target capture method, all regions except for exons and their neighboring splice sites are eliminated as targets. As such, mutations in most regulatory regions, which often reside in non-exonic regions, are, in principle, hardly detected despite the fact that mutations in these regions have been found to be responsible for certain diseases such as preaxial polydactyly  and the Pierre Robin sequence  (for review, see [15–18]). Such mutations are often identified through genome-wide association studies (GWASs). For example, among the published reports of 2304 studies (16,831 associations) registered in the GWAS Catalog (October 30, 2015) , the reports of 535 studies (2276 associations) stated that significant markers were found in intergenic regions distant from any annotated genes. These may be candidates for disorder- or risk-associated loci that harbor regulatory mutations, which cannot be detected using conventional WES as this technique targets only the exons in protein-coding genes. Because these regulatory regions are functionally important and have been subjected to selective pressure, they tend to be conserved during the course of evolution . Therefore, to ensure that the target capture procedure covers not only exons for protein-coding genes but also possible regions containing regulatory mutations, these regions, namely conserved non-coding sequences (CNSs), should be included in the target. Advances in vertebrate genome sequencing and subsequent comparative genomic analyses have made it possible to decipher such regions under strong selective constraints. Information about these regions can easily be retrieved from the multiple genome sequence alignments provided by the UCSC Genome Browser .
One key step in a WES data analysis is the filtering of known single nucleotide variants (SNVs) that are commonly observed in different individuals and hence thought not to be associated with a certain phenotype of interest. At this stage, filtering efficiency depends on the number of known SNVs commonly found in wild-type samples. The most popular data repository for these variants is dbSNP . The number of rat SNVs in dbSNP (Build 146; November 24, 2015) is approximately 5.1 million, whereas 150.5 million and 99.5 million variants are registered for the human and mouse, respectively. This discrepancy has been considered a disadvantage with regard to rat exome analysis in comparison to human and mouse analyses. Fortunately, rapid progress in genome sequencing and the identification of sequence variations in several rat strains have recently been made [23–26]. These resources concerning sequence variations in rat will allow an efficient focus on variants that might be responsible for a certain phenotype.
Here, we describe the development of a novel rat target capture probe set, TargetEC (target capture for exons and CNSs). This probe set covers not only the annotated exons for protein-coding genes but also CNSs, and thus allows the detection of variations in genomic regions important for regulatory functions. To evaluate its performance, we applied the TargetEC to two normal rat strains, WTC/Kyo and PVG/Seac. In addition, as a proof of concept, we also applied this probe set to two mutant rat strains to demonstrate its ability to identify causative mutations for the phenotypes: WTC-swh/Kyo, which exhibits a sparse and wavy hair (swh) phenotype , and KFRS4/Kyo, which exhibits multiple phenotypes such as congenital cataract formation  and ear malformation . This study describes a powerful method by which to identify the causative mutations of various phenotypes in rat strains.
Design of target capture regions for sequencing
RefSeq  and Ensembl  were used for gene annotation; these databases contain 18,573 transcripts for 17,168 genes with coverage of 39.0 Mb of the genome, and 30,404 transcripts for 26,689 genes with coverage of 48.2 Mb of the genome, respectively. By merging these two gene annotations, the genomic regions covered by these annotations increased slightly to 50.8 Mb.
Target capture sequencing
Summary statistics for sequencing and mapping
Mapped reads (%)
After removing duplicated reads (%)
Reads on target (%)
Average target depth
% target with coverage depth of ≥10×
According to the molecular phylogenetic trees of inbred laboratory rat strains constructed using microsatellite markers  and SNPs , WTC/Kyo, PVG/Seac, and the reference strain BN/SsNHsdMCW are widely separated on the tree, indicating a divergence in genomic sequences among these strains. Indeed, the polymorphic rate, measured by microsatellite markers, between WTC and BN/SsNHsdMCW was close to the highest value . In spite of this divergence, the sequence reads of all strains analyzed in this study mapped well on the reference strain, BN/SsNHsdMCW. This indicates that although the probe sequences were designed based on the BN/SsNHsdMCW reference strain, TargetEC can be applied to a wide range of commonly-used inbred laboratory rat strains for the systematic identification of mutations in regions with functional importance (i.e., exons and CNSs).
Summary statistics for SNV and INDEL
We also measured a proportion of the SNVs in terms of each target region class: CDS, UTR, CNS, and others (Fig. 3b and Additional file 1: Figure S3). Although we designed our capture kit to cover only exonic regions and CNSs (phastCons score = 1.0), it also covered some flanking regions because of the expansion to a minimum length of 100 bp and optimization of hybridization using the manufacturer’s internal protocols. Accordingly, approximately 60 % of the residues in the target could not be classified as CDS, UTR, or CNS. In Fig. 3b, we present our results for the WTC/Kyo strain, but note that the results for the other three strains were almost identical (Additional file 1: Figure S3). Our findings clearly indicate that mutations are highly suppressed among CNSs and CDSs, whereas fewer constraints are placed on UTR; in contrast, the remaining regions are relatively susceptible to mutations.
To further analyze the relationship between the mutation frequencies and selective pressure acting on each genomic region, we plotted SNV density as a function of the phastCons score (Fig. 3c and Additional file 1: Figure S4). As expected, we observed a clear negative correlation between the SNV density and phastCons score. Although phastCons scores are calculated for genome alignments of vertebrate species, which include not only mammals but also chicken and zebrafish, the existence of a negative correlation between the SNV density and phastCons score indicates a similar trend of selective pressure among different strains of rats.
Validation of TargetEC for the identification of known causative mutations
To validate whether our approach could detect the causative mutations of certain phenotypes, we analyzed mutations identified by TargetEC in the WTC-swh/Kyo and KRFS4/Kyo strains. For each of these strains, we used WTC/Kyo and PVG/Seac, respectively, as control strains.
The KFRS4/Kyo strain also exhibits other phenotypes . One such phenotype is congenital ear malformation, also known as the “dumbo” phenotype . Although the responsible locus was mapped to a 6-Mb interval on chromosome 14, which contains the Hmx1 gene, no coding changes were identified in this region ; eventually, deletion of a conserved regulatory element for Hmx1, located approximately 80-kb downstream of the gene, was identified . Here, we applied the MACS2 program, which is usually used for peak calling in ChIP-seq experiments, to the mapped reads of KFRS4/Kyo and the control strain PVG/Seac in order to systematically identify large deletions as differences in the accumulated reads. We identified 6554 peak regions specific for PVG/Seac, which were hence thought to be KFRS4/Kyo-specific deletions. Of these peaks, 15 were located within the 6.0-Mb interval on chromosome 14; the most significant, according to the ratio of read numbers, was the deletion located approximately 80-kb downstream of Hmx1 (Fig. 5b), which exactly corresponds to the deletion identified in the conserved regulatory element for Hmx1 .
In this paper, we designed a novel target capture probe set, TargetEC, for rat genome analysis. This probe set not only covers exons for all annotated genes but also covers regions that are highly conserved across vertebrate species. Although the span of covered regions (146.8 Mb) is approximately three times larger than that covered by conventional human (58.0 Mb; Agilent Human All Exon V6) and mouse (49.6 Mb; Agilent Mouse All Exon) exome probes, it can potentially identify regulatory mutations that cannot be detected using conventional exome strategies. Another advantage associated with the inclusion of conserved regions is that most genic regions will be covered even without explicitly using gene annotation information. This can be a great advantage, especially for organisms with poorly annotated genes.
Regarding variants identified in coding regions, predictions of functional consequences are rather straightforward because several types of mutations, including synonymous, non-synonymous, inframe stop, and frameshift mutations exist, and each has different effects on the protein products. Indeed, several existing programs can classify these types of mutations  and predict the severity of the mutant phenotypes [37, 42, 43]. On the other hand, predicting the effects of variations in non-coding regions might not be an easy task. This might explain why non-coding regions have not been previously analyzed using conventional target capture methods (i.e., exome analysis), although a certain fraction of disease-associated mutations are known to be located in non-coding regions (for review, see [15–18]). In this study, we included both CNSs and exons as capture targets. These CNSs may be plausible targets for regulatory mutations because of the high conservation among vertebrate species, which implies the action of strong negative selection on these sites. Validation of the ability of our target capture kit to detect mutations in the regulatory element of Hmx1 suggests that this kit can be a powerful tool for the detection of regulatory mutations, which are not currently well annotated in genome databases. To evaluate the functional relevance of CNS mutations in the rat, we can incorporate information about epigenomic states [44, 45] such as H3K4me1, the histone modification associated with enhancers in the human and mouse, by transferring this information to corresponding regions in the rat genome . Given the number and the total length of CNSs, huge amounts of regulatory mutations might be responsible for certain phenotypes because mutations in non-coding regions only alter the spatio-temporal expression of target genes while maintaining the function of the protein products. On the other hand, critical mutations in coding regions completely change the functions of protein products, which might lead to severe or often lethal phenotypes.
In conventional exome analysis, information about SNPs helps to narrow down the number of candidate mutations. In humans, the number of SNPs registered in dbSNP has increased rapidly, and continues to increase further as a result of progress in genome sequencing projects for individuals such as the 1000 Genomes Project . On the other hand, the number of SNPs identified in rat is far less than those in the human and mouse. However, recent continuous and steady progress has been made in the field of rat genome resequencing [23–26]. The genome resequencing of closely-related reference strains has yielded massive and valuable information about genetic variations in the rat. Exome sequencing of many sample strains represents another possible approach to overcoming this shortfall of information about genetic variations in the rat. In fact, this strategy was adopted in the pilot phase of the 1000 Genomes Project, where more than half of the samples were subjected to exome sequencing to reveal rare variations in a cost-effective manner . Following this strategy, the application of TargetEC to a large stock of mutant strains, such as that stored by NBRP-Rat in Japan , would be useful for the collection of common and rare variations in functionally important regions and would therefore facilitate the identification of mutations responsible for certain phenotypes in mutant strains. This type of analysis is our future aim. The data obtained during this analytical course would be disclosed to public repositories, and would thus be of potential value to the rat genetics and genomics research communities.
One of the drawbacks for TargetEC is that the total length of the covered regions is approximately three-fold greater than that achieved with a conventional exome kit (e.g., a widely used human exome capture kit). Although a three-fold higher number of reads is needed to obtain the same depth of coverage as those obtained by conventional exome sequencing, the total length of the target regions remains 20-fold less than the entire genome, thus confirming the feasibility of this strategy relative to whole genome sequencing (WGS) when seeking relevant mutations for certain phenotypes. Moreover, rapid growth in sequencer performance will render the TargetEC strategy an even more affordable option for the detection of variations not only in exonic regions, but also in functionally important CNSs.
To experimentally confirm the involvement of a candidate mutation identified by target capture sequencing, we can use a gene knockout procedure that was recently established for rats. An early N-ethyl-N-nitrosourea (ENU) mutagenesis-based method was devised for knockout rat generation [49, 50]. With the recent emergence of genome editing technology, it is more feasible to implement genomic modifications in rats to analyze the phenotypic effects of genes [51, 52]. In these circumstances, TargetEC can be applied to laboratory rats that exhibit various disease phenotypes; the resulting experimental confirmation of the involvement of the identified mutations would shed light on the genetic cause underlying these mutants and promote mechanistic insights into the disease etiology.
This design strategy (i.e., targeting both exonic regions and CNSs) can easily be applied to human samples for the detection of regulatory mutations associated with certain phenotypes. Moreover, TargetEC developed in this study might have a potential to be directly applied to cross-species capture , for example, for other rodents, because it covers all the regions that are highly conserved (phastCons score = 1.0) among vertebrates, although we need further evaluation of experimental conditions and performance for such purpose .
The target capture design reported herein can potentially be used to identify additional mutations responsible for various phenotypes in the rat, which is a suitable model organism for comparative studies of diseases and other traits. The potential mutations include not only those in coding regions but also those in CNSs, which might correspond to regulatory mutations. By further analyzing these mutations, we will deepen our understanding of the relationships between the genotypic and phenotypic variations associated with regulatory elements.
WTC/Kyo (NBRP Rat No. 0020), WTC-swh/Kyo (NBRP Rat No. 0287), PVG/Seac (NBRP Rat No. 0080), and KFRS4/Kyo (NBRP Rat No. 0572) rats were provided by the Japanese NBRP-Rat ; http://www.anim.med.kyoto-u.ac.jp/nbr/) and maintained at the Institute of Laboratory Animals, Graduate School of Medicine, Kyoto University, under a specific pathogen-free environment. All animal experimentation protocols were approved by the Institutional Animal Care and Use Committees of Kyoto University and were conducted according to the Regulation on Animal Experimentation at Kyoto University.
Genome annotation data used to design the target regions for TargetEC
The rat genome assembly rn5 (RGSC 5.0, March 2012) was used as a reference genome in this study. We used gene annotations from RefSeq  and Ensembl  for the rat (rn5); these were downloaded from the Table Browser function in the UCSC Genome Browser . We also used highly conserved regions (phastCons score = 1.0)  among 13 vertebrate species (multiz alignment of 13 vertebrates), which were also downloaded from the UCSC Genome Browser . We merged all of these regions (i.e., those annotated in RefSeq and/or Ensembl and the conserved regions) using the “merge” option in the BEDTools program . Regions <100 bp in length were expanded in both directions to reach a minimum length of 100 bp. After these expansions, overlapping regions were further unified using the “merge” option in the BEDTools program. Based on the obtained target regions, a probe set for sequence capture (SeqCap EZ Developer Library; Roche NimbleGen, Madison, WI, USA) was designed and synthesized according to the manufacturer’s internal protocols.
Sample preparation and sequencing
Target capture was performed using the standard SeqCap EZ System protocol (Roche NimbleGen). DNA sequencing libraries were prepared using the KAPA HyperPlus Library Preparation Kit (KAPA Biosystems, London, UK) according to the manufacturer’s protocol. We applied two samples for one-reaction reagents. Sequencing was performed on an Illumina HiSeq 1500 (Illumina, San Diego, CA, USA) in Rapid Run mode PE250 (v2) (WTC/Kyo and WTC-swh/Kyo) and Rapid Run mode PE100 (v1) (PVG/Seac and KFRS4/Kyo), according to the manufacturer’s protocol.
Quality control of the sequencing reads
The reads generated using the Illumina HiSeq platform were subjected to quality control to exclude low quality reads. Initially, we used the FastQC program (http://www.bioinformatics.babraham.ac.uk/projects/fastqc/) to evaluate the overall read quality. Next, we trimmed reads from the 3’-end to address low quality bases (quality score of <30). After this trimming step, shortened reads (length of <30) were discarded from further analysis.
Read mapping and variant identification
We followed a previously reported basic exome data analysis procedure . In brief, the reads that passed the above quality control were first mapped onto the reference rat genome (rn5; RGSC 5.0, March 2012) using BWA (v0.7.4)  with the default parameters. SAMtools (v0.1.12a)  was used for file format conversion and indexing. For post-processing of this mapping step, duplicated reads were removed based on mapping position of the mates using the MarkDuplicates program in Picard tools (v1.87) (http://broadinstitute.github.io/picard/). Next, the RealignerTargetCreator and IndelRealigner utilities in the Genome Analysis Toolkit (GATK; v3.6)  and the FixMateInformation utility in Picard tools were used to realign the reads. To calculate basic statistics, we considered a read to be on target if the read overlap a target region by at least 1 bp. Quality score recalibration was conducted with the BaseRecalibrator utility in the GATK, using known SNVs and small INDELs collected from dbSNP (Build 138; April 25, 2013). The same known variant set was used for filtering to narrow down the number of candidate mutations associated with the phenotypes. Variant calling was performed using the HaplotypeCaller utility in the GATK.
BEDTools (v2.17.0)  was used to filter out known variants from those identified in a sample. ANNOVAR (version 22-March-2015)  was used for the functional annotation of variants, with RefSeq genes and Ensembl genes used as the rat gene annotations. PolyPhen-2 was used to assess the functional consequences of mutations . The Integrative Genome Viewer (IGV; v2_2_13)  was used to visualize the mapped reads and variants.
Detection of large deletions
The BAM files obtained from the read mapping were used to detect large deletions. For this procedure, we applied the peak calling program, MACS2 , to the BAM file for the mutant strain and its control strain. By comparing the peak regions in the mutant and control strains using the Subtract option in BEDTools , we defined a deletion in the mutant strain if there is a peak only in the control strain. Then the identified peaks were visually inspected by using IGV (v2_2_13) .
CNS, conserved non-coding sequence; GAWS, genome-wide association study; SNV, single nucleotide variant; WES, whole exome sequencing.
We are thankful to the National BioResource Project - Rat (http://www.anim.med.kyoto-u.ac.jp/nbr/) for providing rat strains (WTC/Kyo, WTC-swh/Kyo, PVG/Seac, and KFRS4/Kyo).
This work was supported in part by the Cooperative Research Project Program of the Medical Institute of Bioregulation, Kyushu University, and the Genome Information Upgrading Program of the National BioResource Project, Japan Agency for Medical Research and Development (AMED).
Availability of data and materials
The TargetEC probe set is commercially available from Roche NimbleGen under the design name 140929_RN5_MS_EZ_HX1. The obtained reads have been deposited in a public database (DDBJ Sequence Read Archive accession number: DRA004543).
MY and MS conceived and designed the study. TK and OO conducted the experiments. MY, DS, TS, and MS performed the analyses. MY and MS wrote the manuscript with the help from all of the authors. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
All animal experimentation protocols were approved by the Institutional Animal Care and Use Committees of Kyoto University and were conducted according to the Regulation on Animal Experimentation at Kyoto University.
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Hodges E, Xuan Z, Balija V, Kramer M, Molla MN, Smith SW, et al. Genome-wide in situ exon capture for selective resequencing. Nat Genet. 2007;39:1522–7.View ArticlePubMedGoogle Scholar
- Ng SB, Buckingham KJ, Lee C, Bigham AW, Tabor HK, Dent KM, et al. Exome sequencing identifies the cause of a mendelian disorder. Nat Genet. 2010;42:30–5.View ArticlePubMedGoogle Scholar
- Rabbani B, Tekin M, Mahdieh N. The promise of whole-exome sequencing in medical genetics. J Hum Genet. 2014;59:5–15.View ArticlePubMedGoogle Scholar
- Fairfield H, Gilbert GJ, Barter M, Corrigan RR, Curtain M, Ding Y, et al. Mutation discovery in mice by whole exome sequencing. Genome Biol. 2011;12:R86.View ArticlePubMedPubMed CentralGoogle Scholar
- Fairfield H, Srivastava A, Ananda G, Liu R, Kircher M, Lakshminarayana A, et al. Exome sequencing reveals pathogenic mutations in 91 strains of mice with Mendelian disorders. Genome Res. 2015;25:948–57.View ArticlePubMedPubMed CentralGoogle Scholar
- Gibbs RA, Weinstock GM, Metzker ML, Muzny DM, Sodergren EJ, Scherer S, et al. Genome sequence of the Brown Norway rat yields insights into mammalian evolution. Nature. 2004;428:493–521.View ArticlePubMedGoogle Scholar
- Broeckx BJG, Coopman F, Verhoeven GEC, Bavegems V, De Keulenaer S, De Meester E, et al. Development and performance of a targeted whole exome sequencing enrichment kit for the dog (Canis Familiaris Build 3.1). Sci Rep. 2014;4:5597.View ArticlePubMedPubMed CentralGoogle Scholar
- Broeckx BJG, Hitte C, Coopman F, Verhoeven GEC, De Keulenaer S, De Meester E, et al. Improved canine exome designs, featuring ncRNAs and increased coverage of protein coding genes. Sci Rep. 2015;5:12810.View ArticlePubMedPubMed CentralGoogle Scholar
- Robert C, Fuentes-Utrilla P, Troup K, Loecherbach J, Turner F, Talbot R, et al. Design and development of exome capture sequencing for the domestic pig (Sus scrofa). BMC Genomics. 2014;15:550.View ArticlePubMedPubMed CentralGoogle Scholar
- Cosart T, Beja-Pereira A, Chen S, Ng SB, Shendure J, Luikart G. Exome-wide DNA capture and next generation sequencing in domestic and wild species. BMC Genomics. 2011;12:347.View ArticlePubMedPubMed CentralGoogle Scholar
- Bi K, Vanderpool D, Singhal S, Linderoth T, Moritz C, Good JM. Transcriptome-based exon capture enables highly cost-effective comparative genomic data collection at moderate evolutionary scales. BMC Genomics. 2012;13:403.View ArticlePubMedPubMed CentralGoogle Scholar
- Jin X, He M, Ferguson B, Meng Y, Ouyang L, Ren J, et al. An effort to use human-based exome capture methods to analyze chimpanzee and macaque exomes. PLoS One. 2012;7, e40637.View ArticlePubMedPubMed CentralGoogle Scholar
- Lettice LA, Heaney SJH, Purdie LA, Li L, de Beer P, Oostra BA, et al. A long-range Shh enhancer regulates expression in the developing limb and fin and is associated with preaxial polydactyly. Hum Mol Genet. 2003;12:1725–35.View ArticlePubMedGoogle Scholar
- Benko S, Fantes JA, Amiel J, Kleinjan D-J, Thomas S, Ramsay J, et al. Highly conserved non-coding elements on either side of SOX9 associated with Pierre Robin sequence. Nat Genet. 2009;41:359–64.View ArticlePubMedGoogle Scholar
- Epstein DJ. Cis-regulatory mutations in human disease. Brief Funct Genomic Proteomic. 2009;8:310–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Ward LD, Kellis M. Interpreting noncoding genetic variation in complex traits and human disease. Nat Biotechnol. 2012;30:1095–106.View ArticlePubMedPubMed CentralGoogle Scholar
- Douglas AT, Hill RD. Variation in vertebrate cis-regulatory elements in evolution and disease. Transcription. 2014;5, e28848.View ArticlePubMedPubMed CentralGoogle Scholar
- Mathelier A, Shi W, Wasserman WW. Identification of altered cis-regulatory elements in human disease. Trends Genet TIG. 2015;31:67–76.View ArticlePubMedGoogle Scholar
- Welter D, Macarthur J, Morales J, Burdett T, Hall P, Junkins H, et al. The NHGRI GWAS Catalog, a curated resource of SNP-trait associations. Nucleic Acids Res. 2014;42:D1001–6.View ArticlePubMedGoogle Scholar
- Sakabe NJ, Savic D, Nobrega MA. Transcriptional enhancers in development and disease. Genome Biol. 2012;13:238.View ArticlePubMedPubMed CentralGoogle Scholar
- Siepel A, Bejerano G, Pedersen JS, Hinrichs AS, Hou M, Rosenbloom K, et al. Evolutionarily conserved elements in vertebrate, insect, worm, and yeast genomes. Genome Res. 2005;15:1034–50.View ArticlePubMedPubMed CentralGoogle Scholar
- Sherry ST, Ward MH, Kholodov M, Baker J, Phan L, Smigielski EM, et al. dbSNP: the NCBI database of genetic variation. Nucleic Acids Res. 2001;29:308–11.View ArticlePubMedPubMed CentralGoogle Scholar
- STAR Consortium, Saar K, Beck A, Bihoreau M-T, Birney E, Brocklebank D, et al. SNP and haplotype mapping for genetic analysis in the rat. Nat Genet. 2008;40:560–6.View ArticleGoogle Scholar
- Atanur SS, Diaz AG, Maratou K, Sarkis A, Rotival M, Game L, et al. Genome sequencing reveals loci under artificial selection that underlie disease phenotypes in the laboratory rat. Cell. 2013;154:691–703.View ArticlePubMedPubMed CentralGoogle Scholar
- Rat Genome Sequencing and Mapping Consortium, Baud A, Hermsen R, Guryev V, Stridh P, Graham D, et al. Combined sequence-based and genetic mapping analysis of complex traits in outbred rats. Nat Genet. 2013;45:767–75.View ArticleGoogle Scholar
- Hermsen R, de Ligt J, Spee W, Blokzijl F, Schäfer S, Adami E, et al. Genomic landscape of rat strain and substrain variation. BMC Genomics. 2015;16:357.View ArticlePubMedPubMed CentralGoogle Scholar
- Kuramoto T, Morimura K, Nomoto T, Namiki C, Hamada S, Fukushima S, et al. Sparse and wavy hair: a new model for hypoplasia of hair follicle and mammary glands on rat chromosome 17. J Hered. 2005;96:339–45.View ArticlePubMedGoogle Scholar
- Watanabe K, Wada K, Ohashi T, Okubo S, Takekuma K, Hashizume R, et al. A 5-bp insertion in Mip causes recessive congenital cataract in KFRS4/Kyo rats. PLoS One. 2012;7, e50737.View ArticlePubMedPubMed CentralGoogle Scholar
- Quina LA, Kuramoto T, Luquetti DV, Cox TC, Serikawa T, Turner EE. Deletion of a conserved regulatory element required for Hmx1 expression in craniofacial mesenchyme in the dumbo rat: a newly identified cause of congenital ear malformation. Dis Model Mech. 2012;5:812–22.View ArticlePubMedPubMed CentralGoogle Scholar
- Pruitt KD, Brown GR, Hiatt SM, Thibaud-Nissen F, Astashyn A, Ermolaeva O, et al. RefSeq: an update on mammalian reference sequences. Nucleic Acids Res. 2014;42:D756–63.View ArticlePubMedGoogle Scholar
- Cunningham F, Amode MR, Barrell D, Beal K, Billis K, Brent S, et al. Ensembl 2015. Nucleic Acids Res. 2015;43:D662–9.View ArticlePubMedGoogle Scholar
- Serikawa T, Mashimo T, Takizawa A, Okajima R, Maedomari N, Kumafuji K, et al. National BioResource Project-Rat and related activities. Exp Anim Jpn Assoc Lab Anim Sci. 2009;58:333–41.View ArticleGoogle Scholar
- Mashimo T, Voigt B, Tsurumi T, Naoi K, Nakanishi S, Yamasaki K, et al. A set of highly informative rat simple sequence length polymorphism (SSLP) markers and genetically defined rat strains. BMC Genet. 2006;7:19.View ArticlePubMedPubMed CentralGoogle Scholar
- Schaibley VM, Zawistowski M, Wegmann D, Ehm MG, Nelson MR, St Jean PL, et al. The influence of genomic context on mutation patterns in the human genome inferred from rare variants. Genome Res. 2013;23:1974–84.View ArticlePubMedPubMed CentralGoogle Scholar
- Kuramoto T, Yokoe M, Hashimoto R, Hiai H, Serikawa T. A rat model of hypohidrotic ectodermal dysplasia carries a missense mutation in the Edaradd gene. BMC Genet. 2011;12:91.View ArticlePubMedPubMed CentralGoogle Scholar
- Sadier A, Viriot L, Pantalacci S, Laudet V. The ectodysplasin pathway: from diseases to adaptations. Trends Genet TIG. 2014;30:24–31.View ArticlePubMedGoogle Scholar
- Adzhubei IA, Schmidt S, Peshkin L, Ramensky VE, Gerasimova A, Bork P, et al. A method and server for predicting damaging missense mutations. Nat Methods. 2010;7:248–9.View ArticlePubMedPubMed CentralGoogle Scholar
- Shiels A, Bassnett S. Mutations in the founder of the MIP gene family underlie cataract development in the mouse. Nat Genet. 1996;12:212–5.View ArticlePubMedGoogle Scholar
- Varadaraj K, Kumari SS, Patil R, Wax MB, Mathias RT. Functional characterization of a human aquaporin 0 mutation that leads to a congenital dominant lens cataract. Exp Eye Res. 2008;87:9–21.View ArticlePubMedPubMed CentralGoogle Scholar
- Kuramoto T, Yokoe M, Yagasaki K, Kawaguchi T, Kumafuji K, Serikawa T. Genetic analyses of fancy rat-derived mutations. Exp Anim Jpn Assoc Lab Anim Sci. 2010;59:147–55.View ArticleGoogle Scholar
- Wang K, Li M, Hakonarson H. ANNOVAR: functional annotation of genetic variants from high-throughput sequencing data. Nucleic Acids Res. 2010;38, e164.View ArticlePubMedPubMed CentralGoogle Scholar
- Kumar P, Henikoff S, Ng PC. Predicting the effects of coding non-synonymous variants on protein function using the SIFT algorithm. Nat Protoc. 2009;4:1073–81.View ArticlePubMedGoogle Scholar
- Buchan DWA, Minneci F, Nugent TCO, Bryson K, Jones DT. Scalable web services for the PSIPRED Protein Analysis Workbench. Nucleic Acids Res. 2013;41:W349–57.View ArticlePubMedPubMed CentralGoogle Scholar
- ENCODE Project Consortium. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57–74.View ArticleGoogle Scholar
- Mouse ENCODE Consortium, Stamatoyannopoulos JA, Snyder M, Hardison R, Ren B, Gingeras T, et al. An encyclopedia of mouse DNA elements (Mouse ENCODE). Genome Biol. 2012;13:418.View ArticleGoogle Scholar
- Hinrichs AS, Karolchik D, Baertsch R, Barber GP, Bejerano G, Clawson H, et al. The UCSC Genome Browser Database: update 2006. Nucleic Acids Res. 2006;34:D590–8.View ArticlePubMedGoogle Scholar
- 1000 Genomes Project Consortium, Auton A, Brooks LD, Durbin RM, Garrison EP, Kang HM, et al. A global reference for human genetic variation. Nat. 2015;526:68–74.View ArticleGoogle Scholar
- 1000 Genomes Project Consortium, Abecasis GR, Altshuler D, Auton A, Brooks LD, Durbin RM, et al. A map of human genome variation from population-scale sequencing. Nat. 2010;467:1061–73.View ArticleGoogle Scholar
- Zan Y, Haag JD, Chen K-S, Shepel LA, Wigington D, Wang Y-R, et al. Production of knockout rats using ENU mutagenesis and a yeast-based screening assay. Nat Biotechnol. 2003;21:645–51.View ArticlePubMedGoogle Scholar
- Smits BMG, Mudde JB, van de Belt J, Verheul M, Olivier J, Homberg J, et al. Generation of gene knockouts and mutant models in the laboratory rat by ENU-driven target-selected mutagenesis. Pharmacogenet Genomics. 2006;16:159–69.PubMedGoogle Scholar
- Geurts AM, Cost GJ, Freyvert Y, Zeitler B, Miller JC, Choi VM, et al. Knockout rats via embryo microinjection of zinc-finger nucleases. Science. 2009;325:433.View ArticlePubMedPubMed CentralGoogle Scholar
- Mashimo T, Takizawa A, Voigt B, Yoshimi K, Hiai H, Kuramoto T, et al. Generation of knockout rats with X-linked severe combined immunodeficiency (X-SCID) using zinc-finger nucleases. PLoS One. 2010;5, e8870.View ArticlePubMedPubMed CentralGoogle Scholar
- Paijmans JLA, Fickel J, Courtiol A, Hofreiter M, Förster DW. Impact of enrichment conditions on cross-species capture of fresh and degraded DNA. Mol Ecol Resour. 2016;16:42–55.View ArticlePubMedGoogle Scholar
- Karolchik D, Hinrichs AS, Furey TS, Roskin KM, Sugnet CW, Haussler D, et al. The UCSC Table Browser data retrieval tool. Nucleic Acids Res. 2004;32:D493–6.View ArticlePubMedPubMed CentralGoogle Scholar
- Rosenbloom KR, Armstrong J, Barber GP, Casper J, Clawson H, Diekhans M, et al. The UCSC Genome Browser database: 2015 update. Nucleic Acids Res. 2015;43:D670–81.View ArticlePubMedGoogle Scholar
- Quinlan AR, Hall IM. BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010;26:841–2.View ArticlePubMedPubMed CentralGoogle Scholar
- Altmann A, Weber P, Bader D, Preuss M, Binder EB, Müller-Myhsok B. A beginners guide to SNP calling from high-throughput DNA-sequencing data. Hum Genet. 2012;131:1541–54.View ArticlePubMedGoogle Scholar
- Li H, Durbin R. Fast and accurate short read alignment with Burrows-Wheeler transform. Bioinformatics. 2009;25:1754–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, et al. The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009;25:2078–9.View ArticlePubMedPubMed CentralGoogle Scholar
- McKenna A, Hanna M, Banks E, Sivachenko A, Cibulskis K, Kernytsky A, et al. The Genome Analysis Toolkit: a MapReduce framework for analyzing next-generation DNA sequencing data. Genome Res. 2010;20:1297–303.View ArticlePubMedPubMed CentralGoogle Scholar
- Thorvaldsdóttir H, Robinson JT, Mesirov JP. Integrative Genomics Viewer (IGV): high-performance genomics data visualization and exploration. Brief Bioinform. 2013;14:178–92.View ArticlePubMedGoogle Scholar
- Feng J, Liu T, Qin B, Zhang Y, Liu XS. Identifying ChIP-seq enrichment using MACS. Nat Protoc. 2012;7:1728–40.View ArticlePubMedGoogle Scholar