- Open Access
DNA damage repair system in C57BL/6 J mice is evolutionarily stable
BMC Genomics volume 22, Article number: 669 (2021)
DNA damage repair (DDR) system is vital in maintaining genome stability and survival. DDR consists of over 160 genes in 7 different pathways to repair specific type of DNA damage caused by external and internal damaging factors. The functional importance of DDR system implies that evolution could play important roles in maintaining its functional intactness to perform its function. Indeed, it has been observed that positive selection is present in BRCA1 and BRCA2 (BRCA), which are key genes in homologous recombination pathway of DDR system, in the humans and its close relatives of chimpanzee and bonobos. Efforts have been made to investigate whether the same selection could exist for BRCA in other mammals but found no evidence so far. However, as most of the studies in non-human mammals analyzed only a single or few individuals in the studied species, the observation may not reflect the true status in the given species. Furthermore, few studies have studied evolution selection in other DDR genes except BRCA. In current study, we used laboratory mouse C57BL/6 J as a model to address evolution selection on DDR genes in non-primate mammals by dynamically monitoring genetic variation across 30 generations in C57BL/6 J.
Using exome sequencing, we collected coding sequences of 169 DDR genes from 44 C57BL/6 J individual genomes in 2018. We compared the coding sequences with the mouse reference genome sequences derived from 1998 C57BL/6 J DNA, and with the mouse Eve6B reference genome sequences derived from 2003 C57BL/6 J DNA, covering 30 generations of C57BL/6 J from 1998 to 2018. We didn’t identify meaningful coding variation in either Brca1 or Brca2, or in 167 other DDR genes across the 30 generations. In the meantime, we did identify 812 coding variants in 116 non-DNA damage repair genes during the same period, which served as a quality control to validate the reliability of our analytic pipeline and the negative results in DDR genes.
DDR genes in laboratory mouse strain C57BL/6 J were not under positive selection across its 30-generation period, highlighting the possibility that DDR system in rodents could be evolutionarily stable.
A genome is constantly damaged by internal metabolic factors and external environmental factors. In order to maintain genome stability, living organisms are equipped with a highly sophisticated DNA damage repair (DDR) system to effectively repair the damages. The DDR system is composed of multiple pathways including homologous recombination (HR), non-homologous end joining (NHEJ), Fanconi anemia pathway (FA), base excision repair (BER), nucleotide excision repair (NER), mismatch repair (MMR), and single-strand annealing (SSA). Each pathway consists of a group of genes to repair a specific type of DNA damage through their collaborative action.
As DNA damage repair is vital for survival, it would be expected that evolution selection play roles in maintaining a highly functional DNA damage repair machinery for survival and better fitness. BRCA1 and BRCA2 (BRCA) are two important DDR genes for repairing DNA double-strand break through homology recombination (HR) pathway and mutation in BRCA substantially increases cancer risk [1, 2]. Studies indeed revealed that BRCA in the humans and its close relatives of chimpanzee and bonobos are under positive selection . However, the same type of selection was not observed in other mammals [4,5,6,7,8,9]. This raises the possibility that the same DNA damage repair genes in different species could be under different evolution selections . Except a few cases, however, nearly all BRCA variation data reported from non-human mammals were derived from a single individual in the tested species. From population genetics point of view, it is questionable if the observation made in a single individual could represent the situation in the tested species. Further, few other DDR genes except BRCA have ever been analyzed for their evolution selection (Table 1). Therefore, it remains unclear for the relationship between DDR system and evolution selection, a fundamental question in biology for the mechanisms of genome stability maintenance.
Dynamic monitoring of genetic variation is a powerful approach to study evolution selection. This is best exemplified by the variation studies in E. coli by following its constant growth for four decades of over 60,000 generations under laboratory cultural conditions , and in laboratory rat by following its genetic variation in the genes involving in learning, circadian rhythm, and metabolism . C57BL/6 J is one of the most used laboratory mouse models in biological and oncogenic studies. C57BL/6 J is the descendent of cryopreserved embryo stock with clear genetic background (Fig. 1). Its DNA extracted in 1998 was used for the Mouse Genome Project to generate the mouse genome reference sequences , and its DNA extracted in 2003 was sequenced again to generate the mouse genome reference sequences B6Eve . From 1998 and 2018, C57BL/6 J has passed 30 generations. We hypothesized that this period can be longer enough as an excellent model to test evolution selection in DDR system in C57BL/6 J, and the information could be helpful to understand evolution selection on DDR system in rodents as represented by C57BL/6 J.
In present study, we sequenced the coding region of C57BL/6 J genome using the DNA collected from 44 C57BL/6 J individuals in 2018. We searched the variants arisen after 1998 by comparing the mouse genome reference sequences derived from 1998 C57BL/6 J DNA and mouse genome reference sequences B6Eve derived from 2003 C57BL/6 J DNA. We found no evidence for genetic variation arisen in the 169 DDR genes including Brca1 and Brca2 during this period, while we did identify the genetic variation in 116 non-DDR genes involved in other functional categories. From the data, we conclude that DDR system in C57BL/6 J is evolutionarily stable during its 30-generation period.
Identifying genetic variants
C57BL/6 J genome in 1998 was sequenced by the Mouse Genome Project to generate the mouse genome reference sequences. Since then, C57BL/6 J mice has been inbreeded for 30 generations (24 in Jackson Laboratory and 4 in University of Macau Animal Facility) by 2018 (14, Fig. 1). We collected genomic DNA in 2018 from 44 C57BL/6 J mice and performed exome sequencing and called coding variants. We applied the following procedures to ensure the accuracy for the variants called from the exome sequences: 1) Only the variants present in > 50% (22 individuals) of the mice were kept for further analysis; 2) Using both mouse genome reference sequences mm7 and mm10 assemblies as the references for variant calling; 3) use B6Eve variants as the third reference; 4) Using Sanger sequencing to validate the called variants. From the exome sequences collected in the 2018 C57BL/6 J DNA, we identified a total of 3024 variants (Supplementary Table 1), of which 883 (29.2%) were singleton, 1329 (43.9%) were between 2 and 21, and 812 (26.9%) were present in at least 22 mice and used for further analysis (Supplementary Table 2). We reasoned that by setting up this high bar, we can address better population variation rather than individual variation.
Variants in DDR genes
We searched the 812 variants but didn’t identify the variants in Brca1 and Brca2. We further searched the variants in the rest of 167 DDR genes involved in 7 DNA damage repair pathways but didn’t identify any variants in these genes neither (Supplementary Table 3A, B).
Variants in non-DDR genes
We then annotated the 812 variants and identified 116 non-DDR genes with these variants, of which Mroh2a, a HEAT-domain-containing protein with unknown function, had the highest number of 85 variants, and c4b, a component in Complementary system, had the 2nd highest number of 53 variants (Table 3, Supplementary Table 4). We used Sanger sequencing to validate a set of the variants in the original 2018 DNA samples used in exome sequencing. Of the 15 variants tested, 10 (67%) were validated (Supplementary Table 5). The variants identified in the non-DDR genes provided the internal control in ensuring that the absence of variation in DDR genes were a true biological phenomenon instead of missed identification due possibly to technical errors.
C57BL/6 J genome in 1998 was sequenced to generate the mouse genome reference sequences. After 20 years from 1998 to 2018 covering 30 generations, we re-sequenced the coding genes of C57BL/6 J in 44 individuals in order to determine if there could be variation arisen during this period in the DDR genes in C57BL/6 J genome. Our study didn’t identify new variants in DDR genes including Brca1 and Brca2 in the C57BL/6 J genome. The presence of new variants in over a hundred of non-DDR genes during the same period provided a strong assurance for the reliability of the observed lack of selection in DDR genes, and ruled out the possibility that the lack of variation in the full set of DDR genes was due to technical failure. The data from our study indicate the absence of positive selection in DDR genes in C57BL/6 J during the 30-generation period.
The lack of positive selection in DDR genes is unlikely due to the short period of C57BL/6 J under investigation. The 20-years of 30 generations in C57BL/6 J is equivalent to 800 years in the humans when counting 1 year in mouse equals to 30-years in the humans per generation . Studies showed that many BRCA variations in the humans occurred in recent human history. For example, 185delAG in BRCA1, a founder variant in Ashkenazi Jews population, was arisen around 750–1500 years ago ; 1499insA in BRCA1, a founder variant in Tuscany of Italy, was originated 750 years ago ; BRCA1 c.5266dupC, another founder variant in Ashkenazi Jews population, was originated 1800 year ago .
Possibility exists that animal under long-term protected laboratory environment could experience relaxed selection pressure, leading to altered genetic variation . If the time period is longer enough and the starting genome sequences are available, testing genetic variation in wild mice would determine if such possibility could exist for the observation made in C57BL/6 J in our study.
The reference genome sequences used can have impact on the variation identification. After mouse genome project accomplished in 2001, 10 different versions of C57BL/6 J genome reference sequences were generated, including the first version of mm1 released in 2010 to mm10 released in 2011, before the mm39 released in 2020 (https://genome.ucsc.edu/FAQ/FAQreleases.html). The different versions of the mouse genome reference sequences used basically the same raw sequence data generated by the mouse genome project, but the variation data between different version were substantially different, which unlikely reflects true variation but annotation artifacts. As such, using all different versions as the reference for variant identification could lead to high complexity and data inconsistence, and decrease reliability of the resulting variation data. On the other hand, using a single version of reference sequences for variant identification could miss potential variants not identifiable in the single version. To address the concerns, we used two later versions of mouse genome reference sequences, mm7 and mm10, as the references for variant identification; we also used the variation data from Eve B6 genome sequences derived from 2003 C57BL/6 J DNA as another reference; we further used Sanger sequencing to validate selected variants. The combinational use of these approaches in our study ensured reliability and sensibility of the variants identified from our study to address the issue of evolution selection in DDR system in C57BL/6 J.
The evidence for the presence of positive selection in DDR genes is mainly from BRCA in human, chimpanzee and bonobos . We propose explanations for why positive selection in BRCA exists in humans and its close relatives, but not in other mammals as represented in laboratory mouse C57BL/6 J: The basic function of BRCA is to repair DNA double-strand break in order to maintain genome stability in mammals. Like many genes involving in essential biological function, BRCA must be maintained in stable condition to perform their essential work . During evolution process, however, BRCA in humans, chimpanzee and bonobos acquired new function such as enhancing intelligent development , gene expression regulation , and reproduction  etc. Positive selection on these function is beneficial for better fitness; whereas BRCA in other mammals retains the classical function of DNA damage repair, therefore, maintains high stability in order to keep genome stability. The explanations may also be applicable to other DDR genes. It will be interesting to find more evidence to support these explanations in different mouse strains and different species.
DDR genes in laboratory mouse strain C57BL/6 J were not under positive selection across its 30-generation period, highlighting the possibility that DDR system in rodents could be evolutionarily stable.
C57BL/6 J mice used in this study was purchased from Jackson Laboratory in 2017, and inter-bred 4 generations in University of Macau Animal Facility. Mouse genomic DNA in 2018 was extracted from the tails of 44 C57BL/6 J mice (15 male and 29 female) using DNeasy Blood & Tissue Kit (Qiagen) following the instruction. The study was approved by University of Macau Animal Welfare Committee (UMARE-041-2017), and was carried out in accordance with relevant guidelines and regulations.
Exome sequence, mapping and variant call
Exome sequencing was performed at pair-end (2 × 150) and > 100x in Illumina Hiseq 2500 through Novogen customer service (Novogen, Hong Kong). Sequences were aligned to mouse reference genome sequence mm7 and mm10 using BWA 0.7.17MEM module and rearranged by Samtools v1.9 with sort option. Duplicates were removed by Picard in Genome Analysis Tool Kit (GATK) v22.214.171.124. IndelRealinger, BaseRecalibrator and ApplyBQSR options in GATK were used for BAM data processing. GenotypeGVCFs in GATK was used to call variants from BAM files, and Annovar was used for annotation, 20% variant allele frequency was used as the cutoff for variant calling. CrossMap was used to convert mm7 identified variants into mm10 to generate a mm10-based single set of variants. The Eve6B variants contain 2652 coding-variants identified from the 2003 C57BL/6 J genome, which differed from the 1998 C57BL/6 J-based mouse genome reference sequence GRCm38 (Supplementary Table 6). The 3 variants of chr11: 3186080 G > A, chr11: 3187266 C > T, and chr11: 3187367 T > C in Sfi1 were eliminated from the mapping analysis as they were determined by B6Eve study as artifacts .
Source of DNA damage repair genes
DNA damage repair-related genes were downloaded from KEGG DNA repair related pathways (http://software.broadinstitute.org/gsea/msigdb), which consists of 169 genes in 7 pathways of base excision repair (BER), DNA replication (DR), Fanconi anemia (FA), homologous recombination (HR), non-homologous end-joining (NHEJ), mismatch repair (MMR), and nucleotide excision repair (NER) (Table 2).
DNA damage repair
Non-homologous end joining
Fanconi anemia pathway
Base excision repair
Nucleotide excision repair
BRCA1 and BRCA2
Venkitaraman AR. Cancer susceptibility and the functions of BRCA1 and BRCA2. Cell. 2002;108(2):171–82.
Lou Z, Minter-Dykhouse K, Chen J. BRCA1 participates in DNA decatenation. Nat Struct Mol Biol. 2005;12(7):589–93.
Huttley GA, Easteal S, Southey MC, Tesoriero A, Giles GG, et al. Adaptive evolution of the tumour suppressor BRCA1 in humans and chimpanzees. Australian breast Cancer family study. Nat Genet. 2000;25(4):410–3.
Lou DI, McBee RM, Le UQ, Stone AC, Wilkerson GK, et al. Rapid evolution of BRCA1 and BRCA2 in humans and other primates. BMC Evol Biol. 2014;14:155.
Pavlicek A, Noskov VN, Kouprina N, Barrett JC, Jurka J, et al. Evolution of the tumor suppressor BRCA1 locus in primates: implications for cancer predisposition. Hum Mol Genet. 2004;13(22):2737–51.
Fleming MA, Potter JD, Ramirez CJ, Ostrander GK, Ostrander EA. Understanding missense mutations in the BRCA1 gene: an evolutionary approach. Proc Natl Acad Sci U S A. 2003;100(3):1151–6.
Burk-Herrick A, Scally M, Amrine-Madsen H, Stanhope MJ, Springer MS. Natural selection and mammalian BRCA1 sequences: elucidating functionally important sites relevant to breast cancer susceptibility in humans. Mamm Genome. 2006;17(3):257–70.
Sawyer SL, Malik HS. Positive selection of yeast nonhomologous end-joining genes and a retrotransposon conflict hypothesis. Proc Natl Acad Sci U S A. 2006;103(47):17614–9.
Demogines A, East AM, Lee JH, Grossman SR, Sabeti PC, et al. Ancient and recent adaptive evolution of primate non-homologous end joining genes. PLoS Genet. 2010;6(10):e1001169.
Clark AG, Glanowski S, Nielsen R, Thomas PD, Kejariwal A, et al. Inferring nonneutral evolution from human-chimp-mouse orthologous gene trios. Science. 2003;302:1960–3.
Ramirez C, Fleming M, Potter J, et al. Marsupial BRCA1: conserved regions in mammals and the potential effect of missense changes. Oncogene. 2004;23:1780–8.
Tavtigian SV, Deffenbaugh AM, Yin L, et al. Comprehensive statistical study of 452 BRCA1 missense substitutions with classification of eight recurrent substitutions as neutral. J Med Genet. 2006;43(4):295–305.
Schmid K, Yang Z. The trouble with sliding windows and the selective pressure in BRCA1. PLoS One. 2008;3(11):e3746.
Pettigrew C, Wayte N, Lovelock PK, et al. Evolutionary conservation analysis increases the colocalization of predicted exonic splicing enhancers in the BRCA1 gene with missense sequence changes and in-frame deletions, but not polymorphisms. Breast Cancer Res. 2005;7(6):R929–39.
Good BH, McDonald MJ, Barrick JE, Lenski RE, Desai MM. The dynamics of molecular evolution over 60,000 generations. Nature. 2017;551(7678):45–50.
Zhang YP. Rapid evolution of genes involved in learning and energy metabolism for domestication of the laboratory rat. Mol Biol Evol. 2017;34(12):3148–53.
Sarsani VK, Raghupathy N, Fiddes IT, Armstrong J, Thibaud-Nissen F, et al. The Genome of C57BL/6J “Eve”, the mother of the laboratory mouse genome reference strain. G3. 2019;9(6):1795–805.
Mouse Genome Sequencing Consortium, Waterston RH, Lindblad-Toh K, Birney E, Rogers J, et al. Initial sequencing and comparative analysis of the mouse genome. Nature. 2002;420(6915):520–62.
Devine D. How long is a generation? Science provide an answer. Ancestry. 2005;23(5):51–3.
Laitman Y, Feng BJ, Zamir IM, Weitzel JN, Duncan P, et al. Haplotype analysis of the 185delAG BRCA1 mutation in ethnically diverse populations. Eur J Hum Genet. 2013;21(2):212–6.
Marroni F, Cipollini G, Peissel B, D'Andrea E, Pensabene M, et al. Reconstructing the genealogy of a BRCA1 founder mutation by phylogenetic analysis. Ann Hum Genet. 2008;72(Pt 3):310–8.
Hamel N, Feng BJ, Foretova L, Stoppa-Lyonnet D, Narod SA, et al. On the origin and diffusion of BRCA1 c.5266dupC (5382insC) in European populations. Eur J Hum Genet. 2011;19(3):300–6.
Johnson MS, Gopalakrishnan S, Goyal J, Dillingham ME, Bakerlee CW, et al. Phenotypic and molecular evolution across 10,000 generations in laboratory budding yeast populations. Elife. 2021;10:e63910. https://doi.org/10.7554/eLife.63910.
Brocchieri L, Conway de Macario E, Macario AJ. hsp70 genes in the human genome: Conservation and differentiation patterns predict a wide array of overlapping and specialized functions. BMC Evol Biol. 2008;8:19.
Pao GM, Zhu Q, Perez-Garcia CG, Chou SJ, Suh H, et al. Role of BRCA1 in brain development. Proc Natl Acad Sci U S A. 2014;111(13):E1240–8.
Lane TF. BRCA1 and transcription. Cancer Biol Ther. 2004;3(6):528–33.
Smith KR, Hanson HA, Mineau GP, Buys SS. Effects of BRCA1 and BRCA2 mutations on female fertility. Proc Biol Sci. 2012;279(1732):1389–95.
We thank Dr. Laura Reinholdt for providing B6Eve variant data. We are thankful for Information and Communication Technology Office (ICTO) of University of Macau for providing the High-Performance Computing Cluster (HPCC) facilities used in this study. This work was supported by grants from the Macau Science and Technology Development Fund (085/2017/A2, 0077/2019/AMJ), the University of Macau (SRG2017-00097-FHS, MYRG2019-00018-FHS), the Faculty of Health Sciences, University of Macau (FHSIG/SW/0007/2020P and a startup fund) to SMW. All funding sources played no roles in the design of the study and collection, analysis, and interpretation of data and in writing the manuscript.
The study was approved by University of Macau Animal Welfare Committee (UMARE-041-2017).
Consent for publication
The authors declare no competing interests of the study.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Total variants detected in 2018 C57BL/6J genome*
Variants detected in 2018 C57BL/6J present in > 22 (50%) cases*
A. Absence of coding variants in DDR genes by refering to mm7. B. Absence of coding variants in DDR genes by refering to mm10
Variants in Non-DDR genes detected in C57BL/6J from 1998 to 2018
Sanger-validated non-DDR variants in 2018 C57BL/6J
Coding-variants in Eve6B differing from 1998 C57BL/6J genome*
About this article
Cite this article
Wang, X., Wang, S.M. DNA damage repair system in C57BL/6 J mice is evolutionarily stable. BMC Genomics 22, 669 (2021). https://doi.org/10.1186/s12864-021-07983-7
- DNA damage repair
- C57BL/6 J
- Evolution selection