Exome sequencing of senescence-accelerated mice (SAM) reveals deleterious mutations in degenerative disease-causing genes

Background Senescence-accelerated mice (SAM) are a series of mouse strains originally derived from unexpected crosses between AKR/J and unknown mice, from which phenotypically distinct senescence-prone (SAMP) and -resistant (SAMR) inbred strains were subsequently established. Although SAMP strains have been widely used for aging research focusing on their short life spans and various age-related phenotypes, such as immune dysfunction, osteoporosis, and brain atrophy, the responsible gene mutations have not yet been fully elucidated. Results To identify mutations specific to SAMP strains, we performed whole exome sequencing of 6 SAMP and 3 SAMR strains. This analysis revealed 32,019 to 38,925 single-nucleotide variants in the coding region of each SAM strain. We detected Ogg1 p.R304W and Mbd4 p.D129N deleterious mutations in all 6 of the SAMP strains but not in the SAMR or AKR/J strains. Moreover, we extracted 31 SAMP-specific novel deleterious mutations. In all SAMP strains except SAMP8, we detected a p.R473W missense mutation in the Ldb3 gene, which has been associated with myofibrillar myopathy. In 3 SAMP strains (SAMP3, SAMP10, and SAMP11), we identified a p.R167C missense mutation in the Prx gene, in which mutations causing hereditary motor and sensory neuropathy (Dejerine-Sottas syndrome) have been identified. In SAMP6 we detected a p.S540fs frame-shift mutation in the Il4ra gene, a mutation potentially causative of ulcerative colitis and osteoporosis. Conclusions Our data indicate that different combinations of mutations in disease-causing genes may be responsible for the various phenotypes of SAMP strains.


Background
Aging is one of the most complex biological processes that are regulated by both genetic and environmental factors, but its molecular basis remains largely unknown [1]. Senescence-accelerated mice (SAM) are a series of inbred strains developed from the AKR/J strain, consisting of 9 senescence-prone strains (SAMP) and 4 senescenceresistant strains (SAMR) [2,3]. Compared with SAMR accelerated-senescence phenotypes such as a short life span and early onset of various age-related pathological changes [4]. These SAM strains have therefore been used as a model to elucidate the mechanism of aging.
It has remained unknown why SAM strains exhibit different phenotypes, even though they were derived from a common ancestor [2,3]. Genetic analyses by use of biochemical and immunological markers and endogenous murine leukemia virus (MuLV) proviral markers revealed that each SAM strain constitutes a genetically distinct group. Comparisons of the SAM strains with their parental AKR/J strain indicated significant differences in genetic background between them, corroborating the hypothesis of the involvement of other strains, which underscores the probability of accidental outbreeding of the AKR/J strain in the course of the development of SAM [5,6].
Despite intense characterization of SAM strains, the genes responsible for accelerated senescence and pathologic changes in SAMP strains remain unidentified except for mutations in the Apoa2, Sfrp4, and Fgf1 genes [7][8][9]. Xia et al. performed genotyping for 581 microsatellite markers in 13 established SAM strains, and identified 4 loci that were different between the SAMP and SAMR strains [10], although the responsible genes remain unknown. Furthermore, genetic analysis of crosses between the SAMP1 and SAMR1 strains also suggested that combinations of multiple gene mutations are responsible for the phenotypes [11].
Recent advances in next-generation sequencing technologies have made it possible to rapidly determine the DNA sequence of the whole genome of individual humans [12,13]. As an alternative approach to whole-genome sequencing, whole-exome sequencing is an efficient strategy with regard to reducing the cost and workload [14,15]. Exome sequencing enables us to obtain information on functionally important coding regions. Although this type of sequencing is useful for identification of the cause of Mendelian disorders [16,17], it is difficult to explore genes responsible for complex traits by using this approach. The difficulty in identification of combined effects of various variants in humans is mainly ascribable to the presence of heterozygosity as well as homozygosity in humans [18]. In contrast, inbred mouse strains such as SAM strains are useful models to analyze the combined effects of genes because we can focus on homozygous variations only.
In the present study, we performed whole exome sequencing of 6 SAMP and 3 SAMR strains to identify the single-nucleotide variations (SNVs) in their entire exomes. We hypothesized that the accelerated-senescence phenotypes and short life span observed in SAMP strains are caused by coding-region mutations that are present specifically in SAMP strains but are absent in the SAMR strains. We obtained a full view of the exome signature of SAM strains and report herein several mutations that potentially cause various pathogenic phenotypes. Our data demonstrate that this innovative approach, whole-exome sequencing, is paving the way to the unraveling of the genetic mechanisms of accelerated senescence and pathogenic phenotypes in mouse models.

Whole-exome sequencing revealed exonic profiles of SAM strains
Whole-exome capture and next-generation sequencing were successfully performed on 11 mouse strains, i.e., SAMP1/SkuSlc, SAMP3/SlcIdr, SAMP6/TaSlc, SAMP8/ TaSlc, SAMP10/TaSlc, SAMP11/SlcIdr, SAMR1/SlcIdr, SAMR1/TaSlc, SAMR3B/SlcIdr, AKR/J, and C57BL/6J strains, and generated on average 73 million, single-end 50-bp reads per sample ( Table 1). The number of aligned reads was about 51 million including 1.7 gigabases of sequence per sample, and the generated sequences achieved a mean read depth of 33.1 ± 11.8×. On average, 91.8 ± 0.4% of the target base pairs were covered by at least one read; and 72.9 ± 6.5% of the target base pairs were covered by at least 10 reads. After removal of lowquality reads, duplicates, and reads mapped out of targeted regions, SNV detection was performed by use of Avadis NGS ver1.3. The whole-exome sequencing identified 85,198 to 112,245 SNVs for SAMP strains, 108,059 to 121,517 for SAMR strains, 100,463 for AKR/J, and 3,484 for C57BL/6J on targeted regions from the reference mouse genome sequence GRCm38 (NCBI37/mm9; Table 2). The number of SNVs in C57BL/6J mice was much lower than that of those in the other strains, because the genome sequence of C57BL/6J was used as the reference for the mouse genome. These 3,484 SNVs in the C57BL/6J strain may be attributed to individual variability rather than to sequencing error. Actually, several studies reported that minor genetic and phenotypic variations could be observed even among individual C57BL/6J mice [19,20]. Although the targeted regions included noncoding regions, we restricted our analysis to exonic SNVs of 32,019 to 35,817 variants for SAMP strains, 36,174 to 38,925 for SAMR strains, 32,816 for AKR/J, and 1,407 for the C57BL/6J strain. Exonic SNVs included 6,507 to 7,843 non-synonymous SNVs for each sample except for C57BL/6J. Moreover, 230 to 491 novel non-synonymous SNVs were detected after comparison with the public database dbSNP128 and genome sequences of 17 inbred strains of laboratory mice [21]. In the same way, novel multiple nucleotide variants (MNV), frame-shift mutations, and nonsense mutations were detected in each strain (Table 2). We calculated the rates of false positive and false negative by validating 32 known and 61 novel SNVs by using the Sanger method (Additional file 1: Tables S1-S3). Although the false-positive rate was 14% for novel SNVs, the entire false-positive rate was 8.0%, indicating that high-quality calls for homozygous SNVs were gained.
No novel exonic mutations commonly detected among SAMP strains Surprisingly, we detected no novel mutations that were present in all of the SAMP strains, but absent in the other strains (Additional file 1:  (Table 3). Having examined the effects of SNVs on protein function, we predicted Ogg1 p.R304W to be deleterious by both SIFT and PolyPhen-2 programs (SIFT score: 0.00, PolyPhen-2 score: 0.999), and Mbd4 p.D129N to be deleterious only by the PolyPhen-2 program (SIFT score: 0.12, PolyPhen-2 score: 0.996). We performed functional enrichment analysis to confirm whether common features could be detected among these 6 genes. GO analysis showed that "base-excision repair" (Ogg1 and Mbd4) was most significantly overrepresented (adjusted p-value =0.0003; Table 4). Ogg1 and Mbd4 genes were included among the entire top 5 of overrepresented GO categories.
Only "response to stress" included Alox5 in addition to Ogg1 and Mbd4 (adjusted p-value=0.0202). We also checked whether these 6 genes had been reported to be associated with the aging process by referring to the GenAge database [22], but no such genes were recorded in the database (data not shown). The Ogg1 p.R304W mutation was previously observed in all of the SAMP strains, but this same mutation was  also detected in NZB/N, NFS/N, SJL/J, and NOD/ShiLtJ strains [23]. The Ogg1 gene encodes the enzyme 8oxoguanine DNA glycosylase, by which oxidatively modified bases are repaired [24,25]. The methyl-CpG-binding domain protein, encoded by the Mbd4 gene, is also a DNA repair enzyme that is responsible for removing mismatched thymine or uracil within methylated CpG sites [26]. Similar to Ogg1 p.R304W, Mbd4 p.D129N was previously found in normal mice strains including 129P2/ OlaHsd, 129S1/SvImJ, 129S5SvEvBrd, DBA/2J, LP/J, NOD/ShiLtJ, and NZO/HlLtJ [21]. It is interesting that all of the SAMP strains as well as the NOD/ShiLtJ strain share these genes that are involved in DNA repair, i.e., Ogg1 and Mbd4. NOD/ShiLtJ is a mouse model of type 1 diabetes, showing a short life span [27,28]. Nevertheless, we should be careful to conclude that the combination of these mutations regulates the accelerated-senescence phenotype of SAMP, because the short life span of NOD/ ShiLtJ is generally attributed to diabetes caused by insulitis.

Unique deleterious mutations identified in each substrain
We hypothesized that different disease phenotypes among SAMP strains are caused by deleterious SNVs that are unique to each strain or a subgroup of strains. We focused on novel non-synonymous SNVs specific to each strain. We extracted SAMP-specific novel non-synonymous SNVs, after excluding mutations in olfactory-receptor and vomeronasal-receptor superfamily genes or in pseudogenes. In addition to nonsense and frameshift mutations, we focused on dysfunctional mutations predicted to be deleterious by SIFT or PolyPhen-2 programs. As the results, we detected 44 deleterious mutations. Subsequently, 31 of these deleterious mutations were validated by Sanger sequencing (Tables 5,6 and 7, Additional file 1: Tables S5-S9). Among these 31 mutations, only 7 of them were shared by multiple strains (Table 5), whereas the other 24 mutations were detected in only a single strain. Functional enrichment analysis for the genes including these 31 mutations showed that "gap junction channel activity" (Gja1 and Gja3) was the most significantly overrepresented (p=0.0043; Additional file 1: Table S10). However, GO category "aging" and its subcategories were not significantly overrepresented. We also confirmed that no genes including these 31 mutations were recorded in the GenAge database (data not shown) [22]. We also detected 52 novel deleterious mutations in SAMR strains (Additional file 1: Table S11). These results are not surprising, because it has been reported that SAMR strains exhibit several diseases such as non-thymic lymphoma, histiocytic sarcoma, and ovarian cysts [29], although the SAMR strains have been used as control groups against the SAMP strains. Novel deleterious mutations including Fbxl13 p.S734N, Sh3bp5l p.R217W, Tnrc6a p.A278V, and Zkscan2 p.C232X were detected among all of the SAMP strains in addition to being found in several SAMR strains (Additional file 1: Table S12). These mutations may be associated with susceptibility to diseases in SAMP strains as well as in SAMR ones.
Prx p.R167C mutation in SAMP3, SAMP10, and SAMP11 strains Interestingly, we detected deleterious mutations in several genes that had been earlier reported to be associated with severe genetic disorders in both humans and mice. For example, the Prx p.R167C mutation (SIFT score: 0.01, PolyPhen-2 score: 0.998) was detected in 3 SAMP strains (SAMP3/SlcIdr, SAMP10/TaSlc, and SAMP11/SlcIder; Table 5). The Prx gene encodes periaxin, a protein required for the maintenance of myelin [30]. Because myelin is necessary for the conduction of high-frequency and highvelocity nerve impulses by saltatory conduction, its defects lead to severe neuropathy. In humans, nonsense mutations in periaxin cause an autosomal recessive form of CMT4F (Dejerin-Sottas disease), which is one of the severe hereditary motor and sensory neuropathies [31,32]. It was also reported that periaxin-knockout mice exhibit peripheral demyelination, mechanical allodynia, and thermal hyperalgesia [33]. The Prx p.R167C mutation is located within the nuclear localization signal (NLS), which is necessary for this protein to be imported into the nucleus from the cytoplasm (Figure 1) [34]. Localization of periaxin in the nucleus is observed in murine embryonic Schwann cells only for a limited period of time [35]. Because the arginine at position167 is highly conserved among mammalian species, this mutation in NLS might disturb the transfer of periaxin into the nucleus, thereby adversely affecting the normal differentiation of Schwann cells.
Ldb3 p.R473W mutation in all of SAMP strains except for SAMP8 In the Ldb3 gene, encoding LIM domain-binding protein 3, the p.R467W mutation (SIFT score: 0.02, PolyPhen-2 score: 0.968) was detected in all of the SAMP strains except for SAMP8/TaSlc (Table 5). Ldb3 is a component of the sarcomere Z disk protein complex expressed in cardiac and skeletal muscles, and it is connected to calsarcin-1 and α-actinin [36]. Mutations in the Ldb3 gene are responsible for myofibrillar myopathy and dilated cardiomyopathy in humans [37,38]. In addition, LDB3 exon 4 is aberrantly spliced in myotonic dystrophy type 1 [39]. Pathological changes in skeletal and cardiac muscles of SAMP strains, however, have not been fully analyzed.
Gja3 p.S405P mutation in SAMP3, SAMP6, SAMP10, and SAMP11 strains We detected the Gja3 p.S405P mutation (SIFT score: 0.09, PolyPhen-2 score: 0.917) in 4 SAMP strains (SAMP3/ SlcIdr, SAMP6/TaSlc, SAMP10/TaSlc and SAMP11/ SlcIdr; Table 5). Gap junction protein alpha 3, encoded by Gja3, is specifically expressed in the plasma membrane of lens fiber cells to form gap junctions [40]. Gap junctions directly connect the cytoplasm of adjacent cells, and allow various molecules and ions to pass freely between cells, functioning for the maintenance of osmotic and metabolic balance in the avascular lens. A large number of studies have reported the association of mutations of the GJA3 gene with cataract in humans [41,42].  We next analyzed deleterious SNVs unique to a single SAMP strain, i.e., not shared with other SAMP strains. We detected 7 deleterious mutations specific to the SAMP6/ TaSlc strain, which has been used as a mouse model of osteoporosis or ulcerative colitis [43,44]. Among these 7 mutations, we focused on the Il4ra p.S540fs frameshift mutation ( Figure 2, Table 6). The Il4ra p.S540fs substitution immediately generates a stop codon at this position. Osteoporosis is caused by bone resorption in excess of bone formation. The differentiation of osteoclasts is promoted by RANKL, a membrane-bound cytokine, as well as by inflammatory cytokines such as TNF-α, IL-1 and IL-6 [45][46][47][48]. These proteins are mainly expressed in type 1 T-helper lymphocytes (Th1 cells). IL-4, a Th2 cytokine suppresses the formation of Th1 cells to keep the proper balance between Th1/Th2 cytokines. A functional loss of the Il4ra protein would lead to activation of Th1 cells. Therefore, both osteoporosis and ulcerative colitis in SAMP6 might be explained by activation of Th1 cytokines, which would be induced by the Il4ra p.S540fs frameshift mutation. The Zdhhc12 p.R112C mutation (SIFT score: 0.000, PolyPhen-2 score: 0.999) was also detected uniquely in SAMP6 (Table 6). Zinc-finger DHHC domain-containing protein 12, encoded by the Zdhhc12 gene, has a predicted DHHC cysteine-rich palmitoyl acyltransferase domain [49]. Several gene mutations in the Zdhhc family have been implicated in human diseases and abnormal phenotypes of mice. Remarkably, Zdhhc13-truncated mutant mice develop alopecia, osteoporosis, and systemic amyloidosis [50]; and the osteoporotic phenotype can be explained by the finding that protein palmitoylation regulates osteoblast differentiation through bone morphogenesis protein (BMP)-induced Osterix expression [51]. Thus we speculate that Zdhhc12 p.R112C mutation might contribute to the osteoporotic phenotype in SAMP6.

Aifm3 p.K582N mutation specific to SAMP8
Five detected deleterious mutations were unique to the SAMP8/TaSlc strain, which show deficits in learning and memory, emotional disorder, and abnormal circadian rhythm at early ages (Table 7) [52,53]. It is remarkable that the K582N mutation in the Aifm3 gene (SIFT score: 0.01, PolyPhen-2 score: 0.879), encoding apoptosis-inducing factor mitochondrion-associated protein 3, was detected in SAMP8/TaSlc. Although the function of Aifm3 has not been fully elucidated, it has been reported that Aifm3 shares 35% homology with Aifm1 and that overexpression of Aifm3 induces apoptosis in HEK 293 cells [54]. Because  the lysine at 582 in Aifm3 is highly conserved among mammalian species, this p.K582N mutation therefore may alter the function of Aifm3, contributing to the mitochondrial dysfunction in SAMP8 mice.

Discussion
Whole-exome sequencing identified new candidate mutations responsible for age-related phenotypes in SAMP strains In the present study, we identified the entire spectrum of the SNVs in 6 SAMP and 3 SAMR strains by wholeexome sequencing. We summarized the candidate mutations regulating various pathogenic phenotypes in SAMP strains in Figure 3. Our study has clarified that several disease-causing mutations were common among multiple SAMP strains. Two of these mutations, Ogg1 p. R304W and Mbd4 p.D129N, were common among all SAMP strains and would be involved in the susceptibility to diseases via defects in DNA repair. In all SAMP strains except SAMP8/TaSlc, we detected a p.R473W missense mutation in the Ldb3 gene, which has been associated with myofibrillar myopathy. In 3 SAMP strains (SAMP3/SlcIdr, SAMP10/TaSlc, and SAMP11/SlcIdr), we identified a p.R167C missense mutation in the Prx gene, which has been linked to hereditary motor and sensory neuropathy. In 4 SAMP strains (SAMP3/SlcIdr, SAMP6/TaSlc, SAMP10/TaSlc, and SAMP11/SlcIdr), we detected a p.S405P missense mutation in the Gja3 gene, which is a cause of cataract. As the candidate gene mutations responsible for strain-specific phenotypes, we detected 24 deleterious mutations specific to a single SAMP strain, including the Il4ra p.S540fs frameshift mutation in SAMP6/TaSlc, which is used as a model for osteoporosis, and the Aifm3 p.K582N mutation in SAMP8/TaSlc mice, which display deficits in learning and memory and mitochondrial dysfunction. We detected Ogg1 p.R304W and Mbd4 p.D129N deleterious mutations, which were common to all of the SAMP strains, but absent in the SAMR and AKR/J strains; although these 2 mutations were also detected in other mouse strains. It was already investigated as to whether a defect in Ogg1 protein would affect the life spans in SAMP strains. Mori et al. reported that hybrid mice with the homozygous mutation in Ogg1 p.R304W exhibited a complete loss of the glycosylase activity as well as a higher level of 8-oxoguanine in their hepatic nuclear DNA [23]. However, the average life span of the SAMP1×B10.BR hybrid was not different among the mice homozygous, heterozygous or nullzygous (B10.BL allele) for the SAMP1 allele. Moreover, NZB/N, NFS/N, SJL/J, and NOD/ShiLtJ also have the Ogg1 p.R304W mutation. These results suggest that Ogg1 p.R304W alone is not sufficient to cause accelerated senescence and a short life span. We assume that the combination of Ogg1 p.R304W and Mbd4 p.D129N causes accelerated senescence. Both mutations were detected in the NOD/ ShiLtJ strain, which is a type 1 diabetes model [27]. Although NOD/ShiLtJ mice may live for only 6 to 8 months due to diabetes under normal food and water conditions [28], we cannot predicate these mutations to be essential for the accelerated-senescence phenotype of SAMP because the cause of death is different between SAMP and NOD/ShiLtJ strains. Nevertheless, mouse strains that possess Ogg1 p.R304W mutation are known for their pathologic phenotypes: NZB/N for autoimmune hemolytic anemia; SJL/J for reticulum cell sarcomas, in addition to NOD/ShiLtJ for type 1 diabetes [55,56]. Somatic mutations have been implicated in various diseases, and the accumulation of such mutations is one of the most accepted theories to explain aging. The Ogg1 p. R304W mutation might partly contribute to the phenotypes of these mouse strains as well as to the acceleratedsenescence phenotypes of SAMP strains.
In several SAMP strains, missense mutations were detected in the Prx, Ldb3, and Gja3 genes, which mutations have been found in various human degenerative diseases. The pathogenesis of myofibrillar myopathy and peripheral neurodegeneration has not been fully analyzed in SAMP strains. Age-related muscle atrophy and a decline in peripheral neuronal function are assumed to be a common phenomenon that probably occurs in the course of the senescence process [57,58]. Nevertheless, genetic susceptibility to degeneration of skeletal muscle and peripheral neurons may be different among SAMP strains. Prx p.R167C and Ldb3 p.R473W mutations possibly contributed to the degenerative phenotypes of 3 of the SAMP strains in the course of the senescence process.
In the present study, the Gja3 p.S405P mutation was detected in 4 SAMP strains (SAMP3, SAMP6, SAMP10, and SAMP11), among which only the SAMP3 strain is reported to develop cataract [59]. As a lack of reports of cataract in SAMP6, SAMP10, and SAMP11 does not indicate the actual lack of cataract, careful ophthalmologic examinations for cataracts in these 3 other SAMP strains may reveal a pathogenic association. Alternatively, because it is suggested that the pathogenic mechanism underlying the development of cataract in SAMP strains is different from that of murine hereditary cataract, which is generally regulated by single-gene mutations [60], the Gja3 mutation alone may not be sufficient to cause cataract. The SAMP3 strain may have additional mutations besides the Gja3 p.S405P mutation that are responsible for cataract. The Il4ra p.S540fs frameshift mutation can explain the osteoporosis observed in the SAMP6 strain from the viewpoint of osteo-immunology. It is known that IL-4 signaling inhibits osteoclast differentiation by suppressing Th1 cytokines such as RANKL, TNF-α, and IL-1. In fact, Il4 gene knockout mice are sensitive to RANKL-induced bone resorption [61]. A defect in Il4ra might thus enhance osteoclast differentiation due to dysregulation of Th1 cytokines. The Il4ra p.S540fs frameshift mutation can also explain the ulcerative colitis found in the SAMP6 strain. Although the true cause of ulcerative colitis remains unknown, abnormalities of the immune system are possibly related to its pathogenesis. Particularly, a high level of TNF-α was proposed to play an important role in disease progression [62]. In SAMP6 mice, it is expected that up-regulation of TNF-α expression in the colon would occur due to activation of Th1 cells. Thus, both of these pathogenic phenotypes, osteoporosis and ulcerative colitis, may be ascribable to the defect in Il4ra in SAMP6.
We also detected the Aifm3 p.K582N mutation in the SAMP8/TaSlc mice, which display deficits in learning and memory. High oxidative stress derived from brain mitochondrial dysfunction is thought to be one of the causes of age-related neurodegeneration in SAMP8 animals. Actually, decreased activities of NADH-cytochrome c reductase are observed even in 4-week-old SAMP8 mice, suggesting crucial defects in maintenance of the respiratory chain [63]. Aifm3 is likely to be related to mitochondrial maintenance, because it induces apoptosis in vitro and has an oxidoreductase domain, as is the case for Aifm1 [54], which plays roles in maintenance of the mitochondrial respiratory chain [64]. However, the actual roles of Aifm3 in apoptosis in the senescence process and the actual substrates of the oxidoreductase remain unknown. Further investigations are necessary to examine whether the Aifm3 p.K582N contributes to deficits in learning and memory via dysfunction of brain mitochondria.
Overall, it seems that the combinations of different disease-causing mutations specific to each strain cause various degenerative diseases, which combinations are a cause of short life spans of SAMP strains as far as focusing on the mutations of the coding regions is concerned. Actually, it was reported earlier that the life spans of SAMP strains are susceptible to environmental conditions [65]. These observations may be ascribable to the multifactorial nature of the short life span of SAMP unlike other progeroid mice whose life span is regulated by single gene mutation. de Magalhaes JP et al. also reported that the Gompertz mortality curve of the SAMP was not different from that of the SAMR prior to age 1 year despite the difference in age when 50% of mice died, suggesting that the life spans of the SAMP strains may not be related to aging per se [66]. Nevertheless, we think that it is premature to conclude that SAMP strains are degenerative disease models rather than accelerated-senescence models because in vitro studies have shown that primary-cultured cells from several SAMP strains show accelerated senescence and higher oxidative stress and mitochondrial dysfunction than the SAMR1 strain [67][68][69].

Limitation of the present study
Whole-exome sequencing using 50-bp single-end reads on the SOLiD4 platform is able to detect only single or 2base nucleotide variations and insertion/deletion. Because accelerated senescence and the various pathogenic phenotypes may not be explained completely by the nucleotide substitutions in the coding regions, we cannot ignore the possibility that other types of genetic variations are also involved in common accelerated-senescence phenotypes of SAMP strain. Fairfield et al. succeeded in identifying causative mutations in several ENU-induced mutants by exome sequencing, but failed to do so in several spontaneous disease models [70]. They suggested that mutations responsible for spontaneous disease models might reside in the non-coding regions. Actually, it has been proven that most of the non-coding regions have some biochemical functions [71].
Carter et al. reported a 15-bp insertion mutation in the Fgf1 gene in SAMP10 [7], suggesting the involvement of a small structural variation in an exon of this gene. A longread sequencing platform, which can generate over 200-bp fragments, would be required to detect them. It has been suggested that not only small structural variations in exons, but also large genomic structural variations such as copy number variations and gene translocations, contribute to the complex traits of humans [72]. Furthermore, because complementary RNA probes are designed based on reference genome sequences, we were limited to find variants in comparison with the reference sequence.
De novo assembly by whole genome sequencing or matepair library sequencing and comparative genomic hybridization (CGH) array analysis should be performed to detect these sequence variations.
In present study, we focused on only novel deleterious mutations that could be predicted by SIFT and PolyPhen-2. Although these bioinformatics tools are useful to narrow down the candidate mutations, a recent study indicated that SIFT and PolyPhen-2 show 63 and 79% correct prediction rates, respectively [73]. In the future, functional analyses should be conducted to confirm whether the mutations that were predicted to be deleterious in the present study really affect the functions of these genes.

Conclusions
Our study using whole-exome sequencing provides a list of candidate mutations that are potentially linked with various pathogenic phenotypes. As was shown in Figure 3, 2 deleterious mutations in the DNA-repair genes, i.e., Ogg1 p.R304W and Mbd4 p.D129N, were commonly present among SAMP strains, which mutations would be expected to be involved in the genetic vulnerability to agerelated diseases. Under such genetic backgrounds, deleterious mutations detected in each substrain may cause various pathogenic phenotypes. We revealed that only 7 SAMP-specific non-synonymous mutations were shared among substrains, although the mechanisms and development of accelerated senescence and short life span have been assumed to be the same among all of SAMP strains. Furthermore, several SAMP strains had deleterious mutations in the genes associated with hereditary diseases (e.g., Prx p.R167C, Ldb3 p.R473W and Gja3 p.S405P), which mutations have not been previously reported to occur in SAMP strains. These results suggest that comparison of age-related phenotypes among multiple SAMP strains and detailed histopathological reexamination are required. Phenotypic reports of specific SAMP strains have been biased by the researchers' interests. The current exome sequence data will prompt us to scrutinize yet unnoticed pathological features. In addition to the exome database, construction of the comprehensive genome database of SAMP and SAMR strains will contribute not only to a better understanding of the fundamental aging process occurring in SAM strains but also to elucidation of the mechanisms of age-related diseases in humans as well as to the development of a more effective intervention against them.

DNA extraction
Genomic DNA was extracted from the livers of 11 mouse strains, i.e., SAMP1/SkuSlc, SAMP3/SlcIdr, SAMP6/TaSlc, SAMP8/TaSlc, SAMP10/TaSlc, SAMP11/SlcIdr, SAMR1/ SlcIdr, SAMR1/TaSlc, SAMR3B/SlcIdr, AKR/J and C57BL/6J strains. RNase treatment was performed to obtain a high-quality DNA library. All experimental procedures using laboratory animals were approved by the Animal Care and Use Committee of the Tokyo Metropolitan Institute of Gerontology, the Institute for Developmental Research of the Aichi Human Service Center, and by Shinshu University School of Medicine.
Targeted capture and next-generation sequencing Target enrichment was performed by use of a SureSelect XT Mouse All Exon kit (Agilent Technologies, Santa Clara, California, US) optimized for the ABI SOLiD system and 3 μg of genomic DNA according to the manufacturer's protocol. The kit is designed to enrich for 221,784 exons within 24,306 genes covering a total of 49.6 Mb genomic sequences. DNA was sheared by acoustic fragmentation (Covaris, Woburn, Massachusetts, US) and purified with an Agencourt AMPure XP kit (Beckman Coulter, Brea, California, US). The quality of the fragmentation and purification was assessed with an Agilent 2100 Bioanalyzer. The fragment ends were repaired and adaptors were ligated to the fragments (Agilent). The modified DNA library was purified by using the Agencourt AMPure XP kit, and amplified by PCR and captured by hybridization to biotinylated RNA library baits (Agilent). Captured DNA was purified with streptavidin-coated magnetic Dynal beads (Life Technologies, Carlsbad, California, US) and amplified with Barcoding Primer. The prepared exome library was pooled and subjected to emulsion PCR and sequenced on the SOLiD4 (Life Technologies) as single-end 50-bp reads. For each sample, 1 quad of a SOLiD sequencing slide was used.

Read mapping and variant analysis
Sequence reads were mapped to the reference mouse genome (UCSC mm9, NCBI build 37) by using Bioscope software version 1.3 (Life Technologies), which utilizes an iterative mapping approach. After removal of low-quality and duplicate reads, single nucleotide variants (SNVs) were detected with Avadis NGS software version1.3 (Strand Life Sciences, Bangalore, Karnataka, India). Avadis NGS performs SNV identification via an adapted version of the MAQ algorithm, which calculates the probability that the consensus genotype is incorrect by using a Bayesian statistical model with mapping quality, base quality and ploidy taken into consideration. We established criteria for SNV detection as a read coverage ≥ 2, and other parameters were set as default values. Detected SNVs were annotated for extracting non-synonymous and homozygous SNVs by using the Avadis NGS with UCSC transcript annotation. Moreover, we extracted novel SNVs by comparison with NCBI dbSNP build 128 and SNV data of 17 inbred strains of laboratory mice obtained by whole-genome sequencing. We compared filtered SNVs among all strains to explore the mutations that were commonly present among the SAMP strains but absent in the SAMR strains, AKR/J strain, and C57BL/6J strain. The unique mutations of each strain were also selected.

Interpretation of novel missense SNVs
To predict whether the candidate SNVs would have deleterious effects or not, we used 2 software programs, i.e., Sorting Intolerant from Tolerant amino acid substitutions (SIFT; J. Craig Venter Institute, San Diego, California, US, http://sift.jcvi.org/) and Polymorphism Phenotyping v2 (PolyPhen-2; Harvard University, Cambridge, Massachusetts, US, http://genetics.bwh.harvard.edu/pph2/). SIFT uses sequence homology to predict amino acid substitutions that will affect protein function, thus contributing to a disease [74]. SIFT predicts substitutions with a score less than 0.05 as being "deleterious" (Range: 0 to 1). PolyPhen-2 takes into account the physicochemical characteristics of the wildtype and mutated amino acid residue and the consequence of the amino acid change for the structural properties of the protein in addition to evolutional conservation [75]. PolyPhen-2 generates a different scale of reported scores, with the corresponding predictions being "probably damaging" with a score larger than 0.85, "possibly damaging" with a score between 0.85 and 0.15," and "benign" with a score less than 0.15. Because PolyPhen-2 considers only human protein sequences, the mouse SNVs were investigated in the context of human protein sequences.

Mutation validation
Validating the candidate SNVs was performed by using the standard Sanger sequencing approach. Primers were designed to surround candidate SNVs by using Primer 3 version 4.0, and custom DNA oligos were ordered (Life Technologies; Operon Biotechnologies, Tokyo, Japan). Primer sequences are shown in Additional file 1: Tables S1-S2. PCR reactions were carried out in 10-μl reaction mixtures containing a 0.5 μM concentration of each primer, 0.2 mM dNTPs, 0.25U Ex Taq DNA Polymerase Hot-Start Version, 1.0 μl 10×Ex Taq Buffer (Takara Bio, Shiga, Japan), and 1 μl of extracted DNA. The amplification conditions were 1 cycle at 96°C for 5 min of denaturation, 40 cycles of 94°C for 30 s, 55-68°C for 45 s of annealing in proportion to the Tm value of each primer, and extension at 72°C for 45 s, followed by a final extension at 72°C for 10 min. PCR products were purified by using a MultiScreen HTS PCR 96-Well Plate (Millipore, Billerica, Massachusetts, US) for sequences. DNA templates were subjected to the sequencing reactions by using a BigDye Terminator version 3.1 Cycle Sequencing Kit (Life Technologies). The sequencing reaction solution contained 4 μl BigDye Terminator v3.1, 0.32 μM M13 forward primer, 1.75 μl 5×Sequence Buffer, and 2.0 μl PCR product in a final volume of 10 μl. PCR conditions were 1 cycle at 94°C of denaturation, 25 cycles of 94°C for 10 s, 50°C for 15 s and 3 min at 60°C, followed by cleaning of the reaction products by ethanol precipitation. The capillary electrophoresis sequencing was performed by using an ABI Prism 3130xl Genetic Analyzer (Life Technologies), and sequence data were analyzed with Sequencher version 4.2.2 (Gene Codes, Ann Arbor, Michigan, US).

Gene Ontology enrichment analysis
Gene Ontology enrichment analysis (GO analysis) was performed by using WebGestalt (http://bioinfo.vanderbilt. edu/webgestalt/) [76]. The obtained p-values were adjusted by Benjamini-Hochberg multiple testing, and the significant level was established at p<0.05.

Multiple alignment
Multiple sequence alignment was performed by using the Clustal Omega program on the UniProt website (http://www.uniprot.org/) [77].

Data access
Exome data were deposited in DDBJ Sequence Read Archive (BioProject Accession Number: PRJDB37).

Additional file
Additional file 1: Table S1. Known SNV list validated by Sanger sequencing, Table S2. Novel SNV list validated by Sanger sequencing, Table S3. False-positive and false-negative SNV call rate, Table S4. Novel SNVs detected among one or more of SAMP strains, but absent in the SAMR and AKR/J strains, Table S5. List of false-positive SNVs, Table S6. Novel deleterious mutations specific to SAMP1/SkuSlc, We identified only Adamts5 p.A335T mutation (SIFT score: 0.00, PolyPhen-2 score: 0.810) uniquely in SAMP1/SkuSlc. SAMP1 exhibits senile amyloidosis, impaired immune response, contracted kidney, and lung hyperinflation [78][79][80][81][82]. A disintegrin and metalloproteinase with thrombospondin motifs 5 encoded by Adamts5 gene functions as an aggrecanase to cleave aggrecan, a major proteoglycan of cartilage [83]. Adamts5 is responsible for aggrecan degradation in a murine model of osteoarthritis [84], and Adamts5 knockout mice are protected against cartilage degradation through abrogation of joint fibrosis and promoted deposition of cartilage aggrecan [85]. The hyperinflated lungs in SAMP1 result from increased lung compliance, which is related to age-related change in pulmonary elasticity [80]. Although the functions of Adamts5 in tissues except cartilage have not been fully elucidated, the expression of Adamts5 was also observed in mouse lung. Adamts5 p.A335T mutation therefore might play roles in hyperinflation of the lungs by affecting the pulmonary elasticity in SAMP1. Table S7. Novel deleterious mutations specific to SAMP3/SlcIdr, Three deleterious mutations are specific to SAMP3/SlcIdr, which develops temporomandibular joint disease in an early stage of development [86]. The E684K missense mutation in the Bank1 gene (SIFT score: 0.13, PolyPhen-2 score: 0.946) encoding B-cell scaffold protein with ankyrin repeats may be associated with temporomandibular joint disease, because polymorphisms in this gene are associated with susceptibility to connective tissue diseases such as rheumatoid arthritis and systemic lupus erythematosus in humans [87,88]. Although the inflammatory status of the temporomandibular condyle in SAMP3 was reported to be similar to that in other SAMP strains [86], we cannot exclude the possibility that the Bank1 p.E684K mutation contributes to the degenerative changes in the temporomandibular joint in SAMP3. Table  S8. Novel deleterious mutations specific to SAMP10/TaSlc, We detected 4 deleterious mutations specific to SAMP10/TaSlc. Although both SAMP10 and SAMP8 exhibit memory impairment, SAMP10 is distinct from SAMP8 in that cerebral atrophy occurs specifically in SAMP10 [89,90]. The Npr1 p. M630I mutation (SIFT score; 0.03, PolyPhen-2 score; 0.939) was detected uniquely in the SAMP10. Npr1 encodes natriuretic peptide receptor 1 (NPR-A), which is a membrane-bound guanylate cyclase that serves as the receptor for both atrial natriuretic peptide (ANP) and brain natriuretic peptides (BNP) [91]. The main role of ANP/BNP signaling through NPR-A is decreasing systemic vascular resistance and blood pressure via increasing natriuresis [92,93]. The functions of natriuretic peptides in the nervous system also have been examined. NPR-A is mainly localized in glial cells but not in neuronal cells in several regions of the brain [94,95]. Although the cytokine-mediated neuroprotective glial responses are impaired in SAMP10 [96], further examination will be required to elucidate whether the Npr1 p.M630I contributes to the degenerative brain disorder in SAMP10. Table S9. Novel deleterious mutations specific to SAMP11/SlcIdr, Four deleterious mutations are specific to SAMP11/ SlcIdr, which exhibits senile amyloidosis and contracted kidney, as well as diffuse medial thickening of the aorta [3,97]. We identified the G321R mutation in the Gja1 gene (SIFT score: 0.20, PolyPhen-2 score: 1.000) encoding gap junction protein, alpha 1, which is a component of intercellular channels connecting adjacent cells [98]. Gja1 is the major protein of gap junctions in the heart [99]. The mutations in GJA1 were proved to be the cause of several heart diseases [100,101]. Gja1 protein is also expressed in vascular smooth muscle and is necessary for vascular formation and maintaining vascular function [102,103]. Although the effect of a defect of Gja1 protein on vascular morphology is not consistent among studies, Liao et al. reported that the carotid arteries in smooth muscle cell-specific Gja1 gene knockout mice thickens after injury more extensively than those of wild-type mice [104], suggesting that disruption of normal gap junctional communication contributes to abnormal vascular phenotypes including diffuse thickening of the aorta in SAMP11. Table S10. Top 5 overrepresented GO terms within the 31 genes including novel deleterious mutations detected among one or more of SAMP strains, but absent in the SAMR and AKR/J strains, Table  S11. Novel deleterious mutations detected among one or more of SAMP and SAMR strains, but absent in AKR/J strain, Table S12. Novel deleterious mutations detected among all of the SAMPstrains and several SAMR strains, but absent in the AKR/J strain.

Competing interests
The authors declare that they have no competing interests.
Authors' contributions KT and MT designed the experiments and drafted the manuscript. KT, EM and NF carried out the exome library preparations. KI carried out the emulsion PCR and the next-generation sequencing. KT carried out the series of bioinformatic analyses including read mapping, variant analysis, interpretation of the SNVs, and multiple alignment. KT carried out the mutation validation by Sanger sequencing. YK, ST, and SHI carried out the sample preparations. KO, HI, and YO supported the bioinformatic analyses. AS, MM, MH, and KH provided the SAM mice. YH, SH, IO, MI, SE, AI, NM, and MS critically reviewed and corrected the manuscript. TT, MH and MT conceived and supervised the entire study. All authors read and approved the final manuscript.