Efficient linkage mapping using exome capture and extreme QTL in schistosome parasites
© Chevalier et al.; licensee BioMed Central Ltd. 2014
Received: 10 March 2014
Accepted: 14 July 2014
Published: 21 July 2014
Skip to main content
© Chevalier et al.; licensee BioMed Central Ltd. 2014
Received: 10 March 2014
Accepted: 14 July 2014
Published: 21 July 2014
Identification of parasite genes that underlie traits such as drug resistance and host specificity is challenging using classical linkage mapping approaches. Extreme QTL (X-QTL) methods, originally developed by rodent malaria and yeast researchers, promise to increase the power and simplify logistics of linkage mapping in experimental crosses of schistosomes (or other helminth parasites), because many 1000s of progeny can be analysed, phenotyping is not required, and progeny pools rather than individuals are genotyped. We explored the utility of this method for mapping a drug resistance gene in the human parasitic fluke Schistosoma mansoni.
We staged a genetic cross between oxamniquine sensitive and resistant parasites, then between two F1 progeny, to generate multiple F2 progeny. One group of F2s infecting hamsters was treated with oxamniquine, while a second group was left untreated. We used exome capture to reduce the size of the genome (from 363 Mb to 15 Mb) and exomes from pooled F2 progeny (treated males, untreated males, treated females, untreated females) and the two parent parasites were sequenced to high read depth (mean = 95-366×) and allele frequencies at 14,489 variants compared. We observed dramatic enrichment of alleles from the resistant parent in a small region of chromosome 6 in drug-treated male and female pools (combined analysis: = 11.07, p = 8.74 × 10-29). This region contains Smp_089320 a gene encoding a sulfotransferase recently implicated in oxamniquine resistance using classical linkage mapping methods.
These results (a) demonstrate the utility of exome capture for generating reduced representation libraries in Schistosoma mansoni, and (b) provide proof-of-principle that X-QTL methods can be successfully applied to an important human helminth. The combination of these methods will simplify linkage analysis of biomedically or biologically important traits in this parasite.
Here we validate the X-QTL approach for linkage analysis of a biomedically important trait (oxamniquine (OXA) resistance) in the human blood fluke Schistosoma mansoni for which the causative gene (Smp_089320 on chromosome 6) is now known . Schistosomiasis, the second most important tropical parasitic disease after malaria , affects an estimated 187 million people across Africa, Asia, and South America, killing over 200,000 people per year and is caused by blood flukes of the genus Schistosoma [12–14]. Among the three major schistosome species of medical importance, S. mansoni infects over 67 million people in Africa, Middle East and South America . The life cycle of this parasite includes a snail as intermediate host (Biomphalaria spp.) and humans as definitive host. However, the complete lifecycle can be easily maintained in the laboratory using snail and rodent hosts (Additional file 1: Figure S1) , and there is a rich literature describing heritable variation in a range of biomedically and biologically interesting traits such as drug resistance [17, 18], host specificity , and virulence . The 364.5 Mb genome of S. mansoni has been sequenced , and with the aid of a 5 cM genetic map  80% of the scaffolds have been assigned and ordered on chromosomes . The genome is composed of 7 autosomes and one pair of ZW sex chromosomes and contains 10,852 genes. Forty percent of the genome is composed of repetitive elements [21, 23].
We have recently exploited the genome sequence and genetic map to identify, by classical linkage mapping, a QTL region underlying parasite resistance to oxamniquine, one of the two drugs used in the treatment of S. mansoni infection . This is the first trait mapped in S. mansoni or in any human helminth infection and resulted in direct identification of the gene and mutations responsible for this trait. While classical genetic mapping is clearly feasible in this organism, it is extremely labor intensive. In the oxamniquine genetic cross, 2,856 individual snails were exposed to single larval stages to obtain 388 snails infected with F2 progeny, while measurement of resistance involved daily observation of worm death following drug exposure over a two week period for each of the F2 progeny genotypes. Hence mapping of OXA-resistance QTL required a concerted effort by three researchers over a two year period. We therefore sought to develop more efficient methods for linkage mapping in this parasite.
X-QTL methods require accurate measurement of genome wide allele frequencies within progeny pools. This has been done by pyrosequencing , microarray hybridization , and more recently by examining read depth of SNPs using next generation sequence data (Pool-seq) . The genome of S. mansoni is relatively large (364.5 Mb), and comprises ~40% repetitive sequences, so we sought to use reduced representation sequencing to minimize cost and maximize read depth in F2 progeny pools. Several methods using restriction enzymes have been developed for performing reduced representation sequencing for complex genomes (e.g. restriction-site-associated DNA sequencing (RAD-seq), reduced-representation libraries (RRL)) . We opted to use exome capture rather than restriction based methods for several reasons: (i) obtaining exome sequence data simplifies fine mapping of loci identified following initial QTL location, as most variants involved in phenotypic traits result from changes within coding sequences , (ii) we wanted to avoid sequencing repetitive regions that cannot be unambiguously aligned in a single genomic location (this is a particular concern for S. mansoni given the repeat content of this genome), (iii) polymorphisms within restriction sites have the potential to bias representation of parental alleles within progeny pools when using approaches such as RAD-seq . This can potentially generate spurious enrichment of particular alleles within progeny pools. A second aim of this work was therefore to evaluate the efficiency of exome capture methods for S. mansoni.
We staged a cross between an OXA-resistant (HR) male and an OXA-sensitive (LE) female parasites by infecting hamsters with 500 male and 500 female cercariae larvae of a single genotype. These were derived from snails infected with a single miracidium larva, which reproduce clonally giving rise to 1000s of single sex, single genotype cercariae larvae. Male and female F1 parasites were crossed in the same way and single F2 miracidia were used to infect snails. 130 infected snails, each shedding cercariae of a single recombinant genotype, were used in the pooling experiments. These single genotype cercariae comprised 39 male and 91 female genotypes.
Each of the F2 pools used to infect hamsters contained 754 cercariae, representing the 130 recombinant genotypes (39 male and 91 female) (Additional file 2: Figure S2). We balanced the sex ratio by adding 10 representatives of each male genotype and 4 representatives of each female genotype to the pools used for each infection (390 (39 × 10) male cercariae : 364 (91 × 4) female cercariae per pool). Importantly, the F2 pools used to infect each of the six hamsters were identical in their genetic composition, so the impact of OXA-treatment on genome-wide allele frequencies can be directly examined.
Forty F2 females and 16 F2 male adult worms (56 total) survived oxamniquine treatment in the OXA-treatment group (four hamsters). Higher female survival rate following OXA treatment has been observed previously . This represents a global survival rate of 1.86%, relative to the total number of cercariae used in the infections [56/(4 × 754) × 100 = 1.86%]. In the untreated group (two hamsters), 131 females and 147 males (278 total) were recovered, representing 18.44% of the infecting dose [278/(2 × 754) × 100 = 18.44%]. Hence OXA-treatment results in a ~10-fold reduction in survival relative to controls, consistent with strong drug selection.
A set of 125,046 120 bp baits was used to capture the S. mansoni exome (Additional file 3: File S1) in the parents and the pools of treated and untreated male and female worms. The baits included 124,983 in the nuclear genome and 63 in the mitochondrial genome. They covered 87.52% of the S. mansoni exons (59,801 of 68,326 exons) and accounted for 92.18% of the exome length (14,138,142 of 15,336,803 bp) but also included regions surrounding exons (total length: 14,748,899 bp). No baits were designed for capture exons in gene families that could not be unambiguously mapped to a single location in the reference genome. The sequences covered by baits are referred to as the bait regions. Each captured exon was covered by 2 baits.
Exome capture library statistics
F2 untreated females
F2 treated females
F2 untreated males
F2 treated males
Number of mapped reads
Number of reads mapping to bait regions
Percentage of reads mapping to bait regions
Number of reads mapping to exome
Percentage of reads mapping to exome
Comparison between Z-linked and non Z-linked regions2
Ratio of mean read depth
The statistics for each library indicate efficient exome capture (Table 2). We captured > 99% of the genome covered by the baits in the aligned sequencing data (Table 2). Furthermore, this high capture efficiency was consistent across all six libraries, as well as on each chromosome and on the unassembled scaffold sets.
Number and proportion of bait coordinates recovered in each library and from each chromosome
Number of baits
F2 untreated females
F2 treated females
F2 untreated males
F2 treated males
Mean Total (%)
We identified 14,489 variants in the bait regions of the nuclear genome that were segregating between parents and the F2 progeny. This represents one variant every 1.018 kb on average. 4.10% (600 variants) and 19.91% (2,885 variants) were heterozygous (allele frequency between 0.4 and 0.6) in male parent (HR) and in female parent (LE), respectively, suggesting elevated inbreeding in the male (HR) parent. Among the 14,489 variants, 14,132 were SNPs, 187 were insertions (1-22 bp), and 170 were deletions (1-16 bp).
Comparison of treated and untreated female pools shows 72 significant positions, while for treated and untreated male pools 118 positions were significant. Both comparisons reveal strongly significant enrichment of alleles from the resistant parent at the end of chromosome 6. In the female comparison, there were 40 significant positions between positions 227,149 and 4,189,610 (| | ≥ 4.70, p ≤ 1.27.10-6). Similarly, in the male comparison, there were 81 significant positions (| | ≥ 4.74, p ≤ 1.06.10-6) between positions 204,937 and 10,866,071. The most significant peak observed in both male and female comparisons was position 1,286,622 ( = 8.69, p = 1.86.10-18 in females; = 6.97, p = 1.60.10-12 in males). This peak is 233.343 kb away from the end of the causative gene (Smp_89320, position 1,519,965-1,524,755) involved in oxamniquine resistance, which was recently identified by conventional linkage methods . We also observed a second major region (unassembled scaffold SC_0154) that shows enrichment of alleles in both male and female comparisons. This scaffold accounts for 22 significant polymorphisms (| | ≥ 4.74, p ≤ 1.06.10-6) in females and 13 (| | ≥ 4.72, p ≤ 1.18.10-6) in males.
The remaining significant genome positions (10 in females and 24 in males) were not consistent between the two comparisons. For example, a region on chromosome Z at positions 12,660,165, 15,218,811, 16,201,782, and 23,950,413 (| | ≥ 4.74, p ≤ 1.04.10-6) accounted for 4 significant peaks only in females, while the other remaining 6 significant positions were found on chromosomes 1 and 2, and within unassigned scaffolds. In males only, a region of chromosome 2 (positions 17,450,959-29,107,158, | | ≥ 4.74, p ≤ 1.07.10-6) shows marginally significant values at 11 variants. The other remaining 13 significant positions were found on chromosomes 1, 3, 4, Z, and within unassigned scaffolds.
Plotting the combined Z-scores from the female and male strengthens the signal revealed in both male and female comparisons. Of 166 significant positions genome-wide, chromosome 6 (between positions 55,761 and 10,770,876), accounts for 79 significant positions (| | ≥ 4.74, p ≤ 1.08.10-6), including the most significant (position 1,286,622, = 11.07, p = 8.74.10-29). The unassembled scaffold SC_0154 previously identified in both female and male mapping was confirmed in this combined mapping, with 37 significant positions (| | ≥ 4.72, p ≤ 1.20.10-6). The other remaining 50 significant positions were found on chromosomes 1, 2, 3, 4, Z, and within unassigned scaffolds.
Capture was extremely efficient since more than 99% of the targeted regions were recovered and sequenced to high read depth in the six libraries (95-366×) on two lanes of an Illumina HiSeq. Furthermore, while the exome constitutes just 4% of the genome, between 32% to 77% of the capture sequences mapped to the bait regions. The exome bait sequences also captured adjacent non-coding DNA, with sequence read depth dropping to 50% of mean exonic read depth 250 bp from the edge of bait regions, as commonly found in exome capture experiments with other organisms . This is an added bonus for functional genomics applications as promoter polymorphisms and other regulatory sequences are often located within 250 bp of coding regions . We also obtained low read depth of sequence from genome regions up to 50 kb distant from bait regions.
Genotyping of modest numbers of microsatellite loci (7-12) is widely used for population genetics of schistosomes [34, 35]. In this study we genotyped 14,489 exonic SNPs (one every 1.018 kb on average) segregating in a single genetic cross. Dense genotyping by exome sequencing will provide a powerful approach for population genomic analysis of schistosome populations. A complication of working with schistosome infections of humans is that adults are found in the blood vessels and only eggs can be collected from infected patients. However, exome sequencing of miracidia hatched from eggs should be possible following whole genome amplification .
Analysis of the relative read depth suggests that exome sequencing will be useful for analysis of copy number variation. Males carry two Z chromosomes while females carry only one Z chromosome and a W chromosome . Z and W chromosomes share extensive regions of homology (pseudo-autosomal regions), but there are also Z-linked regions that are present as a single copy in females and two copies in males. The presence of Z-linked regions can be clearly inferred from the read depth plot in the sex chromosomes of males and females (Figure 2). These regions correspond exactly to those identified by Lepesant et al. and carrying W specific sequences , and also correspond to the positions of markers showing Z-linked inheritance patterns in the cross used to construct the genetic map .
The X-QTL analysis also revealed another region linked to oxamniquine resistance. Both female and male comparisons identified multiple positions (22 in females and 13 in males) on an unassigned scaffold (SC_0154) showing significant peaks. We suggest two possible explanations. First, this scaffold could belong to the end of the chromosome 6, adjacent to the major QTL peak identified. Second, this scaffold could be positioned elsewhere in the genome and identify a second QTL region influencing resistance. We believe that the second explanation is very unlikely because the oxamniquine resistance trait shows a simple recessive inheritance  and only a single QTL was identified by classical linkage mapping .
The ability of this X-QTL study to correctly map the genome region underlying oxamniquine resistance is particularly remarkable given that two features of our experimental design were suboptimal. First, QTL is most powerful when large numbers of segregating progeny are examined . However, in this experiment just 16 male worms survived drug treatment and were included in the treated male pool, while 40 treated females comprised the treated female pool. Nevertheless, these small sample sizes were sufficient to reveal multiple highly significant variants within the known QTL region on chromosome 6. Second, we used whole genome amplification to generate sufficient amounts of initial DNA for exome capture and library preparation, rather than directly using genomic DNA. While whole genome amplification has been demonstrated to show high fidelity in S. mansoni , we expected that this procedure would add further experimental noise by reducing the precision of allele frequency estimation. In view of the efficiency of X-QTL to identify QTL regions for a simple Mendelian trait under suboptimal conditions (i.e. using very small numbers of pooled progeny and whole genome amplification), we expect that this approach will be extremely powerful for investigating the basis of more complex traits, when experimental design is optimal (i.e. large numbers of F2 in progeny pools and no whole genome amplification).
We suggest that the small number of weakly significant peaks observed in regions outside the chromosome 6 QTL or scaffold SC_0154 are unlikely to impact OXA resistance. Instead we believe these peaks result from the small size of the population samples from OXA-treated pools. We note that the pooled F2 progeny were constructed using 10 male and 4 female cercariae of each genotype. Hence, while 16 male and 40 female adult worms were recovered from treated hamsters, the actual number of independent genotypes may have been lower than this (as few as 2 male genotypes and 10 female genotypes). Hence, our statistical correction of sampling effects may be insufficiently stringent. We suggest that future applications of X-QTL to schistosomes should focus on increasing the size of pools analyzed to minimize the impact of such sampling effects.
Besides simplifying QTL identification for simple or complex traits, X-QTL may be the only suitable method for mapping some traits, such as host specificity or compatibility. Schistosomes vary in ability to infect populations of host snails, and this variation has been shown to be heritable [19, 38]. While crosses can be made between parasites utilizing different snail hosts, classical QTL mapping is problematic because F2 parasites that fail to develop in a particular host snail cannot be genotyped. This problem can be resolved by X-QTL analysis, because allele frequencies can be compared between emerging parasite pools from the two host snail populations to identify the genome region(s) associated with specific development within each snail.
In summary, this study extends the use of X-QTL methods to a multicellular parasite, the human blood fluke S. mansoni, and demonstrates the utility of this approach for a wide range of multicellular eukaryotes. The approach was validated by remapping a QTL for oxamniquine resistance, which was recently located using classical linkage mapping methods. This method opens the way for future QTL mapping of traits of biomedical or biological significance in schistosomes such as praziquantel resistance, host specificity, and virulence to both snail and vertebrate host. The study also demonstrates that exome capture constitutes a cost effective approach for genotyping in X-QTL studies. For S. mansoni, this approach reduced genome size by ~25-fold, thereby reducing sequencing costs, while maximizing read depth and thereby allowing accurate measurement of allele frequency within pooled samples.
This study was performed in strict accordance with the recommendations in the Guide for the Care and Use of Laboratory Animals of the National Institutes of Health. The protocol was approved by the Institutional Animal Care and Use Committee of the University of Texas Health Science Center at San Antonio (Permit Numbers: 08093X and 11087X).
The two laboratory maintained populations of S. mansoni used in this study are the HR population (oxamniquine resistant) and the LE population (oxamniquine sensitive). Oxamniquine resistance in HR was selected in the laboratory. The HR population was isolated from a Puerto Rican child by Dr. E. Bueding, then transferred to Dr. Donato Cioli at the Laboratory of Cell Biology in Rome and submitted to further drug selection at Dr. Cioli’s laboratory . The LE population was originally isolated from a patient in 1965 in Belo Horizonte, Brazil, and has since been maintained in the laboratory . Laboratory isolates of schistosomes are genetically variable and are generally maintained as outcrossed populations. We therefore used isogenic parasites generated by infecting snails with single miracidia for conducting genetic crosses (see below).
We crossed an LE female to a HR male to generate F1 progeny. Subsequently, a male and female from the F1 were crossed to generate F2 progeny (reared to the adult stage).
For each generation, parasites were obtained as follows: Eggs were collected from the liver and hatched under light to obtain miracidia. Snails (B. glabrata) were then infected with only one miracidium. Because sex is determined in the zygote (which develops into a miracidium) by a ZW chromosomal system, monomiracidial infections allowed us to be certain that we were using single clonal types (that is, single genetic individuals of the same sex) in the crosses. After 28 days, snails were exposed to light to shed cercariae. Within the snail, schistosomes proliferate clonally, so the emerging cercariae from a snail infected with a single miracidium form a genetically homogeneous population of a single sex. Cercariae were sexed following the protocol described in Criscione et al. . Briefly, DNA from cercariae shed by monomiracidial infected snails was extracted using Chelex® and PCR was performed to discriminate between males and females using the W1 primers , which are specific to a repetitive region on the W chromosome in females. As this test depends on the failed amplification in males, we performed a concurrent PCR under the same conditions with the autosomal locus sc18  to ensure that the DNA had successfully been extracted from each sample. Upon identification of gender, snails were shed again to collect cercariae for infections. To establish parental and F1 crosses, hamsters (one for each generation) were exposed to 500 female cercariae of a single genotype and 500 male cercariae of a single genotype. For F2 progeny, 390 male cercariae representing 39 genotypes and 364 female cercariae representing 91 genotypes were used to infect each of 6 hamsters (Additional file 2: Figure S2). After 45 days, hamsters were euthanized and perfused to collect adult worms and eggs from the liver for the next generation .
The parent worms were separated by sex and pooled for DNA extraction.
We identified all snails that were infected with F2 schistosomes, and sexed the cercariae larvae emerging. We then made replicate pools of F2 cercariae for infecting hamsters. These pools (i) had identical composition of cercariae from each infected snail (Additional file 2: Figure S2). Hence, changes in allele frequency in control and treated animals at markers should reflect the action of drug selection.
Following maturation, the F2 adult worms were first selected for drug resistance in vivo. On day 43, 45, and 48 post-infection, the four F2 infected hamsters in the treatment group were treated by gavage with 50 mg.kg-1 oxamniquine solubilized in 2.5% (m/v) Cremophor® EL (Sigma-Aldrich). Two F2 infected hamsters were untreated, receiving only the diluent.
On day 50 of infection, hamsters were euthanized and perfused to collect adult worms . Adult worms were transferred to a petri dish containing DMEM culture media (Dulbecco-Modified Minimum Eagle’s Medium, high glucose (4.5 g.L-1), bicarbonate buffered (3.7 g.L-1)) supplemented with 20% newborn calf serum, 100 U.mL-1 penicillin, 100 μg.mL-1 streptomycin, and 0.5 μg.mL-1 amphotericin B. Worms perfused from the four treated hamsters were further treated in vitro with 500 μg.mL-1 oxamniquine solubilized in DMSO (Sigma-Aldrich) for 45 minutes, followed by 3 washes with drug-free medium and incubated at 37°C in an atmosphere of 5% CO2 for 12 consecutive days. The in vitro treatment was done to ensure that all drug sensitive worms were dead. Adult worms perfused from the two control hamsters were treated in the same way, but exposed only to the diluent. The surviving worms from treated and untreated control group were then separated by sex, counted and pooled for DNA extraction.
We prepared DNA from the two parental genotypes and from four pools of F2 progeny (untreated males, oxamniquine treated males, untreated females, oxamniquine treated females). Male and female worms were separated for three reasons: (i) males are bigger than females and contain more DNA, which biases allele frequency measurement in mixed pools, (ii) males and females carry different sex chromosomes (ZZ for males, ZW for females), so separation of males and females allows analysis of sex-linked traits, and (iii) male and female worms serve as internal replication of the experiment. Adult worms from each pool were placed in 200 μL of 6% Chelex® containing 0.2 mg.mL-1 of proteinase K, incubated at 56°C for 2 h, and boiled at 100°C for 8 minutes. For each pool, four independent whole genome amplifications (WGA) were performed using the GenomiPhi V2 DNA amplification kit (GE Healthcare)  and then pooled to minimize bias from the WGA. DNA was quantified using the Qubit® dsDNA BR assay (Invitrogen).
The S. mansoni exome was captured using the SureSelectXT Target Enrichment System (Agilent) according to the manufacturer’s protocol. Three micrograms of WGA DNA were used for each library preparation. The PCR cycles used for the pre-capture and post-capture amplifications were 4 and 12, respectively. The capture was performed using baits (120 bp RNA molecules) specifically designed by Agilent based on the exon annotation of the S. mansoni reference genome (gene file (v. 4): ftp://ftp.sanger.ac.uk/pub/pathogens/Schistosoma/mansoni/genome/Gene_models/ARCHIVE/v4.19.05.11.chado.filtered.gff; reference genome (v. 5): ftp://ftp.sanger.ac.uk/pub/pathogens/Schistosoma/mansoni/genome/Assembly-v5/sma_v5.0.chr.fa.gz). The bar coded libraries from the exome capture were sequenced using 100 bp pair-end method on two lanes of a HiSeq 2000 sequencer (Illumina). Raw sequence data has been submitted to the NCBI Sequence Read Archive under accession number SRP033214.
The sequencing data were aligned against the S. mansoni reference genome (v. 5) using BWA (v. 0.6.1)  and SAMtools (v. 0.1.18) . Realignment around indels, Q-score recalibration (using SNPs from ), and variant (SNP/indel) calling by the UnifiedGenotyper module were performed using GATK (v. 1.4-37) [45, 46]. Bait representation, capture efficiency and read depth analyses were performed using BEDTools (v. 2.14.3) .
The variant calling set was filtered using a minimum Q-score of 40, and a minimum read depth of 10, respectively, in the parental parasites. To refine the calling set, a minimum read depth of 40 was applied for the variants detected in the F2 related libraries. We retained only F2 variants that were segregating in the treated and/or untreated worm pools of each sex.
where 1 and 2 are the estimated allele frequencies in the treated and untreated pools, respectively; 0 is the allele frequency under the null hypothesis H0: 1 = 2 estimated from the average of 1 and 2; 1 and 2 are the number of worms in the treated and untreated pools, respectively, factor for each reflecting the ploidy state ( =2 except for the haploid Z-linked regions in females where =1); and 1 and 2 are the sequencing depths for the treated and untreated pools, respectively.
where Z f and Z m were -scores from female and male pools, respectively. The p-values were obtained by comparing the negative absolute value of -scores to the standard normal distribution. To determine the significant threshold, Bonferroni correction was calculated with α = 0.05. These analyses and the graphical representation of the results were conducted using R (v. 2.14.2) .
This study was supported by R01 AI097576-01 and R21 AI096277 grants to T.J.C.A. and WHO HQNTD1206356 grant to P.T.L. The molecular work at TBRI was conducted in facilities constructed with support from Research Facilities Improvement Program Grant C06 RR013556 from the National Center for Research Resources, NIH. The AT&T Genomics Computing Center supercomputing facilities were supported by the AT&T Foundation and the National Center for Research Resources Grant No. S10 RR029392. Biomphalaria glabrata snails were supplied by Fred Lewis (Biomedical Research Institute, Rockville, MD) under NIH-NIAID Contract No. HHSN272201000005I. We thank Matt Berriman at the Wellcome Trust Sanger Institute for providing the repeat masked sequence, Guilherme Oliveira for supplying the LE line, Marina McDew-White, Lai Zhao and Dawn Garcia for library preparation, Roy Garcia for Illumina sequencing, Ian Cheeseman for statistical advice and discussions, and Maria-Celeste Ramirez, Holly Barnes, and Yong Yi from Agilent for the bait design and technical advice.
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.