Massively-parallel sequencing of genes on a single chromosome: a comparison of solution hybrid selection and flow sorting
- Jamie K Teer1, 3Email author,
- Jennifer J Johnston1,
- Sarah L Anzick2,
- Marbin Pineda2,
- Gary Stone^2,
- NISC Comparative Sequencing Program11,
- Paul S Meltzer2,
- James C Mullikin1 and
- Leslie G Biesecker1
© Teer et al.; licensee BioMed Central Ltd. 2013
Received: 5 October 2012
Accepted: 20 March 2013
Published: 15 April 2013
Targeted capture, combined with massively-parallel sequencing, is a powerful technique that allows investigation of specific portions of the genome for less cost than whole genome sequencing. Several methods have been developed, and improvements have resulted in commercial products targeting the human or mouse exonic regions (the exome). In some cases it is desirable to custom-target other regions of the genome, either to reduce the amount of sequence that is targeted or to capture regions that are not targeted by commercial kits. It is important to understand the advantages, limitations, and complexity of a given capture method before embarking on a targeted sequencing experiment.
We compared two custom targeted capture methods suitable for single chromosome analysis: Solution Hybrid Selection (SHS) and Flow Sorting (FS) of single chromosomes. Both methods can capture targeted material and result in high percentages of genotype identifications across these regions: 59-92% for SHS and 70-79% for FS. FS is amenable to current structural variation detection methods, and variants were detected. Structural variation was also assessed for SHS samples with paired end sequencing, resulting in variant identification.
While both methods can effectively target genomic regions for genotype determination, several considerations make each method appropriate in different circumstances. SHS is well suited for experiments targeting smaller regions in a larger number of samples. FS is well suited when regions of interest cover large regions of a single chromosome. Although whole genome sequencing is becoming less expensive, the sequencing, data storage, and analysis costs make targeted sequencing using SHS or FS a compelling option.
KeywordsFlow sorting Flow cytometry Targeted-sequencing Sequencing Genomic-capture Chromosome Genome
The genome can be interrogated in a random (whole genome shotgun, WGS) or directed (targeted sequencing) manner. Each approach has advantages and disadvantages [1, 2]. Targeted sequencing (or genomic capture) enriches a desired subset of a genome and therefore requires substantially less sequence to generate the needed coverage over the region of interest. As sequencing costs continue to fall, the cost difference for WGS and targeted sequencing also decreases. However, despite declining sequencing costs, the analytic costs (monetary and time) will still be larger for WGS experiments in the foreseeable future. Targeted sequencing therefore allows for reduced data generation and analysis costs, or allows for more samples to be sequenced.
Many of the available targeting methods rely on predesigned regions, which may not be suitable to address the scientific questions of a particular experiment. Exome sequencing (ES), for example, is generally limited to protein coding regions of genomes. While investigation of coding sequences can be powerful, significant evidence supports the critical function of the non-coding (and even non-genic) portions of the genome [3, 4]. Thus, there is a need for methods that can interrogate other customized subsets of the genome in a variety of organisms.
Hybridization capture technologies [5–8] allow for custom probe designs. The cost of the custom capture probes is generally proportional to the total size of the regions of interest. When regions of interest are large, and focused on a single chromosome, it becomes reasonable to capture the entire chromosome instead of using multiple custom hybridization reactions. Although a single chromosome can be a large target, it is a small fraction of the whole genome. For example, chromosomes in the human genome each make up only 2-8% of the total genome size. Chromosomal flow sorting to isolate specific chromosomes, although technically challenging, has undergone many improvements (for review, see [9, 10]). Flow sorting paired with massively-parallel sequencing has been reported in the sequencing of mouse chromosome 17 , barley chromosomes 1H  and 12 additional arms , and wheat chromosomes or arms 1A, 1B, and 1D , 5A , 7DS , 7BS , and 4A . This targeting of isolated chromosomes is a powerful approach to understanding complex plant genomes. Flow sorting and sequencing has also been used on human genomes to identify translocation breakpoints in derivative chromosomes  and in a method to determine phase across a chromosome . These results suggest FS is a powerful capture method for sequencing single chromosomes.
We present a comparison of Solution Hybrid Selection (SHS) (Agilent SureSelect) and Flow Sorting (FS) capture technologies to target a chromosome of interest for massively-parallel sequencing. We show that FS can be used to target the X chromosome of the human genome for the purpose of identifying genotypes and structural variations. We then compare sequencing efficiency, region of interest coverage, and genotype determination rates of SHS and FS. This comparison will be useful for researchers interested in the targeted sequencing of custom regions of interest, particularly when those regions can be found on a single chromosome.
Results and discussion
Target sizes (bp)
Coverage by demoX (%)
Filtered reads on X
Realigned reads on X
Realigned bases on X
Total mean base coverage
≥10x coverage (%)
≥20x coverage (%)
Genotype coverage (%)
Overlap of genotype determinations
chrX nonPar CCDS
chrX nonPar UCSC
Covered by both
Unique to FS
Unique to SHS
Missed by both
Indel and SV determination
Indels - MPG
Indels - Pindel
We also compared the overlap of indel determination in SHS and FS in the same sample. Although SHS used shorter, unpaired sequence reads, more MPG indels were determined than in FS (due to higher total mean base coverage for SHS). We therefore compared the FS BreakDancer/Pindel calls to SHS MPG calls (Figure 5B) and observed that, as above within a sample, the majority of SHS MPG indel determinations were also observed by FS BreakDancer/Pindel. This suggests both SHS and FS are sensitive to smaller indels, particularly with increasing sequence read depth and length.
Although FS identified many more SVs overall (Table 5), only two large deletions overlapped with the demoX target, compared to one large deletion for SHS-PE low. With more sequence, SHS-PE identified six large deletions, but no large insertions. To better understand specificity, we compared different size classes of insertion and deletion to the Database of Genomic Variants (DGV, http://projects.tcag.ca/variation/, ) as in . Medium deletions (100 bp-1 kb) detected by SHS-PE (2) and SHS-PE low (1) were not observed in DGV, but 22/26 (84.6%) observed in FS overlapped a DGV entry. Large deletions (> = 1 kb) were observed in FS and SHS-PE, and half were observed in DGV (FS = 6/12, SHS-PE = 2/4). The overlap of the FS deletions with DGV exceeded that observed in  (67.8% for medium deletions, 43.4% for large) suggesting FS is able to reliably detect SVs (SV detection by SHS was limited by the target size, so the specificity of SV detections from this method is not clear.)
When evaluating targeted sequencing methods, it is important to consider the genomic regions of interest. We have examined both FS, which targets an entire chromosome, and SHS, which targets defined regions. Our comparisons focused on evaluating capture method effectiveness. Of the filtered reads generated, FS had a lower percentage aligned to chromosome X. This was due to the large off-target FS capture of chromosomes 7 and 8, which is caused by imperfect separation of chromosomes when performing flow sorting (Figure 2). The degree of off-target capture depends on the chromosome being investigated, as some chromosomes can be separated more effectively than others. For example, chromosomes 1, 2, 3, and 4 are typically resolved as single peaks whereas chromosomes 9, 10, 11, and 12 are clustered and less easily separated from each other. Recently, the use of increased power settings for the laser in the cell sorter was shown to improve the resolution of the flow karyotypes (even for chromosomes 9–12) and is therefore a more attractive approach for projects involving massively parallel sequencing of flow sorted chromosomes . SHS results in more on-target sequence reads than FS, but it too results in significant amounts of off-target sequence. Sequencing efficiency (the amount of sequence data required to achieve a given coverage across all bases) also contributes to the effectiveness of capture. We evaluated efficiency by examining the distribution of read depths across ROIs. The SHS method was less efficient, as a broader distribution of read depths was observed. In contrast, FS had a tighter distribution of coverage. More importantly, when total mean base coverage was equivalent, FS had a slightly higher genotype determination rate. Although adding more sequence to an SHS experiment increased the genotype determination rate, there were still a number of bases with little or no sequence coverage, likely due to poor capture. Genotypes could be determined at a majority of bases by both methods, but many bases (up to 19.5%) were covered by one method alone, suggesting a combined approach could be used to increase sensitivity. Both methods were amenable to indel and larger SV determination, and similar numbers were observed within the region of interest. While FS targeting may be less efficient than SHS, and SHS sequence efficiency may be less than FS, the two methods are effective for determining genotypes (with FS being slightly more sensitive.) SHS has a design advantage in being able to target regions smaller than a single chromosome. It is therefore important to consider both capture method effectiveness as well as the target design when planning a targeted sequencing experiment.
Experimental cost and ease of use are also important when choosing a sequencing method. In this case, the cost of custom SHS probes for a ~3 Mb target region is similar to that of whole exome SHS probes. In order to cover a whole chromosome, multiple larger probe designs would be required. For example, while list prices (at the time of writing) for hybridization capture reagents range from $450-$1250 per sample for 3 Mb, these prices rise to $4,500-$7,000 per sample to cover chromosome X (150 Mb). Both methods require standard sequencer-specific library preparation. Standard library preparations allow for indexing, which can be used to combine multiple samples for sequencing in order to take advantage of newer high-output sequencing instruments. If we assume the need for 100x total mean base coverage for sensitive genotype determination, this would require at least 155 million 100 base pair reads for chromosome X. As of this writing, a current, widely used sequencer (Illumina HiSeq2000) can generate up to 375 million paired-end reads per lane, making the ability to pool samples essential. The SHS capture method was straightforward, and although some steps required long incubations, hands-on time was relatively low. The FS experiments required access to a flow sorting instrument, as well as the technical expertise to properly perform the chromosomal separations. In addition to this cost, sorting experiments are time consuming and require a large number of mitotic cells, which may be a barrier to high-throughput use of this method.
Although both methods are capable of selecting regions of interest for massively-parallel sequencing, one may certainly be more appropriate than the other depending on the experimental goal. If investigators are targeting sub-chromosomal regions, SHS reagents and sequencing will be less costly, and easier to perform. However, if an investigator wants to sequence larger regions of interest on the same chromosome, or wishes to sequence structurally abnormal “marker” chromosomes, FS may be more appealing. The higher sequence efficiency of FS may partially offset the need to sequence a greater amount of captured DNA. Finally, custom SHS kits include reagents for a minimum sample batch size, and FS may offer a cost advantage when only one or two samples are needed. Conversely, SHS is more suitable for larger sample numbers as it is tailored for high-throughput experiments.
The ever-decreasing costs of massively-parallel sequencing are making whole genome sequencing more practical. However, there are still many advantages to targeting smaller subsets of the genome. Experimental cost is, as of this writing, still lower for targeted sequencing, even for a complete chromosome. The amount of data requiring analysis and storage is much lower for targeted sequencing experiments. Therefore, for a given financial and computational budget, more samples can be analyzed with targeting, increasing the power of an experiment. The lower analytical burden can also result in faster return of results. We have shown that SHS and FS are both effective at focusing sequencing efforts on a targeted subset of the genome. Each method fits specific needs, which will allow researchers with a wide variety of experimental designs and resources to take advantage of this powerful new technology.
The subjects were part of a National Institutes of Health IRB approved study (#94-HG-0193), and provided informed consent. A lymphoblastic cell line was cultured to achieve the high number of cells needed for a flow sorting experiment. Cells were grown and chromosomes were prepared as described .
Flow sorting and in-situ hybridization
Chromosome preparations were sorted as described . Approximately 1.1 × 106 chromosomes were sorted using a dual laser cell sorter (FACS DiVa, Becton-Dickinson). This system allowed a bivariate analysis of both DNA content and base-pair composition.
For sort verification, approximately 500 chromosomes were sorted directly into PCR tubes containing 30 μL of water. The 6MW primer  was used in a primary degenerate oligonucleotide primed PCR (DOP-PCR) to amplify the DNA and then in a secondary PCR reaction to label the chromosomal DNA with biotin-dUTP. In situ hybridization and probe detection was carried out following common fluorescence in situ hybridization (FISH) procedures. Briefly, 300–400 ng of biotinylated PCR product was precipitated with 10 μg of human COT-1 (Invitrogen, Grand Island, NY)) and then dissolved in 14 μl hybridization buffer. Following hybridization, slides were washed and the biotinylated probe was detected with avidin coupled with fluorescein (Vector Laboratories, Burlingame, CA).
Post-sorting DNA purification
DNA was prepared from 250 μL of flow sorted material by adding 15 μL 0.25M EDTA/10% sodium lauroyl sarcosine and 2.5 μL proteinase K (20ng/ml), and incubating overnight at 42°C. Following overnight incubation, 0.17 mM phenylmethylsulfonyl fluoride (PMSF) was added and incubated for 40 minutes at room temperature. Next, the DNA was purified through QIAamp DNA Micro Kit (Qiagen, Valencia, CA) following the manufacturer’s recommended protocol. Purified DNA was used as template for shearing on the Covaris adaptive focused acoustics (AFA) sonicator (Covaris, Inc., Woburn, MA).
Solution hybrid selection
The SHS technique was performed using the SureSelect Human X Chromosome demonstration kit (Agilent Technologies Inc., Santa Clara, CA) according to the manufacturer’s instructions, with modifications as in .
Libraries were prepared and sequenced on a Genome Analyzer IIx (Illumina Inc., San Diego, CA) according to the manufacturer’s protocols.
Initial analysis was carried out using the standard Illumina software, including alignment of sequence reads with ELAND.
A secondary analysis was performed to recover reads that may not have mapped well due to insertions or deletions. For the paired-end data, reads were placed in bins of approximately 100 kilobases along the genome. If one member of a read pair was unaligned, it was placed in the same bin as its mapped mate. Reads were then realigned to the subsection of the genome using a gap-aware alignment program, cross_match (http://www.phrap.org/phredphrapconsed.html). Single-end sequencing was performed for the SHS capture, and so the above approach was not effective (there were no mate pairs to rescue unaligned reads.) We therefore used cross_match to realign all of the unmapped reads against the entire human genome, and then combined the realigned cross_match reads with the reads aligned by ELAND.
In both cases, cross_match and ELAND outputs were converted to the SAM/BAM format , and genotypes were determined using the Most Probable Genotype (MPG) algorithm . Structural variants were detected from paired-end data by first running BreakDancer  on the realigned BAM file (described above) using default settings. This output was then used to guide variant detection using Pindel  (Illumina-PairEnd mode, median insert size of 146 for FS, 182 for SHS-PE and SHS-PE low). Data analysis and comparison was performed using custom Perl scripts, as well as BED file manipulation programs from the bx-python package (https://bitbucket.org/james_taylor/bx-python/wiki/Home) and bedTools , and VCF file manipulation programs from VCFtools (vcftools.sourceforge.net). Overlap of SVs with DGV was counted when a 50% reciprocal overlap was observed. Area-proportional Venn diagrams were prepared using the web tool 3Venn (https://www.cs.kent.ac.uk/people/staff/pjr/EulerVennCircles/EulerVennApplet.html).
Solution hybrid selection
Whole genome sequencing
Regions of interest
Most probable genotype.
This study was supported by the Intramural Research Programs of the National Human Genome Research Institute and the National Cancer Institute. This study is in memory of Gary Stone, not only an excellent flow cytometrist whose skill and dedication made this and many other studies possible, but a friend and colleague who is greatly missed.
- Teer JK, Mullikin JC: Exome sequencing: the sweet spot before whole genomes. Hum Mol Genet. 2010, 19 (R2): R145-151. 10.1093/hmg/ddq333.PubMed CentralView ArticlePubMedGoogle Scholar
- Biesecker LG, Shianna KV, Mullikin JC: Exome sequencing: the expert view. Genome Biol. 2011, 12 (9): 128-10.1186/gb-2011-12-9-128.PubMed CentralView ArticlePubMedGoogle Scholar
- Birney E, Stamatoyannopoulos JA, Dutta A, Guigo R, Gingeras TR, Margulies EH, Weng Z, Snyder M, Dermitzakis ET, Thurman RE: Identification and analysis of functional elements in 1% of the human genome by the ENCODE pilot project. Nature. 2007, 447 (7146): 799-816. 10.1038/nature05874.View ArticlePubMedGoogle Scholar
- Dunham I, Kundaje A, Aldred SF, Collins PJ, Davis CA, Doyle F, Epstein CB, Frietze S, Harrow J, Kaul R: An integrated encyclopedia of DNA elements in the human genome. Nature. 2012, 489 (7414): 57-74. 10.1038/nature11247.View ArticleGoogle Scholar
- Albert TJ, Molla MN, Muzny DM, Nazareth L, Wheeler D, Song X, Richmond TA, Middle CM, Rodesch MJ, Packard CJ: Direct selection of human genomic loci by microarray hybridization. Nat Methods. 2007, 4 (11): 903-905. 10.1038/nmeth1111.View ArticlePubMedGoogle Scholar
- Okou DT, Steinberg KM, Middle C, Cutler DJ, Albert TJ, Zwick ME: Microarray-based genomic selection for high-throughput resequencing. Nat Methods. 2007, 4 (11): 907-909. 10.1038/nmeth1109.View ArticlePubMedGoogle Scholar
- Gnirke A, Melnikov A, Maguire J, Rogov P, LeProust EM, Brockman W, Fennell T, Giannoukos G, Fisher S, Russ C: Solution hybrid selection with ultra-long oligonucleotides for massively parallel targeted sequencing. Nat Biotechnol. 2009, 27 (2): 182-189. 10.1038/nbt.1523.PubMed CentralView ArticlePubMedGoogle Scholar
- Bainbridge MN, Wang M, Burgess DL, Kovar C, Rodesch MJ, D'Ascenzo M, Kitzman J, Wu YQ, Newsham I, Richmond TA: Whole exome capture in solution with 3Gbp of data. Genome Biol. 2010, 11 (6): R62-10.1186/gb-2010-11-6-r62.PubMed CentralView ArticlePubMedGoogle Scholar
- Ibrahim SF, van den Engh G: High-speed chromosome sorting. Chromosome Res. 2004, 12 (1): 5-14.View ArticlePubMedGoogle Scholar
- Dolezel J, Vrana J, Safar J, Bartos J, Kubalakova M, Simkova H: Chromosomes in the flow to simplify genome analysis. Funct Integr Genomics. 2012, 12 (3): 397-416. 10.1007/s10142-012-0293-0.PubMed CentralView ArticlePubMedGoogle Scholar
- Sudbery I, Stalker J, Simpson JT, Keane T, Rust AG, Hurles ME, Walter K, Lynch D, Teboul L, Brown SD: Deep short-read sequencing of chromosome 17 from the mouse strains A/J and CAST/Ei identifies significant germline variation and candidate genes that regulate liver triglyceride levels. Genome Biol. 2009, 10 (10): R112-10.1186/gb-2009-10-10-r112.PubMed CentralView ArticlePubMedGoogle Scholar
- Mayer KF, Taudien S, Martis M, Simkova H, Suchankova P, Gundlach H, Wicker T, Petzold A, Felder M, Steuernagel B: Gene content and virtual gene order of barley chromosome 1H. Plant Physiol. 2009, 151 (2): 496-505. 10.1104/pp.109.142612.PubMed CentralView ArticlePubMedGoogle Scholar
- Mayer KF, Martis M, Hedley PE, Simkova H, Liu H, Morris JA, Steuernagel B, Taudien S, Roessner S, Gundlach H: Unlocking the barley genome by chromosomal and comparative genomics. Plant Cell. 2011, 23 (4): 1249-1263. 10.1105/tpc.110.082537.PubMed CentralView ArticlePubMedGoogle Scholar
- Wicker T, Mayer KF, Gundlach H, Martis M, Steuernagel B, Scholz U, Simkova H, Kubalakova M, Choulet F, Taudien S: Frequent gene movement and pseudogene evolution is common to the large and complex genomes of wheat, barley, and their relatives. Plant Cell. 2011, 23 (5): 1706-1718. 10.1105/tpc.111.086629.PubMed CentralView ArticlePubMedGoogle Scholar
- Vitulo N, Albiero A, Forcato C, Campagna D, Dal Pero F, Bagnaresi P, Colaiacovo M, Faccioli P, Lamontanara A, Simkova H: First survey of the wheat chromosome 5A composition through a next generation sequencing approach. PLoS One. 2011, 6 (10): e26421-10.1371/journal.pone.0026421.PubMed CentralView ArticlePubMedGoogle Scholar
- Berkman PJ, Skarshewski A, Lorenc MT, Lai K, Duran C, Ling EY, Stiller J, Smits L, Imelfort M, Manoli S: Sequencing and assembly of low copy and genic regions of isolated Triticum aestivum chromosome arm 7DS. Plant Biotechnol J. 2011, 9 (7): 768-775. 10.1111/j.1467-7652.2010.00587.x.View ArticlePubMedGoogle Scholar
- Berkman PJ, Skarshewski A, Manoli S, Lorenc MT, Stiller J, Smits L, Lai K, Campbell E, Kubalakova M, Simkova H: Sequencing wheat chromosome arm 7BS delimits the 7BS/4AL translocation and reveals homoeologous gene conservation. Theor Appl Genet. 2012, 124 (3): 423-432. 10.1007/s00122-011-1717-2.View ArticlePubMedGoogle Scholar
- Hernandez P, Martis M, Dorado G, Pfeifer M, Galvez S, Schaaf S, Jouve N, Simkova H, Valarik M, Dolezel J: Next-generation sequencing and syntenic integration of flow-sorted arms of wheat chromosome 4A exposes the chromosome structure and gene content. Plant J. 2012, 69 (3): 377-386. 10.1111/j.1365-313X.2011.04808.x.View ArticlePubMedGoogle Scholar
- Chen W, Kalscheuer V, Tzschach A, Menzel C, Ullmann R, Schulz MH, Erdogan F, Li N, Kijas Z, Arkesteijn G: Mapping translocation breakpoints by next-generation sequencing. Genome Res. 2008, 18 (7): 1143-1149. 10.1101/gr.076166.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Yang H, Chen X, Wong WH: Completely phased genome sequencing through chromosome sorting. Proc Natl Acad Sci USA. 2011, 108 (1): 12-17. 10.1073/pnas.1016725108.PubMed CentralView ArticlePubMedGoogle Scholar
- Pruitt KD, Harrow J, Harte RA, Wallin C, Diekhans M, Maglott DR, Searle S, Farrell CM, Loveland JE, Ruef BJ: The consensus coding sequence (CCDS) project: Identifying a common protein-coding gene set for the human and mouse genomes. Genome Res. 2009, 19 (7): 1316-1323. 10.1101/gr.080531.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Teer JK, Bonnycastle LL, Chines PS, Hansen NF, Aoyama N, Swift AJ, Abaan HO, Albert TJ, Margulies EH, Green ED: Systematic comparison of three genomic enrichment methods for massively parallel DNA sequencing. Genome Res. 2010, 20 (10): 1420-1431. 10.1101/gr.106716.110.PubMed CentralView ArticlePubMedGoogle Scholar
- Chen K, Wallis JW, McLellan MD, Larson DE, Kalicki JM, Pohl CS, McGrath SD, Wendl MC, Zhang Q, Locke DP: BreakDancer: an algorithm for high-resolution mapping of genomic structural variation. Nat Methods. 2009, 6 (9): 677-681. 10.1038/nmeth.1363.PubMed CentralView ArticlePubMedGoogle Scholar
- Ye K, Schulz MH, Long Q, Apweiler R, Ning Z: Pindel: a pattern growth approach to detect break points of large deletions and medium sized insertions from paired-end short reads. Bioinformatics. 2009, 25 (21): 2865-2871. 10.1093/bioinformatics/btp394.PubMed CentralView ArticlePubMedGoogle Scholar
- Iafrate AJ, Feuk L, Rivera MN, Listewnik ML, Donahoe PK, Qi Y, Scherer SW, Lee C: Detection of large-scale variation in the human genome. Nat Genet. 2004, 36 (9): 949-951. 10.1038/ng1416.View ArticlePubMedGoogle Scholar
- Ng BL, Carter NP: Laser excitation power and the flow cytometric resolution of complex karyotypes. Cytometry A. 2010, 77 (6): 585-588.PubMed CentralView ArticlePubMedGoogle Scholar
- Stanyon R, Stone G: Phylogenomic analysis by chromosome sorting and painting. Methods Mol Biol. 2008, 422: 13-29. 10.1007/978-1-59745-581-7_2.View ArticlePubMedGoogle Scholar
- Telenius H, Pelmear AH, Tunnacliffe A, Carter NP, Behmel A, Ferguson-Smith MA, Nordenskjold M, Pfragner R, Ponder BA: Cytogenetic analysis by chromosome painting using DOP-PCR amplified flow-sorted chromosomes. Genes Chromosom Cancer. 1992, 4 (3): 257-263. 10.1002/gcc.2870040311.View ArticlePubMedGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R: The Sequence Alignment/Map format and SAMtools. Bioinformatics. 2009, 25 (16): 2078-2079. 10.1093/bioinformatics/btp352.PubMed CentralView ArticlePubMedGoogle Scholar
- Quinlan AR, Hall IM: BEDTools: a flexible suite of utilities for comparing genomic features. Bioinformatics. 2010, 26 (6): 841-842. 10.1093/bioinformatics/btq033.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.