When evaluating targeted sequencing methods, it is important to consider the genomic regions of interest. We have examined both FS, which targets an entire chromosome, and SHS, which targets defined regions. Our comparisons focused on evaluating capture method effectiveness. Of the filtered reads generated, FS had a lower percentage aligned to chromosome X. This was due to the large off-target FS capture of chromosomes 7 and 8, which is caused by imperfect separation of chromosomes when performing flow sorting (Figure 2). The degree of off-target capture depends on the chromosome being investigated, as some chromosomes can be separated more effectively than others. For example, chromosomes 1, 2, 3, and 4 are typically resolved as single peaks whereas chromosomes 9, 10, 11, and 12 are clustered and less easily separated from each other. Recently, the use of increased power settings for the laser in the cell sorter was shown to improve the resolution of the flow karyotypes (even for chromosomes 9–12) and is therefore a more attractive approach for projects involving massively parallel sequencing of flow sorted chromosomes . SHS results in more on-target sequence reads than FS, but it too results in significant amounts of off-target sequence. Sequencing efficiency (the amount of sequence data required to achieve a given coverage across all bases) also contributes to the effectiveness of capture. We evaluated efficiency by examining the distribution of read depths across ROIs. The SHS method was less efficient, as a broader distribution of read depths was observed. In contrast, FS had a tighter distribution of coverage. More importantly, when total mean base coverage was equivalent, FS had a slightly higher genotype determination rate. Although adding more sequence to an SHS experiment increased the genotype determination rate, there were still a number of bases with little or no sequence coverage, likely due to poor capture. Genotypes could be determined at a majority of bases by both methods, but many bases (up to 19.5%) were covered by one method alone, suggesting a combined approach could be used to increase sensitivity. Both methods were amenable to indel and larger SV determination, and similar numbers were observed within the region of interest. While FS targeting may be less efficient than SHS, and SHS sequence efficiency may be less than FS, the two methods are effective for determining genotypes (with FS being slightly more sensitive.) SHS has a design advantage in being able to target regions smaller than a single chromosome. It is therefore important to consider both capture method effectiveness as well as the target design when planning a targeted sequencing experiment.
Experimental cost and ease of use are also important when choosing a sequencing method. In this case, the cost of custom SHS probes for a ~3 Mb target region is similar to that of whole exome SHS probes. In order to cover a whole chromosome, multiple larger probe designs would be required. For example, while list prices (at the time of writing) for hybridization capture reagents range from $450-$1250 per sample for 3 Mb, these prices rise to $4,500-$7,000 per sample to cover chromosome X (150 Mb). Both methods require standard sequencer-specific library preparation. Standard library preparations allow for indexing, which can be used to combine multiple samples for sequencing in order to take advantage of newer high-output sequencing instruments. If we assume the need for 100x total mean base coverage for sensitive genotype determination, this would require at least 155 million 100 base pair reads for chromosome X. As of this writing, a current, widely used sequencer (Illumina HiSeq2000) can generate up to 375 million paired-end reads per lane, making the ability to pool samples essential. The SHS capture method was straightforward, and although some steps required long incubations, hands-on time was relatively low. The FS experiments required access to a flow sorting instrument, as well as the technical expertise to properly perform the chromosomal separations. In addition to this cost, sorting experiments are time consuming and require a large number of mitotic cells, which may be a barrier to high-throughput use of this method.
Although both methods are capable of selecting regions of interest for massively-parallel sequencing, one may certainly be more appropriate than the other depending on the experimental goal. If investigators are targeting sub-chromosomal regions, SHS reagents and sequencing will be less costly, and easier to perform. However, if an investigator wants to sequence larger regions of interest on the same chromosome, or wishes to sequence structurally abnormal “marker” chromosomes, FS may be more appealing. The higher sequence efficiency of FS may partially offset the need to sequence a greater amount of captured DNA. Finally, custom SHS kits include reagents for a minimum sample batch size, and FS may offer a cost advantage when only one or two samples are needed. Conversely, SHS is more suitable for larger sample numbers as it is tailored for high-throughput experiments.
The ever-decreasing costs of massively-parallel sequencing are making whole genome sequencing more practical. However, there are still many advantages to targeting smaller subsets of the genome. Experimental cost is, as of this writing, still lower for targeted sequencing, even for a complete chromosome. The amount of data requiring analysis and storage is much lower for targeted sequencing experiments. Therefore, for a given financial and computational budget, more samples can be analyzed with targeting, increasing the power of an experiment. The lower analytical burden can also result in faster return of results. We have shown that SHS and FS are both effective at focusing sequencing efforts on a targeted subset of the genome. Each method fits specific needs, which will allow researchers with a wide variety of experimental designs and resources to take advantage of this powerful new technology.