Skip to main content

Gene editing in the context of an increasingly complex genome

Abstract

The reporting of the first draft of the human genome in 2000 brought with it much hope for the future in what was felt as a paradigm shift toward improved health outcomes. Indeed, we have now mapped the majority of variation across human populations with landmark projects such as 1000 Genomes; in cancer, we have catalogued mutations across the primary carcinomas; whilst, for other diseases, we have identified the genetic variants with strongest association. Despite this, we are still awaiting the genetic revolution in healthcare to materialise and translate itself into the health benefits for which we had hoped. A major problem we face relates to our underestimation of the complexity of the genome, and that of biological mechanisms, generally. Fixation on DNA sequence alone and a ‘rigid’ mode of thinking about the genome has meant that the folding and structure of the DNA molecule —and how these relate to regulation— have been underappreciated. Projects like ENCODE have additionally taught us that regulation at the level of RNA is just as important as that at the spatiotemporal level of chromatin.

In this review, we chart the course of the major advances in the biomedical sciences in the era pre- and post the release of the first draft sequence of the human genome, taking a focus on technology and how its development has influenced these. We additionally focus on gene editing via CRISPR/Cas9 as a key technique, in particular its use in the context of complex biological mechanisms. Our aim is to shift the mode of thinking about the genome to that which encompasses a greater appreciation of the folding of the DNA molecule, DNA- RNA/protein interactions, and how these regulate expression and elaborate disease mechanisms.

Through the composition of our work, we recognise that technological improvement is conducive to a greater understanding of biological processes and life within the cell. We believe we now have the technology at our disposal that permits a better understanding of disease mechanisms, achievable through integrative data analyses. Finally, only with greater understanding of disease mechanisms can techniques such as gene editing be faithfully conducted.

Background

Life is more complex than we had previously thought. We have mapped the entire healthy human genome [1, 2] but many unanswered questions and challenges remain in terms of the genome’s relationship with disease [3,4,5]. Indeed, when former President Clinton exited the White House to announce the first draft of the human genome, his words were met with the belief that we had made a paradigm shift toward a better understanding of human disease, with DNA being likened by Clinton to “the language in which God created life” [6]. Fast approaching 20 years since that announcement from the White House in June, 2000, and it may feel as if the fanfare that accompanied the occasion was premature. Perspective is a luxury, though, and although it can feel like research in the biological and medical sciences (‘biomedical sciences’) since that time has been slower than expected, we have nevertheless made huge progress, even looking far beyond the genome.

Indeed, international landmark projects such as the encyclopaedia of DNA elements in the human genome (ENCODE) [7] and functional annotation of the mammalian genome (FANTOM) [8] have shone much light on life’s complexity through their studies on the transcriptome and epigenome, confirming the earliest conclusions by Lander and colleagues in their summary of the first human genome sequence [2]: “The potential numbers of different proteins and protein–protein interactions are vast, and their actual numbers cannot readily be discerned from the genome sequence. Elucidating such system-level properties presents one of the great challenges for modern biology”. The challenge to which Lander alludes is still very much felt today, and these words are being confirmed as we delve even further into disease mechanisms and pathobiology.

The genome

Projects like ENCODE [7] and FANTOM [8] provide evidence that it’s no longer sufficient to think of DNA as the Holy Grail. Despite this, much focus and attention is still given to the genome and its usage in tackling disease through ‘genomic medicine’ and ‘personalized medicine’ [9,10,11,12]. However, there is doubt [13,14,15], and it has become apparent that simply knowing the sequence of DNA is not enough to fully understand disease and to drive us forward.

To take the focus completely away from the genome is to diminish its importance in disease, and we are not implying that we should ever ignore what the genome may be telling us; yet, it is clear that reading just the genomic sequence is not enough. Further evidence of this comes from projects such as The Cancer Genome Atlas (TCGA) [16] and International Cancer Genome Consortium (ICGC) [17], who, combined, now have the whole genome sequence of thousands of tumour-normal pairs across multiple cancers. Such information allows us to catalogue the main genes implicated in each cancer [18,19,20,21] but leaves us far from completely understanding the underlying mechanisms that are at play. For example, genome-wide association studies (GWAS) have for many years done very well at finding strong associations between SNPs and diseases of all types [22]. However, it is important to realise that the majority (roughly 95%) of statistically significant GWAS SNPs are not found in coding regions and instead lie in regions of regulatory DNA [23], a truth that leaves us to merely hypothesise on what the underlying mechanisms may be (see Table 1 for an example in breast cancer). Regretfully, GWAS have also been difficult to replicate [24,25,26], with Colhoun and colleagues specifically alluding to the complexity of disease traits as an issue [27]. Other issues include poor study design in both the initial and replication study as the chief causes, including small sample sizes and insufficient power, lack of comparability between cases and controls, and ignoring underlying population structure [28]. As of writing (March, 2017), the The National Human Genome Research Institute (NHGRI) [29] lists 35,329 GWAS hits reaching genome-wide significance, spanning > 1700 diseases or phenotypes, ranging from severe acne to World class endurance athleticism, variant Creutzfeldt-Jakob Disease (vCJD) to Sjögren’s syndrome, etc. Despite these large efforts, our knowledge of the genetic basis of many traits is still incomplete [5]. Indeed, complete reliance on studies looking at a set of finely mapped SNPs, as in GWAS, ought to be reconsidered for future studies [30, 31].

Table 1 breast cancer CCND1 locus. Status: unsolved

In genomics, currently, many studies have shifted focus to rare variants in the belief that these will help us to better understand disease. The Department of Health in England has also launched a company, Genomics England, who are in the process of sequencing the genomes of patients recruited from within the National Health Service (NHS). The emphasis of Genomics England is on the study of rare diseases and the contribution of genomic variants to these (Genomics England, available from: http://www.genomicsengland.co.uk [Accessed March 4, 2017]). With the aim of sequencing 100,000 genomes, this project will undoubtedly add much to our knowledge of rare variants and rare disease but, as per other landmark sequencing projects, it will equally leave us with many questions and not bring us much closer to fully understanding disease mechanisms. The hypothesis that rare variants even contribute greatly to disease must be brought into question, and it has been [32,33,34,35,36]. Results from recent studies infer that complex phenotypes and diseases are in fact brought about by a mixture of both common and rare variants, each with different effect sizes [37,38,39,40,41]. Additionally, as monogenic diseases appear to be in the minority, with most phenotypic traits and diseases appearing to be dictated by complex genetics, sequencing projects will never advance our knowledge of these to a great extent without thinking beyond the genome. Unfortunately, we can neither abandon these genome sequencing efforts because the information they provide is complementary to everything observed elsewhere in the cell.

The transcriptome

Including knowledge of the transcriptome with that of the genome can help to hone down the list of genomic regions that are likely to be implicated in disease and, as we’ll see, the transcriptome and genome are inextricably connected. Again, in cancer, studies looking at gene expression in the past have been very successful in both segregating cancer into subtypes and also identifying the key oncogenic drivers of each [42,43,44]; yet, despite this, these still fail to complete our understanding of the underlying biological mechanisms for most findings. In fact, the results from ENCODE [7] prove to us that regulation at the level of the transcriptome is just as complex as that at the level of the genome, a finding echoed elsewhere in an earlier study by Mercer et al. [45]. Indeed, the original estimate on the number of protein coding genes upon the completion of the Human Genome Project (HGP) was 30,000–40,000 [2], which is a reasonable estimate, but it fails to take into account the now almost 200,000 identified transcripts and their splice isoforms that code for a messenger RNA (mRNA) that are either protein coding or have regulatory potential [7]. In fact, we now realise that only a small fraction —up to 2%— of the genome is actually transcribed into mRNA and then translated into protein [5]. Surprisingly, a much larger fraction —up to 70%— is transcribed into mRNA but not translated into protein - these are the non-coding RNAs (ncRNAs). Although for most of these ncRNAs the function (if any) remains unknown, some have been known for a long time, such as X-inactive specific transcript (XIST), which acts as an effector in female chromosome X inactivation [46]. Others, such as HOX transcript antisense RNA (HOTAIR), are strongly implicated in cancer [47]. In addition, regulation at the level of the transcriptome is intertwined with that of both itself and the genome through ncRNA interactions [48] —including micro-RNA (miRNA) [49], antisense RNA [50], long intergenic non-coding RNA (lincRNA) [51,52,53], etc.— and also further afield at the level of chromatin [54] and the proteome.

One could make the argument that the complexity of the transcriptome, in fact, far supersedes that of the genome due to the almost innumerable number of potential RNA interactions that can occur between DNA, proteins, and other RNA species, echoing Lander’s earlier words. Transcription at a given locus is also quantifiable, with different levels of a transcript having potentially key roles in determining pathway and cell-type lineages (e.g. Sox2, Oct4, and Nanog) [55], and also functioning as buffers and dictating the transcription of other RNA species, as is seen with antisense RNA [50]. Antisense RNA transcripts are of particular interest because they stump the long held belief that transcription only occurs on a particular DNA strand. As transcription factors and enhancers do not know the rules that we believe they follow and merely bind to wherever there is an accessible matching motif, be it on the coding or non-coding strands, transcription on both strands can be expected. At certain genomic regions, transcription may even be physically ‘blocked’ when the same gene is being transcribed concurrently on both the sense and antisense strands as both RNA polymerases collide [50].

Many techniques are available to begin the undoubtedly difficult task of unravelling this transcriptomic complexity. For example, chromatin isolation by RNA purification sequencing (ChIRP-seq) can be used to determine regions of DNA that are bound by a RNA of interest [54], whilst crosslinking, ligation, and sequencing of hybrids (CLASH) [56] is capable of determining RNA-RNA binding. RNA-protein interactions can also be determined through multiple other techniques including RNA immunoprecipitation sequencing (RIP-seq) [57,58,59] (further techniques can be found in Table 2). The transcriptome is neither static within an organism and differs across different tissues and cells [8] – one could make the argument that each cell has, in fact, a unique profile, with a ‘gradient’ of transcription across the entire human organism’s 1 trillion cells. The differences between each cell are brought about by a combination of the genetic code and both epigenetic and intrinsic and extrinsic environmental interactions, which slightly modify the transcriptional programme from one cell to the next in a gradient-like fashion.

Table 2 A gambit of technological methods to interrogate the genome’s complexity in every possible way

Chromatin structure and folding

The transcriptome and its innumerable potential interactions operate within the spatiotemporal confines of densely-packed chromatin, i.e., DNA tightly wound around histones, which is itself ever changing in relation to cell cycle processes [60] and in preparation and response to transcription [61, 62]. Although research at the level of chromatin is still not a primary interest for many research groups, we are nevertheless now beginning to better appreciate the 3-dimensional structure and folding of the DNA molecule and the role that this plays in regulation and disease mechanisms. DNA ‘accessibility’ is also key, as much of the genome remains inaccessible to the cytosol, thus, shielding these regions ―including any binding motifs within them― from transcription factors and other proteins.

Mercer and Mattick provide an outstanding review of genomic complexity, highlighting the importance of DNA-protein interactions and ncRNAs in, literally, shaping the genome and regulating gene expression in diverse ways [63]. The ability to capture the 3-dimensional structure of a portion of chromatin can be achieved through chromosome conformation capture (3C) technology [64] - other, more complex, ways of interrogating chromatin and its interactions, including chromosome conformation capture on chip (4C), chromosome conformation capture carbon copy (5C), and high-throughput chromosome conformation capture (Hi-C), are mentioned in Table 2. Achieving this genome-wide to produce a ‘structural reference chromatin’, akin to the feats achieved by the HGP and ENCODE for the genome and transcriptome, respectively, is currently over-ambitious and poses a major challenge [63]. Moreover, based on what we now understand, DNA in its chromatin state is a ‘fluid’ molecule ―not ‘fixed’ and static― that is constantly altering its structure inside the nucleus in relation to protein, ncRNA, and environmental interactions.

The inherent genetic makeup of each individual’s genome —mainly in terms of copy number variation, SNPs, short tandem repeats, retrotransposons, etc. — would additionally translate to subtle variation in chromatin structure. Trying to delineate this level of subtlety could only be accurately predicted by entering the realm of quantum chemistry and by shifting the view of DNA from being a sequence of letters to that of a large, complex, deoxyribonucleic molecule, as it was when it was first discovered [65], which interacts with proteins and other nucleic acids in the cytosol via diverse electrochemical and electromagnetic interactions. Such work is currently being done in the quantum chemical and mechanical sciences [66,67,68], but is currently not a primary focus of this review. In addition, although trying to model an entire human DNA molecule in this way would be useful, it is computationally unfeasible.

With a greater appreciation of the importance and complexity of the genome, transcriptome, and epigenome, one can thus begin to imagine a very dynamic environment within the cytosol —a cellular ‘microcosm’ of activity—, whereby transcription is a pervasive process with transcription factors binding at numerous loci in the genome and initiating transcription where the electromagnetic potential, i.e. ‘binding strength’, mediated via certain DNA motifs or interactions with other proteins, is sufficiently strong such that transcription of downstream targets can ultimately occur - where the binding is not sufficiently strong, transcription of targets may be weak or not occur at all; an environment where the ‘pillars’ that give chromatin its shape and form, i.e., histones, are responding to environmental stressors [69] in a cell type-specific manner and, in this way, increasing or decreasing the accessibility —or ‘opening up’ or ‘closing’ loops— of certain DNA regions to factors in the cytosol, thus modifying expression profiles; finally, an environment where chemical modification of DNA bases, e.g., the addition of methyl groups (or ‘methylation’) is again brought about via environmental interactions and which actively hampers the expression of genes by, in part, reducing the binding of transcription factors [70, 71].

The technology that has driven research

A historical perspective: C.1980s onwards

Much of the challenge for understanding the mechanisms that drive the structure and function of nucleic acid, i.e., DNA and RNA, are limited by available technology. Although we now have numerous ways of interrogating the secrets of the genome (Table 2), automated sequencers utilising the dideoxy-sequencing method of Sanger [72] have been relied upon for DNA sequence information since 1977. The first successful automated sequencing runs utilised the Applied Biosystems (ABI) 370A and sequenced two cDNA clones encoding the muscarinic cholinergic receptor and the ß-adrenergic receptor within a rat heart cDNA library [73] - at the time, it was claimed that one sequencer could obtain > 30,000 bases with five overnight sequencing runs. Given the fact that the haploid human genome is approximately 3.5 billion bases-pairs, in 1987 sequencing one human genome on 100 of these instruments would have taken 5000 days or 13.7 years, with a cost of undoubtedly astronomic proportions.

Thus, whilst sequencing the cellular genome was first discussed as early as 1984 [74] and was a chief goal of the HGP [75], clearly no one intended to sequence an entire human genome with the ABI 370A on a routine basis. However, innovations ensued, detection methods were enhanced with the advent of capillary electrophoresis [76] and, in 2001, with multiple high throughput DNA sequencers (ABI 3700) running in tandem, the human genome was sequenced in two efforts [1, 2] with roughly 90–95% genomic coverage, and in a relatively short amount of time: 15 months [2] and 9 months [1].

These efforts provided for a momentous event in our quest to understand DNA, colloquially referred to as ‘the code of life’, and they provided impetus to sequence and understand DNA at an even quicker pace in the future. Whilst saying this, the first attempt to then move beyond ABI’s automated sequencer was not driven by efforts to sequence the human genome; rather, “to discover and understand the function and variation of genes” [77]. The term massively parallel signature sequencing (MPSS) was used to describe a sequencing platform that would become the prototype for what was to follow as we entered the twenty-first century [77]. This platform was able to sequence millions of DNA strands at one time in conjunction with in vitro cloning of cDNA on microbeads. The instrument employed an innovative system that utilised a charge-coupled device (CCD) detector followed by image processing of fluorescent signals corresponding to each of the 4 deoxynucleotides. The method harnessed biochemical and enzymatic reactions to deliver short tags that were 16 to 20 bases long, referred to as ‘signature sequences’. This approach, developed as an alternative to the highly variable probe hybridising methods of microarray chips [78] was known, previous to MPSS, as serial analysis of gene expression (SAGE), which originally relied on short tags of 9 nucleotide bases [79]. Each of these methods —MPSS, SAGE, and the hybridisation method of arrayed cDNA libraries (microarrays)— relied upon previous knowledge of the mRNA sequences that code for the genes of interest. These platforms in a strict sense were not and are not DNA sequencers in the same way that a sequencer is defined today. Thus, it was impractical to expect MPSS to be able to carry out de novo sequencing on the genome of biological organisms that had not yet been deciphered.

In 2005 and 2006, after years of academic research into improved biochemical processes, two sequencing platforms emerged: the 454 sequencer [80] and the Illumina/Solexa Genome Analyzer, which both utilised sequencing by synthesis (SBS). This method, outlined in Hyman [81], involves the detection of the base-by-base addition of each of the 4 nucleotide bases facilitated by a biochemically engineered DNA polymerase. The detection method utilised in the 454 sequencer [80] takes advantage of the release of pyrophosphate (PPi), which occurs after the addition of each base, and then becomes the substrate for a coupled enzymatic reaction with luciferase that results in the release of light [82]. Another group at the University of Cambridge developed a platform that involved a novel single molecule approach with a laser detection system [83] that utilised nucleotides adapted with florescent and reversible 3′ terminator moieties, which in effect preserved the viability of the growing DNA molecule as it was replicated from the double-stranded template. This sequencing method became the driving force behind the technology spawned by engineers at Solexa, later acquired by Illumina [84]. A similar detection method involving fluorescently-labelled nucleotide bases was developed by a group at Columbia University [85, 86]. At the time, several competing technologies were attempting to replace the dideoxy Sanger sequencing method, then considered the gold standard for DNA sequencing [87].

What was driving this profusion of technological innovation? The goal for all of the competing technologies was to introduce a massively parallel sequencing platform that could sequence a genome in a matter of days instead of months. Thus, one could argue that we have had such an intense interest in the relationship of DNA sequence to disease due in part to the fact that the first technological successes that came out were specifically designed to read DNA sequence quickly, reminiscent of the series of technological advances that came from Apollo Program. Indeed, the concept of the ‘personal genome’, which envisions a world where everyone can have their genome sequenced for as little as $1000 [88], has propelled much of the change and innovation that has occurred during the past 15 years. While the technologies introduced by 454 Life Sciences in 2005 and Illumina/Solexa in 2006 demonstrated a remarkable ability to sequence DNA at a rate that was orders of magnitude faster than the ABI sequencers, they did not deliver the $1000 genome.

Then, in 2008, Baylor College of Medicine reported the sequencing of Dr. James Watson’s complete genome with the 454 sequencing platform to a depth of 7.4-fold [89] - it took 2 months and cost less than US$1 million. Comparative bioinformatics revealed 3.3 million SNPs and structural variation in Dr. Watson’s genome. Also in 2008, in a report outlining the SBS method first developed by Balasubramanian and Klenerman [83] at Cambridge, the genome of a male Yoruba from Nigeria was sequenced to > 30× with the Genome Analyzer (Illumina/Solexa) [84], taking 8 weeks to complete at a cost of US$250,000.

Modern technological advances: C.2010 onward

The utilitarian needs that serve to advance technology often result in unanticipated discoveries that carry research in new directions. Pacific Biosciences (PacBio) developed a platform based on single-molecule real-time (SMRT) sequencing that was able to successfully sequence very long fragments of DNA [90]. In 2010, it was recognised that the SMRT technology would be able to secure read lengths greater than 1 Kbp, which far surpassed the capability of the SBS method at that time, i.e., 100-150 bp (Genome Analyzer) and 330 bp (Roche 454) [87]. Soon thereafter, the SMRT technology was utilised in a de novo sequencing method to demonstrate its ability to sequence the entire genome of a bacteria using only a single, long insert shotgun DNA library [91]. The mean length of the reads for this work was 5777 bp with a mean accuracy of 99.9%. Prior to this research conducted by Chin et al. [91], the SMRT platform was already deemed valuable as a tool for microbial phylogenetic profiling. The platform has inherent advantages over Sanger and Roche 454 for sequencing the 16S ribosomal RNA (rRNA) genes within microbial populations, which require longer reads to give finer resolution [92]. Due to the fact that the SMRT platform gives reads that are four times longer than the 454 platform and does not require a library amplification step, the cost was at that time significantly less than other sequencing technologies.

In addition to the recent proliferation of research conducted in the field of microbial profiling, longer read sequencing technologies have been utilised in attempts to produce haplotype-resolved genome sequences, i.e. haplotype phasing. The need for this type of sequence information becomes apparent when considering hereditary disorders, which are invariably linked to the haplotype and mode of inheritance [93]. In addition to SMRT, Oxford Nanopore Technologies (ONT) also developed a platform that provides haplotype phasing; however, high error rates seen in both of these platforms proved to be a difficult hurdle to move past when it was discovered that PCR-chimera formation was not detected by software assembly programs [94]. An alternative approach to increasing the read length to gain long contiguous reads is to manipulate the upfront library preparation with a method that assigns a molecular barcode to very long (> 50 Kbp) DNA fragments, which are then sequenced with a short read NGS platform. This approach ensures that excessive chimera formation will not take place. After sequencing, bioinformatic algorithms assemble the fragments into a haplotype-resolved genomic sequence, e.g., 10× sequencing (10× Genomics, Pleasanton, USA). This method (from c.2015), along with single cell DNA and RNA sequencing, represents the current state of the art in terms of technological advances in sequencing since the HGP in 2000, and involves the attachment of several million synthetic barcodes —each to one DNA fragment within the genome of interest—, which can then furnish a de novo assembly of any genome and incidentally provide the haplotype phasing of that genome [95].

Regarding the role of PCR and NGS, it is important to grasp that, for most if not all sequencing methods, DNA amplification is a necessary preliminary step in order to increase the detection signal, whether that signal will originate from the excitation of a fluorescently labelled molecule (e.g. SBS), emitted light resulting from an enzymatic reaction (e.g. via PPi release), or the disruption of an electrical current (e.g. ONT). However, PCR-driven amplification will result in artefacts such as chimera formation, mentioned above, as well as random base modification errors [96]. To overcome base errors, NGS methods are designed to sequence at great depths of coverage to ensure that these errors —and indeed basecalling errors due to the sequencing process itself— can be bioinfomatically removed from the final data, or at best reduce their influence. For example, thresholds can be set for a minimum sequencing read depth over each base position during variant calling to ensure that errors retain less influence. On the other hand, PCR-chimera formation cannot be entirely eliminated from any NGS method without specific algorithms designed to target each region of interest within the sequencing data in order to computationally identify the chimeric events. Of importance, however, the length of the PCR amplicon affects the prevalence of chimera formation, with shorter PCR amplicons resulting in lower numbers of chimeric sequences. In saying this, when NGS is utilised to gain insight into the presence of SNPs without regard to how these variants relate to one another, in terms of haplotypes, then chimeric artefacts do not pose the same problem as when a definitive haplotype phasing determination is the goal.

Cutting edge gene editing technology

As technological advances progressed for probing the genome and far beyond this, and as knowledge contributed by academic settings about disease association variants and disease biomarkers accumulated at enormous rates, the desire to actually introduce modifications to the ‘language in which God created life’ became a goal of some research groups, with controversy [97, 98]. Presently, the leading gene editing system involves CRISPR (clustered regularly interspaced short palindromic repeats)/Cas, which has been demonstrated to cleave the genome at endogenous loci in human and mouse cells [99], and to facilitate chromosomal rearrangements through sequence-specific DNA double-strand breaks (DSBs) [100] (Fig. 1). This type of gene editing often requires that the target sites be located on the same allele (cis) and it is crucial to examine the entire genome for unintended off target effects in particular when gene editing is applied for clinical applications [101]. While there have been well designed assays to determine off target effects [102], such methods do not directly sequence the entire genome of cells that have undergone CRISPR gene editing. Thus, modern technology that can produce a haplotype-resolved whole genome has much utility in the realm of gene editing, both pre- and post-experimentation.

Fig. 1
figure 1

‘Surgery’ by CRISPR

Main text

Complex genetics, complex disease: Room for gene editing?

The CRISPR/Cas system has provided an unprecedented ability to delve further into the complexity of the genome and is a technique that is being widely discussed across different areas, including disease control in agriculture (see Table 3 for oversight on CRISPR and bees), drug manufacturing, ‘de-extinction’, vector control, food production, and others [103]. The ability to direct the Cas nuclease in a sequence-specific manner by simply altering a 20 nt guide sequence has permitted a cost-effective, high-throughput way to perform genome-wide analysis. Indeed, numerous large scale CRISPR/Cas9 knockout screens have been employed to generate loss-of-function mutations which allow functional characterisation of all annotated genetic elements [102, 104,105,106,107,108]. These screens have been implemented across a wide range of disciplines and have identified many promising hits, including: essential genes for cell viability, genes that confer resistance to current drug therapies, miRNAs involved in cell growth, potential cancer, and anti-viral drug targets etc. [104, 105, 107].

Table 3 Crisis ‘bee’. Status: imminent problem

However, these screens have also highlighted a major issue, with researchers finding little correlation between the results from CRISPR/Cas9-driven screens and those previously carried out using techniques such as RNA interference (RNAi) [109]. A recent CRISPR/Cas9 screen for essential genes involved in tumour growth revealed that the MELK protein known to be essential in tumour growth does not drive cell proliferation in cancer cells as previously thought [110]. As CRISPR/Cas9 and RNAi mediate their effects by different mechanisms, it does not seem irrational that they can yield different results, although, drawing conclusions from contradictory results is problematic. RNAi has a well-documented tendency for off-target effects [111,112,113,114,115]. This underlines the need to validate results by complementary shRNA and CRISPR/Cas9 screening approaches to produce a more comprehensive analysis [105].

The generation of a catalytically inactive ―or ‘dead’― Cas9 (dCas9) introduced the possibility of fusing functional proteins to dCas9, allowing targeting in a sequence-specific manner without initiating a double strand break [116]. This has led to the generation of innovative adaptations of the CRISPR system that have greatly expanded the molecular biology toolkit and advanced both the scope and effectiveness of genome editing. Further, an inventive strategy termed ‘CRISPR-X’ has created a novel and rapid approach to investigate protein function [117]. It involves fusion of dCas9 to activation-induced cytidine deaminase (AID), which mediates somatic cellular hypermutation (SHM). This can be used to rapidly generate a diverse library of mutants with improved or novel functions, which can then be investigated. Another approach utilises the same enzyme to achieve ‘base-editing’ [118]. This provides a novel programmable way to directly change a mutated base at a greater efficiency than point mutations by homology-directed repair. However, as previously described, to get a full appreciation of complex disease, we need to look beyond the genome level. To facilitate this investigation, researchers have now generated adaptations to the CRISPR system that allow interrogation of both the transcriptome and epigenome.

CRISPR and the transcriptome

Transcriptional regulation provides a powerful approach to further the understanding of gene function and regulatory networks. However, the mechanism of transcriptional regulation in eukaryotic cells is complex and involves the interaction of many different transcription factors at DNA regulatory elements that can span large regions of DNA [119]. Previous techniques such as RNAi have been employed to investigate transcriptional repression but, as mentioned, they are prone to off-target effects that can complicate the interpretation. In addition, RNAi is limited to targeting protein coding transcripts only, whereas CRISPR interference (CRISPRi) involves the fusion to a repressive KRAB effector domain [120], thus allowing transcriptional repression beyond the coding sequence to include miRNAs, lincRNAs, ncRNAs, etc. Alternatively, fusion of dCas9 to transcriptional activation domains such as VP64 can be used to upregulate gene expression, known as CRISPR activation (CRISPRa) [120, 121].

Building on this initial approach, transcriptional activation in a real-life scenario was considered, whereby transcriptional factors act in synergy with multiple co-factors. This hypothesis resulted in a CRISPR complex termed ‘Synergistic Activation Mediator’ (SAM) [122]. SAM combines VP64 with additional activation domains to further achieve higher levels of activation. The capacity to upregulate selected genes offers vast possibilities for reprogramming cellular identity in addition to understanding gene function. Furthermore, whilst wild-type Cas9 can be utilised to implement loss-of-function genome-wide screens, no technology was available previously that allows large-scale gain-of-function (GOF) screens to be conducted in a reliable and cost-effective way. Indeed, SAM was previously utilised for genome-scale transcriptional activation and resulted in the identification of genes that, upon GOF, may have resulted in resistance to a BRAF inhibitor [122].

CRISPR and the epigenome

The epigenome is a complex regulatory layer that acts in concert with the underlying DNA sequence to result in the immense array of variation that exists between cells. The epigenome has well documented strong links to disease status, for example, in its role in imprinting disorders and neurological disease [123, 124]. For many diseases, the problems may lie within this additional regulatory layer rather than the genomic sequence itself. Until now, progress in the field of epigenetics has been limited by the availability of appropriate molecular biology techniques to investigate the functional impact of deposition or removal of chromatin modifications [125]. Recent developments utilise dCas9 nuclease as a targeting domain fused to chromatin-modifying enzymes such as Dnmt3a, Tet1, Lsd1, or Hat catalytic domain of p300 [126,127,128]. This introduces an innovative capability to add or remove chromatin modifications in a site-specific manner, providing new insight into the downstream effects on chromatin state and gene expression of specific sequences, offering a better understanding of the role that epigenetics plays in disease. In addition, dCas9 has now been fused to EGFP or a combination of fluorescent proteins which has been called CRISPRainbow [129, 130]. This provides an insightful approach to visualise the native chromatin. The spatiotemporal organisation and dynamics of chromatin have a direct role in the functional output of genome function, and the ability to track real-time in a site-specific manner will provide another dimension of our understanding of the chromatin structure. Although these advancements introduce a new realm of possibilities for the field of epigenetics, such as advanced cellular reprogramming and functional studies, epigenome editing is still in very early stages. The effect of a stably bound Cas9 nuclease may itself affect the chromatin state and chromatin modifications, thus complicating interpretation [125]. Indeed, although much remains to be elucidated about the chromatin modification network, these advances offer promising steps in unravelling the complexity of the genome.

CRISPR in a therapeutic setting

Thus, whilst it is clear that the genome engineering revolution is fast living up to its potential, and that the wild-type CRISPR/Cas system, along with the ever-growing list of adaptations, has massively expanded our ability to investigate the genome to a new depth, two central issues persist: specificity and delivery. For CRISPR/Cas9 to be used in a therapeutic setting, these two issues need to be thoroughly addressed. Off-target cleavage is a known caveat of the CRISPR/Cas system, with many groups reporting indels at off-target sites [131, 132]. However, it is clear that initial guide-design is absolutely critical in achieving both good on-target cleavage in addition to low levels of off-target cleavage [133,134,135]. An attempt to rationally engineer Cas9 in order to improve the specificity has led to the development of high-fidelity Cas9 (HF-Cas9), enhanced Cas9 (eCas9), and hyper-active Cas9 variant (HypaCas9) - in all cases off-target cleavage was greatly reduced [136,137,138].

Furthermore, orthologues of S. pyogenes Cas9 from different species can be considered, which recognise more intricate PAMs (protospacer adjacent motifs) and thus have a reduced number of off-target sites within the genome [139]. Following the emergence of Cas9 for use in mammalian cells, an additional Class II nuclease, Cas12a, formerly known as Cpf1, was discovered [140]. Cas12a offers several distinct differences compared to Cas9, such as its use of T-rich PAMs and its generation of staggered-end double strand breaks with 5′ overhangs. Interestingly, Cas12a has been shown to be more specific than S. pyogenes Cas9, offering a promising alternative [141, 142].

Another hurdle to overcome is the delivery of the CRISPR/Cas system. For productive gene editing, an optimal delivery vehicle should be highly specific and efficient for a particular cell type, not produce an immune response, exhibit minimal genotoxicity and, in order to minimise off-target effects, the expression of the cargo should not persist for an extended period of time. Currently, no vehicle exists that meets all of these requirements; however, the field of gene-editing is nascent and the potential delivery options are continually evolving; therefore it is likely the current limitations of delivery vehicles will be overcome. Current strategies for delivery of CRISPR/Cas9 components have been extensively reviewed by Glass et al. [143].

Genome editing can additionally be only implemented in a setting where there exists a high level of understanding of the underlying disease mechanism. We now focus on 3 major disease areas in which genome editing could be applicable.

Complex genetics: A focus on 3 disease areas

Asthma

Asthma is a heterogeneous syndrome characterised by chronic airway inflammation, airway hyperresponsiveness and intermittent airway obstruction that result in recurrent episodes of breathlessness, wheeze and cough. Asthma is emblematic of a truly complex genetic disease thought to develop through the interaction of multiple genetic loci and environmental factors and is estimated to affect approximately 300 million worldwide [144]. Asthma most often debuts during early childhood and it is currently the most common chronic disease in childhood [145] - its heritability is estimated to be up to 70% [146, 147].

The earliest childhood asthma disease-gene mapping approaches, including linkage and candidate gene based studies, had mixed results, resulting in identification of only a handful of reproducible loci. However, the advent of technical and statistical methods for comprehensive GWAS has identified numerous reproducible asthma-susceptibility loci including ORMDL3, IL1RL1, WDR36, PDE4D, DENND1B, RAD50, IL13, IL18R1, SMAD3, HLA-DQB1, GSDMB, IL33, IL2RB, RORA, HLA-DPA1, IL6R, LRRC32, C11orf30, TNIP1 [146, 148,149,150]. More recently, two consortia, one European (GABRIEL) [151] and one North-American (EVE) [152], conducted independent large-scale meta-analyses of nearly all available asthma GWAS data, reporting striking overlap in the abovementioned loci, which predominantly reside in regulatory regions of the genome and are involved in immune regulation, which is an integral part of asthma pathogenesis. However, as has been observed in virtually all complex diseases, the asthma loci identified to date explain only a small proportion of the total observed heritability of the disease, suggesting that novel approaches are required to identify the additional risk variants underlying this ‘missing heritability’.

The first childhood asthma GWAS identified common regulatory variants at and near the ORMDL3/GSDMB/ZPBP2 loci on chromosome 17q21 in three populations of European ancestry, a finding that has now been confirmed in various ethnic groups. The 17q21 locus has been shown to increase the risk for an early onset, non-atopic phenotype through alterations of the sphingolipid metabolisms, resulting in bronchial hyperresponsiveness [153]. The understanding of the underlying biology of how this asthma locus operates will provide an avenue for development of new asthma drugs in the near future (see Table 4).

Table 4 Childhood asthma and the 17q21 locus. Status: partially solved

More recently, a genome-wide association study identified CDHR3 as a novel susceptibility locus for early childhood asthma with severe exacerbations [154]. The CDHR3 gene is highly expressed in airway epithelium and was, in a subsequent study, shown to be a rhinovirus C receptor of importance for both binding and replication of the virus [155]. Thus, novel therapeutics targeting this specific gene product may alleviate the burden of acute virus-induced exacerbations in children with the risk variant.

Another important field in asthma genetics is pharmacogenomics, which is the study of the role of genetic determinants in the variable, inter-individual response to medications. Pharmacogenomic studies are of particular interest as up to one-half of children with asthma do not respond to treatment with inhaled β2-agonists, leukotriene modifiers, or inhaled corticosteroids. There has been numerous studies and findings, including ADRB2 [156] and CRISPLD2, which has been shown to regulate the anti-inflammatory effects of corticosteroids in airway smooth muscle cells [157].

All of the above findings highlight how genetic studies in asthma have provided important and clinically-applicable knowledge that may be utilised by CRISPR in the future.

Ocular disorders

Ocular genetic disease offers distinct benefits as a test bed in the field of genome engineering. A high proportion of the causative genes in ocular diseases have been elucidated and are due to a single mutation in a single gene [158, 159]. In addition, the eye offers unique anatomical and physiological qualities that make it amenable to treatment; it is easily accessible, has a small surface area and holds an immune-privileged status making ocular diseases an ideal system in which to develop CRISPR/Cas9 gene therapy [160].

Gene-therapy for recessive retinal diseases caused, largely, by loss-of-function mutations is more advanced than for therapies for dominant, gain-of-function diseases. There are several on-going clinical trials for retinal diseases including choroideremia, Leber congenital amaurosis (LCA), Retinitis pigmentosa, Usher syndrome, and Stargardt disease [161,162,163,164,165]. These therapies all employ a gene-replacement strategy in which a functional copy of the gene is introduced to target cells by either adeno-associated virus (AAV) or lentiviral vectors.

Gene-replacement is not always a viable approach as vector carrying capacity restricts the spectrum of disorders that can be treated and, while lentivirus has a larger carrying capacity, the potential for it to integrate into the genome raises safety concerns. A much more attractive treatment strategy would be to correct the defect itself, utilising the novel CRISPR technology. Editas Medicine have a clinical trial planned for LCA in which CRISPR will be targeted to delete a cryptic splice site and restore normal splicing. They have subsequently announced future plans for a similar trial targeted to Usher Syndrome.

An innovative allele-specific approach emerged when Courtney el al. [166] identified the potential to utilise a mutation that generates a novel PAM to achieve allele-specificity. Although this work focused on corneal dystrophy, the technique has also been exploited for use in retinal disease by Bakondi et al. [167]. This approach provided a highly specific treatment strategy for certain autosomal dominant disorders. As the CRISPR technology develops at a rapid pace it is conceivable that soon an array of therapeutics will materialise that will allow safe and efficient correction of a range of genetic defects.

The future for ocular disorders looks bright and, as we begin to understand the integral players and interactions of complex disease, treatment strategies via genome editing technologies will become apparent. The previous optimisation groundwork using well characterised disease as models will allow for a smooth translation to treatment.

Cancer

In the field of cancer, the primary issue in the future will surround tumour heterogeneity and how this will complicate treatment strategies [168]. The revelation that a single tumour biopsy represents, in fact, multiple distinct tumour cell populations [169] was a pivotal moment in the field of cancer research. Since the discovery, a variety of studies have additionally confirmed that metastases from the primary tumour are invariably representative of only one or more sub-populations [170]. The concept of clonal evolution in cancer has been around since 1976 [171] and has been adopted in the field in order to explain these recent findings [172, 173]. This comes as a startling realisation when one considers the implications for personalised medicine: whilst we may be capable of identifying a metastatic clone with a key driver mutation and eradicating this with a specific drug or therapy (if available), in the situation where the primary tumour is highly heterogeneous, by eradicating the initial metastatic clone we may be merely paving the way for a different clone to rise up, which may necessitate an entirely different treatment strategy [168, 172]. Thus, tumour heterogeneity and the driver of this, genomic instability, have been other key focuses of research and will continue to be.

Identification and functional validation of such driver mutations amongst the large number of passenger mutations is thus an ongoing challenge. Genome editing technology such as CRISPR/Cas9 is going some way to address these challenges. It is now possible to reproduce the complex genome states observed in human tumours, such as translocations and inversions, as well as point mutations and deletions, in both cell lines and mouse models. Until recently, cancer mouse models were both laboriously slow and costly to generate, requiring the injection of genetically modified embryonic stem cells into blastocytes. CRISPR has enabled the generation of knockout and knock-in mouse models in as little as four weeks, developing both germline and somatic mutation mouse models.

Taking breast cancer as just one example, CRISPR has facilitated the discovery of point mutations conferring endocrine therapy resistance and, in doing so, has enabled researchers to understand the mechanism by which this happens [174]. Further, CRISPR-engineered mouse models have been used to identify the secondary mutations that confer resistance to PARP inhibitors in BRCA1 and BRCA2 mutant cancers, which are initially responsive [175]. Others have shown that in a HER2 positive model, a CRIPSR-induced mutation within an amplified HER2 region instead confers a dominant negative effect, resulting in cell growth inhibition via the MAPK/ERK axis, with no effect on HER2 protein levels [176]. That this response is potentiated by PARP inhibition, and is a distinct pathway from current HER2 therapies like Trastuzumab, gives some idea of the potential of CRISPR-mediated engineering in identifying new targets for therapy. However, whilst cancer research has been catapulted by the discovery of CRISPR, the reality remains that delivery of Cas9 continues to be a significant obstacle in both the generation of cancer mouse models and the delivery of therapeutic Cas9 guide RNA systems to treat cancer.

Another potential application of CRISPR in cancer could be as a companion technology to ‘blood biopsy’ based methods. The release of circulating free DNA (cfDNA) from tumour cells, i.e., circulating tumour DNA (ctDNA), can be a consequence of different physiological and pathological process such as apoptosis, necrosis, or active secretion (Fig. 2). In cancer patients, the released DNA may carry specific alterations within the fragment such as genetic and/or epigenetic modifications, which include methylation, loss of heterozygosity (LOH), and tumour-specific mutations in oncogenes and tumour suppressor genes [177]. In this regard, cfDNA from the blood of cancer patients ―and also circulating tumour cells (CTCs)― could be exploited for not just diagnosis and prognosis [178, 179] but also help to identify targets for CRISPR-mediated treatment of the primary tumour. After CRISPR therapeutic intervention, cfDNA analysis could equally be used to monitor the effectiveness of the therapy, as it has been documented that, post-surgery, cfDNA and miRNA levels decrease to those found in healthy individuals [180, 181]; however, when the levels of cfDNA do not change, it might show that residual tumour cells exist [182].

Fig. 2
figure 2

Is there utility for CRISPR via circulating tumour DNA detection?

Conclusions

Our desire to achieve a greater understanding of the genome in the past 3 decades has been the main driver of technological development in this area. Now that we have achieved a greater understanding, we are realising that the genome is not the end of the line, in terms of understanding disease. In fact, one could argue that simply understanding DNA has opened a Pandora’s Box and that the real work has only just begun. Thankfully, the technological advances that have allowed us to understand the genome have indirectly given us opportunities to study beyond the genome, specifically at the transcriptome and epigenome (see Table 2 for a list of these), and further beyond these.

One striking revelation from the deluge of data that has already been produced in the biomedical sciences is that it points out just how much we don’t yet understand about disease and how much work there is still to be done. Indeed, biological data is complex, having diverse internal structures that scientists have struggled to interpret using traditional methods and approaches [183], and whereas we are attempting to define how life within the cell functions in a relatively short space of time in order to better understand disease, life itself has had millions of years for various processes to diversify and become ‘fixed’, which has given us the wide diversity of life that we now see. The main players in this diversity are the genome, transcriptome, epigenome, and environment, with the amount of possible configurations between these being limitless.

Many diseases are therefore complex because life itself is complex, and we are still waiting to see major improvements in healthcare in the era of ‘big data’ that modern technology has allowed us to produce [184,185,186]. We don’t claim that a complete understanding of life within the cell will help us to eradicate disease - we may understand disease much better but people will still age and develop illness. In cardiovascular disease, for example, a vast array of methods already exist and we are already knowledgeable on how to prevent these diseases from occurring (see Table 5) - would adding knowledge from the genome significantly reduce cardiovascular deaths?

Table 5 Cardiovascular disease and gene editing. Status: gene editing’s clinical utility in the cardiovascular realm

In order to see significant improvement in healthcare utilising genomic, transcriptomic, and epigenomics data, there must be greater interdisciplinary cross talk between scientists. This includes, but is not limited to, physicians, clinical geneticists, computational biologists, and policy makers. New and recent technology can help to improve treatment, but only in the context of an understanding of disease mechanisms. We must minimise scenarios in which uncertainty enters the healthcare market, particularly in relation to critical techniques such as gene editing. Would it be feasible to excise a ‘disease allele’ if the exact mechanism of functioning of the allele in question was misunderstood? There is hope in terms of data science: integrating omics data can assist in fully defining disease mechanisms (see Table 6), which opens up the door to ‘safe’ gene editing.

Table 6 T-cell acute lymphoblastic leukaemia. Status: solved

Abbreviations

3C:

Chromosome conformation capture

4C:

Chromosome conformation capture on chip

5C:

Chromosome conformation capture carbon copy

AAV:

Adeno-associated virus

ABI:

Applied Biosystems

ACS:

Acute coronary syndrome

AID:

Activation-induced cytidine deaminase

AMI:

Acute myocardial infarction

ATAC-seq:

Assay for Transposase Accessible Chromatin sequencing

A-to-I:

Adenosine-to-inosine

BNP:

B-type natriuretic peptide

CAD:

Coronary artery disease

Cap-seq:

Cap sequencing

CCD:

Charge-coupled device

CCND1:

Cyclin D1

cfDNA:

Circulating free DNA

CHF:

Congestive heart failure

ChIA-PET:

Chromatin Interaction Analysis by Paired-End Tag sequencing

ChIP:

Chromatin immunoprecipitation

ChIRP-seq:

Chromatin isolation by RNA purification sequencing

CIP-TAP:

Calf Intestinal alkaline Phosphatase Tobacco Acid Pyrophosphatase

CLASH:

Crosslinking, ligation, and sequencing of hybrids

CRISPR:

Clustered regularly interspaced short palindromic repeats

CRISPRa:

CRISPR activation

CRISPRi:

CRISPR interference

csRNAs:

Capped small RNAs

CTCs:

Circulating tumour cells (CTCs)

ctDNA:

Circulating tumour DNA

CVD:

Cardiovascular disease

dCas9:

Dead Cas9

DNase I HS site:

DNase I hypersensitive site

DNase-seq:

DNase I HS site sequencing

DSB:

Double-strand break

eCas9:

Enhanced Cas9

ENCODE:

ENCyclopedia Of DNA Elements in the human genome

FAIRE-seq:

Formaldehyde-Assisted Isolation of Regulatory Elements sequencing

FANTOM:

Functional ANnoTation Of the Mammalian genome

Frag-seq:

Fragmentation sequencing

GGE:

Gradient gel electrophoresis

GOF:

Gain-of-function

GRO-seq:

Global Run-On sequencing

GWAS:

Genome-Wide Association Studies / Study

HF-Cas9:

High-fidelity Cas9

HGP:

Human Genome Project (HGP)

Hi-C:

High-throughput chromosome conformation capture

HITS-CLIP:

High Throughput Sequencing Crosslinking and Immunoprecipitation

HOTAIR:

HOX transcript antisense RNA

HPLC:

High performance liquid chromatography

HypaCas9:

Hyper-active Cas9

ICE:

Inosine Chemical Erasing

ICGC:

International Cancer Genome Consortium

iCLIP:

Individual-nucleotide resolution UV cross-linking and immunoprecipitation

INseq:

Insertion sequencing

LCA:

Leber congenital amaurosis

lincRNA:

Long intergenic non-coding RNA

LOH:

Loss of heterozygosity

M6A:

Methylation of the N6 position of adenosine

MAINE-seq:

MNase-Assisted Isolation of Nucleosomes Sequencing

MeRIP-seq:

Methylated RNA Immunoprecipitation sequencing

miRNA:

micro-RNA

MN:

Micrococcal nuclease

MPSS:

Massively parallel signature sequencing

mRNA:

Messenger RNA

ncRNA:

non-coding RNA

NET-seq:

Native elongating transcript sequencing

NHGRI:

The National Human Genome Research Institute

NHS:

National Health Service

NMR:

Nuclear magnetic resonance

ONT:

Oxford Nanopore Technologies

PacBio:

Pacific Biosciences

PAM:

Protospacer adjacent motifs

PAR-CLIP:

Photoactivatable Ribonucleoside-Enhanced Crosslinking and Immunoprecipitation

PARE-seq:

Parallel Analysis of RNA Ends sequencing

PARS:

Parallel analysis of RNA structure

PCSK9:

Proprotein convertase subtilisin/kexin type 9

PPi:

Pyrophosphate

PRE1 / PRE2:

putative regulatory element 1 / 2

RBP:

RNA binding protein

RC-seq:

Retrotransposon Capture sequencing

Ribo-seq:

Ribosome sequencing

RIP-seq:

RNA immunoprecipitation sequencing

RNAi:

RNA interference

rRNA:

Ribosomal RNA

SAGE:

Serial analysis of gene expression

SAM:

Synergistic Activation Mediator

SBS:

Sequencing by synthesis

SHAPE-seq:

Selective 2’-Hydroxyl Acylation analyzed by Primer Extension sequencing

SHM:

Somatic cellular hypermutation

SMRT:

Single-molecule real-time

SPT:

Serine palmitoyltransferase

T-ALL:

T-cell acute lymphoblastic leukaemia

TCGA:

The Cancer Genome Atlas

TC-seq:

Translocation Capture sequencing

TIF-seq:

Transcript Isoform Sequencing

TN-seq:

Transposon sequencing

TRAP-seq:

Translating Ribosome Affinity Purification sequencing

TSS:

Transcription start site

US NCEP:

US National Cholesterol Education Program

VAP:

Vertical auto profile

vCJD:

Variant Creutzfeldt-Jakob Disease

XIST:

X-Inactive Specific Transcript

References

  1. Venter JC, Adams MD, Myers EW, Li PW, Mural RJ, Sutton GG, Smith HO, Yandell M, Evans CA, Holt RA, et al. The sequence of the human genome. Science. 2001;291(5507):1304–51.

    Article  PubMed  CAS  Google Scholar 

  2. International Human Genome Sequencing C. Initial sequencing and analysis of the human genome. Nature. 2001;409:860.

    Article  Google Scholar 

  3. Gonzaga-Jauregui C, Lupski JR, Gibbs RA. Human genome sequencing in health and disease. Annu Rev Med. 2012;63(1):35–61.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  4. Schatz MC. Biological data sciences in genome research. Genome Res. 2015;25(10):1417–22.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  5. Venter JC, Smith HO, Adams MD. The sequence of the human genome. Clin Chem. 2015;61(9):1207–8.

    Article  PubMed  CAS  Google Scholar 

  6. Clinton WJ. In 'June 2000 White House Event'. The White House Office of the Press Secretary. 2000. https://www.genome.gov/10001356/june-2000-white-house-event/.

  7. The EPC. An integrated encyclopedia of DNA elements in the human genome. Nature. 2012;489:57.

    Article  CAS  Google Scholar 

  8. Carninci P, Kasukawa T, Katayama S, Gough J, Frith MC, Maeda N, Oyama R, Ravasi T, Lenhard B, Wells C, et al. The transcriptional landscape of the mammalian genome. Science. 2005;309(5740):1559–63.

    Article  PubMed  CAS  Google Scholar 

  9. Guttmacher AE, Collins FS. Genomic Medicine — A Primer. N Engl J Med. 2002;347(19):1512–20.

    Article  PubMed  CAS  Google Scholar 

  10. Varmus H. Getting ready for gene-based medicine. N Engl J Med. 2002;347(19):1526–7.

    Article  PubMed  Google Scholar 

  11. Chan IS, Ginsburg GS. Personalized medicine: progress and promise. Annu Rev Genomics Hum Genet. 2011;12(1):217–44.

    Article  PubMed  CAS  Google Scholar 

  12. Green ED, Guyer MS, National Human Genome Research I. charting a course for genomic medicine from base pairs to bedside. Nature. 2011;470:204.

    Article  PubMed  CAS  Google Scholar 

  13. Hunter DJ, Khoury MJ, Drazen JM. Letting the genome out of the bottle — will we get our wish? N Engl J Med. 2008;358(2):105–7.

    Article  PubMed  CAS  Google Scholar 

  14. McGuire AL, Burke W. Raiding the medical commons: an unwelcome side effect of direct-to-consumer personal genome testing. JAMA : the journal of the American Medical Association. 2008;300(22):2669–71.

    Article  PubMed  CAS  Google Scholar 

  15. Feero WG, Guttmacher AE, Collins FS. Genomic medicine — an updated primer. N Engl J Med. 2010;362(21):2001–11.

    Article  PubMed  CAS  Google Scholar 

  16. The Cancer Genome Atlas Research N, Weinstein JN, Collisson EA, Mills GB, Shaw KRM, Ozenberger BA, Ellrott K, Shmulevich I, Sander C, Stuart JM. The Cancer Genome Atlas Pan-Cancer analysis project. Nat Genet. 2013;45:1113.

    Article  CAS  Google Scholar 

  17. The International Cancer Genome C. International network of cancer genome projects. Nature. 2010;464:993.

    Article  CAS  Google Scholar 

  18. Stratton M. Exploring the genomes of cancer cells: progress and promise. Science. 2011;331(6024):1553–8.

    Article  PubMed  CAS  Google Scholar 

  19. Stephens PJ, Tarpey PS, Davies H, Van Loo P, Greenman C, Wedge DC, Nik-Zainal S, Martin S, Varela I, Bignell GR, et al. The landscape of cancer genes and mutational processes in breast cancer. Nature. 2012;486:400.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  20. Ciriello G, Miller ML, Aksoy BA, Senbabaoglu Y, Schultz N, Sander C. Emerging landscape of oncogenic signatures across human cancers. Nat Genet. 2013;45:1127.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  21. Kandoth C, McLellan MD, Vandin F, Ye K, Niu B, Lu C, Xie M, Zhang Q, McMichael JF, Wyczalkowski MA, et al. Mutational landscape and significance across 12 major cancer types. Nature. 2013;502:333.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  22. Witte JS. Genome-wide association studies and beyond. Annu Rev Public Health. 2010;31(1):9–20.

    Article  PubMed  PubMed Central  Google Scholar 

  23. Maurano MT, Humbert R, Rynes E, Thurman RE, Haugen E, Wang H, Reynolds AP, Sandstrom R, Qu H, Brody J, et al. Systematic localization of common disease-associated variation in regulatory DNA. Science. 2012;337(6099):1190–5.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  24. Hirschhorn JN, Lohmueller K, Byrne E, Hirschhorn K. A comprehensive review of genetic association studies. Genetics In Medicine. 2002;4:45.

    Article  PubMed  CAS  Google Scholar 

  25. Lohmueller KE, Pearce CL, Pike M, Lander ES, Hirschhorn JN. Meta-analysis of genetic association studies supports a contribution of common variants to susceptibility to common disease. Nat Genet. 2003;33:177.

    Article  PubMed  CAS  Google Scholar 

  26. Manolio TA, Collins FS. The HapMap and genome-wide association studies in diagnosis and therapy. Annu Rev Med. 2009;60(1):443–56.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  27. Colhoun HM, McKeigue PM, Smith GD. Problems of reporting genetic associations with complex outcomes. Lancet. 2003;361(9360):865–72.

    Article  PubMed  Google Scholar 

  28. Studies N-NWGoRiA. Replicating genotype–phenotype associations. Nature. 2007;447:655.

    Article  CAS  Google Scholar 

  29. MacArthur J, Bowler E, Cerezo M, Gil L, Hall P, Hastings E, Junkins H, McMahon A, Milano A, Morales J, et al. The new NHGRI-EBI catalog of published genome-wide association studies (GWAS catalog). Nucleic Acids Res. 2017;45(Database issue):D896–901.

    Article  PubMed  CAS  Google Scholar 

  30. Botstein D, Risch N. Discovering genotypes underlying human phenotypes: past successes for mendelian disease, future approaches for complex disease. Nat Genet. 2003;33:228.

    Article  PubMed  CAS  Google Scholar 

  31. Boyle EA, Li YI, Pritchard JK. An expanded view of complex traits: from polygenic to omnigenic. Cell. 2017;169(7):1177–86.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  32. Pritchard JK, Cox NJ. The allelic architecture of human disease genes: common disease–common variant… or not? Hum Mol Genet. 2002;11(20):2417–23.

    Article  PubMed  CAS  Google Scholar 

  33. Bodmer W, Bonilla C. Common and rare variants in multifactorial susceptibility to common diseases. Nat Genet. 2008;40:695.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  34. Schork NJ, Murray SS, Frazer KA, Topol EJ. Common vs. rare allele hypotheses for complex diseases. Curr Opin Genet Dev. 2009;19(3):212–9.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  35. Cirulli ET, Goldstein DB. Uncovering the roles of rare variants in common disease through whole-genome sequencing. Nat Rev Genet. 2010;11:415.

    Article  PubMed  CAS  Google Scholar 

  36. Gibson G. Rare and common variants: twenty arguments. Nat Rev Genet. 2012;13:135.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  37. Alves MM, Sribudiani Y, Brouwer RWW, Amiel J, Antiñolo G, Borrego S, Ceccherini I, Chakravarti A, Fernández RM, Garcia-Barcelo M-M, et al. Contribution of rare and common variants determine complex diseases—Hirschsprung disease as a model. Dev Biol. 2013;382(1):320–9.

    Article  PubMed  CAS  Google Scholar 

  38. Diogo D, Kurreeman F, Stahl Eli A, Liao Katherine P, Gupta N, Greenberg Jeffrey D, Rivas Manuel A, Hickey B, Flannick J, Thomson B, et al. Rare, low-frequency, and common variants in the protein-coding sequence of biological candidate genes from GWASs contribute to risk of rheumatoid arthritis. Am J Hum Genet. 2013;92(1):15–27.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  39. Yang J, Wang S, Yang Z, Hodgkinson CA, Iarikova P, Ma JZ, Payne TJ, Goldman D, Li MD. The contribution of rare and common variants in 30 genes to risk nicotine dependence. Mol Psychiatry. 2014;20:1467.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  40. Fritsche LG, Igl W, Bailey JNC, Grassmann F, Sengupta S, Bragg-Gresham JL, Burdon KP, Hebbring SJ, Wen C, Gorski M, et al. A large genome-wide association study of age-related macular degeneration highlights contributions of rare and common variants. Nat Genet. 2015;48:134.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  41. Gorski MM, Blighe K, Lotta LA, Pappalardo E, Garagiola I, Mancini I, Mancuso ME, Fasulo MR, Santagostino E, Peyvandi F. Whole-exome sequencing to identify genetic risk variants underlying inhibitor development in severe hemophilia a patients. Blood. 2016;127(23):2924–33.

    Article  PubMed  CAS  Google Scholar 

  42. Perou CM, Sørlie T, Eisen MB, van de Rijn M, Jeffrey SS, Rees CA, Pollack JR, Ross DT, Johnsen H, Akslen LA, et al. Molecular portraits of human breast tumours. Nature. 2000;406:747.

    Article  PubMed  CAS  Google Scholar 

  43. Sørlie T, Perou CM, Tibshirani R, Aas T, Geisler S, Johnsen H, Hastie T, Eisen MB, van de Rijn M, Jeffrey SS, et al. Gene expression patterns of breast carcinomas distinguish tumor subclasses with clinical implications. Proc Natl Acad Sci. 2001;98(19):10869–74.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Curtis C, Shah SP, Chin S-F, Turashvili G, Rueda OM, Dunning MJ, Speed D, Lynch AG, Samarajiwa S, Yuan Y, et al. The genomic and transcriptomic architecture of 2,000 breast tumours reveals novel subgroups. Nature. 2012;486:346.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  45. Mercer TR, Gerhardt DJ, Dinger ME, Crawford J, Trapnell C, Jeddeloh JA, Mattick JS, Rinn JL. Targeted RNA sequencing reveals the deep complexity of the human transcriptome. Nat Biotechnol. 2011;30:99.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  46. Sundeep K. Recent advances in X-chromosome inactivation. J Cell Physiol. 2011;226(7):1714–8.

    Article  CAS  Google Scholar 

  47. Gutschner T, Diederichs S. The hallmarks of cancer. RNA Biol. 2012;9(6):703–19.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  48. Mattick JS. RNA regulation: a new genetics? Nat Rev Genet. 2004;5:316.

    Article  PubMed  CAS  Google Scholar 

  49. Lai EC. Micro RNAs are complementary to 3′ UTR sequence motifs that mediate negative post-transcriptional regulation. Nat Genet. 2002;30:363.

    Article  PubMed  CAS  Google Scholar 

  50. Pelechano V, Steinmetz LM. Gene regulation by antisense transcription. Nat Rev Genet. 2013;14:880.

    Article  PubMed  CAS  Google Scholar 

  51. Rinn JL, Chang HY. Genome regulation by long noncoding RNAs. Annu Rev Biochem. 2012;81(1):145–66.

    Article  PubMed  CAS  Google Scholar 

  52. Mercer TR, Dinger ME, Mattick JS. Long non-coding RNAs: insights into functions. Nat Rev Genet. 2009;10:155.

    Article  PubMed  CAS  Google Scholar 

  53. Wang Kevin C, Chang Howard Y. Molecular mechanisms of long noncoding RNAs. Mol Cell. 2011;43(6):904–14.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  54. Chu C, Qu K, Zhong Franklin L, Artandi Steven E, Chang Howard Y. Genomic maps of long noncoding RNA occupancy reveal principles of RNA-chromatin interactions. Mol Cell. 2011;44(4):667–78.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  55. Kalmar T, Lim C, Hayward P, Muñoz-Descalzo S, Nichols J, Garcia-Ojalvo J, Martinez Arias A. Regulated fluctuations in Nanog expression mediate cell fate decisions in embryonic stem cells. PLoS Biol. 2009;7(7):e1000149.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  56. Kudla G, Granneman S, Hahn D, Beggs JD, Tollervey D. Cross-linking, ligation, and sequencing of hybrids reveals RNA–RNA interactions in yeast. Proc Natl Acad Sci U S A. 2011;108(24):10010–5.

    Article  PubMed  PubMed Central  Google Scholar 

  57. Zhao J, Ohsumi TK, Kung JT, Ogawa Y, Grau DJ, Sarma K, Song JJ, Kingston RE, Borowsky M, Lee JT. Genome-wide identification of Polycomb-associated RNAs by RIP-seq. Mol Cell. 2010;40(6):939–53.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  58. Cloonan N, Forrest ARR, Kolle G, Gardiner BBA, Faulkner GJ, Brown MK, Taylor DF, Steptoe AL, Wani S, Bethel G, et al. Stem cell transcriptome profiling via massive-scale mRNA sequencing. Nat Methods. 2008;5:613.

    Article  PubMed  CAS  Google Scholar 

  59. Penalva LOF, Tenenbaum SA, Keene JD. Gene Expression Analysis of Messenger RNP Complexes. In: Schoenberg DR, editor. mRNA Processing and Metabolism: Methods and Protocols. Totowa, NJ: Humana Press; 2004. p. 125–34.

    Chapter  Google Scholar 

  60. O'Sullivan RJ, Kubicek S, Schreiber SL, Karlseder J. Reduced histone biosynthesis and chromatin changes arising from a damage signal at telomeres. Nature Structural &Amp; Mol Bio. 2010;17:1218.

    Article  CAS  Google Scholar 

  61. Shebzukhov YV, Horn K, Brazhnik KI, Drutskaya MS, Kuchmiy AA, Kuprash DV, Nedospasov SA. Dynamic changes in chromatin conformation at the TNF transcription start site in T helper lymphocyte subsets. Eur J Immunol. 2014;44(1):251–64.

    Article  PubMed  CAS  Google Scholar 

  62. Eberharter A, Becker PB. Histone acetylation: a switch between repressive and permissive chromatin. Second in review series on chromatin dynamics. 2002;3(3):224–9.

    CAS  Google Scholar 

  63. Mercer TR, Mattick JS. Understanding the regulatory and transcriptional complexity of the genome through structure. Genome Res. 2013;23(7):1081–8.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  64. de Wit E, de Laat W. A decade of 3C technologies: insights into nuclear organization. Genes Dev. 2012;26(1):11–24.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  65. Watson JD, Crick FHC. Molecular Structure of Nucleic Acids: A Structure for Deoxyribose Nucleic Acid. Nature. 1953;171(4356):737–8.

    Article  PubMed  CAS  Google Scholar 

  66. Šponer J, Šponer JE, Petrov AI, Leontis NB. Quantum chemical studies of nucleic acids: can we construct a bridge to the RNA structural biology and bioinformatics communities? J Phys Chem B. 2010;114(48):15723–41.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  67. Harrison JG, Zheng YB, Beal PA, Tantillo DJ. Computational approaches to predicting the impact of novel bases on RNA structure and stability. ACS chemical biology. 2013;8(11) https://doi.org/10.1021/cb4006062.

  68. Koch T, Shim I, Lindow M, Ørum H, Bohr HG. Quantum mechanical studies of DNA and LNA. Nucleic Acid Therapeutics. 2014;24(2):139–48.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  69. Fang L, Wuptra K, Chen D, Li H, Huang S-K, Jin C, Yokoyama KK. Environmental-stress-induced chromatin regulation and its heritability. Journal of carcinogenesis & mutagenesis. 2014;5(1):22058.

    Google Scholar 

  70. Medvedeva YA, Khamis AM, Kulakovskiy IV, Ba-Alawi W, Bhuyan MSI, Kawaji H, Lassmann T, Harbers M, Forrest ARR, Bajic VB. Effects of cytosine methylation on transcription factor binding sites. BMC Genomics. 2014;15:119.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  71. Hu S, Wan J, Su Y, Song Q, Zeng Y, Nguyen HN, Shin J, Cox E, Rho HS, Woodard C, et al. DNA methylation presents distinct binding sites for human transcription factors. eLife. 2013;2:e00726.

    Article  PubMed  PubMed Central  Google Scholar 

  72. Sanger F, Nicklen S, Coulson AR. DNA sequencing with chain-terminating inhibitors. Proc Natl Acad Sci U S A. 1977;74(12):5463–7.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  73. Gocayne J, Robinson DA, FitzGerald MG, Chung FZ, Kerlavage AR, Lentes KU, Lai J, Wang CD, Fraser CM, Venter JC. Primary structure of rat cardiac beta-adrenergic and muscarinic cholinergic receptors obtained by automated DNA sequence analysis: further evidence for a multigene family. Proc Natl Acad Sci U S A. 1987;84(23):8296–300.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  74. Dulbecco R. A turning point in cancer research: sequencing the human genome. Science. 1986;231(4742):1055–6.

    Article  PubMed  CAS  Google Scholar 

  75. Hood L, Rowen L. The human genome project: big science transforms biology and medicine. Genome Medicine. 2013;5(9):79.

    Article  PubMed  PubMed Central  Google Scholar 

  76. Luckey JA, Drossman H, Kostichka AJ, Mead DA, D'Cunha J, Norris TB, Smith LM. High speed DNA sequencing by capillary electrophoresis. Nucleic Acids Res. 1990;18(15):4417–21.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  77. Brenner S, Johnson M, Bridgham J, Golda G, Lloyd DH, Johnson D, Luo S, McCurdy S, Foy M, Ewan M, et al. Gene expression analysis by massively parallel signature sequencing (MPSS) on microbead arrays. Nat Biotechnol. 2000;18:630.

    Article  PubMed  CAS  Google Scholar 

  78. Audic S, Claverie J-M. The significance of digital gene expression profiles. Genome Res. 1997;7(10):986–95.

    Article  PubMed  CAS  Google Scholar 

  79. Velculescu VE, Zhang L, Vogelstein B, Kinzler KW. Serial analysis of gene expression. Science. 1995;270(5235):484–7.

    Article  PubMed  CAS  Google Scholar 

  80. Margulies M, Egholm M, Altman WE, Attiya S, Bader JS, Bemben LA, Berka J, Braverman MS, Chen Y-J, Chen Z, et al. Genome sequencing in microfabricated high-density picolitre reactors. Nature. 2005;437:376.

    Article  PubMed  PubMed Central  Google Scholar 

  81. Hyman ED. A new method of sequencing DNA. Anal Biochem. 1988;174(2):423–36.

    Article  PubMed  CAS  Google Scholar 

  82. Ronaghi M, Karamohamed S, Pettersson B, Uhlén M, Nyrén P. Real-time DNA sequencing using detection of pyrophosphate release. Anal Biochem. 1996;242(1):84–9.

    Article  PubMed  CAS  Google Scholar 

  83. Li H, Ren X, Ying L, Balasubramanian S, Klenerman D. Measuring single-molecule nucleic acid dynamics in solution by two-color filtered ratiometric fluorescence correlation spectroscopy. Proc Natl Acad Sci U S A. 2004;101(40):14425–30.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  84. Bentley DR, Balasubramanian S, Swerdlow HP, Smith GP, Milton J, Brown CG, Hall KP, Evers DJ, Barnes CL, Bignell HR, et al. Accurate whole human genome sequencing using reversible terminator chemistry. Nature. 2008;456(7218):53–9.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  85. Ju J, Kim DH, Bi L, Meng Q, Bai X, Li Z, Li X, Marma MS, Shi S, Wu J, et al. Four-color DNA sequencing by synthesis using cleavable fluorescent nucleotide reversible terminators. Proc Natl Acad Sci U S A. 2006;103(52):19635–40.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  86. Guo J, Xu N, Li Z, Zhang S, Wu J, Kim DH, Sano Marma M, Meng Q, Cao H, Li X, et al. Four-color DNA sequencing with 3′-<em>O</em>−modified nucleotide reversible terminators and chemically cleavable fluorescent dideoxynucleotides. Proc Natl Acad Sci. 2008;105(27):9145–50.

    Article  PubMed  PubMed Central  Google Scholar 

  87. Metzker ML. Sequencing technologies — the next generation. Nat Rev Genet. 2009;11:31.

    Article  PubMed  CAS  Google Scholar 

  88. Shendure J, Mitra RD, Varma C, Church GM. Advanced sequencing technologies: methods and goals. Nat Rev Genet. 2004;5:335.

    Article  PubMed  CAS  Google Scholar 

  89. Wheeler DA, Srinivasan M, Egholm M, Shen Y, Chen L, McGuire A, He W, Chen Y-J, Makhijani V, Roth GT, et al. The complete genome of an individual by massively parallel DNA sequencing. Nature. 2008;452:872.

    Article  PubMed  CAS  Google Scholar 

  90. Eid J, Fehr A, Gray J, Luong K, Lyle J, Otto G, Peluso P, Rank D, Baybayan P, Bettman B, et al. Real-time DNA sequencing from single polymerase molecules. Science. 2009;323(5910):133–8.

    Article  PubMed  CAS  Google Scholar 

  91. Chin C-S, Alexander DH, Marks P, Klammer AA, Drake J, Heiner C, Clum A, Copeland A, Huddleston J, Eichler EE, et al. Nonhybrid, finished microbial genome assemblies from long-read SMRT sequencing data. Nat Methods. 2013;10:563.

    Article  PubMed  CAS  Google Scholar 

  92. Fichot EB, Norman RS. Microbial phylogenetic profiling with the Pacific biosciences sequencing platform. Microbiome. 2013;1(1):10.

    Article  PubMed  PubMed Central  Google Scholar 

  93. Mostovoy Y, Levy-Sakin M, Lam J, Lam ET, Hastie AR, Marks P, Lee J, Chu C, Lin C, Džakula Ž, et al. A hybrid approach for de novo human genome sequence assembly and phasing. Nat Methods. 2016;13:587.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  94. Laver TW, Caswell RC, Moore KA, Poschmann J, Johnson MB, Owens MM, Ellard S, Paszkiewicz KH, Weedon MN. Pitfalls of haplotype phasing from amplicon-based long-read sequencing. Sci Rep. 2016;6:21746.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  95. Weisenfeld NI, Kumar V, Shah P, Church DM, Jaffe DB. Direct determination of diploid genome sequences. Genome Res. 2017;27(5):757–67.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  96. Potapov V, Ong JL. Examining sources of error in PCR by single-molecule sequencing. PLoS One. 2017;12(1):e0169774.

    Article  PubMed  PubMed Central  Google Scholar 

  97. Hildt E. Human Germline interventions–think first. Front Genet. 2016;7:81.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  98. Cribbs AP, Perera SMW. Science and bioethics of CRISPR-Cas9 gene editing: an analysis towards separating facts and fiction. The Yale Journal of Biology and Medicine. 2017;90(4):625–34.

    PubMed  PubMed Central  Google Scholar 

  99. Cong L, Ran FA, Cox D, Lin S, Barretto R, Habib N, Hsu PD, Wu X, Jiang W, Marraffini LA, et al. Multiplex genome engineering using CRISPR/Cas systems. Science. 2013;339(6121):819–23.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  100. Blasco Rafael B, Karaca E, Ambrogio C, Cheong T-C, Karayol E, Minero Valerio G, Voena C, Chiarle R. Simple and Rapid In&#xa0;Vivo Generation of Chromosomal Rearrangements using CRISPR/Cas9 Technology. Cell Rep. 2014;9(4):1219–27.

    Article  PubMed  CAS  Google Scholar 

  101. Wiles MV, Qin W, Cheng AW, Wang H. CRISPR–Cas9-mediated genome editing and guide RNA design. Mamm Genome. 2015;26(9):501–10.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  102. Wang H, Yang H, Shivalila CS, Dawlaty MM, Cheng AW, Zhang F, Jaenisch R. One-step generation of mice carrying mutations in multiple genes by CRISPR/Cas-mediated genome engineering. Cell. 2013;153(4):910–8.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  103. Reardon S. The CRISPR zoo. Nature. 2016;531(7593):160–3.

    Article  PubMed  CAS  Google Scholar 

  104. Shalem O, Sanjana NE, Hartenian E, Shi X, Scott DA, Mikkelsen TS, Heckl D, Ebert BL, Root DE, Doench JG, et al. Genome-scale CRISPR-Cas9 knockout screening in human cells. Science. 2014;343(6166):84–7.

    Article  PubMed  CAS  Google Scholar 

  105. Deans RM, Morgens DW, Ökesli A, Pillay S, Horlbeck MA, Kampmann M, Gilbert LA, Li A, Mateo R, Smith M, et al. Parallel shRNA and CRISPR-Cas9 screens enable antiviral drug target identification. Nat Chem Biol. 2016;12:361.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  106. Shi J, Wang E, Milazzo JP, Wang Z, Kinney JB, Vakoc CR. Discovery of cancer drug targets by CRISPR-Cas9 screening of protein domains. Nat Biotechnol. 2015;33:661.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  107. Wallace J, Hu R, Mosbruger TL, Dahlem TJ, Stephens WZ, Rao DS, Round JL, O’Connell RM. Genome-wide CRISPR-Cas9 screen identifies MicroRNAs that regulate myeloid leukemia cell growth. PLoS One. 2016;11(4):e0153689.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  108. Koike-Yusa H, Li Y, Tan EP, Velasco-Herrera MDC, Yusa K. Genome-wide recessive genetic screening in mammalian cells with a lentiviral CRISPR-guide RNA library. Nat Biotechnol. 2013;32:267.

    Article  PubMed  CAS  Google Scholar 

  109. Morgens DW, Deans RM, Li A, Bassik MC. Systematic comparison of CRISPR/Cas9 and RNAi screens for essential genes. Nat Biotechnol. 2016;34:634.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  110. Lin A, Giuliano CJ, Sayles NM, Sheltzer JM. CRISPR/Cas9 mutagenesis invalidates a putative cancer dependency targeted in on-going clinical trials. eLife. 2017;6:e24179.

    Article  PubMed  PubMed Central  Google Scholar 

  111. Castanotto D, Rossi JJ. The promises and pitfalls of RNA-interference-based therapeutics. Nature. 2009;457(7228):426–33.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  112. Tiemann K, Rossi JJ. RNAi-based therapeutics–current status, challenges and prospects. EMBO Molecular Medicine. 2009;1(3):142–51.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  113. Jackson AL, Burchard J, Schelter J, Chau BN, Cleary M, Lim L, Linsley PS. Widespread siRNA “off-target” transcript silencing mediated by seed region sequence complementarity. RNA. 2006;12(7):1179–87.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  114. Sigoillot FD, Lyman S, Huckins JF, Adamson B, Chung E, Quattrochi B, King RW. A bioinformatics method identifies prominent off-targeted transcripts in RNAi screens. Nat Methods. 2012;9:363.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  115. Echeverri CJ, Beachy PA, Baum B, Boutros M, Buchholz F, Chanda SK, Downward J, Ellenberg J, Fraser AG, Hacohen N, et al. Minimizing the risk of reporting false positives in large-scale RNAi screens. Nat Methods. 2006;3:777.

    Article  PubMed  CAS  Google Scholar 

  116. Jinek M, Chylinski K, Fonfara I, Hauer M, Doudna JA, Charpentier E. A programmable dual-RNA–guided DNA endonuclease in adaptive bacterial immunity. Science. 2012;337(6096):816–21.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  117. Hess GT, Frésard L, Han K, Lee CH, Li A, Cimprich KA, Montgomery SB, Bassik MC. Directed evolution using dCas9-targeted somatic hypermutation in mammalian cells. Nat Methods. 2016;13:1036.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  118. Komor AC, Kim YB, Packer MS, Zuris JA, Liu DR. Programmable editing of a target base in genomic DNA without double-stranded DNA cleavage. Nature. 2016;533:420.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  119. Conaway JW. Introduction to theme “chromatin, epigenetics, and transcription”. Annu Rev Biochem. 2012;81(1):61–4.

    Article  PubMed  CAS  Google Scholar 

  120. Gilbert Luke A, Larson Matthew H, Morsut L, Liu Z, Brar Gloria A, Torres Sandra E, Stern-Ginossar N, Brandman O, Whitehead Evan H, Doudna Jennifer A, et al. CRISPR-mediated modular RNA-guided regulation of transcription in eukaryotes. Cell. 2013;154(2):442–51.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  121. Maeder ML, Linder SJ, Cascio VM, Fu Y, Ho QH, Joung JK. CRISPR RNA–guided activation of endogenous human genes. Nat Methods. 2013;10:977.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  122. Konermann S, Brigham MD, Trevino AE, Joung J, Abudayyeh OO, Barcena C, Hsu PD, Habib N, Gootenberg JS, Nishimasu H, et al. Genome-scale transcriptional activation by an engineered CRISPR-Cas9 complex. Nature. 2014;517:583.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  123. Horsthemke B, Buiting K. Chapter 8 Genomic Imprinting and Imprinting Defects in Humans. In: Advances in Genetics, vol. 61: Academic Press; 2008. p. 225–46.

  124. Zovkic IB, Guzman-Karlsson MC, Sweatt JD. Epigenetic regulation of memory formation and maintenance. Learn Mem. 2013;20(2):61–74.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  125. Kungulovski G, Jeltsch A. Epigenome editing: state of the art, concepts, and perspectives. Trends Genet. 2016;32(2):101–13.

    Article  PubMed  CAS  Google Scholar 

  126. Liu XS, Wu H, Ji X, Stelzer Y, Wu X, Czauderna S, Shu J, Dadon D, Young RA, Jaenisch R. Editing DNA Methylation in the Mammalian Genome. Cell. 2016;167(1):233–47. e217

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  127. Kearns NA, Pham H, Tabak B, Genga RM, Silverstein NJ, Garber M, Maehr R. Functional annotation of native enhancers with a Cas9–histone demethylase fusion. Nat Methods. 2015;12:401.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  128. Hilton IB, Gersbach CA. Enabling functional genomics with genome engineering. Genome Res. 2015;25(10):1442–55.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  129. Chen B, Gilbert Luke A, Cimini Beth A, Schnitzbauer J, Zhang W, Li G-W, Park J, Blackburn Elizabeth H, Weissman Jonathan S, Qi Lei S, et al. Dynamic imaging of genomic loci in living human cells by an optimized CRISPR/Cas system. Cell. 2013;155(7):1479–91.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  130. Ma H, Tu L-C, Naseri A, Huisman M, Zhang S, Grunwald D, Pederson T. Multiplexed labeling of genomic loci with dCas9 and engineered sgRNAs using CRISPRainbow. Nat Biotechnol. 2016;34:528.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  131. Hsu PD, Scott DA, Weinstein JA, Ran FA, Konermann S, Agarwala V, Li Y, Fine EJ, Wu X, Shalem O, et al. DNA targeting specificity of RNA-guided Cas9 nucleases. Nat Biotechnol. 2013;31:827.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  132. Fu Y, Foden JA, Khayter C, Maeder ML, Reyon D, Joung JK, Sander JD. High frequency off-target mutagenesis induced by CRISPR-Cas nucleases in human cells. Nat Biotechnol. 2013;31(9):822–6.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  133. Wu X, Scott DA, Kriz AJ, Chiu AC, Hsu PD, Dadon DB, Cheng AW, Trevino AE, Konermann S, Chen S, et al. Genome-wide binding of the CRISPR endonuclease Cas9 in mammalian cells. Nat Biotechnol. 2014;32:670.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  134. Pattanayak V, Lin S, Guilinger JP, Ma E, Doudna JA, Liu DR. High-throughput profiling of off-target DNA cleavage reveals RNA-programmed Cas9 nuclease specificity. Nat Biotechnol. 2013;31:839.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  135. Christie KA, Courtney DG, DeDionisio LA, Shern CC, De Majumdar S, Mairs LC, Nesbit MA, Moore CBT. Towards personalised allele-specific CRISPR gene editing to treat autosomal dominant disorders. Sci Rep. 2017;7(1):16174.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  136. Slaymaker IM, Gao L, Zetsche B, Scott DA, Yan WX, Zhang F. Rationally engineered Cas9 nucleases with improved specificity. Science. 2016;351(6268):84–8.

    Article  PubMed  CAS  Google Scholar 

  137. Kleinstiver BP, Pattanayak V, Prew MS, Tsai SQ, Nguyen NT, Zheng Z, Joung JK. High-fidelity CRISPR–Cas9 nucleases with no detectable genome-wide off-target effects. Nature. 2016;529:490.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  138. Chen JS, Dagdas YS, Kleinstiver BP, Welch MM, Sousa AA, Harrington LB, Sternberg SH, Joung JK, Yildiz A, Doudna JA. Enhanced proofreading governs CRISPR-Cas9 targeting accuracy. Nature. 2017;550(7676):407–10.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  139. Ran FA, Cong L, Yan WX, Scott DA, Gootenberg JS, Kriz AJ, Zetsche B, Shalem O, Wu X, Makarova KS, et al. In vivo genome editing using Staphylococcus aureus Cas9. Nature. 2015;520:186.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  140. Zetsche B, Gootenberg Jonathan S, Abudayyeh Omar O, Slaymaker Ian M, Makarova Kira S, Essletzbichler P, Volz Sara E, Joung J, van der Oost J, Regev A, et al. Cpf1 is a single RNA-guided endonuclease of a class 2 CRISPR-Cas system. Cell. 2015;163(3):759–71.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  141. Kim D, Kim J, Hur JK, Been KW, Yoon S-H, Kim J-S. Genome-wide analysis reveals specificities of Cpf1 endonucleases in human cells. Nat Biotechnol. 2016;34:863.

    Article  PubMed  CAS  Google Scholar 

  142. Kleinstiver BP, Tsai SQ, Prew MS, Nguyen NT, Welch MM, Lopez JM, McCaw ZR, Aryee MJ, Joung JK. Genome-wide specificities of CRISPR-Cas Cpf1 nucleases in human cells. Nat Biotechnol. 2016;34:869.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  143. Glass Z, Lee M, Li Y, Xu Q. Engineering the delivery system for CRISPR-based genome editing. Trends Biotechnol. 2018;36(2):173–85.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  144. Fanta CH. Asthma. N Engl J Med. 2009;360(10):1002–14.

    Article  PubMed  CAS  Google Scholar 

  145. Hans B, Stanley S. Prevalence of asthma-like symptoms in young children. Pediatr Pulmonol. 2007;42(8):723–8.

    Article  Google Scholar 

  146. Moffatt MF. Genes in asthma: new genes and new ways. Curr Opin Allergy Clin Immunol. 2008;8(5):411–7.

    Article  PubMed  Google Scholar 

  147. Vercelli D. Discovering susceptibility genes for asthma and allergy. Nat Rev Immunol. 2008;8:169.

    Article  PubMed  CAS  Google Scholar 

  148. Li X, Howard TD, Zheng SL, Haselkorn T, Peters SP, Meyers DA, Bleecker ER. Genome-wide association study of asthma identifies RAD50-IL13 and HLA-DR/DQ regions. Journal of Allergy and Clinical Immunology. 2010;125(2):328–35. e311

    Article  PubMed  CAS  Google Scholar 

  149. Sleiman PMA, Flory J, Imielinski M, Bradfield JP, Annaiah K, Willis-Owen SAG, Wang K, Rafaels NM, Michel S, Bonnelykke K, et al. Variants of DENND1B associated with asthma in children. N Engl J Med. 2010;362(1):36–44.

    Article  PubMed  CAS  Google Scholar 

  150. Himes BE, Hunninghake GM, Baurley JW, Rafaels NM, Sleiman P, Strachan DP, Wilk JB, Willis-Owen SAG, Klanderman B, Lasky-Su J, et al. Genome-wide association analysis identifies PDE4D as an asthma-susceptibility gene. Am J Hum Genet. 2009;84(5):581–93.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  151. Moffatt MF, Gut IG, Demenais F, Strachan DP, Bouzigon E, Heath S, von Mutius E, Farrall M, Lathrop M, Cookson WOCM. A large-scale, consortium-based Genomewide association study of asthma. N Engl J Med. 2010;363(13):1211–21.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  152. Torgerson DG, Ampleford EJ, Chiu GY, Gauderman WJ, Gignoux CR, Graves PE, Himes BE, Levin AM, Mathias RA, Hancock DB, et al. Meta-analysis of genome-wide association studies of asthma in ethnically diverse north American populations. Nat Genet. 2011;43(9):887–92.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  153. Ono JG, Worgall TS, Worgall S. Airway reactivity and sphingolipids—implications for childhood asthma. Molecular and Cellular Pediatrics. 2015;2:13.

    Article  PubMed  PubMed Central  Google Scholar 

  154. Bønnelykke K, Sleiman P, Nielsen K, Kreiner-Møller E, Mercader JM, Belgrave D, den Dekker HT, Husby A, Sevelsted A, Faura-Tellez G, et al. A genome-wide association study identifies CDHR3 as a susceptibility locus for early childhood asthma with severe exacerbations. Nat Genet. 2013;46:51.

    Article  PubMed  CAS  Google Scholar 

  155. Bochkov YA, Watters K, Ashraf S, Griggs TF, Devries MK, Jackson DJ, Palmenberg AC, Gern JE. Cadherin-related family member 3, a childhood asthma susceptibility gene product, mediates rhinovirus C binding and replication. Proc Natl Acad Sci U S A. 2015;112(17):5485–90.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  156. Hawkins GA, Tantisira K, Meyers DA, Ampleford EJ, Moore WC, Klanderman B, Liggett SB, Peters SP, Weiss ST, Bleecker ER. Sequence, haplotype, and association analysis of ADRβ2 in a multiethnic asthma case-control study. Am J Respir Crit Care Med. 2006;174(10):1101–9.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  157. Himes BE, Jiang X, Wagner P, Hu R, Wang Q, Klanderman B, Whitaker RM, Duan Q, Lasky-Su J, Nikolos C, et al. RNA-Seq Transcriptome profiling identifies CRISPLD2 as a glucocorticoid responsive gene that modulates cytokine function in airway smooth muscle cells. PLoS One. 2014;9(6):e99625.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  158. Weiss JS, Møller HU, Lisch W, Kinoshita S, Aldave AJ, Belin MW, Kivelä T, Busin M, Munier FL, Seitz B, et al. The IC3D classification of the corneal dystrophies. Cornea. 2008;27(Suppl 2):S1–83.

    Article  PubMed  PubMed Central  Google Scholar 

  159. Broadgate S, Yu J, Downes SM, Halford S. Unravelling the genetics of inherited retinal dystrophies: past, present and future. Prog Retin Eye Res. 2017;59:53–96.

    Article  PubMed  CAS  Google Scholar 

  160. Moore C, Christie K, Marshall J, Nesbit M. Personalised genome editing – the future for corneal dystrophies. Prog Retin Eye Res. 2018;1

  161. Xue K, Oldani M, Jolly JK, Edwards TL, Groppe M, Downes SM, MacLaren RE. Correlation of optical coherence tomography and autofluorescence in the outer retina and choroid of patients with Choroideremia. Invest Ophthalmol Vis Sci. 2016;57(8):3674–84.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  162. Jacobson SG, Cideciyan AV, Roman AJ, Sumaroka A, Schwartz SB, Heon E, Hauswirth WW. Improvement and decline in vision with gene therapy in childhood blindness. N Engl J Med. 2015;372(20):1920–6.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  163. Ghazi NG, Abboud EB, Nowilaty SR, Alkuraya H, Alhommadi A, Cai H, Hou R, Deng W-T, Boye SL, Almaghamsi A, et al. Treatment of retinitis pigmentosa due to MERTK mutations by ocular subretinal injection of adeno-associated virus gene vector: results of a phase I trial. Hum Genet. 2016;135(3):327–43.

    Article  PubMed  CAS  Google Scholar 

  164. Parker MA, Choi D, Erker LR, Pennesi ME, Yang P, Chegarnov EN, Steinkamp PN, Schlechter CL, Dhaenens C-M, Mohand-Said S, et al. Test–retest variability of functional and structural parameters in patients with Stargardt disease participating in the SAR422459 gene therapy trial. Translational Vision Science & Technology. 2016;5(5):10.

    Article  Google Scholar 

  165. Zallocchi M, Binley K, Lad Y, Ellis S, Widdowson P, Iqball S, Scripps V, Kelleher M, Loader J, Miskin J, et al. EIAV-based retinal gene therapy in the shaker1 mouse model for usher syndrome type 1B: development of UshStat. PLoS One. 2014;9(4):e94272.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  166. Courtney DG, Moore JE, Atkinson SD, Maurizi E, Allen EHA, Pedrioli DML, McLean WHI, Nesbit MA, Moore CBT. CRISPR/Cas9 DNA cleavage at SNP-derived PAM enables both in vitro and in vivo KRT12 mutation-specific targeting. Gene Ther. 2015;23:108.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  167. Bakondi B, Lv W, Lu B, Jones MK, Tsai Y, Kim KJ, Levy R, Akhtar AA, Breunig JJ, Svendsen CN, et al. In vivo CRISPR/Cas9 gene editing corrects retinal dystrophy in the S334ter-3 rat model of autosomal dominant retinitis Pigmentosa. Mol Ther. 2016;24(3):556–63.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  168. Baird RD, Caldas C. Genetic heterogeneity in breast cancer: the road to personalized medicine? BMC Med. 2013;11(1):151.

    Article  PubMed  PubMed Central  Google Scholar 

  169. Gerlinger M, Rowan AJ, Horswell S, Larkin J, Endesfelder D, Gronroos E, Martinez P, Matthews N, Stewart A, Tarpey P, et al. Intratumor heterogeneity and branched evolution revealed by multiregion sequencing. N Engl J Med. 2012;366(10):883–92.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  170. Burrell RA, McGranahan N, Bartek J, Swanton C. The causes and consequences of genetic heterogeneity in cancer evolution. Nature. 2013;501:338.

    Article  PubMed  CAS  Google Scholar 

  171. Nowell P. The clonal evolution of tumor cell populations. Science. 1976;194(4260):23–8.

    Article  PubMed  CAS  Google Scholar 

  172. Greaves M, Maley CC. Clonal evolution in cancer. Nature. 2012;481:306.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  173. Gerlinger M, McGranahan N, Dewhurst SM, Burrell RA, Tomlinson I, Swanton C. Cancer: evolution within a lifetime. Annu Rev Genet. 2014;48(1):215–36.

    Article  PubMed  CAS  Google Scholar 

  174. Harrod A, Fulton J, Nguyen VTM, Periyasamy M, Ramos-Garcia L, Lai CF, Metodieva G, de Giorgio A, Williams RL, Santos DB, et al. Genomic modelling of the ESR1 Y537S mutation for evaluating function and new therapeutic approaches for metastatic breast cancer. Oncogene. 2017;36(16):2286–96.

    Article  PubMed  CAS  Google Scholar 

  175. Dréan A, Williamson CT, Brough R, Brandsma I, Menon M, Konde A, Garcia-Murillas I, Pemberton HN, Frankum J, Rafiq R, et al. Modeling therapy resistance in <em>BRCA1/2</em>−mutant cancers. Mol Cancer Ther. 2017;16(9):2022–34.

    Article  PubMed  CAS  PubMed Central  Google Scholar 

  176. Wang H, Sun W. CRISPR-mediated targeting of <em>HER2</em> inhibits cell proliferation through a dominant negative mutation. Cancer Lett. 2017;385:137–43.

    Article  PubMed  CAS  Google Scholar 

  177. Schwarzenbach H, Hoon DSB, Pantel K. Cell-free nucleic acids as biomarkers in cancer patients. Nat Rev Cancer. 2011;11:426.

    Article  PubMed  CAS  Google Scholar 

  178. Openshaw MR, Page K, Fernandez-Garcia D, Guttery D, Shaw JA. The role of ctDNA detection and the potential of the liquid biopsy for breast cancer monitoring. Expert Rev Mol Diagn. 2016;16(7):751–5.

    Article  PubMed  CAS  Google Scholar 

  179. Shaw JA, Guttery DS, Hills A, Fernandez-Garcia D, Page K, Rosales BM, Goddard KS, Hastings RK, Luo J, Ogle O, et al. Mutation analysis of cell-free DNA and single circulating tumor cells in metastatic breast Cancer patients with high circulating tumor cell counts. Clin Cancer Res. 2017;23(1):88–96.

    Article  PubMed  CAS  Google Scholar 

  180. Catarino R, Ferreira MM, Rodrigues H, Coelho A, Nogal A, Sousa A, Medeiros R. Quantification of free circulating tumor DNA as a diagnostic marker for breast Cancer. DNA Cell Biol. 2008;27(8):415–21.

    Article  PubMed  CAS  Google Scholar 

  181. Yamamoto Y, Kosaka N, Tanaka M, Koizumi F, Kanai Y, Mizutani T, Murakami Y, Kuroda M, Miyajima A, Kato T, et al. MicroRNA-500 as a potential diagnostic marker for hepatocellular carcinoma. Biomarkers. 2009;14(7):529–38.

    Article  PubMed  CAS  Google Scholar 

  182. Pauline W, Carina R, Klaus P, Sabine K-B, Rainer K, Heidi S. Impact of platinum-based chemotherapy on circulating nucleic acid levels, protease activities in blood and disseminated tumor cells in bone marrow of ovarian cancer patients. Int J Cancer. 2011;128(11):2572–80.

    Article  CAS  Google Scholar 

  183. Boccaletti S, Latora V, Moreno Y, Chavez M, Hwang DU. Complex networks: structure and dynamics. Phys Rep. 2006;424(4):175–308.

    Article  Google Scholar 

  184. Nash DB. Harnessing the power of big data in healthcare. American Health & Drug Benefits. 2014;7(2):69–70.

    Google Scholar 

  185. Belle A, Thiagarajan R, Soroushmehr SMR, Navidi F, Beard DA, Najarian K. Big data analytics in healthcare. Biomed Res Int. 2015;2015:370194.

    Article  PubMed  PubMed Central  Google Scholar 

  186. Kruse CS, Goswamy R, Raval Y, Marawi S. Challenges and opportunities of big data in health care: a systematic review. JMIR Med Inform. 2016;4(4):e38.

    Article  PubMed  PubMed Central  Google Scholar 

  187. Turnbull C, Ahmed S, Morrison J, Pernet D, Renwick A, Maranian M, Seal S, Ghoussaini M, Hines S, Healey CS, et al. Genome-wide association study identifies five new breast cancer susceptibility loci. Nat Genet. 2010;42:504.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  188. French Juliet D, Ghoussaini M, Edwards Stacey L, Meyer Kerstin B, Michailidou K, Ahmed S, Khan S, Maranian Mel J, O’Reilly M, Hillman Kristine M, et al. Functional variants at the 11q13 risk locus for breast Cancer regulate Cyclin D1 expression through long-range enhancers. Am J Hum Genet. 2013;92(4):489–503.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  189. Core LJ, Waterfall JJ, Lis JT. Nascent RNA sequencing reveals widespread pausing and divergent initiation at human promoters. Science. 2008;322(5909):1845–8.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  190. Churchman LS, Weissman JS. Nascent transcript sequencing visualizes transcription at nucleotide resolution. Nature. 2011;469:368.

    Article  PubMed  CAS  Google Scholar 

  191. Ingolia NT, Ghaemmaghami S, Newman JRS, Weissman JS. Genome-wide analysis in vivo of translation with nucleotide resolution using ribosome profiling. Science. 2009;324(5924):218–23.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  192. Reynoso MA, Juntawong P, Lancia M, Blanco FA, Bailey-Serres J, Zanetti ME: Translating Ribosome Affinity Purification (TRAP) Followed by RNA Sequencing Technology (TRAP-SEQ) for Quantitative Assessment of Plant Translatomes. In: Plant Functional Genomics: Methods and Protocols. Alonso JM, Stepanova AN. New York, NY: Springer New York; 2015: 185–207.

    Google Scholar 

  193. Chi SW, Zang JB, Mele A, Darnell RB. Argonaute HITS-CLIP decodes microRNA–mRNA interaction maps. Nature. 2009;460:479.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  194. Hafner M, Landgraf P, Ludwig J, Rice A, Ojo T, Lin C, Holoch D, Lim C, Tuschl T. Identification of microRNAs and other small regulatory RNAs using cDNA library sequencing. Methods. 2008;44(1):3–12.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  195. König J, Zarnack K, Rot G, Curk T, Kayikci M, Zupan B, Turner DJ, Luscombe NM, Ule J. iCLIP reveals the function of hnRNP particles in splicing at individual nucleotide resolution. Nature Structural &Amp; Mol Biol. 2010;17:909.

    Article  CAS  Google Scholar 

  196. German MA, Luo S, Schroth G, Meyers BC, Green PJ. Construction of parallel analysis of RNA ends (PARE) libraries for the study of cleaved miRNA targets and the RNA degradome. Nat Protoc. 2009;4:356.

    Article  PubMed  CAS  Google Scholar 

  197. German MA, Pillay M, Jeong D-H, Hetawal A, Luo S, Janardhanan P, Kannan V, Rymarquis LA, Nobuta K, German R, et al. Global identification of microRNA–target RNA pairs by parallel analysis of RNA ends. Nat Biotechnol. 2008;26:941.

    Article  PubMed  CAS  Google Scholar 

  198. Pelechano V, Wei W, Jakob P, Steinmetz LM. Genome-wide identification of transcript start and end sites by transcript isoform sequencing. Nat Protoc. 2014;9:1740.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  199. Pelechano V, Wei W, Steinmetz LM. Extensive transcriptional heterogeneity revealed by isoform profiling. Nature. 2013;497:127.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  200. Lucks JB, Mortimer SA, Trapnell C, Luo S, Aviran S, Schroth GP, Pachter L, Doudna JA, Arkin AP. Multiplexed RNA structure characterization with selective 2′-hydroxyl acylation analyzed by primer extension sequencing (SHAPE-Seq). Proc Natl Acad Sci. 2011;108(27):11063–8.

    Article  PubMed  PubMed Central  Google Scholar 

  201. Wan Y, Qu K, Ouyang Z, Chang HY. Genome-wide mapping of RNA structure using nuclease digestion and high-throughput sequencing. Nat Protoc. 2013;8:849.

    Article  PubMed  CAS  Google Scholar 

  202. Underwood JG, Uzilov AV, Katzman S, Onodera CS, Mainzer JE, Mathews DH, Lowe TM, Salama SR, Haussler D. FragSeq: transcriptome-wide RNA structure probing using high-throughput sequencing. Nat Methods. 2010;7:995.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  203. Sakurai M, Yano T, Kawabata H, Ueda H, Suzuki T. Inosine cyanoethylation identifies A-to-I RNA editing sites in the human transcriptome. Nat Chem Biol. 2010;6:733.

    Article  PubMed  CAS  Google Scholar 

  204. Meyer Kate D, Saletore Y, Zumbo P, Elemento O, Mason Christopher E, Jaffrey Samie R. Comprehensive Analysis of mRNA Methylation Reveals Enrichment in 3’UTRs and near Stop Codons. Cell. 2012;149(7):1635–46.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  205. Gu W, Lee H-C, Chaves D, Youngman Elaine M, Pazour Gregory J, Conte D Jr, Mello Craig C. CapSeq and CIP-TAP Identify Pol II Start Sites and Reveal Capped Small RNAs as C.elegans piRNA Precursors. Cell. 2012;151(7):1488–500.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  206. Affymetrix/Cold Spring Harbor Laboratory ETP. Post-transcriptional processing generates a diversity of 5′-modified long and short RNAs. Nature. 2009;457:1028.

    Article  CAS  Google Scholar 

  207. Crawford GE, Holt IE, Whittle J, Webb BD, Tai D, Davis S, Margulies EH, Chen Y, Bernat JA, Ginsburg D, et al. Genome-wide mapping of DNase hypersensitive sites using massively parallel signature sequencing (MPSS). Genome Res. 2006;16(1):123–31.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  208. Gaulton KJ, Nammo T, Pasquali L, Simon JM, Giresi PG, Fogarty MP, Panhuis TM, Mieczkowski P, Secchi A, Bosco D, et al. A map of open chromatin in human pancreatic islets. Nat Genet. 2010;42:255.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  209. Giresi PG, Kim J, McDaniell RM, Iyer VR, Lieb JD. FAIRE (formaldehyde-assisted isolation of regulatory elements) isolates active regulatory elements from human chromatin. Genome Res. 2007;17(6):877–85.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  210. Ponts N, Harris EY, Prudhomme J, Wick I, Eckhardt-Ludka C, Hicks GR, Hardiman G, Lonardi S, Le Roch KG. Nucleosome landscape and control of transcription in the human malaria parasite. Genome Res. 2010;20(2):228–38.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  211. Buenrostro JD, Wu B, Chang HY, Greenleaf WJ. ATAC-seq: a method for assaying chromatin accessibility genome-wide. Current Protocols in Molecular Biology. 2015;109(1):21.29.21–9.

    Google Scholar 

  212. Fullwood MJ, Liu MH, Pan YF, Liu J, Xu H, Mohamed YB, Orlov YL, Velkov S, Ho A, Mei PH, et al. An oestrogen-receptor-α-bound human chromatin interactome. Nature. 2009;462:58.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  213. Duan Z, Andronescu M, Schutz K, Lee C, Shendure J, Fields S, Noble WS, Anthony Blau C. A genome-wide 3C-method for characterizing the three-dimensional architectures of genomes. Methods. 2012;58(3):277–88.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  214. Zhao Z, Tavoosidana G, Sjölinder M, Göndör A, Mariano P, Wang S, Kanduri C, Lezcano M, Singh Sandhu K, Singh U, et al. Circular chromosome conformation capture (4C) uncovers extensive networks of epigenetically regulated intra- and interchromosomal interactions. Nat Genet. 2006;38:1341.

    Article  PubMed  CAS  Google Scholar 

  215. Dostie J, Dekker J. Mapping networks of physical interactions between genomic elements using 5C technology. Nat Protoc. 2007;2:988.

    Article  PubMed  CAS  Google Scholar 

  216. Belton J-M, McCord RP, Gibcus JH, Naumova N, Zhan Y, Dekker J. Hi–C: a comprehensive technique to capture the conformation of genomes. Methods. 2012;58(3):268–76.

    Article  PubMed  CAS  Google Scholar 

  217. Sanchez-Luque FJ, Richardson SR, Faulkner GJ. Retrotransposon Capture Sequencing (RC-Seq): A Targeted, High-Throughput Approach to Resolve Somatic L1 Retrotransposition in Humans. In: Garcia-Pérez JL, editor. Transposons and Retrotransposons: Methods and Protocols. New York, NY: Springer New York; 2016. p. 47–77.

    Chapter  Google Scholar 

  218. Baillie JK, Barnett MW, Upton KR, Gerhardt DJ, Richmond TA, De Sapio F, Brennan PM, Rizzu P, Smith S, Fell M, et al. Somatic retrotransposition alters the genetic landscape of the human brain. Nature. 2011;479:534.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  219. van Opijnen T, Bodi KL, Camilli A. Tn-seq: high-throughput parallel sequencing for fitness and genetic interaction studies in microorganisms. Nat Methods. 2009;6:767.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  220. van Opijnen T, Camilli A. Transposon insertion sequencing: a new tool for systems-level analysis of microorganisms. Nature reviews Microbiology. 2013;11(7) https://doi.org/10.1038/nrmicro3033.

  221. Klein Isaac A, Resch W, Jankovic M, Oliveira T, Yamane A, Nakahashi H, Di Virgilio M, Bothmer A, Nussenzweig A, Robbiani Davide F, et al. Translocation-capture sequencing reveals the extent and nature of chromosomal rearrangements in B lymphocytes. Cell. 2011;147(1):95–106.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  222. Oliveira TY, Resch W, Jankovic M, Casellas R, Nussenzweig MC, Klein IA. Translocation capture sequencing: a method for high throughput mapping of chromosomal rearrangements. J Immunol Methods. 2012;375(1):176–81.

    Article  PubMed  CAS  Google Scholar 

  223. HHW V, van Doorn A. A century of advances in bumblebee domestication and the economic and environmental aspects of its commercialization for pollination. Apidologie. 2006;37(4):421–51.

    Article  Google Scholar 

  224. MJF B, Paxton RJ. The conservation of bees: a global perspective. Apidologie. 2009;40(3):410–6.

    Article  Google Scholar 

  225. Linde B, Veerle M, Gamal A-A, Guy S. Lethal and sublethal side-effect assessment supports a more benign profile of spinetoram compared with spinosad in the bumblebee Bombus terrestris. Pest Manag Sci. 2011;67(5):541–7.

    Article  CAS  Google Scholar 

  226. Thomson D. Detecting the effects of introduced species: a case study of competition between Apis and Bombus. Oikos. 2006;114(3):407–18.

    Article  Google Scholar 

  227. Ellis JD, Munn PA. The worldwide health status of honey bees. Bee World. 2005;86(4):88–101.

    Article  Google Scholar 

  228. Cox-Foster DL, Conlan S, Holmes EC, Palacios G, Evans JD, Moran NA, Quan P-L, Briese T, Hornig M, Geiser DM, et al. A metagenomic survey of microbes in honey bee Colony collapse disorder. Science. 2007;318(5848):283–7.

    Article  PubMed  CAS  Google Scholar 

  229. Anderson D, East IJ. The latest buzz about Colony collapse disorder. Science. 2008;319(5864):724–5.

    Article  PubMed  CAS  Google Scholar 

  230. Horvath P, Barrangou R. CRISPR/Cas, the immune system of Bacteria and Archaea. Science. 2010;327(5962):167–70.

    Article  PubMed  CAS  Google Scholar 

  231. The Honeybee Genome Sequencing C. Insights into social insects from the genome of the honeybee Apis mellifera. Nature. 2006;443(7114):931–49.

    Article  CAS  Google Scholar 

  232. Sadd BM, Barribeau SM, Bloch G, de Graaf DC, Dearden P, Elsik CG, Gadau J, Grimmelikhuijzen CJ, Hasselmann M, Lozier JD, et al. The genomes of two key bumblebee species with primitive eusocial organization. Genome Biol. 2015;16(1):76.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  233. Martinez FD, Wright AL, Taussig LM, Holberg CJ, Halonen M, Morgan WJ. Asthma and wheezing in the first six years of life. N Engl J Med. 1995;332(3):133–8.

    Article  PubMed  CAS  Google Scholar 

  234. Anderson GP. Endotyping asthma: new insights into key pathogenic mechanisms in a complex, heterogeneous disease. Lancet. 2008;372(9643):1107–19.

    Article  PubMed  Google Scholar 

  235. Moffatt MF, Kabesch M, Liang L, Dixon AL, Strachan D, Heath S, Depner M, von Berg A, Bufe A, Rietschel E, et al. Genetic variants regulating ORMDL3 expression contribute to the risk of childhood asthma. Nature. 2007;448:470.

    Article  PubMed  CAS  Google Scholar 

  236. Verlaan DJ, Berlivet S, Hunninghake GM, Madore A-M, Larivière M, Moussette S, Grundberg E, Kwan T, Ouimet M, Ge B, et al. Allele-specific chromatin remodeling in the ZPBP2/GSDMB/ORMDL3 locus associated with the risk of asthma and autoimmune disease. Am J Hum Genet. 2009;85(3):377–93.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  237. Miller M, Tam AB, Cho JY, Doherty TA, Pham A, Khorram N, Rosenthal P, Mueller JL, Hoffman HM, Suzukawa M, et al. ORMDL3 is an inducible lung epithelial gene regulating metalloproteases, chemokines, OAS, and ATF6. Proc Natl Acad Sci. 2012;109(41):16648–53.

    Article  PubMed  PubMed Central  Google Scholar 

  238. Breslow DK, Collins SR, Bodenmiller B, Aebersold R, Simons K, Shevchenko A, Ejsing CS, Weissman JS. Orm family proteins mediate sphingolipid homeostasis. Nature. 2010;463(7284):1048–53.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  239. Breslow DK, Weissman JS. Membranes in balance: mechanisms of Sphingolipid homeostasis. Mol Cell. 2010;40(2):267–79.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  240. Worgall TS, Veerappan A, Sung B, Kim BI, Weiner E, Bholah R, Silver RB, Jiang X-C, Worgall S. Impaired Sphingolipid Synthesis in the Respiratory Tract Induces Airway Hyperreactivity. Science Translational Medicine. 2013;5(186):186ra167.

    Article  CAS  Google Scholar 

  241. Miller M, Rosenthal P, Beppu A, Mueller JL, Hoffman HM, Tam AB, Doherty TA, McGeough MD, Pena CA, Suzukawa M, et al. ORMDL3 transgenic mice have increased airway remodeling and airway responsiveness characteristic of asthma. J Immunol. 2014;192(8):3475–87.

    Article  PubMed  CAS  Google Scholar 

  242. Lopez J, Burtis CA, Bruns DE. Tietz fundamentals of clinical chemistry and molecular diagnostics, 7th ed.: Elsevier, Amsterdam, 1075 pp, ISBN 978-1-4557-4165-6. Indian J Clin Biochem. 2015;30(2):243.

    Article  PubMed Central  Google Scholar 

  243. Zivkovic AM, Wiest MM, Nguyen UT, Davis R, Watkins SM, German JB. Effects of sample handling and storage on quantitative lipid analysis in human serum. Metabolomics. 2009;5(4):507–16.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  244. Dong J, Guo H, Yang R, Li H, Wang S, Zhang J, Chen W. Serum LDL- and HDL-cholesterol determined by ultracentrifugation and HPLC. J Lipid Res. 2011;52(2):383–8.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  245. Hafiane A, Genest J. High density lipoproteins: measurement techniques and potential biomarkers of cardiovascular risk. BBA Clinical. 2015;3:175–88.

    Article  PubMed  PubMed Central  Google Scholar 

  246. Mora S, Otvos JD, Rifai N, Rosenson RS, Buring JE, Ridker PM. Lipoprotein particle profiles by nuclear magnetic resonance compared with standard lipids and Apolipoproteins in predicting incident cardiovascular disease in women. Circulation. 2009;119(7):931–9.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  247. Rosenson RS, Brewer HB, Chapman MJ, Fazio S, Hussain MM, Kontush A, Krauss RM, Otvos JD, Remaley AT, Schaefer EJ. HDL measures, particle heterogeneity, proposed nomenclature, and relation to atherosclerotic cardiovascular events. Clin Chem. 2011;57(3):392–410.

    Article  PubMed  CAS  Google Scholar 

  248. Caulfield MP, Li S, Lee G, Blanche PJ, Salameh WA, Benner WH, Reitz RE, Krauss RM. Direct determination of lipoprotein particle sizes and concentrations by ion mobility analysis. Clin Chem. 2008;54(8):1307–16.

    Article  PubMed  CAS  Google Scholar 

  249. Lavu M, Gundewar S, Lefer DJ. Gene therapy for ischemic heart disease. J Mol Cell Cardiol. 2011;50(5):742–50.

    Article  PubMed  CAS  Google Scholar 

  250. Ding Q, Strong A, Patel KM, Ng S-L, Gosis BS, Regan SN, Cowan CA, Rader DJ, Musunuru K. Permanent alteration of PCSK9 with in vivo CRISPR-Cas9 genome editing: novelty and significance. Circ Res. 2014;115(5):488–92.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  251. Musunuru K, Orho-Melander M, Caulfield MP, Li S, Salameh WA, Reitz RE, Berglund G, Hedblad B, Engström G, Williams PT, et al. Ion mobility analysis of lipoprotein subfractions identifies three independent axes of cardiovascular risk. Arterioscler Thromb Vasc Biol. 2009;29(11):1975–80.

    Article  PubMed  PubMed Central  CAS  Google Scholar 

  252. Mansour MR, Abraham BJ, Anders L, Berezovskaya A, Gutierrez A, Durbin AD, Etchin J, Lawton L, Sallan SE, Silverman LB, et al. An Oncogenic Super-Enhancer Formed Through Somatic Mutation of a Noncoding Intergenic Element. Science (New York, NY). 2014;346(6215):1373–7.

    Article  CAS  Google Scholar 

Download references

Acknowledgements

Many thanks to John Mattick (Genomics England & Garvan Institute of Medical Research) and David Guttery (University of Leicester) for their advice on shaping the structure of the review.

Author information

Authors and Affiliations

Authors

Contributions

KB conceived the original idea to compose the review, formed and managed the collaboration, wrote the background, conclusions, and Table 6, provided additional text to link all contributors’ sections together, produced the artworks, and provided final editing across all sections. LDD wrote the section on technology, and Table 2 together with KB. KAC, MAN, and CBTM wrote the section on gene editing and CRISPR, and ocular genetics. SS, VH, LC, and JS wrote the section on cancer and Table 1 together with KB. TK-D wrote Table 3 on CRISPR’s utility in bees. CCS wrote Table 5 on cardiovascular disease. BC, JAL-S, and RSK jointly wrote the section on asthma and Table 4. All authors have reviewed and approved the final version of the review.

Corresponding authors

Correspondence to K. Blighe or C. B. T. Moore.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Blighe, K., DeDionisio, L., Christie, K.A. et al. Gene editing in the context of an increasingly complex genome. BMC Genomics 19, 595 (2018). https://doi.org/10.1186/s12864-018-4963-8

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: https://doi.org/10.1186/s12864-018-4963-8

Keywords