Visualizing tumor evolution with the fishplot package for R
© The Author(s). 2016
Received: 17 June 2016
Accepted: 22 October 2016
Published: 7 November 2016
Massively-parallel sequencing at depth is now enabling tumor heterogeneity and evolution to be characterized in unprecedented detail. Tracking these changes in clonal architecture often provides insight into therapeutic response and resistance. In complex cases involving multiple timepoints, standard visualizations, such as scatterplots, can be difficult to interpret. Current data visualization methods are also typically manual and laborious, and often only approximate subclonal fractions.
We have developed an R package that accurately and intuitively displays changes in clonal structure over time. It requires simple input data and produces illustrative and easy-to-interpret graphs suitable for diagnosis, presentation, and publication.
The simplicity, power, and flexibility of this tool make it valuable for visualizing tumor evolution, and it has potential utility in both research and clinical settings. The fishplot package is available at https://github.com/chrisamiller/fishplot.
Most cancers are heterogeneous and contain multiple subclonal populations that can be detected via high depth massively parallel sequencing. An increasing number of studies are collecting and sequencing longitudinal samples, allowing the clonal evolution of tumors to be tracked in detail. Though tools have been developed for inferring subclonal architecture [1–4] and for determining tumor phylogeny [5, 6], few offer compelling and intuitive visualizations.
We reported one of the first studies describing tumor evolution defined by whole genome sequencing, in patients with relapsed Acute Myeloid Leukemia (AML), and that publication contained a series of custom figures showing changes in clonal architecture between the primary and relapse presentations . Often called “fish plots” due to their resemblance to tropical fish, these visualizations have become widely adopted, both by our group and others [8–11]. Until now, each has been created in vector-art programs like Adobe Illustrator, which is laborious and makes representing accurate proportions challenging. As the sizes of cohorts have grown, this approach has quickly become untenable.
To enable the creation of these plots in a robust and automatable fashion, we have developed an R package (“fishplot”) that takes estimates of subclonal prevalence at different timepoints, and outputs publication-ready images that accurately represent subclonal relationships and their relative proportions. Fishplot is available at https://github.com/chrisamiller/fishplot.
The fishplot package was implemented in R, and requires a minimal set of dependencies (the “plotrix”, “png”, and “Hmisc” packages). Several inputs, including the clonal fractions of each tumor cell population, a representation of descent in the form of parental relationships, and the timepoints at which the samples were obtained are required. These data are readily available from existing tools, such as the clonevol package, which already includes code that interfaces with fishplot for visualization (https://github.com/hdng/clonevol). This feature enables seamless integration into existing genomic analysis pipelines. Figures are output through the R standard graphics libraries, which allow for the creation of vector or raster-based images of any size, suitable for a wide range of applications.
We first created a fish plot of an AML patient with a chemotherapy-induced bottleneck that eliminated one subclone, while another survived and drove the relapse (Fig. 1a). This plot uses the default color scheme and the default curve splining method for smoothing, along with timepoint labels representing the number of days since tumor presentation.
We next plotted a breast tumor sequenced before and after 4 months of neoadjuvant aromatase inhibitor therapy  (Fig. 1b). This therapy did not induce an extreme bottleneck, but nonetheless resulted in substantial clonal remodeling. The resulting figure uses user-defined colors and represents subclones as polygons, without curve smoothing. The standard numeric labels were replaced with categorical labels, but the timepoints remain scaled appropriately.
Lastly, we created a model of AML31, a patient that was sampled with ultra-deep sequencing at many timepoints, allowing even very rare (<1 % Variant Allele Frequency) subclones to be detected  (Fig. 1c). Chemotherapy did not completely eliminate the cancer in this case, resulting in detectable levels of tumor until relapse at day 505 (with subsequent clearance of the tumor with via salvage chemotherapy). The fishplot shows this progression, including the failure to completely clear the tumor. This patient also had oligoclonal skewing post-chemotherapy, resulting in clonal expansion from a hematopoetic stem cell that was not part of the patient's leukemia (Fig. 1c, blue). The package includes functionality for representing such unrelated clones, which is also useful in the case of “collision tumors” with independent origins.
The code and data used to produce Fig. 1 is available as Additional file 1 and can be also found within the example scripts in https://github.com/chrisamiller/fishplot/blob/master/tests/test.R Additional file 2 contains an analysis pipeline demonstration that chains together the sciClone, clonevol, and fishplot packages, taking data from raw somatic variant calls through subclonal detection, phylogeny inference, and fishplot creation.
Characterizing subclonal architecture and the ways in which tumors evolve, both over time and in the context of therapeutic intervention, is important for understanding therapy resistance, which contributes to tens of thousands of cancer deaths each year. Fish plots, like those presented here, provide researchers and clinicians with an intuitive and accurate representation of how an individual tumor is changing over time, potentially making analysis and diagnosis easier. Despite being designed for tracking tumor evolution, this tool may also find niches outside of cancer biology, and could easily be used to represent the changing landscapes of microbial populations, for example. Our group has already used images created by the fishplot package in a number of genomic pipelines and pending publications, and we anticipate that it will be adopted widely within the large community of scientists studying tumor evolution.
Availability and requirements
The fishplot package has been tested on R versions > 2.15 and requires the “plotrix”, “Hmisc”, and “png” packages. It is available from https://github.com/chrisamiller/fishplot.
Acute Myeloid Leukemia
Research reported in this publication was supported in part by funding provided by grant U54HG003079 from the National Human Genome Research Institute to RKW. The funding body played no role in the design of the study or writing of the manuscript.
Availability of data and material
The software package and code that generates Fig. 1 is available as Additional file 1 and at https://github.com/chrisamiller/fishplot. Raw sequence data and variant calls for these tumors are available via the referenced manuscripts.
CMiller created the fishplot package. HD contributed code and testing. JM, LD, and TL designed the original fish plots. CMiller wrote the manuscript and created the figures. RW and CMaher provided supervision. RW secured funding. TL, LD, and EM edited the manuscript. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
Open AccessThis article is distributed under the terms of the Creative Commons Attribution 4.0 International License (http://creativecommons.org/licenses/by/4.0/), which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.
- Miller CA, White BS, Dees ND, Griffith M, Welch JS, Griffith OL, et al. SciClone: inferring clonal architecture and tracking the spatial and temporal patterns of tumor evolution. PLoS Comput Biol. 2014;10:e1003665.View ArticlePubMedPubMed CentralGoogle Scholar
- Roth A, Khattra J, Yap D, Wan A, Laks E, Biele J, et al. PyClone: statistical inference of clonal population structure in cancer. Nat Methods. 2014;11:396–8.View ArticlePubMedPubMed CentralGoogle Scholar
- Deshwar AG, Vembu S, Yung CK, Jang GH, Stein L, Morris Q. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 2015;16:35.View ArticlePubMedPubMed CentralGoogle Scholar
- Oesper L, Mahmoody A, Raphael BJ. THetA: inferring intra-tumor heterogeneity from high-throughput DNA sequencing data. Genome Biol. 2013;14:R80.View ArticlePubMedPubMed CentralGoogle Scholar
- Qiao Y, Quinlan AR, Jazaeri AA, Verhaak RG, Wheeler DA, Marth GT. SubcloneSeeker: a computational framework for reconstructing tumor clone structure for cancer variant interpretation and prioritization. Genome Biol. 2014;15:443.View ArticlePubMedPubMed CentralGoogle Scholar
- Niknafs N, Beleva-Guthrie V, Naiman DQ, Karchin R. SubClonal Hierarchy Inference from Somatic Mutations: Automatic Reconstruction of Cancer Evolutionary Trees from Multi-region Next Generation Sequencing. PLoS Comput. Biol. 2015;11.
- Ding L, Ley TJ, Larson DE, Miller CA, Koboldt DC, Welch JS, et al. Clonal evolution in relapsed acute myeloid leukaemia revealed by whole-genome sequencing. Nature. 2012;481:506–10.View ArticlePubMedPubMed CentralGoogle Scholar
- Engle EK, Fisher DA, Miller CA, McLellan MD, Fulton RS, Moore DM, et al. Clonal evolution revealed by whole genome sequencing in a case of primary myelofibrosis transformed to secondary acute myeloid leukemia. Leukemia. 2014. doi:10.1038/leu.2014.289.PubMedPubMed CentralGoogle Scholar
- Griffith M, Miller CA, Griffith OL, Krysiak K, Skidmore ZL, Ramu A, et al. Optimizing cancer genome sequencing and analysis. Cell Syst. 2015;1:210–23.View ArticlePubMedPubMed CentralGoogle Scholar
- Wang Y, Waters J, Leung ML, Unruh A, Roh W, Shi X, et al. Clonal evolution in breast cancer revealed by single nucleus genome sequencing. Nature. 2014;512:155–60.View ArticlePubMedPubMed CentralGoogle Scholar
- Lundberg P, Karow A, Nienhold R, Looser R, Hao-Shen H, Nissen I, et al. Clonal evolution and clinical correlates of somatic mutations in myeloproliferative neoplasms. Blood. 2014;123:2220–8.View ArticlePubMedGoogle Scholar
- Miller CA, Gindin Y, Lu C, Griffith OL, Griffith M, Shen D, et al. Aromatase inhibition remodels the clonal architecture of estrogen-receptor-positive breast cancers. Nat Commun. 2016;7:12498.View ArticlePubMedPubMed CentralGoogle Scholar