Characteristics of DNA macro-alterations in breast cancer with liver metastasis before treatment
BMC Genomics volume 24, Article number: 391 (2023)
Whole-genome doubling (WGD) has been observed in 30% of cancers, followed by a highly complex rearranged karyotype unfavourable to breast cancer's outcome. However, the macro-alterations that characterise liver metastasis in breast cancer(BC) are poorly understood. Here, we conducted a whole-genome sequencing analysis of liver metastases to explore the status and the time frame model of these macro-alterations in pre-treatment patients with metastatic breast cancer.
Whole-genome sequencing was conducted in 11 paired primary tumours, lymph node metastasis, and liver metastasis fresh samples from four patients with late-stage breast cancer. We also chose five postoperative frozen specimens from patients with early-stage breast cancer before any treatment as control. Surprisingly, all four liver metastasis samples were classified as WGD + . However, the previous study reported that WGD happened in 30% of cancers and 2/5 in our early-stage samples. WGD was not observed in the two separate primary tumours and one lymph node metastasis of one patient with metastatic BC, but her liver metastasis showed an early burst of bi-allelic copy number gain. The phylogenetic tree proves her 4 tumour samples were the polyclonal origin and only one WGD + clone metastasis to the liver. Another 3 metastatic BC patients’ primary tumour and lymph node metastasis experienced WGD as well as liver metastasis, and they all showed similar molecular time-frame of copy number(CN) gain across locations within the same patient. These patients’ tumours were of monoclonal origin, and WGD happened in a founding clone before metastasis, explaining that all samples share the CN-gain time frame.
After WGD, the genomes usually face instability to evolve other macro-alterations. For example, a greater quantity and variety of complex structural variations (SVs) were detected in WGD + samples. The breakpoints were enriched in the chr17: 39 Mb-40 Mb tile, which contained the HER2 gene, resulting in the formation of tyfonas, breakage-fusion-bridge cycles, and double minutes. These complex SVs may be involved in the evolutionary mechanisms of the dramatic increase of HER2 copy number.
Our work revealed that the WGD + clone might be a critical evolution step for liver metastasis and favoured following complex SV of breast cancer.
Metastatic breast cancer (BC) may be biologically different from the original primary tumour because cancer cells continue to evolve stochastically or in response to treatment/microenvironment, leading to treatment failure and death . Evidence indicates that hormone receptors and human epidermal growth factor receptor 2 (HER2) status are discordant between primary and metastatic tumours in 20%-25% of patients . Therefore, it is necessary to understand the genomic profile that depicts metastasis and evolution. Most molecular studies of metastatic BC have been based on single-nucleotide variants (SNVs) and fusion genes by single rearrangements [3,4,5]. Nevertheless, macro-alterations that significantly impact the malignant phenotype and prognosis [6,7,8] are poorly understood.
Macro-alterations include whole-genome doubling (WGD), somatic copy number alterations (CNAs), complex structural variations (SV), and other similar fatal events to survival  that may occur instantaneously during a particular developmental stage in cancer cells. In most cases, these events are presumed to be harmful, and only in some rare cases does it result in an increase in cellular fitness and the generation of viable "hopeful monsters" . Doubling of a complete set of diploid chromosomes has been proposed as a very early event in tumour evolution [11, 12], resulting in tetraploidy, and cancer cells experience missegregation or extensive loss of heterozygosity (LOH) events to evolve a more stable aneuploidy genome . Complex SV refers to multiple (> 2) DNA junctions in distinct topologies within the reference genome that yield one or more copies of complex rearranged alleles . Tyfonas represent an extreme case that harbours significantly elevated junctions . Together with others, these macro-alterations represent an alternative evolutionary pattern different from the Darwinian selection. However, one concern is whether these macro-alterations appear before or after metastasis. Treatment such as chemo-radiation therapy makes it more complicated because there are 2 possibilities. The first is that the treatment selects resistant sub-clones that have already evolved macro-alterations [16, 17]. Another is that these macro-alterations occur following exposure to therapeutics, which act as stressors to promote adaptive, genome-based evolution . Hence, this study used whole-genome sequencing (WGS) data to evaluate these macro-alterations of BC in simultaneous paired fresh primary tumours, liver metastases, and lymph node metastases samples from four patients with late-stage BC. They were all treatment-naive.
We detected WGD events in all four liver metastases. In one of the late-stage patients, WGD occurred only in liver metastases with punctuated copy number (CN) gain at a very early molecular stage of cell development. In the other three patients, WGD occurred across all specimens that showed a similar limited or wide distribution of the CN-gain time frame. In addition, complex SVs breakpoints were mainly present in WGD + samples. Our work implied macro-alterations in our samples were not the result of treatment stress or the liver's microenvironment. They might happen stochastically in the early molecular time of cell development.
Four patients with liver metastatic BC (MBC) at diagnosis and five with early BC (EBC) were enrolled. A CT-guided core needle biopsy was used to acquire primary tumours (pri), lymph node metastasis (ln) (MBC1 and MBC2), and liver metastasis (liv) samples. MBC1 had two separate primary tumours, and we punctured both. MBC3 had more than two primary tumours; we only punctured the largest one because the others were too small to puncture. Five postoperative specimens of early-stage primary tumours stored at -80 °C were obtained from the West China Hospital Breast Cancer Biobank . All samples were treatment-naive. The molecular subtypes, treatments, and outcomes are summarised in Supplemental Table 1. Hormone receptor status of primary tumours was positive in MBC2, MBC3, MBC4, EBC3, EBC6, EBC7, and EBC8 and negative in MBC1 and EBC1. HER2 status of primary tumours was positive in MBC1, MBC2, MBC4, EBC3, EBC6, EBC7, and EBC8 and unknown in MBC3 and EBC1. Overall survival ranged from 125–152 months in EBCs and 11.8–70.1 months in MBCs. Progression-free survival ranged from 0–23.37 months in MBCs.
Mutational loads and signature
All tumour samples underwent WGS (coverage range 24–37'). Blood or normal breast tissues were used as normal controls. BC has a moderate mutational load, which has been reported to be close to one mutation/Mb . In our EBC samples, the mutational loads ranged from 0.22 (EBC8) to 21.09 (EBC7) mutations/Mb. In MBCs, the mutational loads ranged from 0.8 (MBC1pri2) to 3.02 (MBC4liv) mutations/Mb (Fig. 1A, Supplemental Table 2). The most commonly mutated cancer driver genes were TP53 (patient number(P) = 3,sample number(N) = 6), EGFR (P = 2,N = 3) and RECQL4(P = 2,N = 2). Other driver gene mutations did not occur in more than two patients (Fig. 1B, Supplemental Table 3). No cancer driver gene mutations were detected in the 4 samples of MBC1. For MBC2-MBC4, 43.13%-49.22% SNVs of the primary tumour was found to be overlapped with liver metastasis SNV. Nevertheless, only 1.38% SNVs of MBC1pri1 and 3.39% SNVs of MBCpri2 overlapped with liver metastasis SNV(Supplemental Fig. 1).
To understand whether there were differences in the mutational signature between primary tumours and liver metastases, we estimated the mutational signature of each sample based on COSMIC v2 (Fig. 1C, Supplemental Table 4). MBC1pri1, MBC1pri2, MBC1ln, EBC3, and EBC8, which had the lowest mutational load of all samples, were dominated by signature 3 (range 0.29–0.42). All samples of MBC2, MBC3, and MBC4 were dominated by signature 1 (range 0.17–0.27) and signature 5 (range 0.23–0.38). MBC1liv had both signature 3 (0.23), similar to other samples of MBC1, and signature 5 (0.31), similar to the other samples of late-stage patients. The other three patients with early-stage BC (EBC1, EBC6, and EBC7) were dominated by signatures 2 (range 0.15–0.27) and 13 (range 0.20–0.53). Signature 3 is associated with the failure of DNA double-strand break repair by homologous recombination, and signature 5 was previously associated with age (clock-like) in several different cancers . Signature 2 + 13 shows activation of AID/APOBEC cytidine deaminases . EBC7 had our samples' highest mutational load of 21.09 mutations/Mb. It also showed the highest signature of AID/APOBEC cytidine deaminases, which were reported as the most common mutational processes among hypermutated breast cancer .
Interestingly, even though the SNVs/indels of liver and lymph metastases differed almost entirely from the two primary tumours in MBC1, they still showed some similarity in signature, suggesting the tumour samples from MBC1 were from the different clones with the same pathogenic factor.
WGD and LOH
In MBC1liv, 52.29% of the genome was affected by the bi-allele gain (allele CN N: N, N ≥ 2), 24.64% was affected by the mono-allele gain (N:1), 2.73% was affected by gain + loss (N:0), and 9.22% was affected by loss (1:0). MBC1liv was classified as WGD + , as 50% or more of the autosomal tumour genome had a somatic major CN of two or more . However, the other three samples from MBC1 were WGD-, with a gain-affected genome range of 0.72%-9.81%. Again, this implied that the tumours from MBC1 were of poly-clone origin, and only the clone metastases to the liver underwent WGD. In contrast, in MBC2, MBC3, and MBC4, allele-specific CNAs typically involved the same alleles within the same patients, suggesting a common seeding of these samples (Fig. 2A, Supplemental Table 5). All samples from MBC2, MBC3, and MBC4 were classified as having undergone WGD (gain-affected genome range: 78.35%–99.09%) (Fig. 2B, Supplemental Table 5). 2/5 EBCs had WGD events(Supplemental Table 5). The ploidy of WGD + samples ranged from 2.92–5, and that of WGD- samples ranged from 1.78–2.04. All TP53-mutated samples from three patients were WGD + . However, four WGD + samples (EBC1, MBC1liv, MBC3pri, and MBC3liv) did not harbour TP53 mutations, suggesting that TP53 dysfunction is not obligatory for WGD .
Tumour cells were reported to tolerate many large-scale losses after WGD to evolve a more stable sub-tetraploid tumour genome , and WGD buffers the deleterious impact of somatic mutations and somatic CNAs in regions of LOH . Allele loss affected 8.17%-35.98% of the genome of WGD + samples and 1.63%-30.26% of the genome of WGD- samples. In other words, our limited samples did not observe a correlation between WGD and LOH. Recurrent loss of 8p23.3-8p12, 17p, and 17q21.2-17q21.31 resulted in the LOH of TP53 in 14/16 samples and BRCA1 in 13/16 samples.
Sub-clonal composition and evolution
We used phylowgs to infer the sub-clones and build a phylogenetic tree from SNVs and CNAs. It is widely accepted that tumours are monoclonal in origin, arising from a series of mutations in a single cell and its descendants  MBC2, MBC3, and MBC4 originated from a founding clone with cancer driver gene mutations established in the primary tumour seeding all other metastases (Fig. 3). The founding clone then diverged into 2–4 lineages predominantly present in primary tumours, lymph node metastasis, or liver metastasis. Some private clones then evolved from these lineages, such as clone 3, which evolved from clone 2, and were both present in the liver metastasis of MBC3. However, polyclonal lesions, characterised by distinct cell subpopulations expanding within separate domains of the growing tumour, were also reportedly present in various tumours . Clones 1, 2, 4, and 5 of MBC1 were presumed to have polyclonal origins. Clone 1, characterised by vast CNAs, was predominantly present in liver metastases (Supplemental Table 6).
In addition, MBC1 primary tumour was clinically diagnosed as a HER2-enriched subgroup using immunohistochemistry (IHC) and fluorescent in-situ hybridisation (FISH). The two primary tumours' HER2 CNs calculated by WGS were 5 and 18, in line with clinical diagnosis. However, the HER2 CNs in the lymph and liver metastases were 3 and 3, respectively(Supplemental Table 7), suggesting they lost the HER2 amplification. Progression-free survival was only 3.5 months after trastuzumab treatment for brain metastasis, and 1 year later, liver metastasis had progressed. Another patient, MBC4, who was diagnosed as HER2 positive of the primary tumour and HER2 CN were 49(pri) and 51(liv) calculated by WGS data, had a 23.37 months PFS(supplemental Table 1) after treatment with trastuzumab. The dominance of clones without HER2 amplification of MBC1 may be one reason for trastuzumab resistance. In MBC2, MBC3, and MBC4, the HER2 CN was constant between primary and metastatic tumours. The heterogeneity of metastatic BC may underlie its poor responsiveness to therapy and explain why biomarkers of therapy responsiveness measured exclusively in primary tumours provide a restricted view of the biological properties of metastatic BC.
Another DNA macro-alteration is complex SV, which was reported to be accelerated in WGD + cells . In our study, All complex SV types, including tyfonas, breakage-fusion-bridge cycles (BFBs), chromoplexy, chromothripsis, double minutes (DMs), pyrgo, and rigma, were detected in 9/10 WGD + and 3/6 WGD- samples (Table 1). In the three WGD- samples, only 1–2 complex SVs were detected. In nine WGD + samples, more than five complex SVs or tyfonas were detected, which were highly complicated with high junction-CN junctions and fold-back inversions . The breakpoints of the complex SVs enriched in the chr17: 39 Mb-40 Mb tile (p = 5.80E-06, false discovery rate = 0.017) (Fig. 4A, Supplemental Table 8), which recurrently involved the copy number change of HER2.
The amplitude of CN was relatively flat in the samples without complex SVs, whereas steep amplification peaks were observed in the samples with complex SVs. In EBC1 (HER2 CN = 29), MBC4pri (HER2 CN = 49), and MBC4liv (HER2 CN = 51), tyfonas were implicated in HER2 amplification (Fig. 4B). BFBs and DMs were also associated with HER2 amplification (Fig. 4C-D). Hence, the HER2-enriched samples diagnosed using IHC (3 +), FISH ( +), or CN by WGS (≥ 6) could be divided into subgroups with or without complex SVs. The HER2-enriched samples with complex SVs around HER2 showed a higher CN of HER2 (range 18–129) compared with the HER2-enriched samples without complex SVs (range 4–8). Chromoplexy was the most prevalent event type (events = 39, patients n = 9) (Table 1), which affected the genome with low DNA CN changes(Fig. 4E). Chromothripsis (events = 5, n = 4)(Fig. 4E), rigma (events = 4, n = 3), and pyrgo (events = 1, n = 1) were less frequent (Fig. 4F) in our samples.
We verified this result in a published database . There were 257 breast cancer samples in the database, of which 5 were metastatic breast cancer samples. 3 samples underwent WGD (supplementary Fig. 2), and 2/3 of the WGD sample showed a loss of function of TP53. A total of 37 complex SV were detected in these 5 samples, and similar to our results, 36/37 complex SV happened in the 3 WGD samples (Supplemental Fig. 2, Supplemental Table 9).
Timing patterns of CN-gain
We estimated the molecular time of CN-gain using the R package 'mutationtimeR.' In WGD + samples of EBC1 (median gain time 2.84E-21, IQR 5.77E-33 to 1.55E-10), EBC7 (median 2.51E-12, IQR 1.52E-57 to 0.34), MBC1liv (median 6.60E-36, IQR 3.85E-65 to 0.025), MBC3pri (median 0.22, IQR 0.022 to 0.29), MBC3liv (median 0.22, IQR 0.11 to 0.33), MBC4pri (median 0.28, IQR 0.049 to 0.51), and MBC4liv (median 0.18, IQR 0.063–0.25), a substantial fraction of CN-gain occurred early in molecular time. In contrast, CN-gain occurred at mid-molecular times in MBC2pri (median 0.62, IQR 9.99E-28 to 1), MBC2ln (median 0.51, IQR 0.3 to 0.69), and MBC2liv (median 0.57, IQR 0.078 to 0.72) (Fig. 5A). The early punctuated bursts of CN-gain detected in MBC1liv rather than in MBC1pri1, MBC1pri2, and MBC1ln supported a private WGD + clone diverging at a very early stage of tumour evolution and obtaining superior metastatic potential. Similar molecular gain times in MBC2, MBC3, and MBC4 suggested that the primary tumours and metastasis shared the founding clone, which acquired CN-gain at a fixed time before metastasis.
It was reported that WGD + cases had a higher proportion of multiple gain events (more than two) of CN segments compared with diploid cases (24.8% vs 11.4%), and 84% of secondary gain events showed a latency after the first one .In our samples, multiple gain-affected genomes ranged from 3.53% (MBC1liv) to 88.32% (MBC2ln) in WGD + cases and from 0 (MBC1ln) to 1.93% (EBC6) in WGD- cases (Supplemental Table 10). Fifty-six percent of bases showed a relative latency in secondary gain with available timing information (Supplemental Table 11). Interestingly, 910 of the 1058 (86.01%) complex SV breakpoints were located on the chromosome arm, which had large-scale multiple gains (≥ 3 Mb). However, only 28.09% (66/235) of the chromosome arms that underwent large-scale multiple gains had complex SVs (Fig. 5B-C, Supplemental Table 12). Therefore, cumulative local CN-gain and complex chromosome rearrangements probably arise from the same catastrophic events of chromosomal arms.
All four liver metastases were classified as WGD + , which implies that WGD is a prevalent event in BC metastasis, considering that WGD occurs in 19% of all types of BC, 50% of triple-negative BC  and 40% of our early BC samples. However, different time patterns of clonal evolution were observed according to the gain time for each CN across the samples. Generally, WGD can occur in the founding clone before diversification during a limited (MBC3, MBC4) or widely distributed time frame (MBC2). MBC1 displayed diversification of polyclonal seeding, and only the clone dominantly presented in liver metastasis underwent WGD at a very early molecular time. All TP53-mutated samples from three patients underwent WGD + , but not all WGD + samples harboured TP53 mutation, suggesting that TP53 dysfunction is not obligatory for WGD. As reported, some other genes were associated with WGD in TP53 wild-type cancer, including amplification of CCNE1 and loss-of-function mutations in RB1 and BAP1 .
The 4 samples from MBC1 showed similar mutational signatures representing DNA double-strand break repair by homologous recombination, suggesting that the same driver factors were involved in the polyclonal tumours. Additionally, WGD has been consistently linked to metastatic potential . The common consensus of evolution is that mutations and chromosomal aberrations accumulate gradually and sequentially over time , in line with Darwinian evolution. However, an alternative model is punctuated CN evolution, in which many chromosomal rearrangements and CN-gain are acquired in short bursts of genomic instability early in tumour evolution [31, 32]. Our multi-site biopsy data provide clinical evidence supporting the CN-gain of WGD during a limited or widely distributed time frame that may happen before diversification and any treatment or in one of the polyclones.
Nearly 90% of breakpoints that constituted complex SVs were found in the WGD + samples. These breakpoints enriched on chromosome 17q 39 Mb-40 Mb and involved special rearrangement patterns associated with high-junction CN junctions such as tyfonas, BFBs, and DMs, dramatically increasing the CN of HER2 and nearby genes. Tyfonas have been reported to be enriched in both luminal BC and HER2 + BC  and are associated with MDM2 and CDK4 genes. However, two tyfonas events in our early- or late-stage patients affected chr17 q12-21, suggesting a novel mechanism for HER2 amplification. Moreover, HER2-enriched patients with complex SVs nearby HER2 showed a much higher HER2 CN than HER2-enriched patients without complex SVs. Hence, more evidence is needed to explore the predictive efficiency of complex SVs for anti-HER2 therapy . Other novel complexes SVs, such as rigma and pyrgo, were rare in the samples. Interestingly, we found that complex SVs mostly co-occurred with multiple large-scale gains of chromosome arms. There is no strict consensus in the literature; therefore, we classified large-scale secondary gain using ≥ 3 Mb as the threshold for CNAs . We could not exclude the possibility that the chromosome arm had acquired this large-scale secondary CN via complex SVs. However, nearly 70% of arms with large-scale secondary gain did not have complex SVs, implying that they could be independent events. Cumulative local CN-gain and chromosome rearrangements may arise from the same catastrophic events, such as localised cycles of impaired fork progression and breakage .
Our results provide a comprehensive profile of macro-alterations, but more clinical samples, large postoperative specimens, and cell lines are needed for high-depth sequencing and RNA validation. To our knowledge, most primary-metastasis paired studies performed sampling at autopsy or the cancer progression time, meaning that clones were selected by therapy at that time. Our results provided a primitive tumour status and development before therapy and demonstrated that at least some tumour's macro-alteration evolved stochastically before treatment and metastasis. We also stress that the metastatic lesion does not entirely reflect the primary variation feature. Treatment should be targeted to the genetic makeup of the primary and metastatic lesions, respectively. This suggests that we need to combine primary tumour detection and non-invasive techniques for metastatic lesions, like a liquid biopsy, to comprehensively determine the biology of metastatic lesions so that patients can be treated accordingly.
we used paired primary tumours and liver metastases of BC to understand the macro-alterations that may evolve before treatment or even at the very early stage of development. We also stress that primary and metastatic tumours may harbour different macro-alterations; therefore, multi-point sampling is necessary for late-stage patients before treatment, although it is not a routine practice.
Patients and collection of tissue specimens
From Jan 2014 to Jan 2015, four female patients diagnosed with breast invasive ductal carcinoma with incurable liver metastatic solid tumours who agreed to puncture the liver metastasis were included. Hormone receptors and HER2 statuses were confirmed using IHC and IHC/FISH, respectively. The necrotic areas of the lymph nodes and liver metastases were not beyond a quarter of the volume. CT-guided core needle biopsies of the primary tumour and lymph node and liver metastases were stored in liquid nitrogen and sent for pathological diagnoses. Postoperative specimens of the primary tumours from eight patients with EBC were stored at -80 °C. Blood or the normal breast tissue from the same patient was used as control.
This study was approved by the Clinical Test and Biomedical Ethics Committee of West China Hospital Sichuan University (2013; No. 128). All patients provided informed consent before any study procedure and complied with all relevant ethical regulations. Every late-stage patient had an outpatient or telephonic follow-up every 2 months. Clinical outcomes were evaluated using the Response Evaluation Criteria in Solid tumours v.1.1.
DNA extraction and WGS
DNA extraction and WGS
DNA was isolated from 10 mg of tumour tissue according to the supplier's protocols (Qiagen, Hilden, Germany) using a QIAamp DNA Mini kit (Cat No./ID 51304). DNA was quantified by NanoDrop (ThermoFisher, Waltham, MA) and gel electrophoresis, followed by the preparation of a random fragmentation library with a DNA length of 200–500 bp. Barcoded libraries were sequenced as pools on the HiSeq2500 (Illumina, San Diego, CA) platform generating 2 × 150 read pairs using standard settings to a depth of 30.
After trimming and quality filtering using Trim Galore (v.0.6.5, https://github.com/FelixKrueger/TrimGalore), sequencing reads were aligned to the human reference genome (hg38, https://storage.googleapis.com/genomics-public-data/resources/broad/hg38/v0/Homo_sapiens_assembly38.fasta) using the bwa mem algorithm, (v.0.7.17, https://github.com/lh3/bwa), sorted by samtools (v.1.10.2, https://github.com/samtools/samtools) , marked for duplication, and recalibrated using the Genome Analysis Toolkit (GATK) (v.188.8.131.52) according to the GATK best practice workflows (https://gatk.broadinstitute.org/hc/en-us/sections/360007226651-Best-Practices-Workflows).
Mutect2 called SNVs and small insertions/deletions (indels) following the pipeline guide of the GATK (https://gatk.broadinstitute.org/hc/en-us/articles/360037593851-Mutect2) using the tumour-normal mode.
Then, all variants with a frequency > 0.001 in the gnomAD database (http://www.openbioinformatics.org/annovar/download/hg38_gnomad_genome.txt.gz) were removed using the filter functionalities of ANNOVAR, and all passed variants were annotated based on the refGene database (http://www.openbioinformatics.org/annovar/download/hg38_refGene.txt.gz).
Using the R package 'maftools' (v.2.8.0, https://github.com/PoisonAlien/maftools) for R (v.4.1.0; R Foundation for Statistical Computing, Vienna, Austria), an oncoplot was generated to display the top 20 somatic mutations in the samples. 568 mutated cancer driver genes were downloaded from https://www.intogen.org/search . The method for mutational loads is described in the purity, ploidy, and somatic CN alteration section.
The R package 'deconstructSigs' (v.1.8.0, https://github.com/raerose01/deconstructSigs) was used to determine the linear combination of predefined signatures that most accurately reconstructed the mutational profile of a single tumour sample with the COSMIC v2 signature set .
Purity, ploidy, and somatic CN alteration
The PURPLE toolkit (v.2.5.1) was downloaded from https://github.com/hartwigmedical/hmftools/releases/tag/purple-v2.51 to estimate the mutational load, purity, and CNAs. First, we used the paired normal/tumour mode of AMBER (v.3.2, https://github.com/hartwigmedical/hmftools/tree/master/amber#tumour-only mode) to generate a b-allele frequency file for use in PURPLE. Second, COBALT (v.1.1.0 https://github.com/hartwigmedical/hmftools/tree/master/cobalt#tumour-only-mode) was used to determine the read depth ratios of the tumour and normal reference genomes. The b-allele frequency and read depth ratio output from AMBER and COBALT, as well as somatic SNVs/indels from Mutect2, were passed to PURPLE. SV-input VCF files from gridss (v.2.9.4) were used to obtain the CN for the complex SV analysis pipeline. However, the SV input would oversegment the chromosome; therefore, this argument was ignored for the gain-time pipeline. The results of PURPLE included ploidy, purity, total CN, allele-specific CN, tumour mutational loads per megabase, and WGD status according to the major CN. WGD was considered true if more than 10 autosomes had a major allele CN > 1.5. A minor allele CN below one represents a LOH event.
Circos (v.0.69–6, https://github.com/vigsterkr/circos) was used to generate circularly composited renditions that showed somatic variants, tumour purity-adjusted CN changes, minor allele CNs, and SVs within or between the chromosomes.
Subclonal composition and phylogenetic tree
Phyclowgs (https://github.com/morrislab/phylowgs) was used to integrate simple somatic mutations and CNA data to reconstruct tumour clones . First, a phylowgs input parser (https://github.com/morrislab/phylowgs/tree/master/parser) was used to create simple somatic mutations from the Mutect2 output file and CN file from PURPLE. We limited the number of EBC7 variants used for phylogenetic reconstructions to 5000 to limit the run time. Subclone cancer cell fractions and phylogenetic trees were visualised using the R package ‘timescape’ (v.1.16.0, https://github.com/shahcompbio/timescape).
Breakpoint and complex SV calling
Aligned bam files from the same patient were jointly used to call the breakpoints using gridss (v.2.9.4, https://github.com/PapenfussLab/gridss) , and this VCF file was passed to the R script 'gridss_somatic_filter' downloaded from https://github.com/PapenfussLab/gridss/blob/master/scripts/gridss_somatic_filter to remove variants with low confidence, or that could be found in a panel of normals. The R package 'fishhook' (v.0.1, https://github.com/mskilab/fishHook) was used to create contiguous tiles of 1 M and discover the enrichment regions of the breakpoint.
To discover various patterns of SVs, we used JaBbA (https://github.com/mskilab/JaBbA) to infer optimal CNs for DNA segments and bonds between segments to build a genome graph based on junctions and read depth. The first input file 'coverage' was generated from Cobalt, and the second input file 'JUNCTIONS' was generated from the gridss described above. The output of JaBbA was passed to the R package 'gnome' (v.0.1, https://github.com/mskilab/gGnome) to plot and classify subgraphs that satisfied the set of criteria corresponding to a particular class of simple (deletion [del], tandem duplication [dup], inversion [inv], inverted duplication [invdup], translocations [tra]) or complex (rigma, pyrgo, chromoplexy, chromothripsis, BFBs, DMs, and tyfonas) SVs .
Timing of gain
We used the R package 'MutationTimeR' (v.1.00.2, https://github.com/gerstung-lab/MutationTimeR) to calculate CN-gain timing as described . Four input files needed to be prepared: somatic mutations from the Mutect2 pipeline, CN file from PURPLE, complex SVs from the JaBbA-gGnome pipeline, and cluster file from phylowgs.
Availability of data and materials
The information for analysis is in the Supplemental Tables. The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive in National Genomics Data Center, China National Centre for bioinformation /Beijing Institute of Genomics, Chinese Acadamy of Science (HRA002540/PRJCA009821), which are publicly accessible at https://ngdc.cncb.ac.cn/gsa-human/s/zDT4ohn5.
Somatic copy number alterations
Fluorescent in-situ hybridization
Human epidermal growth factor receptor 2
Extensive loss of heterozygosity
Paul MR, Pan TC, Pant DK, Shih NN, Chen Y, Harvey KL, Solomon A, Lieberman D, Morrissette JJ, Soucier-Ernst D, et al. Genomic landscape of metastatic breast cancer identifies preferentially dysregulated pathways and targets. J Clin Invest. 2020;130(8):4252–65.
Maxwell KN, Soucier-Ernst D, Tahirovic E, Troxel AB, Clark C, Feldman M, Colameco C, Kakrecha B, Langer M, Lieberman D, et al. Comparative clinical utility of tumor genomic testing and cell-free DNA in metastatic breast cancer. Breast Cancer Res Treat. 2017;164(3):627–38.
Lefebvre C, Bachelot T, Filleron T, Pedrero M, Campone M, Soria JC, Massard C, Levy C, Arnedos M, Lacroix-Triki M, et al. Mutational Profile of Metastatic Breast Cancers: A Retrospective Analysis. PLoS Med. 2016;13(12):e1002201.
Robinson DR, Wu YM, Lonigro RJ, Vats P, Cobain E, Everett J, Cao X, Rabban E, Kumar-Sinha C, Raymond V, et al. Integrative clinical genomics of metastatic cancer. Nature. 2017;548(7667):297–303.
Angus L, Smid M, Wilting SM, van Riet J, Van Hoeck A, Nguyen L, Nik-Zainal S, Steenbruggen TG, Tjan-Heijnen VCG, Labots M, et al. The genomic landscape of metastatic breast cancer highlights changes in mutation and signature frequencies. Nat Genet. 2019;51(10):1450–8.
Bielski CM, Zehir A, Penson AV, Donoghue MTA, Chatila W, Armenia J, Chang MT, Schram AM, Jonsson P, Bandlamudi C, et al. Genome doubling shapes the evolution and prognosis of advanced cancers. Nat Genet. 2018;50(8):1189–95.
Jamal-Hanjani M, Wilson GA, McGranahan N, Birkbak NJ, Watkins TBK, Veeriah S, Shafi S, Johnson DH, Mitter R, Rosenthal R, et al. Tracking the Evolution of Non-Small-Cell Lung Cancer. N Engl J Med. 2017;376(22):2109–21.
Smith JC, Sheltzer JM. Systematic identification of mutations and copy number alterations associated with cancer patient prognosis. Elife. 2018;7:e39217.
Vendramin R, Litchfield K, Swanton C. Cancer evolution: Darwin and beyond. EMBO J. 2021;40(18):e108389.
Gerlinger M, McGranahan N, Dewhurst SM, Burrell RA, Tomlinson I, Swanton C. Cancer: Evolution Within a Lifetime. Annu Rev Genet. 2014;48:215–36.
Finley JC, Reid BJ, Odze RD, Sanchez CA, Galipeau P, Li XH, Self SG, Gollahon KA, Blount PL, Rabinovitch PS. Chromosomal instability in Barrett’s esophagus is related to telomere shortening. Cancer Epidem Biomar. 2006;15(8):1451–7.
Olaharski AJ, Sotelo R, Solorza-Luna G, Gonsebatt ME, Guzman P, Mohar A, Eastmond DA. Tetraploidy and chromosomal instability are early events during cervical carcinogenesis. Carcinogenesis. 2006;27(2):337–43.
Davoli T, de Lange T. The causes and consequences of polyploidy in normal development and cancer. Annu Rev Cell Dev Biol. 2011;27:585–610.
Maciejowski J, Imielinski M. Modeling cancer rearrangement landscapes. Curr Opin Syst Biol. 2017;1:54–61.
Hadi K, Yao X, Behr JM, Deshpande A, Xanthopoulakis C, Tian H, Kudman S, Rosiene J, Darmofal M, DeRose J, et al. Distinct Classes of Complex Structural Variation Uncovered across Thousands of Cancer Genome Graphs. Cell. 2020;183(1):197-210 e132.
Boisselier B, Dugay F, Belaud-Rotureau MA, Coutolleau A, Garcion E, Menei P, Guardiola P, Rousseau A. Whole genome duplication is an early event leading to aneuploidy in IDH-wild type glioblastoma. Oncotarget. 2018;9(89):36017–28.
Replogle JM, Zhou W, Amaro AE, McFarland JM, Villalobos-Ortiz M, Ryan J, Letai A, Yilmaz O, Sheltzer J, Lippard SJ, et al. Aneuploidy increases resistance to chemotherapeutics by antagonizing cell division. Proc Natl Acad Sci U S A. 2020;117(48):30566–76.
Liu G, Stevens JB, Horne SD, Abdallah BY, Ye KJ, Bremer SW, Ye CJ, Chen DJ, Heng HH. Genome chaos: survival strategy during crisis. Cell Cycle. 2014;13(4):528–37.
Fan Y, Xie G, Wang Z, Wang Y, Wang Y, Zheng H, Zhong X. PTEN promoter methylation predicts 10-year prognosis in hormone receptor-positive early breast cancer patients who received adjuvant tamoxifen endocrine therapy. Breast Cancer Res Treat. 2022;192(1):33–42.
Alexandrov LB, Nik-Zainal S, Wedge DC, Aparicio SA, Behjati S, Biankin AV, Bignell GR, Bolli N, Borg A, Borresen-Dale AL, et al. Signatures of mutational processes in human cancer. Nature. 2013;500(7463):415–21.
Alexandrov LB, Jones PH, Wedge DC, Sale JE, Campbell PJ, Nik-Zainal S, Stratton MR. Clock-like mutational processes in human somatic cells. Nat Genet. 2015;47(12):1402–7.
Barroso-Sousa R, Jain E, Cohen O, Kim D, Buendia-Buendia J, Winer E, Lin N, Tolaney SM, Wagle N. Prevalence and mutational determinants of high tumor mutation burden in breast cancer. Ann Oncol. 2020;31(3):387–94.
Lopez S, Lim EL, Horswell S, Haase K, Huebner A, Dietzen M, Mourikis TP, Watkins TBK, Rowan A, Dewhurst SM, et al. Interplay between whole-genome doubling and the accumulation of deleterious alterations in cancer evolution. Nature Genetics. 2020;52(3):283-+.
Marusyk A, Polyak K. Tumor heterogeneity: causes and consequences. Biochim Biophys Acta. 2010;1805(1):105–17.
Teixeira MR, Pandis N, Bardi G, Andersen JA, Heim S. Karyotypic comparisons of multiple tumorous and macroscopically normal surrounding tissue samples from patients with breast cancer. Can Res. 1996;56(4):855–9.
Dewhurst SM, McGranahan N, Burrell RA, Rowan AJ, Gronroos E, Endesfelder D, Joshi T, Mouradov D, Gibbs P, Ward RL, et al. Tolerance of whole-genome doubling propagates chromosomal instability and accelerates cancer genome evolution. Cancer Discov. 2014;4(2):175–85.
Gerstung M, Jolly C, Leshchiner I, Dentro SC, Gonzalez S, Rosebrock D, Mitchell TJ, Rubanova Y, Anur P, Yu K, et al. The evolutionary history of 2,658 cancers. Nature. 2020;578(7793):122–8.
Wu CH, Hsieh CS, Chang YC, Huang CC, Yeh HT, Hou MF, Chung YC, Tu SH, Chang KJ, Chattopadhyay A, et al. Differential whole-genome doubling and homologous recombination deficiencies across breast cancer subtypes from the Taiwanese population. Commun Biol. 2021;4(1):1052.
Priestley P, Baber J, Lolkema MP, Steeghs N, de Bruijn E, Shale C, Duyvesteyn K, Haidari S, van Hoeck A, Onstenk W, et al. Pan-cancer whole-genome analyses of metastatic solid tumours. Nature. 2019;575(7781):210–6.
Fearon ER, Vogelstein B. A genetic model for colorectal tumorigenesis. Cell. 1990;61(5):759–67.
Gao R, Davis A, McDonald TO, Sei E, Shi X, Wang Y, Tsai PC, Casasent A, Waters J, Zhang H, et al. Punctuated copy number evolution and clonal stasis in triple-negative breast cancer. Nat Genet. 2016;48(10):1119–30.
Baca SC, Prandi D, Lawrence MS, Mosquera JM, Romanel A, Drier Y, Park K, Kitabayashi N, MacDonald TY, Ghandi M, et al. Punctuated evolution of prostate cancer genomes. Cell. 2013;153(3):666–77.
Lee HJ, Seo AN, Kim EJ, Jang MH, Suh KJ, Ryu HS, Kim YJ, Kim JH, Im SA, Gong G, et al. HER2 heterogeneity affects trastuzumab responses and survival in patients with HER2-positive metastatic breast cancer. Am J Clin Pathol. 2014;142(6):755–66.
Pos O, Radvanszky J, Buglyo G, Pos Z, Rusnakova D, Nagy B, Szemes T. DNA copy number variation: Main characteristics, evolutionary significance, and pathological aspects. Biomed J. 2021;44(5):548–59.
Umbreit NT, Zhang CZ, Lynch LD, Blaine LJ, Cheng AM, Tourdot R, Sun L, Almubarak HF, Judge K, Mitchell TJ, et al. Mechanisms generating cancer genome complexity from a single cell division error. Science. 2020;368(6488):eaba0712.
Danecek P, Bonfield JK, Liddle J, Marshall J, Ohan V, Pollard MO, Whitwham A, Keane T, McCarthy SA, Davies RM, et al. Twelve years of SAMtools and BCFtools. Gigascience. 2021;10(2):giab008.
Martinez-Jimenez F, Muinos F, Sentis I, Deu-Pons J, Reyes-Salazar I, Arnedo-Pac C, Mularoni L, Pich O, Bonet J, Kranas H, et al. A compendium of mutational cancer driver genes. Nat Rev Cancer. 2020;20(10):555–72.
Deshwar AG, Vembu S, Yung CK, Jang GH, Stein L, Morris Q. PhyloWGS: reconstructing subclonal composition and evolution from whole-genome sequencing of tumors. Genome Biol. 2015;16:35.
Cameron DL, Baber J, Shale C, Valle-Inclan JE, Besselink N, van Hoeck A, Janssen R, Cuppen E, Priestley P, Papenfuss AT. GRIDSS2: comprehensive characterisation of somatic structural variation using single breakend variants and structural variant phasing. Genome Biol. 2021;22(1):202.
This work was supported by the key program of Science & Technology Department of Sichuan Province (grant number: 2017SZ0005, to H.Z.); Key research and development projects of Science & Technology Department of Sichuan Province (grants number: 2019YFS0342, to Y.F.).
Ethics approval and consent to participate
This study was approved by the Clinical Test and Biomedical Ethics Committee of West China Hospital Sichuan University (2013; No. 128). All patients provided informed consent before any study procedure and complied with all relevant ethical regulations.
Consent for publication
The authors declare no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Characteristics, clinicopathologic features and outcomes of all patients.
Mutation loads for each sample.
SNV and indel.
mutational signature(COSMIC V2).
Allele gain and loss.
Phylowgs subclones cellular prevalence and CCF.
Her2 copy number.
Breakpoints enrichment tile.
Secondary affected genome and proportion.
Secondary gain time latency.
Chromosome arms that with large scale secondary gain and complex SV.
Overlap SNVs of primary tumor, lymph node metastasis, and liver metastasis of the 4 MBC patients. Supplemental Figure 2. The WGD status and complex SV of 5 metastatic breast cancer from a public database (http://mskilab.com/gGraph/).
About this article
Cite this article
Fan, Y., Zou, L., Zhong, X. et al. Characteristics of DNA macro-alterations in breast cancer with liver metastasis before treatment. BMC Genomics 24, 391 (2023). https://doi.org/10.1186/s12864-023-09497-w