Volume 11 Supplement 5
Alternative splicing enriched cDNA libraries identify breast cancer-associated transcripts
© Carraro et al; licensee BioMed Central Ltd. 2010
Published: 22 December 2010
Alternative splicing (AS) is a central mechanism in the generation of genomic complexity and is a major contributor to transcriptome and proteome diversity. Alterations of the splicing process can lead to deregulation of crucial cellular processes and have been associated with a large spectrum of human diseases. Cancer-associated transcripts are potential molecular markers and may contribute to the development of more accurate diagnostic and prognostic methods and also serve as therapeutic targets. Alternative splicing-enriched cDNA libraries have been used to explore the variability generated by alternative splicing. In this study, by combining the use of trapping heteroduplexes and RNA amplification, we developed a powerful approach that enables transcriptome-wide exploration of the AS repertoire for identifying AS variants associated with breast tumor cells modulated by ERBB2 (HER-2/neu) oncogene expression.
The human breast cell line (C5.2) and a pool of 5 ERBB2 over-expressing breast tumor samples were used independently for the construction of two AS-enriched libraries. In total, 2,048 partial cDNA sequences were obtained, revealing 214 alternative splicing sequence-enriched tags (ASSETs). A subset with 79 multiple exon ASSETs was compared to public databases and reported 138 different AS events. A high success rate of RT-PCR validation (94.5%) was obtained, and 2 novel AS events were identified. The influence of ERBB2-mediated expression on AS regulation was evaluated by capillary electrophoresis and probe-ligation approaches in two mammary cell lines (Hb4a and C5.2) expressing different levels of ERBB2. The relative expression balance between AS variants from 3 genes was differentially modulated by ERBB2 in this model system.
In this study, we presented a method for exploring AS from any RNA source in a transcriptome-wide format, which can be directly easily adapted to next generation sequencers. We identified AS transcripts that were differently modulated by ERBB2-mediated expression and that can be tested as molecular markers for breast cancer. Such a methodology will be useful for completely deciphering the cancer cell transcriptome diversity resulting from AS and for finding more precise molecular markers.
More than 30 years ago, Gilbert predicted the existence of protein variants due to the alternative use of exon-intron borders in eukaryotic cells . This prediction has been continually confirmed as a common feature of many species, including humans. Recent estimations, based on high-throughput sequencing, suggest that 90-95% of multiple-exon human genes undergo alternative splicing (AS) [2, 3], producing an average of six distinct transcripts from each gene . This phenomenon enormously impacts the repertoire of proteins, since 80% of AS events occur within the coding region , thus interfering in the functional aspects of the cells.
AS regulates important processes, such as embryonic development, cellular differentiation and apoptosis, by the generation of different protein isoforms among distinct tissues, developmental stages and pathological conditions [6–8]. Alterations of the splicing process, such as the loss of expression balance between variants and aberrant splicing, can lead to the deregulation of crucial cellular processes and are consequently associated with a large spectrum of human diseases , including cancer [10–12].
The development of methodologies to explore transcriptome diversity resulting from AS has been shown to be a potent tool, not only for improving the biological basis of cancer but also for searching for more precise molecular markers for diagnostic, prognostic and therapeutic purposes [13, 14]. Different strategies for large-scale AS variant exploration have been used with different goals. Sequence and microarray-based approaches have been used for defining the AS repertoire of human cells. The former includes several computational analyses concerning genomic and transcriptome alignments of human ESTs (expressed sequence tags) and mRNA databases [11, 15–17] and cross-species alignment from closely related organisms [18, 19]; the latter includes genomic and exon-intron junction microarray platforms [20–23]. Both approaches have contributed to the investigation of the expression pattern of AS variants and also facilitated the identification of novel AS variants. Nonetheless, both approaches are impaired in detecting low-abundance AS transcripts. In this sense, AS-enriched cDNA libraries is one of the most interesting approaches because it combines the convenience of cDNA direct sequencing with the advantage of detecting low-abundance transcript variants. The methodology is based on one enrichment step, consisting of the trapping of heteroduplex molecules formed by the hybridization of two distinct AS variants from the same gene . The heteroduplex can be captured by molecules that recognize the heteroduplex structure [25, 26], generating a vast number of AS events without previous knowledge of them. In this study, to explore AS variants associated with breast tumor cells, we established a powerful approach that enabled the direct exploration of an AS repertoire by combining the use of trapping heteroduplex and RNA amplification. To favor the trapping of splicing variants associated with breast tumor cells that over-expresses the ERBB2 (HER-2/neu) oncogene, a human breast cell line (C5.2) and a pool of 5 ERBB2 over-expressing breast tumor samples were used. Two AS-enriched libraries were constructed, generating a set of 2,048 partial cDNA sequences, named here as alternative splicing sequence-enriched tags (ASSETs), as suggested by Watahiki and collaborators . A subset with 79 ASSETS representing distinct multiple exon sequences was explored in this analysis and reported 138 different AS events. A high rate of validation by RT-PCR (94.5%) was obtained, and 2 novel AS events were identified. Moreover, the balance in the expression level of the AS variants from 3 genes was influenced by ERBB2-mediated expression.
The approach presented here certainly will contribute to the identification of the AS repertoire of cancer cells, especially as it is potentially applicable to any cell type from any tumor tissue, since a small amount of total RNA is required with no previous cDNA library construction. Furthermore, it is completely suitable for using with next-generation sequencers, substantially increasing its potential in deciphering the AS diversity in cancer cell transcriptome.
Alternative splicing libraries
Clinical characteristics from the ductal carcinoma samples.
Grade I SBR
ER +/ PR +/ p53 -/ ERBB2+ (3+)
Grade II SBR
ER +/ PR -/ p53 -/ ERBB2+ (3+)
Grade III SBR
ER +/ PR -/ p53 -/ ERBB2+ (2+/3+)
Grade II SBR
ER +/ PR -/ p53 -/ ERBB2+ (3+)
Grade III SBR
ER +/ PR -/ p53 -/ ERBB2+ (3+)
Characterization of libraries Lib_1 and Lib_2
# High Quality Sequences
All consensus sequences were then aligned to the human genome (NCBI build #36.1) using BLAST  and Sim4 , where only the best hit was considered. Based on criteria for identity (≥ 93%) and coverage (≥ 55%), 214 consensus sequences were aligned on the human genome, 93 and 121 of them reporting multiple and one-exon(s) sequences, respectively (Figure 2B). The consensuses were termed ASSETs, as previously proposed [25, 26].
Detection of alternative splicing events
No distinct splicing variants were observed among the sequences belonging to the same consensuses that would be indicative of putative AS events. Therefore, we searched for AS events through comparisons between ASSETs and full-length or partial cDNA sequences available in public databases.
First, ASSETs were clustered with ESTs from dbEST (8,133,299 ESTs), mRNAs (244,284 sequences) and RefSeqs (26,040 sequences) downloaded from UCSC (September 2007) (Figure 2C). This step resulted in 164 clusters, where 142 contained at least one RefSeq sequence. Sixteen clusters contained sequences from both libraries (Lib_1 and Lib_2), revealing an overlap of approximately 10%.
The 79 clusters containing ASSETs with multiple exons were scanned for AS events through pairwise comparisons of exon/intron boundaries between the ASSET and the reference sequences of each cluster. AS events were searched within the region delimited by the two outermost overlapping regions of each ASSET related cluster. For each ASSET, the corresponding gene and the number and type of related alternative splicing events were annotated.
Search for AS variants by comparison with sequences from public databases.
Presence of alternatively spliced transcripts in databases
No alternatively spliced transcripts in databases
Lib_1 & Lib_2
Gene ontology annotation
Functional classification of genes within the statistically significant biological process categories.
Corrected p value
RPL6 RPL21 EEF2 RPL11 RPS4X RPS2 RPS5 RPL28
Intracellular Protein Transport
XPO1 CLTC GABARAP KRT18 YWHAH NUP62 ZFYVE16 KPNA6 RPL11 MRPL45 SEC61G SEC61A1 SRP9
XPO1 MYO1C CLTC GABARAP YWHAH KRT18 NUP62 ZFYVE16 SEC22B KPNA6 RPL11 RANBP1 GNAS MRPL45 SRP9 SEC61G SEC61A1
XPO1 MYO1C VIL2 CLTC GABARAP YWHAH KRT18 NUP62 ZFYVE16 SEC22B KPNA6 GNAS RPL11 RANBP1 MRPL45 SRP9 SEC61G SEC61A1
Establishment of Localization in Cell
XPO1 MYO1C CLTC GABARAP YWHAH KRT18 NUP62 ZFYVE16 SEC22B KPNA6 RPL11 RANBP1 GNAS MRPL45 SRP9 SEC61G SEC61A1
Cellular Macromolecule Metabolic Process
PPP6C XPO1 UQCRC1 CAMK2G PTPLAD1 FARS2 DNAJC10 MAN1B1 RPS2 RPL6 PTPLA RPL11 PSMD6 DNAJA3 GLT25D1 STK25 ROCK2 PAIP1 PTPRA ZDHHC7 AXL MOBKL1A EEF2 RPS4X RPS5 RPL28 IFNAR1 CCNB1 MGAT1 ST13 SENP1 HDAC2 GSPT1 PPIB RPL21 PSMC2 DDB2 GRK6 MRPL45 CTSH
XPO1 ZFYVE16 KPNA6 RPL11 GABARAP SRP9 SEC61G
XPO1 VIL2 CLTC GABARAP YWHAH KRT18 NUP62 ZFYVE16 SEC22B KPNA6 RPL11 GNAS MRPL45 SEC61G SEC61A1 SRP9
GSPT1 RPL6 RPL21 PAIP1 FARS2 EEF2 RPL11 RPS4X RPS2 MRPL45 RPS5 RPL28
XPO1 CLTC GABARAP YWHAH KRT18 NUP62 ZFYVE16 KPNA6 SEC22B RPL11 GNAS MRPL45 SEC61G SEC61A1 SRP9
Establishment of Protein Localization
XPO1 CLTC GABARAP YWHAH KRT18 NUP62 ZFYVE16 KPNA6 SEC22B RPL11 GNAS MRPL45 SEC61G SEC61A1 SRP9
Validation of ASSETs and heteroduplexes
Lib_1 & Lib_2
Novel alternative splicing: characterization of the putative isoforms
The TRIP6 gene [RefSeq:NM_003302.2] is a thyroid hormone receptor interactor 6 that contains 9 exons. The novel alternatively spliced transcript reports retention of the last intron (Figure 5B). The protein coded by the TRIP6 gene localizes to focal adhesion sites and along actin stress fibers. The novel AS variant identified also inserts a premature stop codon in the putative coding protein, without interfering with any protein functional domain.
Evaluation of AS variant regulation by ERBB2-mediated expression
Finally, we investigated the putative influence of ERBB2-mediated expression on the regulation of AS variants for 17 ASSETS validated using GAPDH as a normalization factor, by comparing the expression level of the ASSETs in the C5.2 cell line in relation to the ERBB2 basal expressed counterpart – the normal breast cell line (Hb4a) through capillary microfluidic electrophoresis (LabChip GX – Caliper Lifesciences) that accurately assesses the size and quantity of each amplification product .
For the 11 validated ASSETs, the relative expression levels were analyzed showing a slight influence of ERBB2 over-expression in all ASSETs (ratio ranging from -1.9 to 1.4) (Supplemental Table 1).
Gene expression analysis under the influence of ERBB2 over-expression.
To confirm the alteration in the relative expression balance of AS variants mediated by ERBB2 expression, a different approach based on probe-specific ligation and PCR amplification was applied . In this strategy, 2 pairs of probes were designed for each gene, specifically targeting the variants of interest (Figure 6). The expression balance difference was confirmed for all 3 genes (FLNA, SFRS9 and TRIP6) visualized on the acrylamide gel (Figure 6).
The diversity of the human transcriptional repertoire caused by AS has been extensively investigated [2, 3], and it is agreed that its regulation is an important mechanism for physiological and pathological aspects of cells. Moreover, AS is a major contributor to protein diversity, which, in part, explains the high complexity of mammals compared to much simpler organisms containing a similar numbers of genes .
Different approaches have been used to explore the variability caused by this phenomenon, and one of the most promising strategies is the use of AS enriched cDNA libraries [25, 26]. This strategy does not require previous knowledge of the variants and permits an AS transcriptome-wide analysis.
Deciphering of the human transcriptional repertoire related to AS variability is an enormous contribution in the comprehension of cancer and in the identification of more precise molecular markers in cancer.
Here we described an AS enriched cDNA library method by combining the use of trapping heteroduplex and RNA amplification procedures. The methodology was initially proposed by Watahiki and collaborators  and was applied in this study with some modifications to favor its application in clinically-oriented cancer studies, in which the availability of total RNA recovered from tumor tissues is normally restrictive. Moreover, the methodology established in this study is potentially applicable to RNA purified from a homogenous tumor cell population captured from a complex tissue by laser, which produces transcriptional data more correlated with the tumor cell.
Our strategy showed, in general, minimal artifacts in the identification of ASSETs, since our validation rate by RT-PCR was significantly high (94.5%). Moreover, the fact that the great majority of the AS events found in our AS enriched libraries were present in public databases and that 100% of them harbor conserved splice sites strengths the assumption that we have established a robust methodology for identifying AS in a transcriptome-wide format.
The fact that we could confirm by RT-PCR novel alternatively spliced transcripts for 2 genes to which no AS variant was present in public databases is further support that among the ASSETs with no confirmation of AS events, a high frequency of prone additional AS variants, which could participate in heteroduplex formation, is expected. The absence of amplification during the validation process of additional AS transcripts for two thirds of the selected genes suggests a significant difference in the expression level of both variants with consequent competition for the same pair of primers in the PCR reaction, avoiding the amplification of low-abundance AS transcripts.
The relatively high redundancy levels encountered in both libraries (84.78% and 86.84%) were somewhat expected. This number is similar to the redundancy reported by Thill and collaborators . In technical terms, this problem can be bypassed by decreasing the number of PCR cycles in the library construction, which is relatively easy to control.
Another potential problem was that no additional alternatively spliced transcripts were identified in sequence data provided by enriched cDNA libraries alone. This can be indicative of a problem caused by using non-phosphorylated adaptors. In this situation, only one strand (5’-3’) of these adaptors was ligated to the 5’ end of the Dpn II digested molecules that contains a phosphate residue; the other strand (3’-5’) was not ligated and, as a consequence, was disconnected from the cDNA molecules at the denaturation step of PCR and was consequently unable to be cloned and sequenced. Usually this region is re-synthesized by polymerase at the first cycle of the PCR reaction through annealing of complementary regions of cDNA molecules, a process known as polymerase fill-in, also seen in some cDNA library approaches [36, 37]. However, in our case where the strands of cDNA molecules are from distinct alternatively spliced transcripts, the fill-in process is probably inefficient due to non-perfect annealing. To avoid this problem, the use of phosphorylated adaptors is a simple solution that would favor the representation of both alternatively spliced transcripts that formed the heteroduplex structure.
ERBB2, or HER-2/neu, is an oncogene that is over-expressed in 20-30% of human breast carcinomas and is associated with poor prognosis, independent of the lymph node status [38, 39]. This marker is also associated with chemo resistance to a range of anticancer drugs and a positive response to herceptin [40, 41]. Despite this oncogene being most extensively investigated in clinical and basic oncology, the ERBB2-mediated mechanism involved in the transformation and progression of breast tumors has not yet been totally elucidated.
In this study, we proposed to identify alternatively spliced transcripts associated with breast tumors that are under ERBB2 influence by constructing 2 AS-enriched cDNA libraries using RNA sources representative of ERBB2 over-expression: the human breast cell line C5.2 that was previously transfected with 4 copies of full-length ERBB2 and a pool of 5 breast carcinoma samples, which demonstrate strong positivity in ERBB2 immunostaining in tumor cell membranes .
For testing if the expression of ASSETs was regulated by ERBB2-mediated expression, we evaluated the ASSETs validated by RT-PCR in both cell lines, HB4a and C5.2, the former with basal levels and the latter with over-expression of ERBB2 mRNA . Both cell lines have been considered a model for the investigation of ERBB2-mediated expression, since the only difference between them is the insertion of 4 copies of full-length ERBB2 in the C5.2 cell line [45, 46]. For the ASSETs in which we could identify an additional AS transcript by RT-PCR, 50% of them (3 out of 6 - TRIP6, FLNA and SFRS9) seemed to be influenced by ERBB2-mediated expression, since differences in the relative expression balance between both cell lines were observed.
Although the expression assessment of 2 or more AS variants is a problematic issue concerning accurate quantification the results presented here were confirmed by an alternative methodology, which increased the robustness of the data.
The microfluidic capillary electrophoresis-based strategy relies on amplification of both variants in the same reaction and could introduce bias due to amplification competition between variants. However, this would be expected to equally influence all reactions, independent of the template used. The alternative strategy relies on the specific binding of probes under highly stringent conditions, enabling the evaluation of each variant separately, with high accuracy and is consequently very promising for AS expression assessment. The different expression balance between both cell lines for 3 genes confirmed by 2 different approaches suggests that these genes transcribe AS variants, whose expression is differently influenced by ERBB2.
FLNA is a member of the actin-binding protein family that organizes actin filaments and is involved in numerous cellular processes, especially development. Many studies have reported the involvement of this protein in carcinogenesis. Using melanoma cell lines lacking or expressing FLNA, Fiori and collaborators  have shown that this protein is an important regulator of EGFR members (including ERBB2) that ensure efficient ligand-mediated activation of these receptors and, consequently, intracellular trafficking and degradation.
SFRS9 is a RNA-binding protein from the arginine/serine-rich family that acts as a splicing factor regulating constitutive splicing and also modulating the selection of alternative splice sites. It has been suggested that this protein acts downstream of the ERBB2 pathway, since phosphorylation of SFRS9 was detected in ERBB2-over expressing breast and ovarian cancer cells and was reduced by monoclonal antibody Herceptin treatment. Moreover, a putative role for SFRS9 in cell migration was suggested, since migration was significantly retarded following the depletion of SFRS9 transcripts in ovarian cancer cell lines .
TRIP6 is a thyroid hormone receptor interactor that localizes to focal adhesion sites and along actin stress fibers [49, 50]. This protein enhances lysophosphatidic acid (LPA) -induced cell migration by directly binding to the carboxyl-terminal tail of the LPA2 receptor through its LIM domains . Moreover, TRIP6 might enhance cell migration by binding to PDZ domain of MAGI-1b/PTEN destabilizing membrane β-catenin and E-cadherin junctional complexes, promoting cell motility .
The development of strategies to selectively represent the AS transcripts repertoire, requiring small amounts of total RNA, will be important for generating more correlated information between AS transcripts and specific cell types and conditions in a transcriptome-wide format.
In spite of using Sanger sequencing in this study, our approach is completely suitable for using with next-generation sequencers , with the possibility of decreasing the number of PCR cycles, and consequently the redundancy level of the library; and assaying multiple barcoded samples with high sequence coverage in a single run. Finally, the use of next generation sequencers would tremendously expand the applicability of our approach toward characterizing cancer cell transcriptome diversity resulting from AS.
In this study we presented a method for exploring AS from any RNA source that generates reliable AS data in a transcriptome-wide format. Additionally, our data identified AS transcript candidates whose expression was influenced by ERBB2-mediated expression and can be tested as molecular markers for breast cancer. The association of such methodology with deep sequencing may be helpful for completely deciphering the cancer cell transcriptome and finding more precise molecular markers.
The human breast cell line C5.2 is derived from normal luminal cells transfected with four copies of the full-length ERBB2 cDNA (HER-2/neu) gene presenting tumor characteristics . Cells were maintained in RPMI medium supplemented with 100 ml/l fetal bovine serum (FBS), 5 µg/ml insulin, 5 µg/ml hydrocortisone and 1 mmol/l L-glutamine in a humidified incubator containing 50 ml/l CO2 at 37°C. The medium was changed every 2-3 days, and after 10 days the total RNA was extracted using a CsCl gradient . The yield of extracted total RNA was determined with a Kontron 810 spectrophotometer GeneQuant pro (GE Healthcare Life Sciences), and the integrity was also verified by electrophoresis through 1% agarose gel upon visualization with ethidium bromide.
RNA samples from 5 ductal breast carcinoma samples used in this study were provided by the biorepository bank from A.C. Camargo Hospital (São Paulo, Brazil). These samples were positive for ERBB2 through immunohistochemistry analysis (Table 6), according to the following criteria: weak to moderate complete membrane staining in > 10% of tumor cells or strong complete membrane staining in > 30% of tumor cells.
Alternative splicing enriched cDNA library construction
RNA amplification and double strand cDNA synthesis
For first strand cDNA synthesis, total RNA was incubated with 0.75 µg oligo dT containing the T7 RNA polymerase site (5’AAACGACGGCCAGTGAATTGTAATACGACTCACTATAGGCGCT(24)’3’) at 70°C for 10 minutes. The reaction was performed by adding 1X first strand buffer, 0.01 M DTT (Dithiothreitol), 40 U of RNasin (Promega Corporation), 1 mM dNTP (GE Healthcare Life Sciences), 1 µg of Template Switch (TS) DNA Oligo (5’AAGCAGTGGTAACAACGCAGAGTACGCGGG 3’) and 400 U of SuperScript II (Invitrogen Life Technologies) in a total volume of 20 µl. The reaction was incubated for 120 min at 42°C. For the second strand synthesis, the Advantage® cDNA PCR Kit (Clontech) was used as follows: 5X cDNA PCR Reaction Buffer, 1 mM dNTP Mix, 5X Advantage cDNA Polymerase Mix, 1.4 U of RNase H (Invitrogen Life Technologies) in a final volume of 100 µl. The reaction was incubated at 37°C for 10 min, 94°C for 3 min, 65°C for 5 min. and 75°C for 30 min. The stop reaction including 0.25 M of NaOH and 0.5 mM EDTA was added, followed by incubation at 65°C for 10 min. The dscDNA was purified by phenol:chloroform:isoamylic alcohol (25:24:1) pH 8.0 extraction followed by Microcon YM-100 Centrifugal Filter Unit (Millipore).
Double-strand cDNA was in vitro transcribed into RNA with RiboMAX™ Large Scale RNA Production Systems (Promega Corporation) as follows: 1X buffer, 3 µM rNTP and 2.5 µl Enzyme T7 Mix. The reaction was incubated at 37°C for 6 hours. Amplified RNA (aRNA) was purified by TRIzol® Reagent (Sigma Aldrich Corporation).
After purification, aRNA was used for double-stranded cDNA synthesis as described above using 1 µg of TS-oligo for the first strand synthesis and 0.5 µg oligo dT(24) for the second strand synthesis.
Denaturation and renaturation
Double-stranded cDNA molecules were heated at 96°C for 20 min and incubated at 42°C for 24 hours in a mixture of 0.2% SDS, 0.5 M NaCl, 0.05 M Tris-HCl pH 7.5 and 30% formamide.
Exonuclease VII cleavage
Exonuclease VII (USB Corporation) cleavage was performed in 70 mM Tris-HCI, pH 8.0; 8 mM EDTA, pH 8.0; 10 mM 2-mercaptoethanol; 50 µg/ml BSA and 0.2 U of the enzyme and incubated at 37°C for 30 min. The enzyme was inactivated at 95°C for 10 min.
Dpn II digestion
Fifteen units of the restriction enzyme II (New England Biolabs) was used for each 500 ng of cDNA in 1X buffer. The reaction was incubated at 37°C for 3 hours.
Heteroduplex molecule trapping by biotin-streptavidin
The cDNA sample was incubated with 100 pmoles of random 25-mer oligonucleotide biotinilated at the 5’ end in 6X SSC and 0.1% SDS at 65 °C for 16 hours.
This mixture was incubated with 1 mg streptavidin magnetic particles (F. Hoffmann-La Roche Ltd.) and 300 μl TEN100 binding buffer (10 mM Tris-HCl; 1 mM EDTA; 100 mM NaCl, pH 7.5) for 30 min at room temperature. The tube was applied to a magnetic separator, and the supernatant was removed and incubated with another aliquot of streptavidin magnetic particle for a second round of purification. Both aliquots of magnetic particles coupled to heteroduplex molecules by the biotinilated random oligonucleotide were mixed and washed 3 times with TEN100 washing buffer (10 mM Tris-HCl; 1 mM EDTA; 1 M NaCl, pH 7.5). The cDNA molecules were then eluted from the magnetic particles by adding 6 M guanidine-HCl and purified by a phenol: chloroform: isoamylic alcohol pH 8.0 extraction.
Ligation of XDPN12 and XDPN14 adaptors
The adaptors were commercially synthesized and contained four bases complementary to the cleavage site of the Dpn II enzyme. First, the cDNA molecules were mixed with 1X T4 Ligase Buffer, 400 pmols XDPN12 (5’GATCTCTCGAGT3’) and 400 pmols XDPN14 (5’CTGATCACTCGAGA3’) and incubated at 55°C for 1 min. Next, the temperature was decreased from 54°C to 28°C at a rate of 2°C every 2 min and from 28°C to 14°C at a rate of 2°C every 4 min to favor a perfect annealing of the oligonucleotides. At last, 2000 units of T4 DNA ligase (Invitrogen Life Technologies) were added, and the reaction was incubated at 14°C for 16 hours. The reaction was purified with a Microcon YM-100 Centrifugal Filter Unit (Millipore).
Polymerase chain reaction
The RT-PCR reaction was carried out in 1X buffer, 0.1 mM dNTP, 1.5 mM MgCl2, 200 pmols XDPN18 oligonucleotide (5’CTGATCACTCGAGAGATC3’), 2 units GoTaq® DNA Polymerase (Promega Corporation) and 10 µl of purified cDNA in a total volume of 20 µl. The reaction was incubated at 95°C for 4 min followed by 35 cycles of 95°C for 45 s, 58°C for 1 min and 72°C for 4 min and a final extension at 72° for 7 min.
Cloning and sequencing
PCR products were inserted into T/A plasmid vector pTZ57R/T using the InsT/Aclone PCR Product Cloning Kit (Fermentas Life Sciences), following the manufacturer’s recommendations, in a total volume of 10 µl. The ligation was performed at 22°C for 16 hours. The ligation was dialyzed for 20 min in 0.025 µM nitrocellulose membrane (Millipore), and 3 µl was used for transformation in DH10B E. coli cells by electroporation (2.5 KV, 25 μFD, 200 OHMS). The clone inserts were sequenced with ABI Prism 3100 (Applied Biosystems). The sequencing reaction was performed with M13 reverse primer (5’GTCATAGCTGTTTCCTG3’) and BigDye Terminator v3.1 cycle sequencing kit (Applied Biosystems), following the manufacturer’s recommendations.
The sequences were automatically analyzed, and regions corresponding to vector sequences were trimmed. The quality control was performed in 20 bp windows, where only windows containing at least 15 bp with a Phrep quality score ≥ 20 were considered.
The sequences of each library were clustered individually using the CAP3 program, allowing estimation of library’s redundancy. The consensus sequences were first aligned against the human genome (NCBI build #36.1) using BLAT . Second, to improve the quality and specificity of alignment the best hit of each sequence in the genome was selected, and realigned using Sim4 . Third, sequences showing identity ≥ 93% and sequence coverage (percentage of sequence length aligned) ≥ 55% were considered. Lastly, the sequences were clustered with ESTs from dbEST (8,133,299 sequences), mRNAs (244,284 sequences) and RefSeqs (26,040 sequences) downloaded from UCSC (September 2007) (see Galante  for more details).
The primers for splice variant validation were designed at the extremities of the ASSET sequence. Twenty nanograms of cDNA from both the total RNA from the C5.2 cell line and the pool of breast cancer samples were used to validate the ASSETs from Lib_1 and Lib_2, respectively. The PCR reaction was performed in a total volume of 20 μl by mixing 1 X reaction buffer (Invitrogen Life Technologies), 2.5 mM MgCl2 (Invitrogen Life Technologies), 0.2 mM dNTP (Amersham Biosciences), 10 pmoles of each primer and 1 unit Taq Platinum (Invitrogen Life Technologies). PCR reactions were performed with 40 cycles at 95°C for 30 sec, 60°C for 30 sec and 72°C for 30 sec, followed by a final extension at 72°C for 7 min. Amplification products were visualized on a 8% acrylamide gel and subsequently sequenced by ABI3130.
ERBB2 influence on relative expression
For verifying the ERBB2 influence on gene expression, all ASSETs were amplified using the C5.2 cell line and also the Hb4a cell line, which is a human mammary luminal epithelial cell line. The PCR products were quantified through capillary microfluidic electrophoresis (LabChip GX – Caliper Lifesciences). The expression of the GAPDH gene was used as a normalization factor. The expression ratio was determined by the normalized value of C5.2 divided by the normalized value of Hb4a for each ASSET. Genes were considered to be differently expressed between cell lines for ratios ≥|2|. The differently expressed genes were analyzed in a group of tumor and normal breast samples through a strategy based on specific-probe ligation. The left and right probes were targeted against specific exon junctions of each variant of a gene. The left probe contained at its 5’ end a recognition sequence of the forward PCR primer (GGGTAGGCTAAGGGTAGGA) followed by a stuffer sequence of 38 nucleotides (CCGTTGCCAGTCTGCTCAGACCTCCCTCGCGCCATCAG), and the right probe was phosphorylated at its 5’ end and contained a recognition sequence of the reverse PCR primer (TCTAGATTGGATCTTGCTGGCAC) at its 3’ end. A specific RT primer designed downstream of the probe target sequence was used for cDNA synthesis. The probes were hybridized to pre-heated cDNA from Hb4a and C5.2 at 60°C overnight, and only the probes specifically hybridized to their target sequences were connected by T4 DNA ligase, resulting in one unique probe. As a negative control, ligation and hybridization were performed in the absence of any template for all pairs of probes. The unique probes were PCR amplified. Amplification products were analyzed on 8% acrylamide gel. (Additional file 1)
List of abbreviations used
alternative splicing sequence-enriched tag
expressed sequence tag
reverse transcriptase polymerase chain reaction
Template Switch oligo
This work was supported by Fundação de Amparo à Pesquisa do Estado de São Paulo (CEPID/FAPESP 98/14335). ENF is supported by grant FAPESP (05/56289-2). We are grateful to the Biobank and the Research and Educational Center at A.C. Camargo Hospital. We thank Dr. Ricardo Renzo Brentani for important comments and corrections on the manuscript.
This article has been published as part of BMC Genomics Volume 11 Supplement 5, 2010: Proceedings of the 5th International Conference of the Brazilian Association for Bioinformatics and Computational Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2164/11?issue=S5.
- Gilbert W: Why genes in pieces?. Nature. 1978, 271: 501-. 10.1038/271501a0.View ArticlePubMed
- Pan Q, Shai O, Lee LJ, Frey BJ, Blencowe BJ: Deep surveying of alternative splicing complexity in the human transcriptome by high-throughput sequencing. Nat Genet. 2008, 40: 1413-1415. 10.1038/ng.259.View ArticlePubMed
- Wang ET, Sandberg R, Luo S, Khrebtukova I, Zhang L, Mayr C, Kingsmore SF, Schroth GP, Burge CB: Alternative isoform regulation in human tissue transcriptomes. Nature. 2008, 456: 470-476. 10.1038/nature07509.PubMed CentralView ArticlePubMed
- Harrow J, Denoeud F, Frankish A, Reymond A, Chen CK, Chrast J, Lagarde J, Gilbert JG, Storey R, Swarbreck D, Rossier C, Ucla C, Hubbard T, Antonarakis SE, Guigo R: GENCODE: producing a reference annotation for ENCODE. Genome Biol. 2006, 7: 1-9.View Article
- Zavolan M, Van Nimwegen E: The types and prevalence of alternative splice forms. Curr. Opin. Struct. Biol. 2006, 16: 362-367. 10.1016/j.sbi.2006.05.002.View ArticlePubMed
- Black DL: Mechanisms of alternative pre-messenger RNA splicing. Annu Rev Biochem. 2003, 72: 291-336. 10.1146/annurev.biochem.72.121801.161720.View ArticlePubMed
- Yura K, Shionyu M, Hagino K, Hijikata A, Hirashima Y, Nakahara T, Eguchi T, Shinoda K, Yamaguchi A, Takahashi K, Itoh T, Imanishi T, Gojobori T, Mitiko : Alternative splicing in human transcriptome: functional and structural influence on proteins. Gene. 2006, 380: 63-71. 10.1016/j.gene.2006.05.015.View ArticlePubMed
- Xing Y, Lee C: Relating alternative splicing to proteome complexity and genome evolution. Adv Exp Med Biol. 2007, 623: 36-49. 10.1007/978-0-387-77374-2_3.View ArticlePubMed
- Tazi J, Bakkour N, Stamm S: Alternative splicing and disease. Biochim Biophys Acta. 2009, 1792: 14-26. 10.1016/j.bbadis.2008.09.017.View ArticlePubMed
- Bartel F, Taubert H, Harris LC: Alternative and aberrant splicing of MDM2 mRNA in human cancer. Cancer Cell. 2002, 2: 9-15.View ArticlePubMed
- Hui L, Zhang X, Wu X, Lin Z, Wang Q, Li Y, Hu G: Identification of alternatively spliced mRNA variants related to cancers by genome-wide ESTs alignment. Oncogene. 2004, 23: 3013-3023. 10.1038/sj.onc.1207362.View ArticlePubMed
- Venables JP, Klinck R, Koh C, Gervais-Bird J, Bramard A, Inkel L, Durand M, Couture S, Froehlich U, Lapointe E, Lucier JF, Thibault P, Rancourt C, Tremblay K, Prinos P, Chabot B, Elela SA: Cancer-associated regulation of alternative splicing. Nat Struct Mol Biol. 2009, 16: 670-676. 10.1038/nsmb.1608.View ArticlePubMed
- Venables JP, Klinck R, Bramard A, Inkel L, Dufresne-Martin G, Koh C, Gervais-Bird J, Lapointe E, Froehlich U, Durand M, Gendron D, Brosseau JP, Thibault P, Lucier JF, Tremblay K, Prinos P, Wellinger RJ, Chabot B, Rancourt C, Elela SA: Identification of alternative splicing markers for breast cancer. Cancer Res. 2008, 68: 9525-9531. 10.1158/0008-5472.CAN-08-1769.View ArticlePubMed
- Brinkman BMN: Splice variants as cancer biomarkers. Clinical Biochemistry. 2004, 37: 584-594. 10.1016/j.clinbiochem.2004.05.015.View ArticlePubMed
- Hsu FR, Chang HY, Lin YL, Tsai YT, Peng HL, Chen YT, Cheng CY, Shih MY, Liu CH, Chen CF: AVATAR: a database for genome-wide alternative splicing event detection using large scale ESTs and mRNAs. Bioinformation. 2005, 1: 16-8. 10.6026/97320630001016.PubMed CentralView ArticlePubMed
- Modrek B, Resch A, Grasso C, Lee C: Genome-wide detection of alternative splicing in expressed sequences of human genes. Nucleic Acids Res. 2001, 29: 2850-2859. 10.1093/nar/29.13.2850.PubMed CentralView ArticlePubMed
- Kirschbaum-Slager N, Parmigiani RB, Camargo AA, de Souza SJ: Identification of human exons overexpressed in tumors through the use of genome and expressed sequence data. Physiol. Genomics. 2005, 21: 423-432. 10.1152/physiolgenomics.00237.2004.View ArticlePubMed
- Kan Z, Rouchka EC, Gish WR: Gene structure prediction and alternative splicing analysis using genomically aligned ESTs. Genome Res. 2001, 5: 889-900.View Article
- Chen FC, Chen CJ, Ho JY, Huang TJ: Identificatyion and evolutionary analysis of novel exons and alternative splicing events using cross-species EST-to-genome comparisons in human, mouse and rat. BMC Bioinformatics. 2006, 7: 136-10.1186/1471-2105-7-136.PubMed CentralView ArticlePubMed
- Johnson JM, Castle J, Garrett-Engele P, Kan Z, Loerch PM, Armour CD, Santos R, Schadt EE, Stoughton R, Shoemaker DD: Genome-wide survey of human alternative pre-mRNA splicing with exon junction microarrays. Science. 2003, 302: 2141-2144. 10.1126/science.1090100.View ArticlePubMed
- Gardina PJ, Clark TA, Shimada B, Staples MK, Yang Q, Veitch J, Schweitzer A, Awad T, Sugnet C, Dee S, Davies C, Williams A, Turpaz Y: Alternative splicing and differential gene expression in colon cancer detected by a whole genome exon array. BMC Genomics. 2006, 7: 325-10.1186/1471-2164-7-325.PubMed CentralView ArticlePubMed
- Cuperlovic-Culf M, Belacel N, Culf AS, Ouellette RJ: Microarray analysis of alternative splicing. OMICS. 2006, 10: 344-357. 10.1089/omi.2006.10.344.View ArticlePubMed
- Castle JC, Zhang C, Shah JK, Kulkarni AV, Kalsotra A, Cooper TA, Johnson JM: Expression of 24,426 human alternative splicing events and predicted cis regulation in 48 tissues and cell lines. Nat Genet. 2008, 40: 1416-1425. 10.1038/ng.264.PubMed CentralView ArticlePubMed
- Ferreira EN, Rangel MC, Pineda PB, Vidal DO, Camargo AA, Souza SJ, Carraro DM: Heteroduplex formation and S1 digestion for mapping alternative splicing sites. Genet Mol Res. 2008, 7: 958-969. 10.4238/vol7-3X-Meeting012.View ArticlePubMed
- Watahiki A, Waki K, Hayatsu N, Shiraki T, Kondo S, Nakamura M, Sasaki D, Arakawa T, Kawai J, Harbers M, Hayashizaki Y, Carninci P: Libraries enriched for alternatively spliced exons reveal splicing patterns in melanocytes and melanomas. Nat Methods. 2004, 3: 233-239.View Article
- Thill G, Casteli V, Pallud S, Salanoubat M, Wincker P, de la Grange P, Auboet D, Schachter V, Weissenbach J: ASEtrap: a biological method for speeding up the exploration of spliceomes. Genome Res. 2006, 16: 776-786. 10.1101/gr.5063306.PubMed CentralView ArticlePubMed
- Matz M, Shagin D, Bogdanova E, Britanova O, Lukyanov S, Diatchenko L, Chenchik A: Amplification of cDNA ends based on template-switching effect and step-out PCR. Nucleic Acids Res. 1999, 27: 1558-1560. 10.1093/nar/27.6.1558.PubMed CentralView ArticlePubMed
- Huang X, Madan A: CAP3: A DNA sequence assembly program. Genome Res. 1999, 9: 868-877. 10.1101/gr.9.9.868.PubMed CentralView ArticlePubMed
- Kent WJ, Sugnet CW, Furey TS, Roskin KM, Pringle TH, Zahler AM, Haussler D: The human genome browser at UCSC. Genome Res. 2002, 12: 996-1006.PubMed CentralView ArticlePubMed
- Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W: A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome Res. 1998, 9: 967-974.
- Maere S, Heymans K, Kuiper M: BiNGO: a Cytoscape plugin to assess overrepresentation of Gene Ontology categories in biological networks. Bioinformatics. 2005, 21: 3448-3449. 10.1093/bioinformatics/bti551.View ArticlePubMed
- Lewis BP, Green RE, Brenner SE: Evidence for the widespread coupling of alternative splicing and nonsense-mediated mRNA decay in humans. Proc Natl Acad Sci U S A. 2003, 7: 189-192.View Article
- Green RE, Lewis BP, Hillman RT, Blanchette M, Lareau LF, Garnett AT, Rio DC, Brenner SE: Widespread predicted nonsense-mediated mRNA decay of alternatively-spliced transcripts of human normal and disease genes. Bioinformatics. 2003, 19: i118-21. 10.1093/bioinformatics/btg1015.View ArticlePubMed
- Venables JP, Koh CS, Froehlich U, Lapointe E, Couture S, Inkel L, Bramard A, Paquet ER, Watier V, Durand M, Lucier JF, Gervais-Bird J, Tremblay K, Prinos P, Klinck R, Elela SA, Chabot B: Multiple and specific mRNA processing targets for the major human hnRNP proteins. Mol Cell Biol. 2008, 28: 6033-6043. 10.1128/MCB.00726-08.PubMed CentralView ArticlePubMed
- Nardi A, Pomari E, Zambon D, Belvedere P, Colombo L, Dalla Valle L: Transcriptional control of human steroid sulfatase. J Steroid Biochem Mol Biol. 2009
- Jiang Z, Cote J, Kwon JM, Goate AM, Wu JY: Aberrant splicing of tau pre-mRNA caused by intronic mutations associated with the inherited dementia frontotemporal dementia with Parkinson linked cromossome 17. Molecular and Cellular Biology. 2000, 20: 4036-4048. 10.1128/MCB.20.11.4036-4048.2000.PubMed CentralView ArticlePubMed
- Diatchenko L, Lau YF, Campbell AP, Chenchik A, Moqadam F, Huang B, Lukyanov S, Lukyanov K, Gurskaya N, Sverdlov ED, Siebert PD: Suppression subtractive hybridization: a method for generating differentially regulated or tissue-specific cDNA probes and libraries. Proc Natl Acad Sci U S A. 1996, 93: 6025-6030. 10.1073/pnas.93.12.6025.PubMed CentralView ArticlePubMed
- Albanell J, Baselga J: Unraveling resistance to trastuzumab (Herceptin): insulin-like growth factor-I receptor, a new suspect. J Natl Cancer Inst. 2001, 93: 1830-183. 10.1093/jnci/93.24.1830.View ArticlePubMed
- Slamon DJ, Leyland-Jones B, Shak S, Fuchs H, Paton V, Bajamonde A, Fleming T, Eiermann W, Wolter J, Pegram M, Baselga J, Norton L: Use of chemotherapy plus a monoclonal antibody against HER2 for metastatic breast cancer that overexpresses HER2. N Engl J Med. 2001, 344: 783-92. 10.1056/NEJM200103153441101.View ArticlePubMed
- Kumar CC, Madison V: Drugs targeted against protein kinases. Expert Opin Emerg Drugs. 2001, 6: 303-315. 10.1517/1472822.214.171.1243.View ArticlePubMed
- Slamon D, Pegram M: Rationale for trastuzumab (Herceptin) in adjuvant breast cancer trials. Semin Oncol. 2001, 28: 13-19.View ArticlePubMed
- Harris RA, Eichholtz TJ, Hiles ID, Page MJ, O'Hare MJ: New model of ErbB-2 over-expression in human mammary luminal epithelial cells. Int J Cancer. 1999, 80: 477-484. 10.1002/(SICI)1097-0215(19990129)80:3<477::AID-IJC23>3.0.CO;2-W.View ArticlePubMed
- Press MF, Hung G, Godolphin W, Slamon DJ: Sensitivity of HER-2/neu antibodies in archival tissue samples: potential source of error in immunohistochemical studies of oncogene expression. Cancer Res. 1994, 54: 2771-2777.PubMed
- Stamps AC, Davies SC, Burman J, O'Hare MJ: Analysis of proviral integration in human mammary epithelial cell lines immortalized by retroviral infection with a temperature-sensitive SV40 T-antigen construct. Int. J. Cancer. 1994, 57: 865-874. 10.1002/ijc.2910570616.View ArticlePubMed
- Jongeneel CV, Iseli C, Stevenson BJ, Riggins GJ, Lal A, Mackay A, Harris RA, O’Hare MJ, Neville AM, Simpson AJG, Strausberg RL: Comprehensive sampling of gene expression in human cell lines with massively parallel signature sequencing. PNAS. 2003, 100: 4701-4705.View Article
- dos Santos ML, Palanch CG, Salaorni S, Da Silva WA, Nagai MA: Transcriptome characterization of human mammary cell lines expressing different levels of ERBB2 by serial analysis of gene expression. Int J Oncol. 2006, 28: 1441-1461.PubMed
- Fiori JL, Zhu TN, O'Connell MP, Hoek KS, Indig FE, Frank BP, Morris C, Kole S, Hasskamp J, Elias G, Weeraratna AT, Bernier M: Filamin A modulates kinase activation and intracellular trafficking of epidermal growth factor receptors in human melanoma cells. Endocrinology. 2009, 150: 2551-2560. 10.1210/en.2008-1344.PubMed CentralView ArticlePubMed
- Mukherji M, Brill LM, Ficarro SB, Hampton GM, Schultz PG: A phosphoproteomic analysis of the ErbB2 receptor tyrosine kinase signaling pathways. Biochemistry. 2006, 45: 15529-15540. 10.1021/bi060971c.View ArticlePubMed
- Yi J, Beckerle MC: The human TRIP6 gene encodes a LIM domain protein and maps to chromosome 7q22, a region associated with tumorigenesis. Genomics. 1998, 49: 314-316. 10.1006/geno.1998.5248.View ArticlePubMed
- Wang Y, Dooher JE, Koedood Zhao M, Gilmore TD: Characterization of mouse Trip6: a putative intracellular signaling protein. Gene. 1999, 234: 403-409. 10.1016/S0378-1119(99)00168-7.View ArticlePubMed
- Xu J, Lai YJ, Lin WC, Lin FT: TRIP6 enhances lysophosphatidic acid-induced cell migration by interacting with the lysophosphatidic acid 2 receptor. J Biol Chem. 2004, 279: 10459-10468.PubMed CentralView ArticlePubMed
- Chastre E, Abdessamad M, Kruglov A, Bruyneel E, Bracke M, Di Gioia Y, Beckerle MC, van Roy F, Kotelevets L: TRIP6, a novel molecular partner of the MAGI-1 scaffolding molecule, promotes invasiveness. FASEB J. 2009, 23: 916-928. 10.1096/fj.08-106344.View ArticlePubMed
- Holt RA, Jones SJ: The new paradigm of flow cell sequencing. Genome Res. 2008, 18: 839-846. 10.1101/gr.073262.107.View ArticlePubMed
- Glisin V, Crkvenjakov R, Byus C: Ribonucleic acid isolated by cesium chloride centrifugation. Biochemistry. 1974, 13: 2633-2637. 10.1021/bi00709a025.View ArticlePubMed
- Galante PA, Vidal DO, de Souza JE, Camargo AA, de Souza SJ: Sense-antisense pairs in mammals: Functional and evolutionary considerations. Genome Biol. 2007, 8: R40-10.1186/gb-2007-8-3-r40.PubMed CentralView ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.