BMC Genomics BioMed Central

Putative cold acclimation pathways in Arabidopsis thaliana identified by a combined analysis of mRNA co-expression patterns, promoter motifs and transcription factors Abstract Background: With the advent of microarray technology, it has become feasible to identify virtually all genes in an organism that are induced by developmental or environmental changes. However, relying solely on gene expression data may be of limited value if the aim is to infer the underlying genetic networks. Development of computational methods to combine microarray data with other information sources is therefore necessary. Here we describe one such method.


Background
Prostate cancer is the most frequently diagnosed cancer as well as the second leading cause of cancer death among American men [1]. Androgen ablation therapy for patients with advanced prostate cancer inevitably fails as the dis-ease progresses to an androgen-independent stage [2]. Few effective treatment options are available to these patients, and these increase survival by only a matter of months [3,4]. We examined an in vivo human prostate cancer tumour model to identify the underlying molecu-lar events involved in hormonal progression. The LNCaP hollow fibre model differs from xenograft models by growing the LNCaP human prostate cancer cell line within fibres that are implanted subcutaneously in host mice [5]. These fibres prevent host cells from infiltrating, and contaminating, the tumour cell population. Upon castration of the host the LNCaP cells progress to an androgen-independent stage as determined by a rising titre of serum prostate-specific antigen (PSA), mimicking this aspect of clinical disease [5].
Many genes important in the development and progression of cancer have been identified by first detecting their altered expression at different stages of the disease. It has thus become desirable to perform high-throughput gene expression analyses to quickly assay the expression status of large numbers of genes in a given model or treatment condition. A variety of techniques are available for monitoring gene expression profiles, with microarrays and Serial Analysis of Gene Expression (SAGE) being the most widely used. However, microarray experiments are only able to monitor the expression of genes for which prior knowledge of the transcript sequence is available, and they also lack the sensitivity to detect transcripts expressed at very low levels. The SAGE technique is capable of detecting novel transcripts [6][7][8][9], but SAGE is also not optimal for detecting low abundance transcripts. In contrast, suppression subtractive hybridization includes a normalization step that enriches for rare transcripts in a population of RNAs [10,11]. Subtractive hybridization is also able to detect entirely novel transcripts for which no previous annotation exists [12,13]. Thus, subtractive hybridization is a powerful tool to detect less abundant transcripts and the novel transcripts that tend to be expressed at low levels. In support of this concept, a significant proportion of the transcripts identified by subtractive hybridization were shown to be expressed at levels below the detection limit of Affymetrix GeneChip ® arrays [14]. Additionally, subtractive hybridization identified a number of novel transcripts which were not represented on these arrays [14].
Gene expression changes occurring with the hormonal progression of prostate cancer have been examined in various systems (see [15][16][17], for example). Our goal was to utilize the LNCaP hollow fibre model to identify genes that had not been previously associated with prostate cancer. The application of subtractive hybridization resulted in the identification of a number of novel expressed sequences in this model. These sequences exhibit low protein-coding potential and low conservation across species, but RT-PCR experiments confirmed their expression in samples of prostate cancer and in a variety of human tissues.

Novel transcripts were represented in the subtracted cDNA libraries
Suppression subtractive hybridization was used to isolate novel transcripts expressed at different stages of hormonal progression in the LNCaP hollow fibre model. This model enables the isolation and molecular analysis of prostate cells (free from contamination by host cells) at multiple stages of hormonal progression. Samples from intact mice, from mice 10 days after castration (at PSA nadir), and mice 45-60 days post-castration were used as the tester and driver samples in independent subtractive hybridization experiments. Each experiment compared different time points, and both forward and reverse subtractions were performed for each comparison.
To identify novel genes associated with prostate cancer we examined the sequences of all 428 of the subtracted clones isolated from our subtractive hybridization experiments. These sequences were filtered to remove poor quality sequences and any clones containing less than 25 bp of sequence, more than one insert, or regions of repetitive sequence (Table 1). Of the remaining 340 clones, 103 clones contained inserts that were represented by other clones in the library. BLAST analysis of the Ensembl database (v. 35 -Nov 2005) [18] was then used to identify the transcript represented in each of the remaining 237 nonredundant clones. First, the clones were searched against the population of annotated Ensembl cDNAs, identifying 150 clones (63.3%) derived from previously annotated human cDNAs (Fig. 1). The remaining 87 clones were then mapped to the human genome using the Ensembl database, with all mappings verified using the UCSC Human Genome Browser (hg17 -May 2004) [19] and GenBank databases. This search identified 57 clones (24.1%) matching expressed sequence tags (EST) and mRNAs which had not yet been classified as Ensembl cDNAs. A clone exhibiting any overlap with a transcript was considered to be part of that transcript. Of the 30 remaining clones, 5 (2.1%) did not align uniquely to the genome, but the other 25 (10.5%) did map uniquely to unannotated regions of the human genome and were considered to represent novel transcripts (Fig. 1). These 25 clones mapped to a variety of locations, including within introns of annotated transcripts, nearby annotated transcripts, as well as intergenic regions. This analysis suggested that novel transcripts were represented in our subtractive hybridization libraries.

Characterization of novel clone sequences
We next assessed the sequence characteristics of these novel clones to determine if there were differences between those mapping to unannotated regions of the genome relative to those clones mapping to known transcripts. Five characteristics were examined for each clone: (1) their protein coding potential, (2) their evolutionary conservation, (3) whether they showed evidence of splicing, (4) whether they demonstrated homology to known non-coding RNAs, and (5) whether they contained a poly-adenylation site. For this analysis, the 25 novel clones ( Table 2) were compared to 25 clones randomly chosen from the 57 that mapped to previously identified ESTs (Table 3), and another 25 clones randomly chosen from Distribution of sequence matches in the suppression subtractive hybridization clone set   the 150 mapping to annotated transcripts (Table 4). First, to assess the likely coding potential of each clone, the length of the longest open reading frame (ORF) was determined using the ORF Finder tool at NCBI. The clones derived from known transcripts were much more likely to contain an ORF spanning most of the sequence. It should also be noted that the sequences used for this analysis were derived from single-pass sequencing reads, and thus potential sequencing errors may interrupt longer ORFs. Next, we qualitatively assessed the degree of conservation of each of these clones using the Vertebrate and Multiz Alignment and Conservation track at UCSC [20]. This track provides a measure of evolutionary conservation of a genomic region amongst 17 vertebrate species. As predicted, clones originating from known transcripts tended to show high conservation of their respective genomic loci, while the clones representing novel transcripts mapped to regions of the genome that were poorly conserved. This was consistent with the prediction that highly conserved transcripts would be expressed by a larger number of species, thus increasing the likelihood of the transcript having been detected in previous studies.
We were also interested in whether these sequenced clones derived from spliced regions of transcripts. While over half of the annotated clones appeared to be spliced, only one clone matching an EST was spliced, and none of the novel clones showed evidence of splicing. It may be that the clones matching ESTs and unannotated regions were more frequently derived from untranslated regions of transcripts. This would be consistent with the decreased size and number of ORFs in these clones. A number of families of non-coding RNAs have also now been identified, such as ribosomal RNAs, transfer RNAs, and more recently, micro RNAs. To determine if the unannotated sequences identified here may be non-coding RNAs, these sequences were queried against the Rfam database of noncoding RNA families [21]. None of the novel sequences displayed homology to any of the known RNA families in this database. Next we examined all these clones for the presence of poly-adenylation sites. Approximately 30% of all clones, regardless of their mapping, were poly-adenylated, further supporting that the novel clones were derived from mRNA transcripts. This result also showed that the known and novel transcript fragments were equally likely to contain poly-adenylation sites, suggesting that novel clones were not specifically biased towards the 3' untranslated regions of transcripts.

Analysis of SAGE tags in the novel clones
SAGE is a second technique that has been used to identify large numbers of novel transcripts [6,7]. We sought to determine if the 25 novel clones had been previously detected in SAGE experiments performed with human tissues. Subtracted clones matching unannotated regions of the human genome were searched for potential NlaIII restriction sites and the 17 bp of downstream sequence was considered to represent a LongSAGE tag. Twenty of * Length of longest open reading frame as a percentage of the total length of each clone. † Degree of conservation of each clone: + = 0-9%; ++ = 10-29%; +++ = 30-59%; ++++ = 60-79%; and +++++ = 80-100% conservation.
our unannotated sequences contained uniquely mapping regions which could be potential LongSAGE tags, and many clones contained multiple possible tags. We then compared our predicted LongSAGE tags to a metalibrary containing approximately 11 million uniquely mapping human SAGE tags previously identified at the B.C. Cancer Agency Genome Sciences Centre or in the Cancer Genome Anatomy Project (CGAP) at NCBI. These tags were identified in 79 SAGE libraries representing a variety of human tissues, including embryonic and haematopoietic stem cells, samples of normal and cancerous lung, brain, pancreas, breast and colon, as well as some human cancer cell lines. Only three of the 20 unannotated clones contained LongSAGE tags that were previously detected in a SAGE experiment ( Table 2); none of these tags mapped to annotated transcripts. These tags, from clones 1cE10, 2dB01, and 2dD06, occurred once, thrice, and once, respectively, in the entire human LongSAGE metalibrary, providing in silico evidence that these clones represent novel transcripts that are expressed at very low levels.
A LongSAGE tag and a novel clone may also derive from the same transcript without the two sequences directly overlapping. We therefore looked for LongSAGE tags mapping to the 50 kb of genomic DNA sequence flanking our novel sequences to identify any neighbouring regions predicted to be expressed based on the previous SAGE experiments. The genomic neighbourhoods of all 20 of the analyzed subtracted clones contained LongSAGE tags that were not from annotated transcripts. As stated above, 3 of these clones mapped directly to a LongSAGE tag. Of the remaining 17 clones, 7 mapped within 2.5 kb of a previously detected LongSAGE tag, while the remaining 10 clones were up to 23 kb from their nearest tag. These tags also occurred at low frequency in the human metalibrary, with each tag being detected from one to three times. These regions of the genome do not appear to encode abundantly expressed transcripts, providing a possible explanation why these transcripts were not detected previously. This analysis supported the concept that our novel subtracted clones were likely to represent entirely novel transcripts.

The novel subtracted clones are expressed by a variety of cell types
To confirm that the novel clones detected by subtractive hybridization are naturally expressed in cells we performed a series of RT-PCR experiments. We chose 6 novel clones and examined their expression in LNCaP cells from the hollow fibre model. All six of the chosen clones were clearly expressed in LNCaP cells ( Fig. 2A). The absence of bands in the reverse transcriptase (RT)-negative samples  confirmed that these products did not derive from contamination with genomic DNA. The tested clones included two clones containing a previously identified LongSAGE tag, three clones for which the nearest tag was between 2 to 8 kb away, and one clone that was approximately 14 kb from the nearest LongSAGE tag. Thus, the novel transcript fragments isolated here were derived from expressed transcripts that had not previously been documented in any database.
To determine the presence or absence of these novel transcripts in various tissues we performed RT-PCR with a panel of RNAs derived from "normal" human tissues. Each of the 6 clones chosen displayed unique tissue expression profiles, with 4 clones being highly specific to the prostate and testes or the testes alone, while the remaining 2 clones were expressed in all the tissues tested (Fig. 2B). Clone 2A03 was almost exclusively expressed in the prostate, with only a faint product observed in the testes. Even the two clones observed in all the tissues tested appeared to be expressed at variable levels in each of the tissues. Thus, while the expression of some of these clones appeared higher in LNCaP cells, the expression of these novel sequences was not unique to this cell line.
Next we examined the expression of these 6 novel transcripts in three samples of prostate cancer with matched samples of "normal" prostate. While the cancer specimens were not microdissected, these samples were all scored by a pathologist to contain 65-80% tumour tissue. Five of the transcripts were expressed in both the normal and cancer specimens (Fig. 2C). Clone 2dB01 showed the most limited expression, being detectable in only one sample of cancer. The expression of this transcript is quite limited as LNCaP cells and the single cancer specimen were the only non-testicular samples in which expression could be demonstrated (Fig. 2B,C). While expression levels can not be precisely measured via RT-PCR, some of these clones appeared to show differential expression in the normal samples relative to the cancer samples. From these experiments we conclude that the novel clones derived from our subtractive hybridization experiments in fact represent previously unannotated transcripts that are expressed in a variety of human tissues.

Five of the novel clones are part of larger untranslated regions
The novel clones isolated here were cDNA fragments presumably derived from larger transcripts. We wanted to  isolate more sequence information from the full-length transcripts to understand the possible functions of these novel transcripts. The above analysis with the human SAGE metalibrary identified a number of tags mapping near the novel clones isolated using subtractive hybridization, suggesting they may represent the same transcript. RT-PCR experiments were then performed utilizing primers spanning clones 1E05 and 2dB01 and their neighbouring LongSAGE tags. These experiments demonstrated that some of these tags were in fact derived from larger transcripts that also contained these novel subtracted clones (Fig. 3). These PCR products were not detected in reverse transcriptase negative samples, confirming they were not derived from genomic DNA. Upon sequencing of these PCR amplicons, clone 1E05 was demonstrated to be part of an additional 6 kb untranslated region (accession nos. DQ668402 and DQ668403) downstream of a previously identified, but uncharacterized, mRNA represented in the GenBank database (AL832227). Interestingly, mRNA AL832227 was originally identified in human testis tissue. This provides validation of the tissue expression profiles we observed as clone 1E05 was only detected in cDNA from testis (Fig. 2B). Finally, clone 2dB01 was part of a 1.6 kb transcript that did not contain any significant ORFs (accession no. DQ668401). This transcribed region is likely an untranslated region of a larger transcript that has yet to be identified.
Rapid Amplification of cDNA Ends (RACE) was then performed in an attempt to isolate full-length transcripts for the remaining four clones. RACE in the 3' direction successfully extended clones 1B09, 2A03, and 1aC02 (accession nos. EH613608 -EH613610) to poly-adenylation sites. Unfortunately, RACE in the 5' direction did not yield larger transcript fragments for any of these clones. In all, these experiments have isolated approximately 0.9 kb of transcript 1B09, 1.5 kb of transcript 2A03, and 2.5 kb of transcript 1aC02, indicating they too encode large noncoding regions of novel transcripts.

Discussion
While the precise number of genes in the human genome remains unknown, it is clear that an even greater number of transcripts are produced by a myriad of alternative splicing events. The recognition of non-coding RNAs has also resulted in a greater focus on transcripts not containing open reading frames. Thus, even with the completed sequence of the genome further experiments are required to fully annotate the functional units transcribed from the genome in the different cell types during normal growth and development, as well as in diseased tissues. High-resolution tiling microarrays have been used to generate predictions of likely transcripts, many of which mapped to intergenic and intronic regions of the genome that were not previously annotated [22][23][24]. Similarly, approaches focused on sequencing full-length cDNAs continue to identify a large number of novel transcripts, many of which appear to be non-protein coding [25,26]. Our The novel subtracted clones were expressed in a variety of human tissues Figure 2 The novel subtracted clones were expressed in a variety of human tissues. (A) RT-PCR was performed with primer pairs specific for each of six novel clones using cDNA from the LNCaP hollow fibre model. RT indicates the presence or absence of Reverse Transcriptase. (B) RT-PCR was performed using cDNAs generated from normal human tissues with each of the six novel clones. All clones were expressed in at least one of the normal tissues. (C) RT-PCR was performed using cDNAs generated from three samples of prostate cancer (T1 -T3) and their respective matched normal samples (N1 -N3). All six of the clones were expressed in at least one of these samples. Studies now suggest that the majority of the human genome is transcribed, but the function(s) of most of these transcripts has not yet been demonstrated (reviewed in [27]). The lack of defined function for these transcripts has led some to propose that they arise indirectly through spurious transcription of the genome. However, new functions of non-coding transcripts continue to be identified, indicating that these "spurious" transcripts likely have functions that have yet to be characterized. For example, steroid receptor RNA activator (SRA) was demonstrated to act as an RNA transcript to regulate the transcriptional activity of steroid nuclear receptors, including the androgen receptor [28]. Another non-coding transcript, expressed at low levels in various tissues, was recently demonstrated to regulate the nuclear trafficking of nuclear factor of T cells (NFAT), and has been renamed non-coding RNA repressor of NFAT (NRON) [29]. Such studies confirm that low-abundance, non-coding, RNA transcripts perform diverse functions and regulate multiple biological processes.

RT: + -+ -+ -+ -+ -+
Subtractive hybridization has been used to characterize gene expression changes associated with prostate cancer [12,15,[30][31][32][33], and some of these studies have also characterized novel genes that were identified from their subtracted libraries [12,13,30]. However, these reports examined only a few novel genes that were differentially expressed in their experimental systems, and few studies have used subtractive hybridization to examine changes with hormonal progression in an in vivo model [15]. Our study is unique in that we sequenced all the clones arising from the subtractive hybridization experiments performed with in vivo samples from the LNCaP hollow fibre model. This approach identified a large number of transcripts that had not previously been detected in prostate cancer cells and may be of prognostic or therapeutic value.
We identified 25 completely unannotated clones, and an additional 57 clones matching previously sequenced ESTs that were otherwise unannotated. The isolation of these ESTs specifically from prostate cancer cells may prove informative at a later date. Furthermore, we considered any overlap of an EST or annotated gene with one of our clones to signify that that clone derived from a previously identified transcript. However, several of the clones matching ESTs and annotated transcripts displayed only partial overlap with the known sequence, suggesting that the subtracted clone may still represent an unidentified splice variant of the known transcript. Most of these novel transcript fragments exhibited low sequence conservation amongst vertebrate species, though this may indicate that these transcripts are human-or primate-specific. Prostate cancer is only known to spontaneously occur in humans, rats, and some species of dogs even though the prostate organ is present in all mammals. Furthermore, the human prostate exhibits numerous morphological differences. Thus, it is likely that many transcripts required for prostate development and prostate cancer progression would not be extensively conserved throughout evolution. In support of this, the KLK3 gene, encoding prostate-specific Two of the novel clones were part of transcripts containing a neighbouring SAGE tag , is only present in primate genomes [34]; PSA is a well known clinical marker used to monitor prostate cancer progression and response to therapy. This demonstrates that evolutionary conservation alone is not predictive of potential clinical utility.
The functional relevance of these novel transcripts in the hormonal progression of prostate cancer remains to be elucidated. In our experiments, 4 of the 6 clones tested were expressed only in the normal prostate and testes; this limited tissue expression profile suggests these novel transcripts may function specifically in these organs. This also suggests the expression of these transcripts is tightly regulated, as would be expected for a functional transcript. Furthermore, a related publication from our group identified a novel variant of TMEFF2 which encodes a secreted form of the protein [35]. This alternate form of the protein was identified after a novel clone from our subtractive hybridization library (clone 2A06) was mapped to the fourth intron of the TMEFF2 gene. The expression of TMEFF2 has been shown to increase with progression to androgen independence [15,36], consistent with our subtractive hybridization experiments. TMEFF2 is also currently being investigated as a target for antibody-based therapy in the treatment of prostate cancer [37,38], confirming that our approach identified novel transcripts which may be of interest in the study of prostate cancer. Changes in expression of some of these transcripts may also be valuable as a marker for disease progression.
To characterize the function(s) of these novel transcripts it will first be necessary to identify the full-length cDNA sequence. Multiple techniques are available to recover full-length cDNA molecules starting from only cDNA fragments. For example, RACE is widely used to obtain fulllength cDNA sequences [39]. RACE was successful in isolating further 3' sequence information for 3 of the 4 clones for which it was attempted. As in the case of clone 1E05, it is possible that all three of these clones actually derive from large 3' untranslated regions of protein coding transcripts. The large size and low expression levels of these transcripts increase the difficulty of identifying their full 5' sequence. Alternatively, we may have already identified the majority of the sequence in these novel, non-coding transcripts. Another technique to identify full-length sequence for these novel transcripts would be to screen existing cDNA libraries [6,13]. However, given the relatively low expression of these transcripts this approach would likely require extensive screening of such libraries. Two recent studies have used Northern blot analysis to detect expression of similar low-abundance, non-coding, novel transcripts [22,26]. These groups found that only 20-30% of the novel transcripts were detectable by Northern blot analysis, even when using large amounts of poly-A+ RNA, indicating that the remaining transcripts fell below the detection limit of this technique.
Margulies et al. [40] recently described a highly parallel sequencing by synthesis approach that demonstrated an increased throughput for sequencing of genomic DNA. Our group has combined sequencing by synthesis and random shotgun analysis to generate ESTs and characterize the transcriptome of LNCaP cells grown in tissue culture [41]. This study isolated approximately 180,000 ESTs, and of these, 1,900 (1.0%) mapped to the human genome in regions not previously annotated in the Ensembl database. One of these ESTs mapped directly to clone 2A03 described here, while 12 more mapped within 1.5 kb of clones 1cD03 and 2dB01. However, the remaining 22 novel clones identified here were still not detected by this alternative approach.
SAGE has also been used to identify potentially novel transcripts [6][7][8][9]. Unfortunately, while SAGE provides sufficient sequence information to accurately map the tags to the human genome, there is often little other information available to aid in the design of experiments to derive more sequence data from these novel transcripts. In contrast, subtractive hybridization provides longer sequence fragments, but it is not possible to determine the orientation of these fragments. Our data also suggest that subtractive hybridization was able to detect transcripts that had not previously been found using SAGE or high throughput sequencing by synthesis. It is possible that subtractive hybridization may be more sensitive to detecting transcripts expressed at low levels. Alternatively, subtractive hybridization may isolate those transcripts that can not be efficiently detected by SAGE, for instance transcripts lacking an NlaIII restriction site. Common results from multiple techniques gives greater confidence in the identification of novel transcripts and underlines the value of using complementary techniques to achieve a more thorough analysis of the human transcriptome.

Conclusion
Our subtractive hybridization experiments have identified novel transcripts that are specifically expressed in the prostate and/or the urogenital tissues. It may be of clinical value to further develop these novel transcripts as prognostic or therapeutic markers for prostate cancer and hormonal progression. Additionally, characterizing such novel transcripts and transcript variants may aid in identifying and understanding the processes important in the development of androgen independent disease.

LNCaP hollow fibre model
The LNCaP hollow fibre model has been described in detail previously [5]. Briefly, LNCaP cells (2 × 10 7 ) in RPMI 1640 with 20% (v/v) FBS were sealed inside polyvinylidine fluoride (PVDF) fibres (500 kDa molecular weight cutoff; 1 mm internal diameter; Spectrum Medical Co, Houston, TX) and incubated overnight at 37°C. The fibres were then cut into fragments of approximately 2 cm and inserted subcutaneously into anesthetized male 6-8 week old athymic nude mice (BALB/c strain) obtained from Charles River Laboratory (Montreal, Canada). Serum samples were obtained from the dorsal tail vein of mice every 7 days, and PSA levels measured by an immunoenzymatic assay (Abbott IMX, Montreal, Canada). Serum samples were always obtained prior to the performance of any procedure. After one week mice were castrated by ligation of the vas deferens through a small incision in the scrotum. Control (intact) animals were not castrated, but all other procedures were performed on the same schedule. Hollow fibres were removed on the day of castration, 10 days post-castration, and 45-60 days postcastration when serum PSA levels had risen. All fibres were immediately placed on ice, washed three times in sterile phosphate buffered saline (PBS), and wiped with sterile, moistened lab wipes. Any fibre visibly contaminated by mouse tissue was set aside. To harvest cells, 1 mL of icecold TRIZOL ® Reagent (Invitrogen, Burlington, Canada) was flushed through the fibres and the cells homogenized with a 21-G needle prior to storage at -80°C. All animal procedures were performed according to protocols approved by the Animal Care Committee at the University of British Columbia.

Suppression subtractive hybridization
The SMART PCR cDNA Synthesis Kit (Clontech, Palo Alto, CA) was used to generate full-length cDNA from 1 μg of starting total RNA using oligo-dT primer according to the manufacturer's protocol. Suppression subtractive hybridization was then performed with the PCR-Select™ cDNA Subtraction Kit (Clontech) according to the manufacturer's protocol. Briefly, the cDNA was digested with RsaI restriction endonuclease to generate fragments of approximately equal lengths. The digested cDNA was purified and split into two populations before ligation of Adaptor 1 or Adaptor 2R. An excess of driver cDNA was added to each reaction, the DNA denatured, and hybridization performed for 8 hours at 68°C. The two populations of cDNA were then combined and fresh denatured driver cDNA added before hybridizing for an additional 16 hours. The final cDNA population was subjected to two rounds of PCR to specifically amplify the differentially expressed cDNA transcripts. The efficiency of subtraction was determined by monitoring levels of the housekeeping gene glyceraldehyde-3-phosphate dehydrogenase (GAPDH) in these samples. The subtracted cDNA pools were then ligated into pCR ® 2.1-TOPO vector (Invitrogen) and transformed into competent bacteria. Positive colonies were selected and their inserts sequenced unidirectionally with T7 primer on an ABI 3700 automated sequencer. Clones containing novel inserts were further sequenced to obtain the entire insert sequence. All sequences obtained in these experiments have been deposited in dbEST [Accession numbers EC093848 -EC094057; EH613608 -EH613610] and GenBank [Accession numbers DQ668378 -DQ668403].

RT-PCR
Total RNA was isolated from LNCaP cells maintained in vivo using TRIZOL ® Reagent according to the manufacturer's protocol. Total RNA samples from several human tissues were purchased from Stratagene (La Jolla, CA). Samples of total RNA from cases of prostate cancer, and their matched normal samples, were purchased from Genomics Collaborative (Cambridge, MA). Reverse transcription (RT) was performed using MMLV-RT (Invitrogen) with 1 μg of template RNA. Subsequent PCR reactions were performed using 1 μL of the resulting cDNA as template. The primers used to amplify the clones are summarized in Table 5. PCR products of interest were cloned into pCR ® 2.1-TOPO vector and sequenced by the NAPS facility at the University of British Columbia.

RACE
Poly-A+ mRNA from LNCaP cells was used for RACE experiments with the Smart RACE cDNA Amplification kit (Clontech) according to the manufacturer's protocol. PCR products were gel-purified, cloned, and sequenced.