TTS Mapping: integrative WEB tool for analysis of triplex formation target DNA Sequences, G-quadruplets and non-protein coding regulatory DNA elements in the human genome
© Jenjaroenpun and Kuznetsov; licensee BioMed Central Ltd. 2009
Published: 3 December 2009
DNA triplexes can naturally occur, co-localize and interact with many other regulatory DNA elements (e.g. G-quadruplex (G4) DNA motifs), specific DNA-binding proteins (e.g. transcription factors (TFs)), and micro-RNA (miRNA) precursors. Specific genome localizations of triplex target DNA sites (TTSs) may cause abnormalities in a double-helix DNA structure and can be directly involved in some human diseases. However, genome localization of specific TTSs, their interconnection with regulatory DNA elements and physiological roles in a cell are poor defined. Therefore, it is important to identify comprehensive and reliable catalogue of specific potential TTSs (pTTSs) and their co-localization patterns with other regulatory DNA elements in the human genome.
"TTS mapping" database is a web-based search engine developed here, which is aimed to find and annotate pTTSs within a region of interest of the human genome. The engine provides descriptive statistics of pTTSs in a given region and its sequence context. Different annotation tracks of TTS-overlapping gene region(s), G4 motifs, CpG Island, miRNA precursors, miRNA targets, transcription factor binding sites (TFBSs), Single Nucleotide Polymorphisms (SNPs), small nucleolar RNAs (snoRNA), and repeat elements are also mapped based onto a sequence location provided by UCSC genome browser, G4 database http://www.quadruplex.org and several other datasets. The results pages provide links to UCSC genome browser annotation tracks and relative DBs. BLASTN program was included to check the uniqueness of a given pTTS in the human genome. Recombination- and mutation-prone genes (e.g. EVI-1, MYC) were found to be significantly enriched by TTSs and multiple co-occurring with our regulatory DNA elements. TTS mapping reveals that a high-complementary and evolutionarily conserved polypurine and polypyrimidine DNA sequence pair linked by a non-conserved short DNA sequence can form miR-483 transcribed from intron 2 of IGF2 gene and bound double-strand nucleic acid TTSs forming natural triplex structures.
TTS mapping provides comprehensive visual and analytical tools to help users to find pTTSs, G-quadruplets and other regulatory DNA elements in various genome regions. TTS Mapping not only provides sequence visualization and statistical information, but also integrates knowledge about co-localization TTS with various DNA elements and facilitates that data analysis. In particular, TTS Mapping reveals complex structural-functional regulatory module of gene IGF2 including TF MZF1 binding site and ncRNA precursor mir-483 formed by the high-complementary and evolutionarily conserved polypurine- and polypyrimidine-rich DNA pair. Such ncRNAs capable of forming helical triplex structures with a polypurine strand of a nucleic acid duplexes (DNA or RNA) via Hoogsteen or reverse Hoogsteen hydrogen bonds. Our web tool could be used to discover biologically meaningful genome modules and to optimize experimental design of anti-gene treatment.
Although triplexes have been well characterized in vitro and their role is studying in clinical trials, their biological significance in living organisms is still under discussion [2, 4]. It was demonstrated that once formed, these structures can frequently cause down-regulation and sometimes up-regulation of gene expression revealing their potential role in gene expression control and suggesting their application in gene therapy [2, 3]. These structures are also able to impair DNA polymerization, and can influence DNA recombination and repair . Triplexes might also have a role in chromatin organization of both interphase nuclei and mitotic chromosomes . Recently, it was demonstrated that a triple stranded pseudo-knot is a conserved essential element of telomerase RNA, for instance in humans . A number of proteins able to bind triplexes have been identified . TTS sequences are common in mammalian genes. Most annotated protein-coding genes in human and other mammalian genomes contain, at least, one unique and high-affinity TTS in their (putative) promoters and/or transcribed regions [3, 9].
RNAs are integral components of chromosomes and contribute to their structural organization and the functions of various chromosome regions . Naturally occurring small interference RNAS (siRNAs) and closely related class of non-coding micro-RNAs (miRNAs) could play an essential regulatory role in eukaryotic gene expression and cell function [10, 11]. It is now becoming apparent that chromatin architecture and epigenetic memory can be regulated by ncRNA-directed processes, although their exact mechanisms are yet to be understood . It was shown that specific ncRNAs are able to form natural triplexes (purine-purine-pyrimidine triplex structure called H-DNA-DNA-RNA form) with major promoter regions of DHFR gene and switch alternative transcriptional isoforms of genes . miRNA-based anti-gene therapy is now considered as a very promising tool for modulation of gene expression and cell phenotype. However, chromosome co-localization and functional relationships between miRNAs and other types of ncRNAs and TTSs have not been systematically studied.
Certain guanine-rich sequences can often co-localized with TTSs. Certain guanine-rich sequences can fold spontaneously into four stranded DNA structure known as G4 DNA motifs . The structure of G4, which comprises stacked G-tetrads, has a square planar arrangement of four guanine bases stabilized by Hoogsteen GG pairing. It is extremely stable under physiological conditions. Recent studies have demonstrated that the G4 DNA structures formed in regulatory regions (e.g. promoters) could be overlapped with pTTSs and regulate gene expression [12, 13]. A well-known example of such regulation is the repressive effect of G4 DNA on the transcription of human MYC gene . The transcriptional activity of MYC gene is reduced considerably when a parallel G4 DNA formed in the nuclease hypersensitive element III1 upstream of the P1 promoter is stabilized by the G4 ligand TMPyP4 . However, relatively little is known about the detailed molecular mechanism by which G4 DNA influences genome properties. Consecutive guanine structures and guanine content may attribute the high binding affinity of TFOs to their targets [15, 16]. Therefore, TTSs including G4 might be perspective genome elements for development novel triplex strategies.
It has been demonstrated in vitro and in vivo that TFOs targeting promoter regions and genic regions of oncogenes and other disease-related genes can diminish the expression of genes, prohibit cell proliferation, and induce apoptosis in the cells . These results also suggest that the effects of TFOs on endogenious gene expression may depend on the specific chromatin structure of specifically addressed gene. In particular, status of nucleosome occupancy might be considered as the factor of efficiency of triplex formation [16–18].
Two TTS WEB resources and search tools of pTTSs in the human genome have been proposed till now. For the first one, TRACTs  software, the input is the gene sequence, additionally, the starting and the ending positions of each exon and each intron. The input for this tool can be generated from parsing annotated sequences. The specificity of the input DNA sequence and the annotation of the sequence without explicit intron and exon description of the annotated gene is not reported by TRACTs. The second, TFO target sites tool reported by Gaddis in 2006 , has also provide information about TTS sequences for annotated gene (chosen by several annotated Ids). The pTTSs in gene-flanking regions can also be optimized. The both tools use different criteria of identification of pTTSs. The both tools provide a user with information about TTS sequence length and gene region locations. However, the information about TTS within intergenic and intragenic regions of a given protein-coding gene, and any freely selected region on a chromosome might be also important for many reasons, but such opportunities are not available through these two bioinformatics tools. In the context of discovery and analysis of complex genome regulatory elements and genome architectures, it might also be important to study co-localization patterns of natural TTSs, G4 motifs, precursors of non-coding RNAs (ncRNAs), ncRNA targets and other regulatory elements. However, that kind of analysis has not been carried out yet.
In this work we identify and annotate highly-specific pTTSs in any region of human genome and integrate this information with G4 dataset and several USCS genome browser  annotation tracks of reported genes, gene models, mRNAs and ESTs. We also provide a multi-track view of the sequences together with several key DNA regulatory elements, including CpG islands, transcription factor (TF) binding sites (BS), single nucleotide polymorphisms (SNPs), nucleosome occupancy profile, repeat sites, and ncRNA precursors and ncRNA gene targets.
TTS mapping tools features
TTS search parameters
TTS length correlates with TTS uniqueness in the human genome
A comparison of the frequencies of uniquely mapped pTTSs found in the TTS databases
Number of unique pTTSs
Ratio of [B] and [A]
Wu et al. 2007 [A]
TTS mapping [B]
We also count the number uniquely located pTTSs identified by Wu et al. . Table 1 shows that the total number of TTSs in our DB contains 55% of TTSs reported by Wu at al. However, this table also shows that short TTSs consist of a major fraction (46%) of the TTS population. The numbers of medium-length (19-24 nt) and long-length uniquely located TTSs in the both databases (DBs) are similar. Thus, our DB exhibits stronger filtration of the short TTSs than Wu et al. DB. This difference could be explained by difference in pTTS alignment criteria and additional filtering procedures related to multiple mapping of the pTTSs reported in . More information regarding a comparison of non-unique and unique sequences in the DBs is presented in Additional file 2.
Another limitation of triplex binding is pyrimidine insertion in TTS. Only a single pyrimidine insertion can greatly decrease triplex stability [15, 23]. Although modified TFOs can overcome the limitation, the insertions still affect triplex stability in these cases . Nevertheless, a few pyrimidine insertions still have to be considered. There are several reports in the literature about the use of modified TFO for an efficient inhibition of gene expression. For example, for the inhibition of MYC gene expression it was demonstrated that a 25 nucleotides long TTS with 3 thymine based insertion was required [25, 26]. Therefore, our TTS mapping keeps a flexibility of search for pTTSs with pyrimidine insertion.
G content and TTS length
It was shown that TFOs are able to form stable triplex structures when their length is at least seven nucleotides (nt) . The further studies indicated that TFOs longer than 7 nt can form more specific bonds with their targets than the shorter ones [15, 28, 29]. Most of high-affinity binding between TFO and TTS requires high G content (>54% of the total nucleotide sequence length) in the TTS sequences [15, 23]. However, the experiment of Vekhoff et al. demonstrates that GU-TFOs can form stable triplexes to its target with minimal G composition of 40-50%, depending on TFOs or TTS length and sequence context .
The latter three components were used to identify pTTSs, depending on the user's choice. In TTS mapping tool, 15 nucleotides is used as the default minimal TTS length, because, even though 15 nucleotides length does not guarantee a strong uniqueness, such sequences may still be interesting if they overlap with transcription factors binding sites or other regulatory elements. The default minimal G content is set to 50%. This value was chosen in order to ensure that triplex structure could be formed and be stable .
Summary tables of TTS mapping
List of found pTTSs and integrative annotation information
The descriptive information about pTTSs is listed in a table by TTS ID (in Figure 5). TTS IDs are created consisting of 17 characters in the following order: "TTS", pyrimidine insertion, %G, chromosome, start position of pTTS, for example: "TTS 3 50 08 128815591". Each TTS ID refers to sequence, chromosome position, G composition percent, an overlapping tandem repeat, an overlapping known repeat element overlapped, overlapping annotation tracks overlapped, and a BLAST genome link, which is used to evaluate the uniqueness of pTTSs in human genome. An application for sorting and filtering tables is accessible in column captions. In addition, the user can click on TTS ID to display its descriptive information and details on each overlapping annotation track overlapped.
Descriptive statistics table of pTTSs and the genomic regions overlapping with pTTSs
Detailed descriptive information about each pTTS is described in a table represented in Figure 6. Descriptive information table contains pTTS sequence, chromosome position, strand, %G, the number of pyrimidine insertions, tandem repeats, and known overlapping repeat regions. The table also displays overlaps of the pTTS with genes. The detailed information about other overlapping tracks is also displayed when such tracks overlapping are found. A case study of pTTSs found in the genes of transcription factors
It has been shown that TTS could be co-localized with TFBSs in the same putative promoter region [15, 23, 30]. For insurance, figure 7 shows several pTTSs found by our TTS mapping software in the region of MYC gene. In zooming region on Figure 7 the start position of MYC gene includes the uniquely mapped TTS TTS15008128817597) which overlapped with the TFBS V$E2F_02. The next downstream TTS (TTS15008128817681) is located in the fist exon of MYC gene which overlapped also with BS V$P300_01. However, TTS15008128817681 pTTS has multiple locations in the human genome. Additionally, several other annotation tracks including G4, TFBS, miRNA target, and CpG Islands are also displayed on Figure 7. More detailed information about pTTSs and other putative regulatory sequences is presented in the supplementary materials (additional files 3, 4). Using BLAST alignment of pTTS TTS15008128817597 against the human genome we found a unique location of the pTTS. This TTS is overlapped with the TFBS V$E2F_02. Potential importance of TTS TTS15008128817597 for anti-gene therapy has been demonstrated. Inhibition of gene expression via pTTS TTS15008128817597 in vitro was found by several research groups [25, 31, 32].
In another case study (Figure 8), we identified a pTTS TTS15003170347908 which sequence is unique in the genome and is overlapped with the TFBS V$MZF1_02. MZF1 is TF belonging to the Krüppel family of zinc finger proteins, expressed in totipotent hemopoietic cells as well as in myeloid progenitors. MZF1 can act as a tumor/growth suppressor in the hemopoietic compartment .
The same pTTS is also located in EVI-1 gene isoforms: the first exon of NM_001105078 and in putative 2 Kb-promoter regions of EVI-1 isoforms NM_001105077 and NM_005241. Detailed information about EVI-1 region is presented in supplementary materials (additional files 5, 6, 7, 8). We found no evidences in the literature of possible function or therapeutic applications of this pTTS. Biological importance and clinical significance of these associations requires a further investigation. The presented results suggest that simultaneous mapping of specific regulatory elements (unique pTTS and TFBS) presented on Figure 8 could provide important information regarding new application in targeting EVI-1 genes.
TTS and G4 co-occurrence
G4 and pTTSs overlapping in the MYC promoter and MYC intragenic regions pTTSs overlapped with G4s are in bold type.
CCCCACCTTCCCC ACCCTCCCCACCCTCCCC ATAAGCGCCCCTCCCGGGTTCCC
TTS can co-occurred in ncRNA precursor
Output of TTS mapping exhibits an often occurrence of TTSs in the precursors of small ncRNAs located in the introns of protein coding genes. For instance, figure 9A shows an overlapping between miRNA precursor, hsa-mir-483, located in intron 2 of short isoform of IGF2 gene. The IGF2 gene encodes a member of the insulin family of polypeptide growth factors that is involved in development and growth. IGF2 can stimulate various cellular responses acting as a cell survival factor or mitogenic factor and can also modify metabolism. It is an imprinted gene and is expressed only from the paternally inherited allele. Although IGF2 is normally only transcribed from the paternal allele, maternal imprinting is lost in many tumors, leading to biallelic expression of the gene. Disruption of imprinting and the resulting increase in gene dosage have been implicated in tumor transformation in a variety of human tissues . Loss of IGF2 imprinting leads to an oncogenic diathesis that enhances the risk for neoplastic transformation. There is a read-through, INS-IGF2, which aligns to this gene at the 3' region and to the upstream INS gene at the 5' region.
Four TTSs overlapped with mir-483 precursor
TTS mapping could predict the non-trivial functional links between genome localization of specific evolutionally conserved TTSs and essential regulatory protein-coding genes, natural siRNA precursors and share TF BS in gene promoter regions. For instance, the TTS-rich regions included into intron 2 of IGF2 gene covers the mir-483 precursor region. One of TTS includes binding site of TF MZF1. Figure 9C shows that one of four pTTS (TTS15011002112005) includes the binding site (TTCCCCTCTCCC) of TF MZF1. Additionally, TTS mapping exhibits trimethylation of hystone 3 lysine in the position 27 (H3K27met3) and an absent of trimethylation of hystone 3 lysine in position 4 (H3K4met3) of IGF2 gene region observed in several cancer cell types, which suggests that transcription of the both IGF2 gene and mir-483 is suppressed. We suggest that the TTSs could form triplexes in and TTS-rich regions included into intron 2 of IGF2 gene and, thus, TTS-TFO complex might be involve in mechanisms of local epigenetic regulation of repressive heterochromatin directly or/and through the recruitment of specific proteins (for instance, MZF1 and triplex-binding proteins directly interacting with TTSs). Data on negative nucleosome occupancy of evolutionarily conserved TTSs in the mir-483 precursor region (Figure 9), exhibited in USCS viewer, support our hypothesis. Thus, we could suggest that both the IGF2 gene and the precursor mir-493 are essential genes (in fetal tissue development and cancer growth) which might be controlled by genetic and epigenetic mechanisms driven by natural TTSs-TFO complexes.
In this work we provided an automatic cartography of specific pTTSs within the human genome and integrated this information with other identified regulatory DNA motifs and sites. Our analysis shows that TTS mapping can provide a comprehensive and detailed analysis of integrity TTSs with other genomic regulatory signals essential for understanding of genome stability and gene expression.
We developed a flexible web-based search tools for finding, annotating TFO G-rich TTSs within the human genome and integrating this data with G4 motifs and other regulatory elements, including CpG Island, miRNA precursors, miRNA targets, transcription factor binding sites (TFBSs), Single Nucleotide Polymorphisms (SNPs), small nucleolar RNAs (snoRNA), nucleosome potential and repeat elements. Descriptive information about each genome region, including sequence context overlapping annotated DNA sequence regions and gene regions (e.g. introns and exons) and putative promoter and downstream region is provided. The engine assists the user in finding highly-specific TFO TTS (pure polypurine sequences with length larger than 14 bp) and "moderate specific" TFO TTS (TFO purine target sites with length larger than 14 bp, including 1, 2 or 3 pyrimidine insertions).
Using well studied oncogenes MYC and EVI-1, which exhibit strong recombination- and mutation-prone functions (which are often associated with human malignancies) as examples, we demonstrated that pTTS, TFBS, and miRNA targets and several other regulatory elements could be synergistically involved in co-modulation of the promoter regions of MYC and EVI-1.
TTS mapping reveals that a high-complementary evolutionarily conserved polypurine and polypyrimidine motif pair linked by the non-conserved short sequence, form miR-483 precursor. Originally, miRNA hsa-miR-483 was located in chromosome 11 and was identified by  in fetal liver in human. In addition, miR-483 was annotated in mouse and rat by sequence similarity in mirBase [38, 39]. This ncRNA was detected in an automatic scan (composite score 8.6, energy 31.4) and was scored highly as a miRNA by RNAmicro . Hsa-miR-483 was also detected by RNAz but was not in the input set for EvoFold . Predictions of miR-483, in addition to human, rat and mouse, also were made for dog, cow, horse and rabbit http://people.csail.mit.edu/akiezun/miRviewer/mir-483_index.html. So, miR-483 DNA precursor is strongly conserved across many mammals and contains a high-complementary polypurine and polypyrimidine motif pair linked by a short low-conserved sequence forming hairpin loop of secondary structure of this ncRNA. miRNA can be co-transcribed by RNAPol-II with protein-coding gene IGF2 in which miRNA precursor is embedded. We suggest that R-Y TTS pair forming a natural DNA-RNA triplex within precursor mir-483 transcribing from the second intron of IGF2 gene might be an important target for endogenous regulation of the functions of this ncRNA in the human cells.
BLAST aligning of pTTS with the reference human genome revealed the unique overlapping of pTTS with TFBSs (Figure 8 and Figure 9). Inhibition of gene expression via TFOs forming the triplexes with such TTSs was demonstrated by several research groups [25, 31, 32]. In particular, it was shown that triplex-forming oligonucleotides can bind with a high specificity to the MYC promoter in HeLa cells, thereby reducing MYC mRNA levels in the cells [25, 31, 32].
Figure 8 and Figure 9C shows that pTTSs TTS15003170347908 and TTS15011002112005 include common anti-sense sequence of BSs of TF MZF1 (having TFBS TTCCCCTCTCCC) which could mediate a control of transcription of genes IGF2 and EVI-1(Figure 8). These examples suggest that transcription of two (or more) essential genes could be specifically directed via a single highly-homologous TTS. Other DNA-binding proteins (e.g. nuclear protein bound the TTS in promoter region of Ki-ras and MYC genes [14, 34]) can induce site-specific modifications of genome DNA structures and, finally, changes of gene expression. Additionally, silencing mechanism(s) of the gene expression could be regulated by some proteins directly interacting with pyrimidine-rich motif (s) of RNAs. For instance, CRD-BP/IMP1 is an oncofetal RNA-binding protein which, via short pyrimidine-reach binding sites, specifically recognizes several RNAs, including the leader 3' IGF2 mRNA and MYC mRNAs [42–44]. de novo CRD-BP/IMP1 expression has been detected in human tumors of different origins, and in some of these tumors characterizes the vast majority of the samples studied (see [42, 43] for references). However, the possibility of mir-483 to form triplex structures (RNA-DNA-DNA or DNA-DNA-DNA) and the possibility of CRD-BP/IMP1 protein to interact with mRNA of IGF2 gene might play an important role in our understanding of the specific control of IGF2 and mir-483 precursor expression.
TTS mapping provides a comprehensive visual and analytical tool to help a user to find pTTSs, G4s and other regulatory DNA structures and genomic regions. Our pipeline allows us (i) to discover diverse biologically meaningful complex genome elements, (ii) identify novel biomarkers of complex diseases and their unique/specific combinations.
TTS Mapping provides not only bioinformatics support, due to its sequence visualization tools and statistical tables, but also integrates knowledge about many diverse DNA structures and their annotation tracks. Our results suggest that recombination-prone and mutation-prone genes EVI-1 and MYC is significantly enriched with co-occurring regulatory DNA sequences including TTSs and G4s, which could be used to develop novel approaches for gene therapy based on highly-specific TFO-TTS interactions.
TTS Mapping predicts the existence of a sub-set of natural ncRNAs forming hairpin secondary structures, including high-complementary and evolutionarily conserved polypurine- and polypyrimidine-rich stem and non-conserved polypurine/polypyrimidine hairpin loops with varying purine/pyrimidine content. Such R-Y paired ncRNAs could form siRNA and miRNAs, which might be involved in silencing and activation of expression of many dozens of essential genes. The DNA precursors of such Y-R paired ncRNAs might be also considered as prospective targets for high-specific anti-gene therapy.
The human genome sequences are prepared for TTSs searching in two different data sets: the first data set is human genome sequence hg18 (NCBI Build 36.1); the second data set corresponds to the same genome sequence (hg18), for which DNA repeat regions were masked (RepeatMasker/RepBase update 9.11) . The genome data for BLAST was prepared Makeblastdb software (NCBI Blast 2.2.19+ package) from human genome sequence without repeat masking. NCBI BlastN software  was used to verify the uniqueness of each TTS. UCSC annotation tracks (refGene, cpgIslandExt, snp129, targetscans, tfbsConsSites, and wgrna) were integrated for mapping pTTSs with annotation tracks. G-quadruplex sequences, predicted by Quadparser, were provided at http://www.quadruplex.org. The search engine of TTS mapping is available for a public access at http://ggeda.bii.a-star.edu.sg/~piroonj/TTS_mapping/TTS_mapping.php.
Construction and Implementation of TTSs mapping
The aim of TTSs mapping pipeline is to provide users with a flexible pTTSs search tool for polypurine sequences in the human genome and to integrate the pTTSs mapping data with available annotation tracks which could be associated with structural and functional properties of the TTSs. Our web tool requests a chromosome position as an initial search parameter. To form the triplex, the given pTTS should match the following specifications [15, 23, 30]: (a) minimal and maximal pTTS sequence length, (b) minimal guanine (G) content percent, (c) number of non-contiguous pyrimidine insertions, and (d) human genome sequences with or without repeat masking. Detailed information about optimal selection of these parameters is presented in the next section. TTS mapping provides optimal parameters of pTTS searching by default and also it allows the users to choose the parameters for mapping pTTSs optionally (see section TTS search parameters in Results). Chromosome co-localization of G-quadruplexs, miRNAs, snoRNAs, miRNA gene targets, TFBSs, SNPs, nucleosome occupancy profiles could be also observed and statistically analysed. The data flow schema is showed in Figure 2. The results of the search are reported in a summary table, and are used as input data for integrate mapping of pTTSs with available annotation information.
Figure 3 shows the interface page of TTS mapping tools. Figure 4 shows how search results are combined in summary tables containing: genome region-specific information, pTTS searching parameters, statistics of pTTSs found in the given region, references to gene sequence databases (via refGene from UCSC). That information is reported for the each gene found in the given genome region overlapped with pTTS, and the annotation tracks found in the given region and overlapped with pTTSs. In addition, the summary results page provides links to view the pTTSs details and to display all the available tracks in UCSC bioinformatics browser. TTS mapping tool is implemented in Perl language.
Criteria of identification of pTTS sequences
The TTS mapping tool is used to identify pTTSs in the reference human genome (hg18). The initial parameters for the program search were the following: minimal length of TTS is ≥15 nt, minimal guanine content is ≥50%, the number of pyrimidine insertions could be ranged from 0 nt to 3 nt (1 nt insertion is by default). Using these criteria, the pTTS sequences were computationally mapped on the human chromosomes. We used NCBI BLAST program for alignment of these sequences against human genome to count the copy number of the given pTTS present in the genome. The unique pTTSs defined as the sequences found only in a single position in the genome were also identified. This set of unique pTTSs was divided into three groups: short-length TTSs (15-18 nt long), medium-length TTSs (19-24 nt long), and large-length TTSs (>25 nt long). The number of pTTS in the each set was computed and compared with the number of unique pTTSs previously described by Wu et al. .
Other papers from the meeting have been published as part of BMC Bioinformatics Volume 10 Supplement 15, 2009: Eighth International Conference on Bioinformatics (InCoB2009): Bioinformatics, available online at http://www.biomedcentral.com/1471-2105/10?issue=S15.
Thanks to Dr. I. Kurochkin, Dr. A. Batagov and Dr. A. Yarmishyn for reading of the manuscript and their very useful comments. The authors would like to acknowledge Agency for Science, Technology and Research (A*STAR) for funding and Bioinformatics Institute for providing the web server.
This article has been published as part of BMC Genomics Volume 10 Supplement 3, 2009: Eighth International Conference on Bioinformatics (InCoB2009): Computational Biology. The full contents of the supplement are available online at http://www.biomedcentral.com/1471-2164/10?issue=S3.
- Felsenfeld G, Rich A: Studies on the formation of two- and three-stranded polyribonucleotides. Biochim Biophys Acta. 1957, 26 (3): 457-468. 10.1016/0006-3002(57)90091-4.View ArticlePubMedGoogle Scholar
- Duca M, Vekhoff P, Oussedik K, Halby L, Arimondo PB: The triple helix: 50 years later, the outcome. Nucleic Acids Res. 2008, 36 (16): 5123-5138. 10.1093/nar/gkn493.PubMed CentralView ArticlePubMedGoogle Scholar
- Jain A, Wang G, Vasquez KM: DNA triple helices: biological consequences and therapeutic potential. Biochimie. 2008, 90 (8): 1117-1130. 10.1016/j.biochi.2008.02.011.PubMed CentralView ArticlePubMedGoogle Scholar
- Zain R, Sun JS: Do natural DNA triple-helical structures occur and function in vivo?. Cell Mol Life Sci. 2003, 60 (5): 862-870.PubMedGoogle Scholar
- Chin JY, Schleifman EB, Glazer PM: Repair and recombination induced by triple helix DNA. Front Biosci. 2007, 12: 4288-4297. 10.2741/2388.View ArticlePubMedGoogle Scholar
- Wang G, Carbajal S, Vijg J, DiGiovanni J, Vasquez KM: DNA structure-induced genomic instability in vivo. J Natl Cancer Inst. 2008, 100 (24): 1815-1817. 10.1093/jnci/djn385.PubMed CentralView ArticlePubMedGoogle Scholar
- Shefer K, Brown Y, Gorkovoy V, Nussbaum T, Ulyanov NB, Tzfati Y: A triple helix within a pseudoknot is a conserved and essential element of telomerase RNA. Mol Cell Biol. 2007, 27 (6): 2130-2143. 10.1128/MCB.01826-06.PubMed CentralView ArticlePubMedGoogle Scholar
- Wu Y, Rawtani N, Thazhathveetil AK, Kenny MK, Seidman MM, Brosh RM: Human replication protein A melts a DNA triple helix structure in a potent and specific manner. Biochemistry. 2008, 47 (18): 5068-5077. 10.1021/bi702102d.PubMed CentralView ArticlePubMedGoogle Scholar
- Goni JR, Vaquerizas JM, Dopazo J, Orozco M: Exploring the reasons for the large density of triplex-forming oligonucleotide target sequences in the human regulatory regions. BMC Genomics. 2006, 7: 63-10.1186/1471-2164-7-63.PubMed CentralView ArticlePubMedGoogle Scholar
- Amaral PP, Dinger ME, Mercer TR, Mattick JS: The eukaryotic genome as an RNA machine. Science. 2008, 319 (5871): 1787-1789. 10.1126/science.1155472.View ArticlePubMedGoogle Scholar
- Martianov I, Ramadass A, Serra Barros A, Chow N, Akoulitchev A: Repression of the human dihydrofolate reductase gene by a non-coding interfering transcript. Nature. 2007, 445 (7128): 666-670. 10.1038/nature05519.View ArticlePubMedGoogle Scholar
- Du Z, Zhao Y, Li N: Genome-wide analysis reveals regulatory role of G4 DNA in gene transcription. Genome Res. 2008, 18 (2): 233-241. 10.1101/gr.6905408.PubMed CentralView ArticlePubMedGoogle Scholar
- Maizels N: Dynamic roles for G4 DNA in the biology of eukaryotic cells. Nat Struct Mol Biol. 2006, 13 (12): 1055-1059. 10.1038/nsmb1171.View ArticlePubMedGoogle Scholar
- Siddiqui-Jain A, Grand CL, Bearss DJ, Hurley LH: Direct evidence for a G-quadruplex in a promoter region and its targeting with a small molecule to repress c-MYC transcription. Proc Natl Acad Sci USA. 2002, 99 (18): 11593-11598. 10.1073/pnas.182256799.PubMed CentralView ArticlePubMedGoogle Scholar
- Debin A, Laboulais C, Ouali M, Malvy C, Le Bret M, Svinarchuk F: Stability of G, A triple helices. Nucleic Acids Res. 1999, 27 (13): 2699-2707. 10.1093/nar/27.13.2699.PubMed CentralView ArticlePubMedGoogle Scholar
- Shen C, Buck A, Polat B, Schmid-Kotsas A, Matuschek C, Gross HJ, Bachem M, Reske SN: Triplex-forming oligodeoxynucleotides targeting survivin inhibit proliferation and induce apoptosis of human lung carcinoma cells. Cancer Gene Ther. 2003, 10 (5): 403-410. 10.1038/sj.cgt.7700581.View ArticlePubMedGoogle Scholar
- Brown PM, Fox KR: Nucleosome core particles inhibit DNA triple helix formation. Biochem J. 1996, 319 (Pt 2): 607-611.PubMed CentralView ArticlePubMedGoogle Scholar
- Brown PM, Fox KR: DNA triple-helix formation on nucleosome core particles. Effect of length of the oligopurine tract. Eur J Biochem. 1999, 261 (1): 301-310. 10.1046/j.1432-1327.1999.00279.x.View ArticlePubMedGoogle Scholar
- Gal M, Katz T, Ovadia A, Yagil G: TRACTS: A program to map oligopurine.oligopyrimidine and other binary DNA tracts. Nucleic Acids Res. 2003, 31 (13): 3682-3685. 10.1093/nar/gkg625.PubMed CentralView ArticlePubMedGoogle Scholar
- Gaddis SS, Wu Q, Thames HD, DiGiovanni J, Walborg EF, MacLeod MC, Vasquez KM: A web-based search engine for triplex-forming oligonucleotide target sequences. Oligonucleotides. 2006, 16 (2): 196-201. 10.1089/oli.2006.16.196.View ArticlePubMedGoogle Scholar
- Zweig AS, Karolchik D, Kuhn RM, Haussler D, Kent WJ: UCSC genome browser tutorial. Genomics. 2008, 92 (2): 75-84. 10.1016/j.ygeno.2008.02.003.View ArticlePubMedGoogle Scholar
- Wu Q, Gaddis SS, MacLeod MC, Walborg EF, Thames HD, DiGiovanni J, Vasquez KM: High-affinity triplex-forming oligonucleotide target sequences in mammalian genomes. Mol Carcinog. 2007, 46 (1): 15-23. 10.1002/mc.20261.View ArticlePubMedGoogle Scholar
- Perkins BD, Wilson JH, Wensel TG, Vasquez KM: Triplex targets in the human rhodopsin gene. Biochemistry. 1998, 37 (32): 11315-11322. 10.1021/bi980525s.View ArticlePubMedGoogle Scholar
- Campbell MA, Miller PS: Cross-linking to an interrupted polypurine sequence with a platinum-modified triplex-forming oligonucleotide. J Biol Inorg Chem. 2009Google Scholar
- Kim HG, Reddoch JF, Mayfield C, Ebbinghaus S, Vigneswaran N, Thomas S, Jones DE, Miller DM: Inhibition of transcription of the human c-myc protooncogene by intermolecular triplex. Biochemistry. 1998, 37 (8): 2299-2304. 10.1021/bi9718191.View ArticlePubMedGoogle Scholar
- Postel EH, Flint SJ, Kessler DJ, Hogan ME: Evidence that a triplex-forming oligodeoxyribonucleotide binds to the c-myc promoter in HeLa cells, thereby reducing c-myc mRNA levels. Proc Natl Acad Sci USA. 1991, 88 (18): 8227-8231. 10.1073/pnas.88.18.8227.PubMed CentralView ArticlePubMedGoogle Scholar
- Mouscadet JF, Ketterle C, Goulaouic H, Carteau S, Subra F, Le Bret M, Auclair C: Triple helix formation with short oligonucleotide-intercalator conjugates matching the HIV-1 U3 LTR end sequence. Biochemistry. 1994, 33 (14): 4187-4196. 10.1021/bi00180a011.View ArticlePubMedGoogle Scholar
- Plum GE, Pilch DS, Singleton SF, Breslauer KJ: Nucleic acid hybridization: triplex stability and energetics. Annu Rev Biophys Biomol Struct. 1995, 24: 319-350. 10.1146/annurev.bb.24.060195.001535.View ArticlePubMedGoogle Scholar
- Reither S, Jeltsch A: Specificity of DNA triple helix formation analyzed by a FRET assay. BMC Biochem. 2002, 3: 27-10.1186/1471-2091-3-27.PubMed CentralView ArticlePubMedGoogle Scholar
- Vekhoff P, Ceccaldi A, Polverari D, Pylouster J, Pisano C, Arimondo PB: Triplex formation on DNA targets: how to choose the oligonucleotide. Biochemistry. 2008, 47 (47): 12277-12289. 10.1021/bi801087g.View ArticlePubMedGoogle Scholar
- Carbone GM, McGuffie E, Napoli S, Flanagan CE, Dembech C, Negri U, Arcamone F, Capobianco ML, Catapano CV: DNA binding and antigene activity of a daunomycin-conjugated triplex-forming oligonucleotide targeting the P2 promoter of the human c-myc gene. Nucleic Acids Res. 2004, 32 (8): 2396-2410. 10.1093/nar/gkh527.PubMed CentralView ArticlePubMedGoogle Scholar
- McGuffie EM, Catapano CV: Design of a novel triple helix-forming oligodeoxyribonucleotide directed to the major promoter of the c-myc gene. Nucleic Acids Res. 2002, 30 (12): 2701-2709. 10.1093/nar/gkf376.PubMed CentralView ArticlePubMedGoogle Scholar
- Gaboli M, Kotsi PA, Gurrieri C, Cattoretti G, Ronchetti S, Cordon-Cardo C, Broxmeyer HE, Hromas R, Pandolfi PP: Mzf1 controls cell proliferation and tumorigenesis. Genes Dev. 2001, 15 (13): 1625-1630. 10.1101/gad.902301.PubMed CentralView ArticlePubMedGoogle Scholar
- Cogoi S, Quadrifoglio F, Xodo LE: G-rich oligonucleotide inhibits the binding of a nuclear protein to the Ki-ras promoter and strongly reduces cell growth in human carcinoma pancreatic cells. Biochemistry. 2004, 43 (9): 2512-2523. 10.1021/bi035754f.View ArticlePubMedGoogle Scholar
- Arora A, Dutkiewicz M, Scaria V, Hariharan M, Maiti S, Kurreck J: Inhibition of translation in living eukaryotic cells by an RNA G-quadruplex motif. RNA. 2008, 14 (7): 1290-1296. 10.1261/rna.1001708.PubMed CentralView ArticlePubMedGoogle Scholar
- Steigen SE, Schaeffer DF, West RB, Nielsen TO: Expression of insulin-like growth factor 2 in mesenchymal neoplasms. Mod Pathol. 2009, 22 (7): 914-921. 10.1038/modpathol.2009.48.View ArticlePubMedGoogle Scholar
- Fu H, Tie Y, Xu C, Zhang Z, Zhu J, Shi Y, Jiang H, Sun Z, Zheng X: Identification of human fetal liver miRNAs by a novel method. FEBS Lett. 2005, 579 (17): 3849-3854. 10.1016/j.febslet.2005.05.064.View ArticlePubMedGoogle Scholar
- Griffiths-Jones S: The microRNA Registry. Nucleic Acids Res. 2004, D109-111. 10.1093/nar/gkh023. 32 DatabaseGoogle Scholar
- Griffiths-Jones S, Grocock RJ, van Dongen S, Bateman A, Enright AJ: miRBase: microRNA sequences, targets and gene nomenclature. Nucleic Acids Res. 2006, D140-144. 10.1093/nar/gkj112. 34 DatabaseGoogle Scholar
- Hertel J, Stadler PF: Hairpins in a Haystack: recognizing microRNA precursors in comparative genomics data. Bioinformatics. 2006, 22 (14): e197-202. 10.1093/bioinformatics/btl257.View ArticlePubMedGoogle Scholar
- Washietl S, Pedersen JS, Korbel JO, Stocsits C, Gruber AR, Hackermuller J, Hertel J, Lindemeyer M, Reiche K, Tanzer A, et al: Structured RNAs in the ENCODE selected regions of the human genome. Genome Res. 2007, 17 (6): 852-864. 10.1101/gr.5650707.PubMed CentralView ArticlePubMedGoogle Scholar
- Ioannidis P, Mahaira LG, Perez SA, Gritzapis AD, Sotiropoulou PA, Kavalakis GJ, Antsaklis AI, Baxevanis CN, Papamichail M: CRD-BP/IMP1 expression characterizes cord blood CD34+ stem cells and affects c-myc and IGF-II expression in MCF-7 cancer cells. J Biol Chem. 2005, 280 (20): 20086-20093. 10.1074/jbc.M410036200.View ArticlePubMedGoogle Scholar
- Jonson L, Vikesaa J, Krogh A, Nielsen LK, Hansen T, Borup R, Johnsen AH, Christiansen J, Nielsen FC: Molecular composition of IMP1 ribonucleoprotein granules. Mol Cell Proteomics. 2007, 6 (5): 798-811. 10.1074/mcp.M600346-MCP200.View ArticlePubMedGoogle Scholar
- Sparanese D, Lee CH: CRD-BP shields c-myc and MDR-1 RNA from endonucleolytic attack by a mammalian endoribonuclease. Nucleic Acids Res. 2007, 35 (4): 1209-1221. 10.1093/nar/gkl1148.PubMed CentralView ArticlePubMedGoogle Scholar
- Tarailo-Graovac M, Chen N: Using RepeatMasker to identify repetitive elements in genomic sequences. Curr Protoc Bioinformatics. 2009, Chapter 4 (Unit 4): 10-PubMedGoogle Scholar
- Zhang H: Alignment of BLAST high-scoring segment pairs based on the longest increasing subsequence algorithm. Bioinformatics. 2003, 19 (11): 1391-1396. 10.1093/bioinformatics/btg168.View ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.