Genomic organisation and alternative splicing of mouse and human thioredoxin reductase 1 genes

Background Thioredoxin reductase (TR) is a redox active protein involved in many cellular processes as part of the thioredoxin system. Presently there are three recognised forms of mammalian thioredoxin reductase designated as TR1, TR3 and TGR, that represent the cytosolic, mitochondrial and novel forms respectively. In this study we elucidated the genomic organisation of the mouse (Txnrd1) and human thioredoxin reductase 1 genes (TXNRD1) through library screening, restriction mapping and database mining. Results The human TXNRD1 gene spans 100 kb of genomic DNA organised into 16 exons and the mouse Txnrd1 gene has a similar exon/intron arrangement. We also analysed the alternative splicing patterns displayed by the mouse and human thioredoxin reductase 1 genes and mapped the different mRNA isoforms with respect to genomic organisation. These isoforms differ at the 5' end and encode putative proteins of different molecular mass. Genomic DNA sequences upstream of mouse exon 1 were compared to the human promoter to identify conserved elements. Conclusions The human and mouse thioredoxin reductase 1 gene organisation is highly conserved and both genes exhibit alternative splicing at the 5' end. The mouse and human promoters share some conserved sequences.


Background
The thioredoxin system is comprised of thioredoxin (Trx) and thioredoxin reductase (TR) and plays an important role in maintaining the redox state of the cell [1]. This system is involved in many cellular functions including synthesis of deoxyribonucleotides [2], redox control of transcription factors [3], protection against oxidative stress [4], cell growth and cancer [5]. The reduced form of Trx is maintained by TR, an enzyme that contains a redox active disulphide group and a FAD molecule that uses the reducing power of NADPH [6,7]. Presently, there are three recognised forms of mammali-an thioredoxin reductase: TR1, TR3 and TGR, that share a similar domain organisation and all contain selenocysteine [8][9][10][11][12][13][14]. TR1 was the first thioredoxin reductase to be characterised and is known as the cytosolic form [8]. TR3 is the mitochondrial form and is also known as TrxR2 [9][10][11]. TGR, previously known as the novel form of thioredoxin reductase (or TR2) functions as a Trx and GSSG reductase [11,12]. In this study, we focus upon the cytosolic form of mouse thioredoxin reductase (mTR1) and human thioredoxin reductase (hTR1). The mouse Txnrd1 gene is located on chromosome 10 and yields a cDNA sequence approximately 3200 bp long. This se-quence encodes a protein with 499 amino acids and a molecular mass of 54.5 kDa [15]. The human TXNRD1 gene is located on chromosome 12 [16] and the first cDNA sequence reported was 3826 bp and encoded a protein of 495 amino acids with a molecular mass of 54.1 kDa [8]. Recently it was reported that the mouse Txnrd1 gene is alternatively spliced, producing three isoforms of mouse Txnrd1 mRNA that vary in their 5' sequences but have common downstream sequences [17]. Also recently, the human TXNRD1 gene has been reported to exhibit possible alternative splicing around its first exon [18].
Here we report the genomic organisation of the human and mouse thioredoxin reductase 1 genes, subsequently detailing the alternative splicing pattern of the first exon in both species. We also present a potential mouse Txnrd1 promoter region and compare this genomic DNA sequence with the recently studied TXNRD1 gene promoter [18] to identify conserved sequences.

Mouse Txnrd1 genomic organisation
To determine the genomic structure of the mouse Txnrd1 gene we initially screened a mouse genomic DNA library using a mouse Txnrd1 cDNA fragment as a probe. Successful screening, restriction mapping and Southern blot analysis revealed two overlapping Txnrd1 genomic clones approximately 15 kb in length. One clone was extensively mapped and sequenced and aligned with the Txnrd1 cDNA sequence [15] to reveal intron/exon junctions. The genomic sequences contained exons 5, 6, 9, 10, 11, 12, 13, 14 and 15 (accession numbers AF394929,  AF394930,  AF394931,  AF394932,  AF394933, AF394934, AF394935, AF394936 and AF394937 respectively). The location of exons 7 and 8 was determined by restriction enzyme mapping, however the precise intron/exon junctions were established using mouse genomic sequences from the National Centre for Biotechnology Information (NCBI) Trace Archive [www.ncbi.nlm.nih.gov] . Two sequences numbered 18222987 and 20784325 in the archive were used to confirm exons 7 and 8 respectively. These alignments enabled the intron/exon boundaries to be fully mapped between exons 5 and 15 (Table 1). Exons 5 to 15 span approximately 14 kb which is consistent with the size of the mouse genomic Txnrd1 clone obtained from the library screen. The intron/exon junctions for exons 1 to 4 and 16 were determined by database mining using genomic Txnrd1 DNA sequences deposited in the Trace Archive of the NCBI. Database searches using cDNA sequences corresponding to exons 1 to 4 and exon 16 [15] produced a number of 1 kb sequences. The following mouse Txnrd1 genomic sequences 19029770, 19679815, 18378992, 22011035, 17424013 and 18272097 were used to deduce intron/exon boundaries for exons 1, 2, 3a, 3b, 4 and 16 (respectively) although intron lengths could not be determined. Table 1 shows the length of each exon ranging  from the shortest, exon 5 (73 bp), to the longest, exon 3b (338 bp). The principal ATG start codon is located in exon 4 and the TAA stop codon in exon 16. Another ATG start codon that is utilised by one mRNA isoform generated by alternative gene splicing is found in exon 3b. Table 1 also shows the intron lengths in the region spanning exons 5 to 15. These sizes range from the shortest, intron 13 (94 bp), to the longest, intron 15 (3.13 kb). All introns display splice signals consistent with the GT/AG rule [19], except intron 4, in which the 5' splice donor is GC instead of GT. This discrepancy in the 5' splice donor in intron 4 is consistent with that also found in intron 4 of the human TXNRD1 gene (Table 1).

Human TXNRD1 genomic organisation
To determine the genomic structure of the human TXNRD1 gene, the human genome data from the NCBI database [www.ncbi.nlm.nih.gov] was searched using the TXNRD1 cDNA [8] sequence. Alignments were used to establish intron/exon junctions, and these results confirmed through the adherence of predicted splice signals to the AG/GT rule [19] and by comparisons made with the mouse intron/exon junctions described above. The cDNA sequences for mouse and human thioredoxin reductase are approximately 80% homologous, and the human [20] and mouse [21] thioredoxin genes have an identical genomic organisation. Therefore we predicted that the human and mouse thioredoxin reductase genes would also have a similar organisation. After searching approximately 100 kb of human TXNRD1 genomic DNA sequence, all exons were mapped.
The mapping of the human TXNRD1 gene is detailed in Table 1. Exon lengths range from the shortest, exon 1a (42 bp), to the longest, exon 3b (341 bp). The start codon is located in exon 4, however another ATG is found in exon 1 and may be utilised in at least one mRNA isoform [18]. The stop codon is positioned in exon 16, as is the 3'UT region. Intron sizes vary greatly from the shortest, intron 10 (917 bp), to the longest mapped intron, intron 12 (26.97 kb). Table 1 also lists the intron/exon splice signals which all conform to the AG/GT rule [19], except intron 4. Like intron 4 in the mouse Txnrd1 gene, the human 5' splice donor contains a GC instead of GT. The occurrence of this splice signal deviation in both mouse and human thioredoxin reductase intron 4 sequences is interesting since the 5' region of both genes are involved in alternative splicing and this could represent a potential regulatory site.

Alternative splicing of mouse and human TXNRD1 genes
Recent reports of alternative splicing in both the mouse [17,22] and human [18] thioredoxin reductase genes lead us to investigate the link between splicing events and genomic organisation. This investigation was carried out through comparisons of mouse and human thioredoxin reductase expressed sequence tags (EST) with the genomic organisation as described in this report.

Alternative splicing of mouse Txnrd1 gene
The Gladyshev group [17] proposed the first three exons in the mouse Txnrd1 gene are alternatively spliced, producing three isoforms of mouse Txnrd1 mRNA ( Figure  1). These three mRNA isoforms were reported to contain exons 1 and 4, 2 and 4, and 3 and 4 to generate isoforms I, II and III respectively. Transcripts consistent with some of these isoforms were described in another study [22], although they were labelled differently.
• Isoform I As reported by Sun et al [17], isoform I contains the ATG start codon positioned within exon 4 and translation of the resulting sequence produces the first reported mouse TR1 protein of 54.5 kDa [15].

• Isoform II
With respect to isoform II, we noted that ESTs (for example, accession number AI956288) containing exon 2 always displayed exon 1 immediately upstream, or did not extend far enough to be informative. Thus isoform II appears to contain exons 1, 2, 4, 5, 6, 7 etc ( Figure 1). The only ATG start codon in the correct reading frame is located in exon 4, hence translation of this isoform would also yield the 54.5 kDa mouse TR1 protein [15].

• Isoform III
With respect to form III, ESTs containing exon 3 (accession numbers AA168412, AI607108) were aligned with mouse Txnrd1 genomic DNA sequences in order to establish the intron/exon junctions. Alignment of a mouse Txnrd1 EST containing exon 3 (accession number AI607108) with mouse genomic Txnrd1 sequence (Trace Archive 22011035) showed exon 3 is actually comprised of two exons, designated as exons 3a and 3b in this study. The splicing signals around exon 3b are consistent with  the AG/GT rule [19]. Exon 3a was aligned with the mouse genomic Txnrd1 DNA sequence (Trace Archive 18378992) to confirm a consensus splice sequence at the 5' splice donor site to form the intron between exon 3a and 3b. The 5' end of exon 3a could not be accurately defined since the 5'end of isoform III has not been accurately mapped. All ESTs containing exon 3 sequences always contain both exon 3a and 3b and therefore isoform III is predicted to contain exons 3a, 3b, 4, 5, 6, 7 etc ( Figure 1). Two ATG start codons in the same reading frame are present in exons 3b and 4. Translation from the start codon encoded in exon 4 would produce the original 54.5 kDa protein, however use of the start codon in exon 3b yields a protein with a predicted molecular mass of 67 kDa. This 67 kDa protein coincides with the 67 kDa mTR1 protein previously reported [17].

Alternative splicing of human TXNRD1 gene
Alternative splicing of the human TXNRD1 gene was investigated based on that already determined for the mouse Txnrd1 gene and also on recent reports of possible alternative splicing in the human TXNRD1 gene [17,18]. The Arner group [18] reported three isoforms of human TXNRD1 mRNA -I, II and III, however these isoforms do not align with the three mouse mRNA isoforms previously discussed. In this report the human TXNRD1 isoforms are numbered according to structural and sequence similarity displayed with the mouse Txnrd1 isoforms where possible. Subsequently the isoforms I, II and III reported by the Arner group are denoted here as isoforms V, II and I (respectively). In addition to the isoforms reported by the Arner group we identified two further isoforms (isoforms IV and VI) and proposed another (isoform III), that would align with the mouse isoform III.
• Isoform I Isoform I contains exons 1, 2, 4, 5, 6, 7 etc (accession number BF182740) [8] and represents the human TXNRD1 cDNA sequence first reported [8]. There are two ATG start codons in the correct reading frame in exons 1 and 4. The ATG in exon 4 directs translation of the orginal 54.1 kDa hTR1 protein [8]. Translation of this mRNA isoform from the ATG in exon 1 yields a protein with a predicted molecular mass of approximately 60 kDa [18]. This protein coincides with a 60 kDa hTR1 protein previously reported [22].

• Isoform II
Isoform II is comprised of exons 1, 4, 5, 6, 7 etc (accession number AU077310). The ATG start codon is in exon 4 and directs translation of the original 54.1 kDa hTR1 protein.
• Isoform III Isoform III is predicted to contain exons 3b, 4, 5, 6, 7 etc, however no human thioredoxin reductase EST was found to contain exon 3b. Therefore isoform III is not denoted in Figure 1. This isoform prediction was based on sequence comparisons made between the mouse exon 3a and 3b sequences with human genomic TXNRD1 sequences. A human TXNRD1 genomic region with homology to mouse exon 3b was detected ( Figure 2). This alignment was flanked by canonical GT/AG splice signals at equivalent positions to that observed for mouse exon 3b, indicating the possibility of the human TXNRD1 mRNA isoform III. Exon 3a was not identified in the human genome sequences possibly due to low homology between the mouse and human 5' UT region. An align-   ment between the deduced amino acid sequences derived from the mouse and human exon 3b ( Figure 3) reveals a 50% identity between the two species (as outlined in Figure 3). If conservative amino acid substitutions are taken into consideration the homology percentage increases to approximately 70%. The high degree of sequence conservation further substantiates the likelihood of exon 3b being utilised in an as yet undescribed human mRNA variant.
An analysis of the reading frame in exon 3b revealed three in frame ATG codons that could potentially initiate translation. Translation of this mRNA isoform from the most 5' ATG start codon in exon 3b would produce a protein with a predicted molecular mass of approximately 67 kDa. A human TR1 protein with a mass of 67 kDa has been detected [17] substantiating the possibility that exon 3b is present in some mRNA transcripts. A search of the EST and nucleotide databases failed to identify any expressed sequences that contain exon 3b.
• Isoform IV Isoform IV was discovered following an EST database search. This isoform contains exon 1A, 4, 5, 6, 7 etc (accession number AU132293). Exon 1A is a product of an internal splice site located within exon 1. This splice site generates a 42 bp fragment that corresponds to the 5' end of exon 1. There is one ATG start codon in the correct reading frame in exon 4 and translation from this ATG yields the original 54.1 kDa hTR1 protein [8].
• Isoforms VI Isoform VI was discovered via an EST database search. This isoform contains exons 2, 4, 5, 6, 7 etc (accession number BG772375) and the ATG codon is found in exon 4 where it directs translation to potentially produce the original 54.1 kDa TR1 protein [8]. The EST that revealed this isoform extends a further 300 bp immediately upstream of the 5' end of exon 2. The nature of this 300 bp region is unknown as it did not display homology with human TXNRD1 cDNA sequences or any sequence in GenBank. Subsequently the 5' end of this isoform may not terminate with exon 2.

Potential thioredoxin reductase regulatory elements
The alternative splicing pattern utilised by both the TXNRD1 and Txnrd1 genes generate mRNA isoforms that are heterogenous at the 5' end. This alternative splicing presents consequences for possible gene regulation and location of control elements. Indeed Rundlof and co-workers [22] recently reported that the alterna-

Figure 2
Alignment of mouse thioredoxin reductase 1 (Txnrd1) exon 3b with partial human thioredoxin reductase 1 (TXNRD1) genomic DNA sequence. Splice junctions (AG/GT) are outlined in TXNRD1. The ATG codon that yields the 67 kDa protein form is conserved in both genes and is boxed.

M P V D D Y W L C L P A S C A R P F V Q T V R V M P V D D C W L Y F P A S R G R T F V Q T V W V V Q S C P H C C W F P G V L P S V P E P L R M P A P T C P N C C W F P G F L P P V P R P P H V P A M L P T G S H S A V L P P S H C S T A P P S T R V L L R G P R G A V L P A S R P S K T L P S S S Q E P S S S A D P K L C L S P P T S D S R Q E S Q T P C P T -D P C I C P P P S T P D S R Q E R N V Q F G L K N T Q S E L
BMC Genomics 2001, 2:10 http://www.biomedcentral.com/1471-2164/2/10 tively spliced mouse transcripts are expressed in a tissue specific manner. Currently the only genome information regarding transcriptional control of thioredoxin reductase 1 genes is available in the recent identification and characterisation of the core promoter for the human TXNRD1 gene [18]. This core promoter of approximately 180 bp contains Oct 1, Sp1 and Sp3 binding sites and has an increased GC content, suggesting the human TXNRD1 gene is a house keeping gene. However this does not explain the response of TXNRD1 to cellular signalling since thioredoxin reductase protein and mRNA levels are known to increase quickly and significantly in human cells in response to exogenous agents. It is possible that the core promoter for the human TXNRD1 gene is accompanied by other promoter elements that have not yet been identified. Comparisons between mouse and human promoter regions may reveal these potential regulatory elements.
A promoter region for the mouse Txnrd1 gene has yet to be elucidated. However in this study a genomic sequence of mouse DNA (NCBI Trace Archive 19029770) that extends 450 bp upstream from exon 1 has been examined for possible control elements. This sequence, when aligned with the human TXNRD1 core promoter region [18] produced a nucleotide match of approximately 70% (Figure 4). This potential mouse promoter region, like the corresponding human region, lacks a classical TATA box. Unlike the human region, it also lacks consensus Sp1, Sp3 and Oct 1 binding sites at equivalent positions. However it does contain potential AP-1 binding sites, one of which is conserved in the human promoter. There is also a potential CAAT box present at position 325 in both the mouse and human sequences (Figure 4). Promoter studies with the human promoter suggest this CAAT box is not required as part of the core promoter [18].
The mouse Txnrd1 mRNA isoform III does not contain exon 1, instead exon 3a is the most 5' exon. This infers that a potential promoter region may be present just upstream from exon 3a. This isoform encodes a 67 kDa protein that is expressed at lower levels [17] than the 54.5 kDa mouse TR1 protein first identified [15]. The difference in expression levels of these two forms of mouse Txnrd1 may be a reflection of the use of alternate promoter regions. In the human gene exons 1 and 4 span approximately 28 kb and similarly an alternative promoter may exist in this region to direct transcription of mRNA isoforms that lack exon 1.
In addition to the transcriptional regulation of the thioredoxin reductase 1 genes previously discussed, it is also important to consider post-transcriptional control. Recent studies [23,24] have investigated AU rich instability elements present in the 3' UT region of the human TXNRD1 gene that facilitate mRNA degradation, leading to rapid mRNA turnover. There are four AU elements present in the human 3'UT region and three of these elements are also present in the 3' UT region of the mouse Txnrd1 gene. Also, a non-AU rich instability element present in the human TXNRD1 3'UT region [24] is 83% conserved in the mouse Txnrd1 3'UT region [15] ( Figure  5). Thus, control of thioredoxin reductase 1 mRNA levels from both the human and mouse genomes appears to be directed from elements present in both the 5' and 3' UT regions.

Conclusions
In conclusion we report the genomic organisation of mouse and human thioredoxin reductase 1 genes. These genes display a conserved genomic organisation as the coding regions of both genes have an almost identical exon/intron structure. Comparison of mouse and human 5' sequences allows possible regulatory elements to be identified and also in this study has enabled alternative splicing events at the 5' end of each gene to be reviewed

Materials and methods
All chemicals were purchased from Sigma (Castle Hill, NSW, Australia) unless otherwise indicated. The Txnrd1 cDNA fragment used in the probe was generated using reverse transcriptase PCR that utilised mRNA isolated from mouse liver tissue (Oligotex Direct mRNA Mini Kit, Qiagen, Victoria, Australia) as the template with oligonucleotides designed from the mouse Txnrd1 cDNA sequence [15]. These two oligonucleotides (forward primer 5' ACATCTACGCCATTGGTGAC and reverse primer 5' TGGGGCTTAACCTCAGCAGC (Geneworks, Adelaide, Australia)) amplified a region of mouse Txnrd1 cDNA approximately 520 bp in length.

• Mapping of mouse Txnrd1
One of the clones obtained from the mouse genomic library screen was extensively mapped using restriction enzyme digests and Southern blot analysis. The resulting sequences were aligned with the mouse Txnrd1 cDNA sequence [15] using MacVector™ software (Oxford Molecular Group) to reveal the intron/exon junctions for exons 5, 6, 9, 10, 11, 12, 13, 14 and 15. To complete the intron/exon map of the mouse Txnrd1 gene, the mouse Txnrd1 cDNA sequence was used to BLAST-search [25] the NCBI Trace Archive for mouse genomic DNA fragments containing exons 1, 2, 3, 4, 7, 8 and 16. Numerous 1 kb genomic sequences were obtained and aligned with the relevant exon sequences via MacVector™ software to reveal the intron/exon junctions.

Genomic organisation of the human TXNRD1 gene
The human TXNRD1 cDNA sequence [8] was used to BLAST-search the NCBI human genome database for the TXNRD1 genomic DNA sequence. The resulting genomic sequence was aligned with the human TXNRD1 cDNA sequence using MacVector™ software to reveal intron/ exon junctions.

Alternative splicing patterns
Mouse BLAST-searches [25] of the NCBI EST and Trace Archive databases were used to analyse the alternative splicing patterns in the mouse Txnrd1 gene.
Human BLAST-searches of the NCBI EST and human genome databases were used to analyse the alternative splicing patterns in the human TXNRD1 gene.

Promoter Elements
The human TXNRD1 promoter sequence was used to BLAST-search the NCBI Trace Archive for the potential mouse Txnrd1 promoter region.

A G G T T A A G C C C C A G T G T G G A T G C T A G G T T A A G C C C C A G T G T G G A T G C T G T T G C C A A G A C T G C A A A C C A C T G G G T T G C C A A G A C T A C A G A C C A T T G C C T C G T T T C C G T G C C C A A A T C C A A G C T T G C T T C C T T G C C C A C G C C C A -G G C G A A G T T T T G T G A A G T T C A
Publish with BioMed Central and every scientist can read your work free of charge