All jawed-vertebrates have four T cell receptor (TCR) chains: alpha (TRA), beta (TRB), gamma (TRG) and delta (TRD). Marsupials appear unique by having an additional TCR: mu (TRM). The evolutionary origin of TRM and its relationship to other TCR remain obscure, and is confounded by previous results that support TRM being a hybrid between a TCR and immunoglobulin locus. The availability of the first marsupial genome sequence allows investigation of these evolutionary relationships.
The organization of the conventional TCR loci, encoding the TRA, TRB, TRG and TRD chains, in the opossum Monodelphis domestica are highly conserved with and of similar complexity to that of eutherians (placental mammals). There is a high degree of conserved synteny in the genomic regions encoding the conventional TCR across mammals and birds. In contrast the chromosomal region containing TRM is not well conserved across mammals. None of the conventional TCR loci contain variable region gene segments with homology to those found in TRM; rather TRM variable genes are most similar to that of immunoglobulin heavy chain genes.
Complete genomic analyses of the opossum TCR loci continue to support an origin of TRM as a hybrid between a TCR and immunoglobulin locus. None of the conventional TCR loci contain evidence that such a recombination event occurred, rather they demonstrate a high degree of stability across distantly related mammals. TRM, therefore, appears to be derived from receptor genes no longer extant in placental mammals. These analyses provide the first genomic scale structural detail of marsupial TCR genes, a lineage of mammals used as models of early development and human disease.
The hallmarks of the vertebrate adaptive immune system are antigen specific receptors, the T cell receptors (TCR) and immunoglobulins (Ig) encoded by genes that undergo somatic DNA recombination to generate diverse binding specificities. The TCR are expressed by thymus-derived lymphocytes (T cells) that play a major role in regulation and effector functions of immune responses. Each T cell expresses a unique TCR that binds a specific antigen resulting in the activation of an immune response [1, 2]. TCR are heterodimers comprised of either alpha (TRA) and beta (TRB) or gamma (TRG) and delta (TRD) combinations, respectively. These two combinations define the two major lineages of T cells: αβT cells and γδT cells [3, 2]. αβT cells typically recognize peptide antigens presented on major histocompatibility complex (MHC) encoded molecules. In contrast γδT cells have been shown to be either MHC restricted or in some cases, similar to Ig, able to bind free antigen . Ig are expressed by antibody forming cells (B cells), which produce both a membrane bound form of Ig that comprises the B cell receptor (BCR) and a soluble form that is free antibody. Like TCR, Ig are made up of two different chain types, a heavy (IgH) and light (IgL) chain. TCR and Ig chains both contain variable domains that bind the antigen and membrane-proximal constant (C) domains. It is the variable domains that are encoded by gene segments that undergo somatic recombination to generate diversity in binding specificity. The gene segments encoding the variable domains of TRA, TRG and IgL chains are the variable (V) and joining (J) gene segments, while the variable domains of TRB, TRD and IgH chains are encoded by exons assembled from V, diversity (D) and J gene segments . Recombination of these gene segments takes place in the thymus for developing T cells and adult bone-marrow for developing B cells .
Of the immunoglobulin superfamily (IgSF) members, the TCR and Ig are each other's nearest relatives, however there are dissimilarities to their genetic structure and evolutionary history [6, 7]. For example all jawed-vertebrates appear to contain the same four homologous TCR isotypes: TRA, TRB, TRG, and TRD . In contrast there is variability in the number and class of Ig isotypes in different vertebrate lineages [9, 10]. In addition the organization of TCR loci appears to be more conserved than Ig. For example, in cartilaginous fish most Ig loci are organized as multiple, unlinked clusters of [V-(D)-J-C], limiting the combinatorial usage of their gene segments . Whereas, in bony fish and tetrapods the majority of Ig loci are organized in the translocon style of Vn-(D)n-Jn-Cn . TCR loci tend to be organized in the translocon style in all lineages. These differences between TCR and Ig genes are likely the result of dissimilar selection forces on the two different antigen receptor systems and have made determining the evolutionary relationship of the TCR and Ig chains to each other unclear .
The relationship between Ig and TCR is further muddled by the recent discoveries in marsupials and sharks of TCR loci that appear to be hybrids between ancestral Ig and TCR loci [13, 14]. In marsupials, a mammalian lineage that diverged from eutherians (placental mammals) 186 to 193 million years ago (MYA), a new fifth TCR chain, named TCR mu (TRM) has been identified [14, 15]. Unlike the conventional TCR, TRM has a tandem cluster organization and its origins appear to have involved a recombination between and ancestral TCR locus, most likely a TRD and IgH. TRM appears analogous to an unusual shark TCR called NAR-TCR, which utilizes the C regions of TRD and upstream Ig-like V regions . Both TRM and NAR-TCR are expressed in an atypical TCR isoform that contains double variable domains. Marsupial TRM and shark NAR-TCR, however, are not orthologous but rather the product of convergent evolution generating common features . Nonetheless the presence of TCR with these features in both marsupials and cartilaginous fish, make it likely analogous TCR will be found in other vertebrate lineages and illustrate a level of plasticity in TCR evolution heretofore unrealized.
The availability of the first completely sequenced marsupial genome provides the opportunity to investigate the evolutionary origins of TRM and its relationship to the conventional TCR in mammals . Towards this aim we have determined the complete genomic organization, content and evolution of the loci encoding both the conventional TCRs (TRA, TRB, TRG and TRD) and the recently discovered TRM locus in the opossum Monodelphis domestica. In addition, these analyses provide a level of detail for the TCR genes of a marsupial that has only been available for a limited number of eutherians such as human and mouse.
Results and Discussion
Previous physical mapping of the TCR loci in the opossum revealed that the TRA and TRD were co-localized on chromosome 1p . Analysis of the opossum whole genome sequence confirmed that the TRD genes are clustered within the TRA locus resembling the organization of TRA/D observed in eutherians and birds (Figure 1) [18–20]. In opossum, the TRA/D locus spans approximately 1.3 Mb, making it intermediate in size compared to that of human at 1 Mb and mouse at 1.65 Mb [21, 22]. To investigate the degree of genomic conservation in this region we identified the genes syntenic to opossum TRA/D locus and compared these to the available genomic data for mouse, human, cow and chicken using the current ENSEMBL databases for each species . At the 5' end of the opossum TRA/D locus are the methyl-transferase like 3 (METTL3), zinc finger protein (SALL2) and several olfactory receptor (OR) loci that have conserved synteny in human, mouse, and cow, but not chicken (Figure 1 and Additional file 1). However these genes are not immediately flanking the cow TRA/D locus. Also conserved at the 5' end of the TRA/D locus in opossum, human, and mouse are TRA V genes (TRAV) interspersed with the OR loci. In opossum and mouse there is only one TRAV segment interspersed with the OR, whereas in human there are two at this location (Figure 1) . These TRAV gene segments (opossum TRAV1, mouse TRAV1, and human TRAV1.1 and 1.2) appear orthologous in phylogenetic analyses forming their own distinct group (group B) in a tree of TRAV and TRDV sequences (Figure 2). The 3' end of the TRA/D locus appears to be the most conserved across species since many of the loci have conserved synteny in both mammals and birds. The opossum has two copies of the defender against cell death gene 1 (DAD1) also found at the 3' end of the human, mouse, cow and chicken TRA/D loci. In those eutherian mammals examined the position of the abhydrolase domain-containing protein 4 gene (ABHD4) is also conserved (Figure 1 and Additional file 1). The opossum TRDV6 gene segment further illustrates the conservation of the TRA/D locus across mammals. This gene segment is in an inverted orientation and located downstream of the TRD C (TRDC). A clear ortholog of TRDV6 is found in both human and mouse with the same location and reading orientation and these gene segments from all three species fall into the same phylogenetic clade (see below, Figure 2). These results all support the overall organization of the mammalian TRA/D loci and their flanking genomic regions being highly conserved over a span of at least 186 My and as long as 300 My in some cases [15, 24].
Overall the organization and complexity of gene segments within the opossum TRA/D locus is similar to that of human and mouse. There are 74 total V segments (TRAV plus TRDV) in opossum, a number intermediate to that of human and mouse (Table 1). In human and mouse the V gene segments are either TRAV or TRDV, in other words used in TRA or TRD chains respectively, or in some cases specific V segments have been found expressed in either TRA or TRD chains and have been designated TRA/DV. The category a particular V segment falls into is historically defined by a number of criteria. One criterion is nucleotide similarity to V segments defined already in other species; by this criterion there are 68 opossum TRAV and six TRDV segments when compared with human and mouse. These two groups, with a single exception, are spatially separated in the TRA/D locus with the TRAV segments at the 5' end and the TRDV at the 3' end of the locus. The exception is a single TRDV segment (TRDV1) interspersed with the TRAV segments (Figure 1). Of the 68 TRAV and 6 TRDV, 12 and 2 respectively appear to be pseudogenes due to absence of a complete open reading frame (ORF) (Figure 1, Table 1). This is a ratio of functional to non-functional gene segments comparable to human and mouse (Table 1).
To determine which of the 74 V gene segments are also used as TRA/DV we performed RT-PCR on 23 day old thymus RNA using combinations of primer pairs specific for each of the V segment subgroups (see below) paired with either TRA or TRD C regions (TRAC and TRDC respectively) (not shown). This age was chosen as an early age where the thymus is fully mature . Nineteen of the TRAV segments and all four of the functional TRDV segments were found expressed with both TRAC and TRDC resulting in a total of 23 apparent TRA/DV segments (Figure 1 and Table 1). The number of opossum TRA/DV may be an underestimate since it is possible some TRAV may be rarely expressed with TRDC or may appear at different times during development than was examined. Alternatively this number could be an overestimate as well, since it is possible that some combinations are negatively selected in the thymus and not used in the periphery. Either way this appears to be a comparatively high number of TRA/DV segments in the opossum relative to human and mouse (Table 1).
V gene segments evolve by gene duplication and deletion resulting in degrees of relatedness amongst segments . These are defined as subgroups with the segments belonging to the same subgroup by having 80% or greater nucleotide identity. By this criterion the current 68 TRAV segments can be placed into 41 subgroups, where the nucleotide identity between subgroups ranges from 32.1 to 79.5%. The six TRDV gene segments were sufficiently different, with nucleotide identity ranging from 33.6 to 61.9%, that each belonged to its own distinct subgroup. Phylogenetic analyses using TRAV and TRDV segments were performed to elucidate their evolutionary relatedness. Opossum sequences were compared with sequences from human, mouse, rabbit, cow, sheep, and chicken using the same dataset as used by Su et al.  to define the major phylogenetic groups (Figure 2). Eight groups of V gene segments (designated A through H) with bootstrap values greater than 89% emerged from the inclusion of the marsupial sequences (Figure 2). All eight contain opossum sequences (Figure 2). Four of these, groups A through D, were defined previously and only included TRAV or TRAV/D . However, the addition of opossum sequences revealed four new V groups (E through H) not previously recognized or requiring reevaluation. Group E contains TRAV, TRDV and TRA/DV sequences; groups F and G only TRDV sequences; and group H TRAV and TRA/DV sequences. The addition of the opossum sequences to this analysis substantiate the statement that all these subgroups were present in the common ancestor of amniotes and that some species have lost segments that belong to different subgroups .
These analyses also allow us to evaluate the evolution of mammalian V segments that can be utilized with either TRA or TRD chains over a larger time-span than has been available previously. TRA/DV sequences clearly belong to different groups rather than forming a monophyletic cluster (Figure 2) and, as observed for human and mouse, TRA/DV regions are dispersed throughout the opossum TRA/D locus, although most are located towards the 3' end of the locus (Figure 1). Considering that αβ and γδT cells recognize potentially very different antigens these results continue to support the high level of plasticity in V gene utilization at this locus.
The number and complexity of D and J gene segments in the opossum TRA/D locus is also comparable to that of human and mouse (Figure 1, Table 1). Encoding TRD chains are at least two D and six J gene segments (TRDD and TRDJ respectively), all of which are located upstream of a single TRDC. There are at least 53 TRAJ segments located between the TRDC and TRAC all of which appear to be functional by the criteria defined above for the V gene segments (Figure 1, Table 1).
The opossum TRAC and TRDC regions are encoded by three exons (Figure 3A, 3B) that, as reported previously, encode residues conserved in other species [28, 29]. For both TRAC and TRDC, exon 1 encodes an IgSF domain, which contains two cysteine residues that form the intra-chain disulfide bond. Exon 2 encodes the connecting peptide (Cp) containing the cysteine residue involved in the inter-chain disulfide bond. Exon 3 encodes the transmembrane (Tm) and a short cytoplasmic (Ct) region (Figure 3A, 3B). In the TM region of both TRA and TRD there are two hydrophilic residues (lysine and arginine) that are conserved in other species and that are important for interaction with other dimers from the TCR complex . There are five potential N-glycosylation sites in the opossum TRAC and two in the TRDC [28, 29].
TRB has been physically mapped to chromosome 8q in the opossum . The opossum TRB locus spans 400 kb making it smaller in size than its human and mouse homologues that are about 650 kb each (Figure 4) [31, 22]. Genes syntenic the opossum TRB locus are also conserved across human, mouse, cow and chicken. These include the trypsinogen genes (TRY) found at the 5' and 3' ends of the TRB locus and intermixed between the TRBV and TRBC gene segments (Figure 4) . The mono-oxygenase DBH-like 2 (DBHL) is found at the 5' end of the locus in the opossum, similar to human, mouse, and cow, but not in chicken. Genes such as kell blood group glycoprotein (Kel) and ephrin type-b receptor 6 precursor (EPHB6) found at the 3' end of the opossum TRB locus have conserved synteny in mammals and chicken (Figure 4). As with TRA/D, the TRB locus organization is highly conserved between opossum and eutherians. This is further illustrated by a TRBV segment (TRBV28 in opossum) located at the 3' end of the locus that is in the reverse orientation relative to the other gene segments. A clear orthologue of this gene segment is present in human (TRBV30) and mouse (TRBV31) making this an ancient arrangement (Figure 4 and group F in Figure 5).
There are 36 opossum TRBV segments (Figure 4, Table 1) and these can be grouped in 28 subgroups based on nucleotide identity with the value between subgroups ranging from 38.5 to 68.5%. Compared with other TCR, the TRB locus in eutherians appears to contain a higher number of V pseudogenes, where 19% and 34% of TRBV are pseudogenes in human and mouse, respectively. This pattern appears to hold for the opossum since nine of the 36 TRBV segments (25%) appear to be pseudogenes (Table 1).
Phylogenetic analyses of the TRBV regions from different mammals and chicken reveal that the opossum TRBV regions are very diverse with six groups (A to F) identified (Figure 5). Human, mouse and opossum TRBV sequences are found in all six groups, supporting their presence before the divergence of marsupial and eutherian mammals (Figure 5) . Three sequences chickenB1S1, mouseB2 and opossumB20 did not cluster with any other sequence, nor with each other, and therefore not included within any of the groups.
As in human and mouse, the opossum TRB D, J, and C genes are organized in tandem cassettes. The human and mouse TRB locus contains two of such D-J-C cassettes while the opossum has four (Figures 3C–3F and 4). In the opossum, each cassette contains a single TRBD, four or five TRBJ, and a single TRBC. All four cassettes appear to be functional and TRBJ gene segments from each have been found in TRB transcripts (data not shown).
The four opossum TRBC regions are very similar to each other at the nucleotide level and in intron – exon organization. Each TRBC region is encoded by four exons (Figure 3C–F). Exon 1 encodes the immunoglobulin domain which contains two conserved cysteines residues important for the intra chain disulfide bond formation. There are three potential N-glycosylation sites in exon 1. Exon 1 of TRBC1, TRBC2 and TRBC3 have 100% nucleotide identity and TRBC4 differs by only a single non-synonymous nucleotide substitution (K10E) that encodes a lysine (AAA) instead of the glutamate (GAA). Exon 2 encodes the Cp and it contains a conserved cysteine residue used for the inter chain disulfide bond (Figure 3C–F). The Cp from the four opossum TRB share more than 82.6% nucleotide identity. Exon 3 encodes the Tm region and contains a lysine residue involved in the interaction with the CD3 complex. The Tm regions are much conserved among the four opossum TRBC, sharing greater than 85.1% nucleotide identity. Exon 4 encodes the cytoplasmic region and it also includes 3' untranslated region.
Due to the high degree of sequence similarity among the four cassettes described above it is difficult to fully reconstruct the duplication events that led to the current arrangement in the opossum. However, cassettes 2 and 3 share two characteristics indicating they are derived from a relatively recent tandem duplication. First of all, both cassettes are nearly identical in nucleotide sequence over a 12.8 kb region that extends from 4.8 kb upstream of a non-functional copy of cyclin A2 including the gene segments TRBD, TRBJ to 100 bp downstream of TRBC (Additional file 2). Secondly both cassettes 2 and 3 have a cyclin A2 gene 5' of the D-J-C gene segments, which is not present in the other two cassettes (Figure 4). Cyclin A2 is also associated with the cow TRB locus but is not in human, mouse or chicken. This is consistent with the cyclin A2 gene being inserted near the TRB D-J-C cassettes prior to the divergence of marsupials and eutherians, with subsequent loss in some eutherian species such as human and mouse.
TRG has been physically mapped to chromosome 6q in the opossum . As in human and mouse the TRG locus is the smallest and least complex of the three conventional TCR loci. From the most 5' V to the 3' untranslated region (UTR) of the single C region, the opossum TRG locus spans only approximately 90 kb (Figure 6), smaller than that in human (150 kb) and mouse (205 kb) [32, 33]. The opossum TRG locus has a translocon organization, which is different from that present in human and mouse (Figure 6). The human TRGV segments are upstream of two TRGJ-TRGC cassettes, while in mouse there are four cassettes that contain TRGV, TRGJ and TRGC gene segments [34, 33]. Even though the organization of V, J and C gene segments appears different between human, mouse and opossum, the genes flanking this locus are conserved among these species (Figure 6). Amphiphysin (AMPH) and the related to steroidogenic acute regulatory protein D3-N-terminal like (STARD3NL) are found at the 5' and 3' end of the TRG locus respectively (Figure 6). Both AMPH and STARD3NL are also associated with the TRG locus in cow and chicken, although their locations appear changed when searching in their current assembled genomes . In cows and sheep there are two TRG loci, TRG1 and TRG2 [36, 37]. To determined if the opossum also has more than one TRG locus we examined its genome thoroughly by performing BLASTN searching of the entire MonDom5.0 assembly using the TRG cDNA and genomic sequences. Only a single TRG locus was identified and this corresponds to the locus we mapped previously to chromosome 6q (Figure 6) .
There are nine TRGV gene segments present in the opossum and these are divided into four subgroups based on nucleotide identity (Figure 7). All TRGV segments appear to be functional, and have been found expressed in the thymus (not shown). The number of TRGV segments is similar to that in mouse where there are seven V segments, all of which are functional. In human there are fourteen V segments, but only six are functional (Table 1).
Previously, phylogenetic analyses of the mammalian and avian TRGV segments revealed the presence of eight ancient groups . Addition of the opossum TRGV sequences to these analyses revealed two additional groups, group I and J. Group I contains human, mouse and opossum TRGV4. Support for this group is low (Figure 7), however it is likely that these three sequences from opossum, human and mouse are derived from the same ancestral gene since they also have similar RSS sequences (data not shown). The new group J contains only the members of the opossum TRGV1 subgroup (Figure 7), which is related to the previously defined group F that contains sequences from human, rabbit, sheep and cow . This relationship is not well supported however, and two groups may not have arisen from a common ancestral gene segment (Figure 7). Opossum TRGV3 segments group in a previously defined group A. The single member of opossum TRGV2 subgroup does not clearly cluster with any existing group (Figure 7).
There is only a single opossum TRGC region (Figure 3G and 6) in contrast to four in mouse (three of which are functional) and two in human (both functional) (Table 1). The opossum TRGC is encoded by three exons. Exon 1 encodes the immunoglobulin domain and contains a single N-glycosylation site. An unusual characteristic described previously in marsupials was the absence of the second cysteine residue required for the formation of intrachain disulfide bond in the TRGC region . Exon 2 encodes the Cp and exon 3 encodes the Tm, Ct and 3'UTR regions (Figure 3G).
The TRM locus has been described so far only in marsupials and is located on chromosome 3q in the opossum . Homologs to TRM have yet to be found in any eutherian mammal examined so far (results not shown). Previous analyses of the TRM locus were consistent with TRM being a hybrid locus generated by recombination between ancestral Ig and TCR genes . To examine this hypothesis further we analyzed the genes flanking the TRM locus to look for any evidence of this recombination. On the immediate centromeric, 5' side of TRM are three zinc finger protein genes of the C2H2 type (ZNF3) (Figure 8 and Additional file 1). Unfortunately these share similarity to human and mouse ZNF3 genes on several chromosomes making orthology difficult to establish (data not shown). On the telomeric, 3' side of TRM are genes encoding Speckle type POZ-like protein (PCIF1-like) and Myelin Oligodendrocyte Glycoprotein (MOG). In both cases PCIF1 and MOG have paralogous copies in the opossum genome and the paralogue syntenic to TRM is the least similar to the eutherian homologue. None of the genes flanking the opossum TRM locus have conserved synteny in human and mouse making it difficult to identify a region of the eutherian genome that is homologous to the region of the opossum genome containing TRM. In other words, and in contrast to the conventional TCR, the chromosomal region containing TRM is not well conserved in mammals.
Previously we reported that the TRM V gene segments appeared to be more similar to Ig V gene segments than that of TCR . This conclusion was drawn from an analysis of a limited number of available marsupial TCR V gene segments at that time. Availability of the complete TCR genomic sequences described above allows us to further test this observation. TRM is organized in tandem clusters where complete clusters contain two classes of V segments: a single non-rearranged V gene segment (TRMV) and an unusual V gene segment which is already joined to D and J genes (TRMVj) in the germline DNA . The opossum has six such complete clusters. The remaining two clusters are partial, lacking the TRMV and TRMD gene segments (Figure 8) .
To investigate the evolutionary history of the individual TRM clusters we compared the sequence and organization of the six complete clusters (clusters 1, 2, 3, 4, 5, and 7) and two partial clusters (clusters 6 and 8). These analyses revealed three classes of clusters based on gene content and nucleotide sequence identity. These three classes likely represent the most ancient duplications of TRM clusters. The lineages generated by these older duplications are represented by clusters 1, 2 and 3, respectively. Clusters 5 and 7 have similar gene content (three TRMD segments) and share greater nucleotide sequence identity to cluster 3 and represent more recent whole cluster duplications that would have followed the duplication of an additional D segment in this lineage . Clusters 4, 6, and 8 contain two TRMD segments and share greatest nucleotide sequence identity to cluster 2, representing another round of more recent duplications. Cluster 1 is a third class unto itself based on not sharing significant sequence similarity to the others . These results are consistent with the current complement of TRM clusters in opossum being the result of whole cluster duplications followed by divergence of each cluster. These duplication events have resulted in partial clusters cases and different numbers of clusters in different marsupial species; bandicoots for example appear to have only two TRM clusters [14, 29].
Opossum TRMV and TRMVj each form distinct clades in a phylogenetic analysis and share only 38 to 43.5% nucleotide identity to each other and are, therefore, from distinct subgroups (Figure 9A) . Now having the complete genomic sequence from the opossum, we wished to compare the TRMV genes with V genes from the conventional TCR and Ig loci. When TRMV and TRMVj are compared to all conventional opossum TCR V gene segments the similarity also remains low; nucleotide identity between TRM V genes and TRA, TRB, TRG, and TRD V genes ranges from 27 to 43%. In other words, the TRA/D, TRB, and TRG loci do not contain V segments from which either TRMV or TRMVj appear to have been derived. The greatest similarity for TRM V genes remains with the IgH V gene segments (VH) with nucleotide identity ranging from 34 to 51%. However, the clades of TRMV and TRMVj remain outside those containing VH genes from a variety of species (Figure 9A). In addition we compared the TRM V genes to all the germline VH in opossum and four other marsupial species (Tammar wallaby, Virginia opossum, Brush-tail possum, and Northern Brown bandicoot) and TRMV and IGHV continued to form distinct clades (Figure 9B). These results are consistent with a conclusion that, indeed, TRMV are most related to IGHV, but not sufficiently similar to extant marsupial IGHV to determine from which they might have been derived.
In addition to the six functional TRMV gene segments located within the TRM locus, there is a single TRMV orphon gene (TRMV-OR2) located on opossum chromosome 2 in a region containing a number of flanking sequences resembling long interspersed repeat elements (LINE)(data not shown). TRM is the only of the TCR loci in the opossum for which orphon V gene segments have been found. TRMV-OR2 appears to be non-functional and has no leader peptide, but it does contain a complete V gene segment and the recombination signal sequence (RSS). TRMV-OR2 is most similar to TRMV2 and TRMV4, sharing 96 and 95% nucleotide identity, respectively. The high degree of identity between these functional gene segments and the orphon is consistent with the latter being the result of a relatively recent duplication event that, based on the flanking LINE elements may have been due to transposition . TRMV-OR2 provides additional evidence for the role of retroelements in both gene translocation in marsupials. A similar translocation was reported previously for opossum MHC class I genes, where two class I loci (UB and UC) had been translocated outside the MHC region. UB and UC are similarly tightly flanked by retroelements [40, 41]. Furthermore, TRMV-OR2 may also provide an additional connection between retroelements and the evolution of TCR genes themselves in marsupials. TRMVj is a variable region gene segment that appears to have been generated by retro-transposition since it is lacking an intron that all V gene segments have and is already joined to the D and J gene segments in the germline . Although highly speculative, it is possible that translocated orphons such as TRMV-OR2 may have contributed to the activation of local retroelement genes to allow their co-expression when the TRM locus is actively transcribed. This would have been a necessary step in the retro-translocation events that generated TRMVj.
Is TRM present in lineages other than marsupials?
The availability of a large number of vertebrate genome sequences, with varying degrees of depth of sequence coverage, provided the opportunity to search for TRM or TRM-like genes in species other than marsupials. A search of the available genomes using the BLAST algorithm and opossum TRM sequences was unable to identify a homologue in any of the eutherian species available. This search included human, mouse, rabbit, dog, cat, cow, horse, hedgehog, elephant and armadillo. The rabbit, cat, hedgehog, elephant and armadillo are low coverage genome sequences ranging from 1.86× to 2×. While it is possible that in any given species the TRM locus was missed in the sequencing, it is unlikely that such a random gap would have been consistently present in all the eutherian genomes. Therefore we conclude that TRM is not likely present in any eutherian lineage. We also searched the current chicken and the anole lizard (Anolis carolinensis) genomes in a similar manner and were unable to detect clear TRM homologues. In the case of chicken and all the eutherian genomes, the TRD homologues were identifiable indicating that our search strategies are able to pick up this conventional TCR locus. Furthermore, we were able to identify a clear TRM homologue in the recently completed platypus genome sequence (Ornithorhynchus_anatinus-5.0) available at GenBank. Further characterization of the platypus TRM locus is ongoing and beyond the scope of this paper. Nonetheless identification of these genes in a monotreme, which are separated from marsupials and eutherians by 217 to 231 MY , further supports that our search strategies should be able to identify TRM homologues in other species. These results are also consistent with TRM being present early in the evolution of mammals and therefore likely lost in the eutherian lineage.
First and foremost, we have described in detail the genomic content and complexity of the T cell receptor loci for the opossum Monodelphis domestica, the first such analysis available for a marsupial. The opossum is arguably the most extensively studied marsupial species and is used as a model of human disease and development. The opossum, for example, is one of the few mammalian model organisms that develop melanoma following exposure to ultraviolet radiation providing a cancer model . Additionally, opossums are a natural host and a reservoir of the causative agent of Chagas disease, Trypanosoma cruzi . Like all marsupials, opossums give birth to highly altricial young also providing a model for early immune as well as other anatomical system development. Further characterization of the immune system in the opossum, and T cell immunobiology in particular, is important for better understanding of these disease and developmental models. Complete characterization of the TCR genomics in this species is one step in that direction.
In spite of detailed analyses of the opossum conventional TCR loci, the origins of TRM remain enigmatic. The current evidence support the following conclusions and model for the origin of TRM: 1) There likely was a recombination or insertion event between an IgH and TCR locus (Figure 10); 2) The TCR locus involved was most likely TRD or a TRD-like based on sequence similarity . Unfortunately the highly conserved and stable organization of the TRA/D region across birds and mammals does not provide clues as to how TRD might have participated in the origins of TRM; 3) The IGH-TCR hybrid formed likely underwent a whole or partial duplication event giving rise to multiple sets of V, D, and J elements one of which remained unrearranged in the germline, the other becoming germline joined either through direct RAG mediated V(D)J recombination in the germline (left-hand path in Figure 10) or through retrotransposition (right-hand path in Figure 10). Ongoing analyses of the TRM locus in the platypus may yield further insights into these possible scenarios for the origins of this unusual TCR chain.
The opossum whole genome assembly MonDom5 was used in this study and it is available at GenBank under the accession number AAFR00000000. The location for all TCR gene segments in MonDom5 is provided in Additional file 3. For comparative purposes the current genome assemblies from human (NCBI 36), mouse (NCBI m37), cow (Btau_3.1) and chicken (WASHUC2), rabbit (RABBIT), dog (CanFam2.0), cat (CAT), horse (preEnsembl EquCab2), hedgehog (eriEur1), elephant (BROAD E1), armadillo (ARMA) and the anole lizard (AnoCar1.0) were searched for any evidence of TRM. Genomes were analyzed using BLAST assembled genomes tools [23, 35].
Opossum lymphoid tissues were collected and immediately processed to extract the RNA or stored in RNAlater® (Ambion, Austin, TX) at 4°C for 24 hours and stored at -80°C for use later. Whole RNA extraction was performed using the Trizol RNA extraction protocol (Invitrogen, Carlsbad, CA). Tissue was homogenized in 1 ml of Trizol® Reagent per 100 mg until the tissue was completely dispersed. Phase separation was done using 200 μl of chloroform per 1 ml of Trizol. RNA was precipitated with 500 μl of isopropanol per 1 ml of trizol, washed with 70% ethanol and resuspended in 50 to 100 μl of DEPC water. DNase treatment to remove contaminating DNA has been performed using Ambion's kit TURBO DNA-free (Ambion, Austin, TX). Each sample was quantified using the NanoDrop ND-1000 Spectrophotometer (NanoDrop Technologies, Wilmington, DE).
Reverse transcription, PCR and sequencing
Reverse transcription-polymerase chain reactions (RT-PCR) were performed using GeneAmp RNA PCR Core Kit (Applied Biosystems, Foster City, CA). Amplifications of cDNAs were performed using AdvantageTM-HF 2 PCR (BD Biosciences, CLONTECH Laboratories, Palo Alto, California) with the conditions: 94°C for 1 minute, denaturation at 94°C for 30 seconds, annealing/extension according to the melting temperature of the primers, and a final extension period of 68°C for 5 minutes.
PCR products were cloned using TOPO TA Cloning® Kit for sequencing (Invitrogen, Carlsbad, CA). Plasmids were sequenced using BigDye Terminator Cycle Sequencing Kit v3 (Applied Biosystems, Foster City, CA) in 10 μl reactions and analyzed on an ABI Prism 3100 DNA automated sequencer (PerkinElmer Life And Analytical Sciences Inc, Wellesley, MA). Analyses of chromatograms were done using the SequencherTM 4.6 program (Gene Codes Corporation, Ann Arbor, MI).
Identification of TRV, TRD, TRJ, TRC genes
To determine the location, content and organization of the TCR genes, the whole opossum genome was searched using the BLAST algorithm. TRA, TRB, TRG, TRD and TRM were located using sequences previously isolated [28, 29, 38, 14]. The V and J segments were located by similarity to corresponding segments from other species and by identifying the flanking conserved RSS.
Rapid amplification of 5' complementary DNA ends (5' RACE) performed on opossum thymus mRNA was used to identify novel, expressed V, D and J segments. In addition to using 5' RACE, PCR using primers that are specific for each V family were also used in RT-PCR to amplify cDNA containing VDJ recombinations that are underrepresented in the RACE PCR. Primers used to amplify the conventional TCR are complementary to the most conserved sequences of the V regions and have been paired with primers located on the C regions (Additional file 4). Sequences obtained by these means were compared with the whole opossum genome using the BLAST algorithm to identify novel TCR gene segments. This approach allowed the identification of gene segments in six possible reading frames, which also help to find gene segments located in the reverse orientation and D segments that may be used in multiple reading frames. The exon-intron organization of V regions was determined using sequences obtained by 5' RACE.
Determination of the exon-intron organization in the C regions was done using BLAST to compare available cDNA sequences that encode complete TCR chains to the opossum genomic sequences. Additionally, transcripts encoding the C terminal end of TCR were obtained by 3' RACE performed on thymus cDNA. Clones obtained by 3' RACE PCR were used to identify the CP, TM and CT and 3' untranslated regions (UTR) for each one of the TCR. cDNA made using the Oligo dT primer was used to perform 3' RACE PCR. The Oligo dT primer supplies a priming site for the GeneRacerTM 3' PCR primers. Sequences obtained by 3' RACE were aligned with the germline sequences to determine the location, intron-exon boundaries and splice sites of these exons.
Opossum gene segments were named following the IMGT nomenclature established for human and mouse . TRV segments were numbered according to their location, from the 5' to 3' end of the locus. TRAJ segments were numbered from 3' to 5' according to a nomenclature proposed by Koop et al.  and followed by IMGT. The TRBD, TRBJ and TRBC found in four cassettes in the opossum are numbered according to the cassette to which they belong and the position of the cassette from 5' to 3' in the locus.
Nucleotide sequences that encode from FR1 through FR3 of the V regions identified from each one of the opossum TCR loci were compared to TCR V sequences from other species retrieved from GenBank. The sequences were aligned using ClustalX  and BioEdit  programs. Phylogenetic analyses were performed using MEGA version 3.0  for the distance methods (neighbor joining and minimum evolution). Confidence values were obtained from bootstrap analyses using 1000 replications.
The accession numbers for sequences used in the phylogenetic analyses are:
Gray short-tailed opossum: IGHV1-1: AAC48826; IGHV1-6: AAC48820; IGHV1-11: AAC48816; IGHV1-16: AAC48836; IGHV1-23: AAC48846; IGHV2-1: AAC48849. Remaining germline sequences are available at AAFR00000000. Brush-tail possum: 70: AAL87470; 71: AAL87471; 72: AAL87472; 73: AAL87473; 74: AAL87474; 76: AAL87476; 77: AAL87477; 78: AAL87478; 79: AAL87479; 91: AAD41691; 92: AAD41692; 41: AAT40441; 40: AAT40440; 44: AAT40444; 43: AAT40443; 42: AAT40442. Virginia opossum: P83 is an unpublished sequence kindly provided by Dr. R. Riblet. Tammar Wallaby: sequences are unpublished but available upon request. Northern Brown Bandicoot: Bandicoot58: AY586158. Shark NTCR: AAY98815.
Dot plot analyses
Comparisons of the genomic sequence were performed using the program Spin from the Staden Package . Dot matrix plots were generated to determine degrees of similarity among cassettes for the TRB locus. The sequence analyzed includes the four TRB cassettes with coding and non-coding sequence from 3kb upstream of the most 5' TRBD (TRBD1) segment to the most 3' TRBC (TRBC4) segment, which comprises 51 kb. Although not shown, similar comparisons were performed for the other TCR loci.
T cell receptor. TRA: T cell receptor alpha. TRB: T cell receptor beta. TRG: T cell receptor gamma. TRD: T cell receptor delta. TRM: T cell receptor mu. Ig: immunoglobulin. V: variable gene segment. D: Diversity gene segment. J: Joining gene segment. C: constant region. Tm: transmembrane region. Cp: connecting peptide. Ct: cytoplasmic region. RSS: recombination signal sequence. MHC: major histocompatibility complex.
Kronenberg M, Siu G, Hood LE, Shastri N: The molecular genetics of the T cell antigen receptor and T cell antigen recognition. Ann Rev Immunol. 1986, 4: 529-41. 10.1146/annurev.iy.04.040186.002525.
Marchalonis JJ, Bernstein RM, Shen SX, Schluter SF: Emergence of the immunoglobulin family: conservation in protein sequence and plasticity in gene organization. Glycobiology. 1996, 6: 657-663. 10.1093/glycob/6.7.657.
Criscitiello MF, Saltis M, Flajnik MF: An evolutionarily mobile antigen receptor variable region gene: Doubly rearranging NAR-TcR genes in sharks. Proc Natl Acad Sci USA. 2006, 103: 5036-5041. 10.1073/pnas.0507074103.
Rheede TV, Bastiaans T, Boone DN, Hedges SB, De-Jong WW, Madsen O: The Platypus Is in Its Place: Nuclear Genes and Indels Confirm the Sister Group Relation of Monotremes and Therians. Mol Biol Evol. 2006, 23: 587-597. 10.1093/molbev/msj064.
Deakin JE, Parra ZE, Graves JAM, Miller RD: Physical Mapping of T cell receptor loci (TRA, TRB, TRD and TRG) in the opossum (Monodelphis domestica). Cytogenet Genome Res. 2006, 112: 342K-10.1159/000089901.
Satyanarayana K, Hata S, Devlin P, Roncarolo MG, De Vries JE, Spits H, Strominger JL, Krangel MS: Genomic organization of the human T-cell antigen-receptor α/δ locus. Proc Natl Acad Sci USA. 1988, 85: 8166-8170. 10.1073/pnas.85.21.8166.
Chien YH, Iwashima M, Kaplan KB, Elliot JF, Davis MM: A new T-cell receptor gene located within the alpha locus and expressed early in T-cell differentiation. Nature. 1987, 327: 677-682. 10.1038/327677a0.
Glusman G, Rowen L, Lee I, Boysen C, Roach JC, Smith AFA, Wang K, Koop BF, Hood L: Comparative genomics of the human and mouse T cell receptor loci. Immunity. 2001, 15: 337-349. 10.1016/S1074-7613(01)00200-X.
Baker ML, Rosenberg GH, Zuccolotto P, Harrison GA, Deane EM, Miller RD: Further characterization of T cell receptor chains of marsupials. Dev Comp Immunol. 2001, 25: 495-507. 10.1016/S0145-305X(01)00016-7.
Call ME, Wucherpfennig KW: The T cell receptor: critical role of the membrane environment in receptor assembly and function. Ann Rev Immunol. 2005, 23: 101-125. 10.1146/annurev.immunol.23.021704.115625.
Lefranc MP, Rabbitts TH: Two tandemly organized human genes encoding the T-cell gamma constant region sequences show multiple rearrangement in different T-cell types. Nature. 1985, 316: 464-466. 10.1038/316464a0.
Miccoli MC, Antonacci R, Vaccarelli G, Lanave C, Massari S, Cribiu EP, Ciccarese S: Evolution of TRG clusters in cattle and sheep genomes as drawn from the structural analysis of the ovine TRG2@ locus. J Mol Evol. 2003, 57: 52-62. 10.1007/s00239-002-2451-9.
Conrad ML, Mawer MA, Lefranc MP, McKinnell L, Whitehead J, Davis SK, Pettman R, Koop BF: The genomic sequence of the bovine T cell receptor gamma TRG loci and localization of the TRGC5 cassette. Vet Immunol Immunopathol. 2007, 115: 346-356. 10.1016/j.vetimm.2006.10.019.
Parra ZE, Arnold T, Nowak MA, Hellman L, Miller RD: TCR gamma chain diversity in the spleen of the duckbill platypus (Ornithorhynchus anatinus). Dev Comp Immunol. 2006, 30: 699-710. 10.1016/j.dci.2005.10.002.
Miska KB, Wright AM, Lundgren R, Sasaki-McClees R, Osterman AK, Gale JM, Miller RD: Analysis of a marsupial MHC region containing two recently duplicated class I loci. Mamm Genome. 2004, 15: 851-854. 10.1007/s00335-004-2224-4.
Teixeira AR, Monteiro PS, Rebelo JM, Arganaraz ER, Vieira D, Lauria-Pires L, Nascimento R, Vexenat CA, Silva AR, Ault SK, Costa JM: Emerging Chagas Disease: Trophic Network and Cycle of Transmission of Trypanosoma cruzi from Palm Trees in the Amazon. Emerg Infect Dis. 2001, 7: 100-112.
This work was supported by funding from the National Institutes of Health Initiative for Maximizing Student Diversity (ZEP, JT) and Institutional Development Award (IDeA) programs (MLB, JH), the Robert C. McNair program (AML) and the National Science Foundation (RDM).
Authors and Affiliations
Center for Evolutionary and Theoretical Immunology and Department of Biology, University of New Mexico, Albuquerque, NM, 87131, USA
Zuly E Parra, Michelle L Baker, Jennifer Hathaway, April M Lopez, Jonathan Trujillo, Alana Sharp & Robert D Miller
ZEP, MLB, and RDM conceived of the study, participated in its design, and drafted the manuscript. ZEP, JH, AML, JT, and AS generated the data and performed the sequencing reactions. ZEP performed the data analyses. All authors have read and approved the final manuscript.
Additional file 1: Opossum TCR syntenic genes and their corresponding location on chromosomes of several species. this table contains the opossum TCR syntenic genes and their corresponding location on chromosomes of several species. (PDF 12 KB)
Additional file 2: Dot plot analyses of opossum TRB cassettes. The dot-matrix analysis corresponds to the comparison of 51 Kb region containing the four TRB D-J-C cassettes aligned to itself. (PDF 14 KB)
Additional file 3: Location of opossum TRA/D, TRB, TRG and TRM gene segments. contains four tables (A-D) with the location of opossum TRA/TRD, TRB, TRG and TRM gene segments in the MonDom5 assembly. (PDF 54 KB)
This article is published under license to BioMed Central Ltd. This is an Open Access article is distributed under the terms of the Creative Commons Attribution License (
), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.
Parra, Z.E., Baker, M.L., Hathaway, J. et al. Comparative genomic analysis and evolution of the T cell receptor loci in the opossum Monodelphis domestica.
BMC Genomics9, 111 (2008). https://doi.org/10.1186/1471-2164-9-111