- Research article
- Open Access
Rapid evolution of cancer/testis genes on the X chromosome
BMC Genomicsvolume 8, Article number: 129 (2007)
Cancer/testis (CT) genes are normally expressed only in germ cells, but can be activated in the cancer state. This unusual property, together with the finding that many CT proteins elicit an antigenic response in cancer patients, has established a role for this class of genes as targets in immunotherapy regimes. Many families of CT genes have been identified in the human genome, but their biological function for the most part remains unclear. While it has been shown that some CT genes are under diversifying selection, this question has not been addressed before for the class as a whole.
To shed more light on this interesting group of genes, we exploited the generation of a draft chimpanzee (Pan troglodytes) genomic sequence to examine CT genes in an organism that is closely related to human, and generated a high-quality, manually curated set of human:chimpanzee CT gene alignments. We find that the chimpanzee genome contains homologues to most of the human CT families, and that the genes are located on the same chromosome and at a similar copy number to those in human. Comparison of putative human:chimpanzee orthologues indicates that CT genes located on chromosome X are diverging faster and are undergoing stronger diversifying selection than those on the autosomes or than a set of control genes on either chromosome X or autosomes.
Given their high level of diversifying selection, we suggest that CT genes are primarily responsible for the observed rapid evolution of protein-coding genes on the X chromosome.
Cancer/testis (CT) genes are a growing family of genes defined by a unique pattern of expression: amongst normal tissues, they are expressed only in cells of the germ line and in embryonic trophoblasts, but their gene products are also found in a significant number of malignant cancers . The first CT genes were discovered because of the immune responses that they elicit in some cancer patients, and can thus be classified as CT antigens [2, 3]; systematic exploration of publicly available gene expression profiles (as documented in EST libraries, SAGE and MPSS data, and microarray experiments) uncovered a significant number of additional CT genes [4, 5], against most of which immune responses have not yet been documented. Nevertheless, all CT genes are in principle attractive targets for cancer immunotherapy, because the gonads are immunoprivileged organs and anti-CT immune responses will therefore target tumours specifically. Vaccination using peptides derived from the NY-ESO-1 (CTAG1B) and MAGEA1 CT genes has already been proven to bring clinical benefits to melanoma patients [6, 7].
CT genes comprise more than 240 members from 70 families, and can be subdivided into two broad categories based on chromosomal localization. CT-X genes are located on the X chromosome, are mostly members of gene families organized into complex direct and inverted repeats, and are expressed primarily during the spermatogonial stage of spermatogenesis . Non-X CT genes are located on autosomes, are mostly single-copy genes, and are expressed primarily during the meiotic and reduction division stages of spermatogenesis . Careful annotation of the sequence of the human X chromosome has revealed that as many as 10% of all genes present on the chromosome are members of known CT families ; further analysis of the expression patterns of genes of unknown function located in repeated regions could even increase this estimate . The biological functions of most CT-X genes have not been characterized in any detail. However, evidence is emerging that the best studied of these, the MAGE genes, can act as signal transducing transcriptional modulators. Moreover, MAGE genes appear to be able to mediate proliferative signals [10–12] and a member of the GAGE family has been shown to repress apoptosis , thus directly contributing to the malignant phenotype when aberrantly expressed in cancer. Available data suggest that many CT genes are involved in the re-programming of the transcriptional machinery that occurs during the transition from mitotic to meiotic division during spermatogenesis. It has been suggested that a similar re-programming may be responsible for some of the phenotype of malignant cancer cells [8, 14].
There is mounting evidence that the evolutionary history of the human X chromosome is significantly different from that of autosomes. It contains a disproportionate number of tandem and interspersed segmental duplications, both direct and inverted, containing genes with a testis-specific expression pattern including many CT-X genes . These duplications are unstable in the genome, and subject to copy number polymorphisms, both within the human population and between humans and chimpanzees [15, 16]. While its overall DNA sequence has diverged significantly less than that of autosomes since speciation of hominoids from chimpanzees , a significant proportion of protein-coding genes located on the X chromosome are under higher diversifying (positive) selection than those on autosomes . Genes located on the X chromosome are also the most abundant source of functional retrogenes in the primate lineage, and constitute a reservoir of genetic material for the generation of new genes and functions in this lineage, again with a bias toward testis-specific functions [19, 20].
For all of these reasons, it is of interest to trace the evolutionary history of CT genes, and particularly of the CT-X subset, and to measure the selective pressures that act on them. Many of the human CT-X genes do not have easily identifiable orthologues in the mouse, rat or dog genomes, precluding such an analysis among Eutheria using currently available genome data. For example, it has been shown that the large MAGE family of CT-X genes has expanded independently in the primate and rodent lineages . The recent availability of a draft genome for the chimpanzee has made it feasible to study the evolution of the CT genes within the primate lineage. We show here that the CT genes in general and the CT-X genes in particular are under strong diversifying pressure and amongst the fastest-evolving genes in the human genome.
Identification of CT gene families in chimpanzee
To date at least seventy CT gene families, many with multiple members, have been identified in human. We took the opportunity afforded by the publication of the initial sequence of the chimpanzee genome  to ask whether CT genes were conserved in man's closest evolutionary neighbour. To this end we assembled a list of human transcript sequences representing all CT gene families, and searched for homologous sequences in the human and chimpanzee genomes. We expected that given the relatively short time elapsed since human-chimpanzee divergence (~ 6 million years ago ) the human sequences would be able to detect CT gene homologues in the chimpanzee genome. Moreover, since the majority of CT genes isolated thus far were detected and characterized using transcript information via cDNA cloning protocols, performing the same search in human allowed us to identify all CT genes present in the current assembly of the human genome. We implemented a two-stage approach in order to accurately define the structure of each CT gene locus. First, we used MegaBlast  to search for regions homologous to the CT transcript sequences. Then we applied the SIBsim4 cDNA to genome alignment program (an improved version of sim4 ) to these regions to establish a gene structure from a locus-specific spliced alignment (see Methods). As can be seen in Table 1, almost all human CT families are found in chimpanzee, and the chromosomal locations of the CT genes in chimpanzee correspond to those in human. In terms of copy number, the biggest family, PRAME, is well represented in chimpanzee (37 genes), as are MAGEA (9 genes) CTAGE (15 genes), XAGE (12 genes) and SSX (8 genes). The number of CT genes in each family is probably underestimated because of the relatively low sequence coverage in the current version of the chimpanzee genome assembly. This is especially true for the X chromosome, where the sequence coverage is only about 2-fold , and where most of the human multi-gene CT families are located. Nevertheless, the current data indicate that some chimpanzee CT families (FTHL17/CT38, TSPY/CT78 and PRAME) may contain more members than in human.
In order to investigate more closely the relatedness of CT genes in these two species, we sought putative human and chimpanzee orthologues for as many CT genes as possible, based on nucleotide sequence identity to the cognate human transcript sequence. Ninety-eight orthologous CT pairs were defined in this way (see Methods and additional file 1). The average identity of the human and chimpanzee orthologues to the human transcript sequences was 99.6% and 97.8%, respectively. Since we were interested in the characteristics of CT genes as a group, we also defined a group of human-chimpanzee orthologous non-CT control genes from chromosome X, where most of the CT genes are located, and from autosomal chromosomes 18 and 19 (see Methods). The reasons for choosing a limited set of control genes were two-fold: first, this allowed us to generate manually curated alignments of the same quality as for the CT genes, and second, it provided test and control groups of similar sizes for statistical analysis. The average identity of the human and chimpanzee control orthologues to the human transcript sequences was 99.6% and 98.7%, respectively. The finding that the chimpanzee and human CT orthologues were on average less closely related than the control orthologues (97.8% versus 98.7%; p < 2.2e-16 by a chi-squared test) suggested a possible difference in the divergence rates between the CT group and the control group. We tested this by analysing the substitution rates between human and chimpanzee ORF sequences (see below). Given the high accuracy of the human genomic sequence, the finding that the average human identity was less than 100% for both CT genes and non-CT control genes presumably reflects polymorphisms and/or sequencing errors in the original transcript sequences.
CT genes on chromosome X are evolving faster than those on other chromosomes
We estimated the divergence rates of the CT genes from pairwise sequence alignments of the human and chimpanzee orthologues using phylogenetic analysis (PAML package ). Mutations in a protein-coding gene can either have no effect (synonymous changes) or alter the sequence of the encoded protein (non-synonymous changes). The rate of synonymous changes (dS) indicates the background mutation frequency, while the ratio of the non-synonymous to synonymous mutation rates (dN/dS) indicates the type of evolutionary pressure acting on the gene. A dN/dS ratio value less than 1 suggests negative or purifying selection, a ratio equal to 1 suggests neutral evolution, and a ratio greater than 1 suggests positive or diversifying selection . To test what type of evolutionary pressure might be acting on the CT genes, we aligned the ORFs in the human-chimpanzee orthologue pairs and used the codeml program from the PAML package  to estimate the dN/dS ratios. Again, for comparison purposes, the control genes were subjected to an identical procedure. Figure 1 shows the distribution of dN/dS ratios for the CT genes and controls by chromosomal location. In contrast to the control genes, which show the distribution of ratios expected if most genes are under purifying selection, CT genes located on chromosome X have an excess of ratios greater than one. At the level of individual genes, SSX1, PAGE2B, SSX4, MAGEB2, GAGE4 and CPXCR1 have rate ratios greater than 2, indicative of strong evolutionary selective pressure acting on the gene products (Table 2). CT genes located on chromosomes other than chromosome X (CT-nonX) have a distribution of ratios skewed towards lower values, suggesting that this subgroup is evolving slower than the CT-X genes. In contrast, the majority of control genes, irrespective of chromosomal location, have rate ratios less than 0.5, suggestive of purifying selection. In addition, the nonsynonymous substitution rates for CT genes which had no synonymous changes between human and chimpanzee was on average higher than for the controls (see additional file 2).
The apparent difference between the dN/dS distributions for the CT genes and the controls was assessed for significance using a nonparametric Mann-Whitney test, which indicates whether the medians of the two populations are significantly different. The difference in dN/dS values between all CT genes and all controls is highly significant with a p-value of 1.128e-11 (Table 3). Moreover, the difference between CT genes and the controls is significant whether the CT genes are located on chromosome X (p = 4.686e-10) or not (p = 1.498e-05). The distribution of dN/dS values is also significantly different for CT genes on chromosome X compared to those elsewhere (p = 2.812e-05), suggesting that there is stronger selective pressure on CT genes located on chromosome X. In contrast, there is no significant difference in the distribution of dN/dS ratios between the control genes located on chromosome X or elsewhere (p = 0.4962). Previous work has shown that the protein-coding genes on the hominid X chromosome have a higher average dN/dS value than other chromosomes . Our results suggest that the CT genes contribute strongly to this difference, and thus to the rapid evolution of protein-coding genes on the X chromosome.
Several recent publications have taken advantage of the chimpanzee draft genome to identify genes that are under diversifying selection in the primate lineage ( and references therein). Their conclusions were concordant, in that they identified the X chromosome as containing a high number of positively selected genes, they found that positively selected genes are predominantly testis-specific, and that their functions are linked to gametogenesis as well as sensory perception and immunity against invading pathogens. Because most of these studies were performed at the whole genome level, they tended to focus on genes for which orthologues could be easily identified and pairwise alignments of coding regions generated automatically. This may explain why they failed to identify CT genes as a dominant group of positively selected genes. A review of recently published literature confirms that only a limited number of CT genes have been recognised as undergoing positive selection (Table 4). Moreover, a large proportion were identified through investigation of individual CT gene families (SPANX  and PRAME ). In the present study, we have focused on the comparison between human and chimpanzee CT genes, with an emphasis on generating high-quality manually curated data. This was made necessary by the fact that many CT genes are located within segmental duplications and hence have multiple paralogues, and that we tried to be exhaustive in our analysis of all known CT gene families. Because of the large number of gaps that remain in the current assembly of the chimpanzee genome and the relatively high stringency we imposed on the extent of the alignments, we have certainly underestimated the number of CT homologues present in the chimpanzee genome, and some of the human:chimpanzee pairs may not correspond to true orthologues. However, neither of these problems should significantly affect the main conclusions of our study.
Given the close evolutionary kinship between humans and chimpanzees it is not surprising that all known CT gene families are shared between the two species. On the other hand, homologues of many CT antigens have not been found outside the primate lineage so far, and the available genome data are still too sparse to track the appearance of CT gene families during mammalian evolution. Even though the data are still incomplete, it is clear that most CT gene families are undergoing copy number expansions in the primate lineage, presumably driven by non-allelic homologous recombination between segmental duplications. The best-studied CT family in this respect is SPANX, which is present as a single-copy gene in rodents and has duplicated and acquired new sub-families in the primate lineage, including at least one (SPANX-C) found to be specific to humans on the basis of its genomic position . SPANX genes have been shown to have copy number polymorphisms in the human population, potentially linked to susceptibility to prostate cancer, and to undergo very rapid evolution affecting both dN and dS . An elegant study of the PRAME cluster on human chromosome 1  revealed the recent expansion in the human lineage of these genes via two large segmental duplications, and subsequent smaller duplications that may be polymorphic in the human population. The large MAGE family of CT antigens, which also comprises genes that do not show a CT expression pattern, has expanded in both the primate and rodent lineages, but independently . Our data also show that many MAGE genes are under diversifying selection (Table 2).
By definition, CT genes are expressed in testis, and for those for which data exists expression has been shown to be restricted to cells involved in spermatogenesis. It is believed that many CT genes are also expressed during oogenesis, but data on this process are still very sparse [30, 31]. There is abundant evidence in the literature that many genes expressed predominantly during gametogenesis, as well as those implicated in reproduction in general (e.g. those encoding proteins found in the seminal fluid or expressed predominantly in the prostate) are undergoing positive selection during evolution [32–34]. In this respect, CT genes seem to behave much like other reproductive genes.
However, the CT-X genes are a special case, in that diversifying selective pressure seems more intense on this class. It is probable that the evolutionary pressures driving changes in the encoded protein sequences and those driving the expansion of the CT-X gene families are similar. Strikingly, the X chromosome is enriched in intrachromosomal tandem segmental duplications relative to autosomes . Several hypotheses have been put forward to explain why a subset of genes located on the X chromosome is evolving faster than those on autosomes [34–36]. Our data do not shed new light on this subject. However, it is interesting to note that CT-X genes contribute very significantly to the high average positive selection observed in protein-encoding genes on this chromosome, against a genomic background that is much more highly conserved than on the autosomes . One may speculate that transcriptional controls on recently duplicated genes could be relaxed relative to the parental copies, thereby allowing re-expression in tumours and the partial replication in these tumours of the transcriptional changes accompanying gametogenesis.
Essentially all human CT families have homologues at the same chromosomal locations in the chimpanzee genome. The copy numbers in the multi-gene CT families may differ between the two species but until a high-quality assembly of the chimpanzee genome is available this cannot be assessed in a reliable way. On the average, CT genes are under stronger positive selection than a set of randomly selected control genes. CT-X genes as a group are evolving very rapidly, not only relative to control genes on the X chromosome or on autosomes, but also relative to autosomal CT genes.
CT genes and human/chimpanzee genomic sequences
Human Reference sequence (RefSeq ), or GenBank (where no RefSeq was available) entries were obtained for transcripts representing all documented CT gene families in the CT Gene Database . Transcript sequences were also obtained for additional candidate CT genes described in recent publications, which have not yet been added to the CT Gene Database. In some cases, multiple alternatively spliced transcript sequences from the same gene were selected to maximize sequence representation of the locus. Although PRAME has not been designated a CT gene, due to its trace level of expression in some normal adult tissues other than testis, it does exhibit the other main characteristics of CT genes, i.e. strong expression in the testis and up-regulation in various tumours, and was included in the set of CT genes selected for this study. Non-CT control genes were randomly chosen from lists of genes having a RefSeq identifier on chromosomes X, 18 (low gene density) and 19 (high gene density), generated using BioMart [39, 40]. Control genes were selected from locations distributed uniformly along the lengths of the chromosomes to average out site-specific differences in mutation rates. The human (Homo sapiens) genomic sequence used was NCBI Build Number 36 (version 1, release date 9 March 2006), obtained from the NCBI. The chimpanzee (Pan troglodytes) genomic sequence used was NCBI Build Number 2 (version 1, release date 4 October 2006), also obtained from the NCBI.
Identification of CT gene loci in human and chimpanzee
CT gene loci were identified in both human and chimpanzee based on sequence identity between the human transcript sequences and human or chimpanzee genomic sequences. We used MegaBlast  to identify genomic regions homologous to the RefSeq sequences and SIBsim4  (an improved version of sim4 ) to produce high quality spliced alignments at those sites, from which locus-specific transcript sequences were generated. A gene was considered complete if the alignment contained at least 80% of the cognate transcript length or 80% of the annotated open reading frame (ORF), and had at least 85% identity to the human transcript sequence. Putative orthologues were identified as the sequences in human and chimpanzee genomes having the highest identity (and satisfying the 80% length threshold) to the same human transcript sequence. In many cases the poor quality (gaps, incorrect assembly) of the published chimpanzee genome sequence prevented us from finding a chimpanzee orthologue to the human gene. High quality sequence alignments for putative human/chimpanzee orthologues were obtained for 98 of the initial list of 135 CT genes (73%) and 153 of the 180 control genes (85%) selected randomly from chromosomes 18, 19 and X.
Divergence of CT genes
The genome-based transcript sequences derived from human and chimpanzee for each putative orthologous pair were aligned using clustalw (version 1.81 ), with gap extension penalties set to zero to allow gaps in the alignment arising from sequences missing in the chimpanzee assembly. Both sequences in the alignment were then trimmed to the extent of the human ORF based on annotation in the RefSeq or GenBank entry. Each nucleotide alignment was manually curated and revised, if necessary, to reflect the corresponding protein alignment. ORFs containing stop codons were dropped from the analysis. Rates of synonymous (dS; also known as Ks) and non-synonymous (dN; also known as Ka) substitutions between aligned ORFs were estimated using the codeml programme from the PAML package  with the F3x4 codon frequency model (and runmode = -2 in the codeml control file). Note that incomplete codons in either the human or the chimpanzee sequence are ignored by codeml. The statistical significance of differences in the distributions between human-chimpanzee divergence rates (dN/dS) among CT genes and controls was assessed using a Mann-Whitney (Table 3) or Welch two sample t-test (additional file 3) in the R package .
CT genes on chromosome X
nonsynonymous substitution rate
synonymous substitution rate
National Center for Biotechnology Information
open reading frame
phylogenetic analysis by maximum likelihood
Scanlan MJ, Simpson AJ, Old LJ: The cancer/testis genes: review, standardization, and commentary. Cancer Immun. 2004, 4: 1-
van der Bruggen P, Traversari C, Chomez P, Lurquin C, De Plaen E, Van den Eynde B, Knuth A, Boon T: A gene encoding an antigen recognized by cytolytic T lymphocytes on a human melanoma. Science. 1991, 254 (5038): 1643-1647. 10.1126/science.1840703.
Chen YT, Scanlan MJ, Sahin U, Tureci O, Gure AO, Tsang S, Williamson B, Stockert E, Pfreundschuh M, Old LJ: A testicular antigen aberrantly expressed in human cancers detected by autologous antibody screening. Proceedings of the National Academy of Sciences of the United States of America. 1997, 94 (5): 1914-1918. 10.1073/pnas.94.5.1914.
Chen YT, Scanlan MJ, Venditti CA, Chua R, Theiler G, Stevenson BJ, Iseli C, Gure AO, Vasicek T, Strausberg RL, Jongeneel CV, Old LJ, Simpson AJ: Identification of cancer/testis-antigen genes by massively parallel signature sequencing. Proceedings of the National Academy of Sciences of the United States of America. 2005, 102 (22): 7940-7945. 10.1073/pnas.0502583102.
Chen YT, Iseli C, Venditti CA, Old LJ, Simpson AJ, Jongeneel CV: Identification of a new cancer/testis gene family, CT47, among expressed multicopy genes on the human X chromosome. Genes, chromosomes & cancer. 2006, 45 (4): 392-400. 10.1002/gcc.20298.
Chianese-Bullock KA, Pressley J, Garbee C, Hibbitts S, Murphy C, Yamshchikov G, Petroni GR, Bissonette EA, Neese PY, Grosh WW, Merrill P, Fink R, Woodson EM, Wiernasz CJ, Patterson JW, Slingluff CL: MAGE-A1-, MAGE-A10-, and gp100-derived peptides are immunogenic when combined with granulocyte-macrophage colony-stimulating factor and montanide ISA-51 adjuvant and administered as part of a multipeptide vaccine for melanoma. J Immunol. 2005, 174 (5): 3080-3086.
Jager E, Karbach J, Gnjatic S, Neumann A, Bender A, Valmori D, Ayyoub M, Ritter E, Ritter G, Jager D, Panicali D, Hoffman E, Pan L, Oettgen H, Old LJ, Knuth A: Recombinant vaccinia/fowlpox NY-ESO-1 vaccines induce both humoral and cellular NY-ESO-1-specific immune responses in cancer patients. Proceedings of the National Academy of Sciences of the United States of America. 2006, 103 (39): 14453-14458. 10.1073/pnas.0606512103.
Simpson AJ, Caballero OL, Jungbluth A, Chen YT, Old LJ: Cancer/testis antigens, gametogenesis and cancer. Nat Rev Cancer. 2005, 5 (8): 615-625. 10.1038/nrc1669.
Ross MT, Grafham DV, Coffey AJ, Scherer S, McLay K, Muzny D, Platzer M, Howell GR, Burrows C, Bird CP, Frankish A, Lovell FL, Howe KL, Ashurst JL, Fulton RS, Sudbrak R, Wen G, Jones MC, Hurles ME, Andrews TD, Scott CE, Searle S, Ramser J, Whittaker A, Deadman R, Carter NP, Hunt SE, Chen R, Cree A, Gunaratne P, Havlak P, Hodgson A, Metzker ML, Richards S, Scott G, Steffen D, Sodergren E, Wheeler DA, Worley KC, Ainscough R, Ambrose KD, Ansari-Lari MA, Aradhya S, Ashwell RI, Babbage AK, Bagguley CL, Ballabio A, Banerjee R, Barker GE, Barlow KF, Barrett IP, Bates KN, Beare DM, Beasley H, Beasley O, Beck A, Bethel G, Blechschmidt K, Brady N, Bray-Allen S, Bridgeman AM, Brown AJ, Brown MJ, Bonnin D, Bruford EA, Buhay C, Burch P, Burford D, Burgess J, Burrill W, Burton J, Bye JM, Carder C, Carrel L, Chako J, Chapman JC, Chavez D, Chen E, Chen G, Chen Y, Chen Z, Chinault C, Ciccodicola A, Clark SY, Clarke G, Clee CM, Clegg S, Clerc-Blankenburg K, Clifford K, Cobley V, Cole CG, Conquer JS, Corby N, Connor RE, David R, Davies J, Davis C, Davis J, Delgado O, Deshazo D, Dhami P, Ding Y, Dinh H, Dodsworth S, Draper H, Dugan-Rocha S, Dunham A, Dunn M, Durbin KJ, Dutta I, Eades T, Ellwood M, Emery-Cohen A, Errington H, Evans KL, Faulkner L, Francis F, Frankland J, Fraser AE, Galgoczy P, Gilbert J, Gill R, Glockner G, Gregory SG, Gribble S, Griffiths C, Grocock R, Gu Y, Gwilliam R, Hamilton C, Hart EA, Hawes A, Heath PD, Heitmann K, Hennig S, Hernandez J, Hinzmann B, Ho S, Hoffs M, Howden PJ, Huckle EJ, Hume J, Hunt PJ, Hunt AR, Isherwood J, Jacob L, Johnson D, Jones S, de Jong PJ, Joseph SS, Keenan S, Kelly S, Kershaw JK, Khan Z, Kioschis P, Klages S, Knights AJ, Kosiura A, Kovar-Smith C, Laird GK, Langford C, Lawlor S, Leversha M, Lewis L, Liu W, Lloyd C, Lloyd DM, Loulseged H, Loveland JE, Lovell JD, Lozado R, Lu J, Lyne R, Ma J, Maheshwari M, Matthews LH, McDowall J, McLaren S, McMurray A, Meidl P, Meitinger T, Milne S, Miner G, Mistry SL, Morgan M, Morris S, Muller I, Mullikin JC, Nguyen N, Nordsiek G, Nyakatura G, O'Dell CN, Okwuonu G, Palmer S, Pandian R, Parker D, Parrish J, Pasternak S, Patel D, Pearce AV, Pearson DM, Pelan SE, Perez L, Porter KM, Ramsey Y, Reichwald K, Rhodes S, Ridler KA, Schlessinger D, Schueler MG, Sehra HK, Shaw-Smith C, Shen H, Sheridan EM, Shownkeen R, Skuce CD, Smith ML, Sotheran EC, Steingruber HE, Steward CA, Storey R, Swann RM, Swarbreck D, Tabor PE, Taudien S, Taylor T, Teague B, Thomas K, Thorpe A, Timms K, Tracey A, Trevanion S, Tromans AC, d'Urso M, Verduzco D, Villasana D, Waldron L, Wall M, Wang Q, Warren J, Warry GL, Wei X, West A, Whitehead SL, Whiteley MN, Wilkinson JE, Willey DL, Williams G, Williams L, Williamson A, Williamson H, Wilming L, Woodmansey RL, Wray PW, Yen J, Zhang J, Zhou J, Zoghbi H, Zorilla S, Buck D, Reinhardt R, Poustka A, Rosenthal A, Lehrach H, Meindl A, Minx PJ, Hillier LW, Willard HF, Wilson RK, Waterston RH, Rice CM, Vaudin M, Coulson A, Nelson DL, Weinstock G, Sulston JE, Durbin R, Hubbard T, Gibbs RA, Beck S, Rogers J, Bentley DR: The DNA sequence of the human X chromosome. Nature. 2005, 434 (7031): 325-337. 10.1038/nature03440.
Park JH, Kong GH, Lee SW: hMAGE-A1 overexpression reduces TNF-alpha cytotoxicity in ME-180 cells. Mol Cells. 2002, 14 (1): 122-129.
Duan Z, Duan Y, Lamendola DE, Yusuf RZ, Naeem R, Penson RT, Seiden MV: Overexpression of MAGE/GAGE genes in paclitaxel/doxorubicin-resistant human cancer cell lines. Clin Cancer Res. 2003, 9 (7): 2778-2785.
Glynn SA, Gammell P, Heenan M, O'Connor R, Liang Y, Keenan J, Clynes M: A new superinvasive in vitro phenotype induced by selection of human breast carcinoma cells with the chemotherapeutic drugs paclitaxel and doxorubicin. Br J Cancer. 2004, 91 (10): 1800-1807. 10.1038/sj.bjc.6602221.
Cilensek ZM, Yehiely F, Kular RK, Deiss LP: A member of the GAGE family of tumor antigens is an anti-apoptotic gene that confers resistance to Fas/CD95/APO-1, Interferon-gamma, taxol and gamma-irradiation. Cancer Biol Ther. 2002, 1 (4): 380-387.
Yang B, O'Herrin S, Wu J, Reagan-Shaw S, Ma Y, Nihal M, Longley BJ: Select Cancer Testes Antigens of the MAGE-A, -B, and -C Families Are Expressed in Mast Cell Lines and Promote Cell Viability In Vitro and In Vivo. J Invest Dermatol. 2006
Perry GH, Tchinda J, McGrath SD, Zhang J, Picker SR, Caceres AM, Iafrate AJ, Tyler-Smith C, Scherer SW, Eichler EE, Stone AC, Lee C: Hotspots for copy number variation in chimpanzees and humans. Proceedings of the National Academy of Sciences of the United States of America. 2006, 103 (21): 8006-8011. 10.1073/pnas.0602318103.
Bailey JA, Eichler EE: Primate segmental duplications: crucibles of evolution, diversity and disease. Nature reviews. 2006, 7 (7): 552-564. 10.1038/nrg1895.
Patterson N, Richter DJ, Gnerre S, Lander ES, Reich D: Genetic evidence for complex speciation of humans and chimpanzees. Nature. 2006, 441 (7097): 1103-1108. 10.1038/nature04789.
The Chimpanzee Sequencing and Analysis Consortium: Initial sequence of the chimpanzee genome and comparison with the human genome. Nature. 2005, 437 (7055): 69-87. 10.1038/nature04072.
Emerson JJ, Kaessmann H, Betran E, Long M: Extensive gene traffic on the mammalian X chromosome. Science. 2004, 303 (5657): 537-540. 10.1126/science.1090042.
Vinckenbosch N, Dupanloup I, Kaessmann H: Evolutionary fate of retroposed gene copies in the human genome. Proceedings of the National Academy of Sciences of the United States of America. 2006, 103 (9): 3220-3225. 10.1073/pnas.0511307103.
Chomez P, De Backer O, Bertrand M, De Plaen E, Boon T, Lucas S: An overview of the MAGE gene family with the identification of all human members of the family. Cancer research. 2001, 61 (14): 5544-5551.
Zhang Z, Schwartz S, Wagner L, Miller W: A greedy algorithm for aligning DNA sequences. J Comput Biol. 2000, 7 (1-2): 203-214. 10.1089/10665270050081478.
Florea L, Hartzell G, Zhang Z, Rubin GM, Miller W: A computer program for aligning a cDNA sequence with a genomic DNA sequence. Genome research. 1998, 8 (9): 967-974.
Yang Z: PAML: a program package for phylogenetic analysis by maximum likelihood. Comput Appl Biosci. 1997, 13 (5): 555-556.
Yang Z: Inference of selection from multiple species alignments. Current opinion in genetics & development. 2002, 12 (6): 688-694. 10.1016/S0959-437X(02)00348-9.
Sabeti PC, Schaffner SF, Fry B, Lohmueller J, Varilly P, Shamovsky O, Palma A, Mikkelsen TS, Altshuler D, Lander ES: Positive natural selection in the human lineage. Science. 2006, 312 (5780): 1614-1620. 10.1126/science.1124309.
Kouprina N, Mullokandov M, Rogozin IB, Collins NK, Solomon G, Otstot J, Risinger JI, Koonin EV, Barrett JC, Larionov V: The SPANX gene family of cancer/testis-specific antigens: rapid evolution and amplification in African great apes and hominids. Proceedings of the National Academy of Sciences of the United States of America. 2004, 101 (9): 3077-3082. 10.1073/pnas.0308532100.
Birtle Z, Goodstadt L, Ponting C: Duplication and positive selection among hominin-specific PRAME genes. BMC genomics. 2005, 6: 120-10.1186/1471-2164-6-120.
Kouprina N, Pavlicek A, Noskov VN, Solomon G, Otstot J, Isaacs W, Carpten JD, Trent JM, Schleutker J, Barrett JC, Jurka J, Larionov V: Dynamic structure of the SPANX gene cluster mapped to the prostate cancer susceptibility locus HPCX at Xq27. Genome research. 2005, 15 (11): 1477-1486. 10.1101/gr.4212705.
Gjerstorff MF, Kock K, Nielsen O, Ditzel HJ: MAGE-A1, GAGE and NY-ESO-1 cancer/testis antigen expression during human gonadal development. Hum Reprod. 2007
Nelson PT, Zhang PJ, Spagnoli GC, Tomaszewski JE, Pasha TL, Frosina D, Caballero OL, Simpson AJ, Old LJ, Jungbluth AA: Cancer/testis (CT) antigens are expressed in fetal ovary. Cancer Immun. 2007, 7: 1-
Bustamante CD, Fledel-Alon A, Williamson S, Nielsen R, Hubisz MT, Glanowski S, Tanenbaum DM, White TJ, Sninsky JJ, Hernandez RD, Civello D, Adams MD, Cargill M, Clark AG: Natural selection on protein-coding genes in the human genome. Nature. 2005, 437 (7062): 1153-1157. 10.1038/nature04240.
Khaitovich P, Hellmann I, Enard W, Nowick K, Leinweber M, Franz H, Weiss G, Lachmann M, Paabo S: Parallel patterns of evolution in the genomes and transcriptomes of humans and chimpanzees. Science. 2005, 309 (5742): 1850-1854. 10.1126/science.1108296.
Khaitovich P, Enard W, Lachmann M, Paabo S: Evolution of primate gene expression. Nature reviews. 2006, 7 (9): 693-702. 10.1038/nrg1940.
Nielsen R, Bustamante C, Clark AG, Glanowski S, Sackton TB, Hubisz MJ, Fledel-Alon A, Tanenbaum DM, Civello D, White TJ, J JS, Adams MD, Cargill M: A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS biology. 2005, 3 (6): e170-10.1371/journal.pbio.0030170.
Vicoso B, Charlesworth B: Evolution on the X chromosome: unusual patterns and processes. Nature reviews. 2006, 7 (8): 645-653. 10.1038/nrg1914.
NCBI Reference Sequence (RefSeq). [http://www.ncbi.nlm.nih.gov/RefSeq]
CT Gene Database. [http://www.cancerimmunity.org/CTdatabase]
Kasprzyk A, Keefe D, Smedley D, London D, Spooner W, Melsopp C, Hammond M, Rocca-Serra P, Cox T, Birney E: EnsMart: a generic system for fast and flexible access to biological data. Genome research. 2004, 14 (1): 160-169. 10.1101/gr.1645104.
BioMart - MartView. [http://www.biomart.org/biomart/martview]
SIBsim4 project. [http://sibsim4.sourceforge.net]
Thompson JD, Higgins DG, Gibson TJ: CLUSTAL W: improving the sensitivity of progressive multiple sequence alignment through sequence weighting, position-specific gap penalties and weight matrix choice. Nucleic acids research. 1994, 22 (22): 4673-4680. 10.1093/nar/22.22.4673.
R Development Core Team: R: A language and environment for statistical computing. 2004, R Foundation for Statistical Computing, Vienna, Austria.
We thank members of SIB Lausanne for discussions, and in particular Asa Wirapati and Frédéric Schutz for advice on statistical analysis of the phylogenetic data. This work was supported by the Ludwig Institute for Cancer Research.
BJS, CI, LJO, AJS and CVJ designed the experiments. BJS wrote the software pipeline to identify human and chimpanzee CT genes and to produce ORF alignments. SP, MZ and WH scanned the literature for citations of positive selection. BJS and CVJ wrote the manuscript, which was read and approved by all authors.
Electronic supplementary material
Additional File 3: Significance of the differences in the distributions of dN/dS ratios between CT and control ORFs using a parametric t-test. Distribution of dN/dS ratios assessed by parametric t-test. The results are qualitatively similar to those presented in Table 3 and confirm that the distribution of dN/dS values is different between CT genes and controls. (DOC 28 KB)
Authors’ original submitted files for images
Below are the links to the authors’ original submitted files for images.