Structure and evolution of the mouse pregnancy-specific glycoprotein (Psg) gene locus

Background The pregnancy-specific glycoprotein (Psg) genes encode proteins of unknown function, and are members of the carcinoembryonic antigen (Cea) gene family, which is a member of the immunoglobulin gene (Ig) superfamily. In rodents and primates, but not in artiodactyls (even-toed ungulates / hoofed mammals), there have been independent expansions of the Psg gene family, with all members expressed exclusively in placental trophoblast cells. For the mouse Psg genes, we sought to determine the genomic organisation of the locus, the expression profiles of the various family members, and the evolution of exon structure, to attempt to reconstruct the evolutionary history of this locus, and to determine whether expansion of the gene family has been driven by selection for increased gene dosage, or diversification of function. Results We collated the mouse Psg gene sequences currently in the public genome and expressed-sequence tag (EST) databases and used systematic BLAST searches to generate complete sequences for all known mouse Psg genes. We identified a novel family member, Psg31, which is similar to Psg30 but, uniquely amongst mouse Psg genes, has a duplicated N1 domain. We also identified a novel splice variant of Psg16 (bCEA). We show that Psg24 and Psg30 / Psg31 have independently undergone expansion of N-domain number. By mapping BAC, YAC and cosmid clones we described two clusters of Psg genes, which we linked and oriented using fluorescent in situ hybridisation (FISH). Comparison of our Psg locus map with the public mouse genome database indicates good agreement in overall structure and further elucidates gene order. Expression levels of Psg genes in placentas of different developmental stages revealed dramatic differences in the developmental expression profile of individual family members. Conclusion We have combined existing information, and provide new information concerning the evolution of mouse Psg exon organization, the mouse Psg genomic locus structure, and the expression patterns of individual Psg genes. This information will facilitate functional studies of this complex gene family.


Background
In mammalian pregnancy the interaction between the maternal uterine tissues and foetal trophoblasts is regulated by a wide variety of cellular and endocrinological mechanisms. These mechanisms underpin trophoblastic invasion and remodelling of maternal tissues, placental angiogenesis, and the modulation of maternal immune responses. Central to these processes is the production by trophoblast of a variety of hormones that are found in abundance in the maternal bloodstream during pregnancy [1].
The pregnancy-specific glycoproteins (PSG) are the most abundant foetal proteins in the maternal bloodstream in late pregnancy [2]. They are synthesised in the syncytiotrophoblast of the human placenta and giant cells and spongiotrophoblast in the rodent placenta [3][4][5]. The PSG family of glycoproteins belongs to the carcinoembryonic antigen (CEA) family, which also includes the CEArelated adhesion molecules (CEACAMs). The CEA family is itself part of the immunoglobulin (Ig) superfamily [6]. The Ig domain structure of the human and rodent PSGs differs. Containing both V-like Ig domains (N), C2-like Ig domains (A and B) and relatively hydrophilic tails (C), domain arrangements in human PSGs are type I (N-A1-A2-B2-C), type IIa (N-A1-B2-C), type IIb (N-A2-B2-C), type III (N-B2-C) and type IV (A1-B2-C) [7]. In contrast, rodent PSGs are typically comprised of 3, and in a few cases of 5 or 7 N-domains followed by an A-domain [8].
In the primate / rodent ancestor, the initial duplication of the CEACAM / PSG primordial gene has been estimated to have occurred about 90 Myr ago [9], approximately at the time of human-rodent divergence. The most probable PSG ancestor in rodents and primates is a CEACAM15like molecule based on the organisation of N and A domains. CEACAM15 is not classified as a PSG because comparisons of N and A domain sequence identity clearly delineate members of the CEACAM and PSG subfamilies (Roland Zebhauser, WZ, AM, TM, to be published elsewhere). It has been suggested that human and rodent PSG multigene families evolved independently via further gene duplication and exon shuffling events [10].
There are 11 members of the PSG family in humans that are encoded by genes clustered on chromosome 19q13.2 [11,12]. PSG proteins have a similar domain structure to the CEACAMs, but lack a membrane anchor and are therefore secreted. However, a few variants have been described that are retained within the cell. Conversely, a small number of human and mouse CEACAM variants lack a membrane anchor and are secreted. Membrane-anchored CEACAMs are widely expressed during embryonic development and in adult tissues, and are implicated in carcinogenesis, angiogenesis and regulation of immune functions [13,14]. In contrast, PSGs and some CEACAMs are expressed almost exclusively in trophoblasts of the haemochorial placenta of rodents and primates [4,5,15].
The biochemical properties and physiological functions of the PSGs remain to be fully elucidated, although functional experiments and clinical observations are beginning to provide some clues. Low PSG levels in the maternal circulation are associated with threatened abortions, intrauterine growth retardation and foetal hypoxia [16][17][18][19]. The importance of PSGs for the maintenance of pregnancy is also underlined by the observation that the application of anti-PSG antibodies or vaccination with PSG induces abortion in mice and monkeys, respectively, and reduces the fertility of non-pregnant monkeys [20,21]. The majority of PSG functional studies have focussed on determining whether PSGs are able to modulate the maternal immune system to prevent rejection of the allotypic foetus. Early studies with complex PSG mixtures isolated from placenta indicated an inhibitory effect on phytohaemagglutinin or allogeneically stimulated lymphocytes [22,23]. In further experiments it was shown that human monocytes secreted anti-inflammatory cytokines in response to PSG exposure. Moreover, recombinant mouse PSG18 was found to induce the production of interleukin (IL)-10 in the mouse macrophage cell line RAW 264.7 [24]. Human PSG1, PSG6 and PSG11 all induced secretion of IL-10, IL-6 and transforming growth factor (TGF)-β1 [25]. Whilst IL-10 and TGF-β1 are antiinflammatory [26], IL-6 is usually considered to be a proinflammatory cytokine. However IL-6 does have some well-described anti-inflammatory properties [27]. Furthermore, IL-6 has been shown to indirectly promote trophoblast growth by upregulation of human chorionic gonadotropin (hCG) release by the trophoblast, and induction of granulocyte-macrophage-colony stimulating factor (GM-CSF) [28,29]. Further evidence implicating PSGs in immune modulation arises from PSG mediated suppression of T cells in purulent septic complications of abortion [30] and elevated circulating PSG levels are correlated with improved symptoms of rheumatoid arthritis [31]. PSG induction of alternative monocyte activation is of particular importance as it implies a PSG-mediated switching of the immune system from a predominantly T H 1 response to a predominately T H 2 response which is more compatible with a successful pregnancy [32].
The only PSG receptor identified to date is the integrinassociated CD9 receptor, which was found to bind the N1 domain of both PSG17 [33] and PSG19 (unpublished data). Additionally, the presence of the conserved tripeptide motif Arg-Gly-Asp (RGD) on a solvent-exposed loop in the N-terminal Ig domain in the majority of human and some lower-primate PSGs implicates a function that involves integrin-related receptors [34]. Thus it has been speculated that the RGD domain may enable some PSGs to disrupt cell-matrix interactions [35]. However, no rodent PSG isolated to date possesses an RGD domain. Evidence supporting the hypothesis that the RGD domain may be involved in receptor binding was provided by the discovery that a peptide containing the RGD motif, from human PSG9, bound to a receptor on the surface of a promonocytic cell line [36]. In common with integrin interactions, this was dependent on the presence of divalent cations and showed sensitivity to cytoskeletal signalling. However, the expected sizes of the receptor subunits differed from those of known integrins, therefore, the identity of the receptor remains elusive.
Much current work has focussed on human PSGs due to their possible relevance to disorders of pregnancy. However, the study of rodent PSGs is important because, the evident differences between primate and rodent PSG protein domain structures notwithstanding, there appears to be considerable conservation in terms of expression in trophoblast, independent gene family expansions in mammalian lineages with haemochorial placentation, and postulated immune functions during pregnancy. Moreover, the application of gene targeting and mutagenesis in the mouse is likely to be informative with respect to elucidating the cellular and physiological functions of PSGs. Such experiments will require an accurate genomic map of the mouse Psg locus, which we undertook to produce in the work described herein. It is also pertinent to ask whether the independent expansions of PSG gene families in different mammalian lineages reflect selection for increased gene dosage or for diversification of function mediated through different protein structures or developmental expression patterns. We therefore undertook to examine and correlate protein domain evolution and expression profiles of the various mouse Psg genes to attempt to address this question. Our results suggest that different family members have very different expression levels at different stages of development, which we consider may be supportive of the hypothesis that mouse Psg genes may have evolved divergent functions in mammalian pregnancy. However, mutagenesis of individual family members will be necessary to rigorously test this hypothesis.

Identification of novel mouse Psg genes
For comparative studies of the human PSG family it is relatively easy to compare coding sequences (CDS) and peptide sequences because complete sequence information is available. However the data available for mouse PSGs is not complete, making such analyses difficult. Thus, we firstly collated the currently available public data, and we then attempted to identify sequences for PSGs that were not completely resolved in the databases. Full-length cDNA sequences of Psg17, Psg18, Psg19, Psg21, Psg23, Psg28 and Psg30 were identified via basic name searches of the RefSeq RNA database. Their identity was then verified by comparison to cDNA fragment sequences, which were obtained during the course of this work and deposited in GenBank [37], as misnaming of genes is commonplace in the databases. The cDNA sequence of Psg22 was then identified via BLAST analysis of the mouse RefSeq RNA database using the GenBank partial sequence referenced in Beauchemin et al. [37]. Psg31 was identified by BLAST analysis of the same database using the full-length Psg30 sequence and found to be the XM_355864.1 predicted transcript. However, there was a discrepancy between the predicted transcript and the sequences of EST clones CK032208 and CN694284. Comparison of these EST sequences with genomic contig NT_039395.2, using pairwise BLAST analysis, revealed that there had been a duplication of the N1 domain exon. We refer to the two N1 domains of Psg31 as N1 and N1* hereafter.
The gene and full-length cDNA coding sequences of the remaining mouse genes (Psg20, Psg24, Psg25, Psg26, Psg27 and Psg29) were deduced manually by systematic BLAST analysis of the mouse genome database as described in Methods. None of these predicted cDNAs were observed in the mouse EST database, although all except Psg20 were observed in the Trace Archive EST sequences. A novel splice variant of Psg16 was also found. BLAST analysis of the mouse High Throughput Genomic Sequences (HTGS) database identified contig AC148976.2, which appears to contain the whole Psg16 gene. An alternative exon 1 was discovered upstream of the previously described initiating exon by a pairwise BLAST comparison of this contig with full-length Psg17 coding sequence. The use of this alternative exon 1 produced a transcript that encodes a typical PSG polypeptide complete with a predicted secretory-peptide signal sequence and cleavage site. Multiple hits identified from subsequent BLAST analysis of the mouse EST and Trace Archives EST databases provided evidence that this novel splice variant was placentally expressed in vivo.
In contrast, only one hit was obtained by identical analysis using the coding sequence of the brain-specific transcript described in Chen et al. [38]. This transcript (BC030357) was derived from a retinal cDNA library. The brain-expressed splice variant is generated from an alternative initiation site within exon 2 of the dominant placentally-expressed form of the gene. Alternative promoter usage would explain the brain and placenta-specific expression patterns of these variants of Psg16. Unlike the brain-specific variant, the placentally-expressed variant possesses a predicted secretory signal peptide at the N-terminus, like most other Psg gene family members.
The comparison of the brain derived Psg16 coding sequence with the genomic sequence (AC148976.2) also revealed differences in the encoding of the A-domain. The placental transcript is predicted to be encoded by 5 exons, as are the majority of mouse Psg mRNAs. However, a weak splice donor signal sequence within the fifth exon permits splicing to a strong splice acceptor sequence downstream of the sixth exon, as seen in the brain-expressed transcript. Trace Archive EST data reveals multiple hits to sequences from placental cDNA libraries using the 3' end of the placental Psg16 coding sequence as bait. This confirms the existence of our predicted transcript. Conversely, similar analysis using the brain-expressed variant yielded no hits. The sixth exon is present on a separate randomly ordered gene fragment within the AC148976.2 contig.
Psg-ps1 was previously considered to be a pseudogene, based on a point deletion at nucleotide position 30, downstream from the canonical Psg translational start site [8]. However, despite this frame shift, the open reading frame of this unusual Psg continues 105 bp upstream of the site of the mutation to an alternative ATG. Inspection of the sequence revealed a Kozak consensus, and BLAST analysis of the public EST and Trace Archive EST databases yielded many mRNA clones that contain this region in addition to downstream exons. Hence, this gene is clearly expressed, and we now propose to rename Psg-ps1 as Psg32 hereafter. We note that this mutation and amino terminal extension abolishes the canonical PSG secretory signal and peptide cleavage site. We therefore suggest that if Psg32 is indeed translated, the resulting protein is retained within the cytoplasm. To determine if the deletion observed in BALB/c mice was also present in other murine strains, we amplified and sequenced a 146 bp fragment by PCR using a set of primers specific for the 5'untranslated region and the leader peptide of Psg32. The deletion observed in the Psg32 cDNA is also present in the genomic DNA of A/J, C57BL6/J, YBR/Ei, and SWR/J inbred mouse strains (data not shown).
The nomenclature (past and current) and accession numbers of nucleotide sequences of all the murine PSGs are documented in Table 1. The genome sequence and predicted CDS and translation products for Psg16, Psg20, Psg26 and Psg31 are listed in Additional file 1. The complete CDS data for all known mouse Psgs (except the brain-specific splice variant of Psg16) are listed in Additional file 2. The complete protein primary sequences for all known mouse Psgs (except the brain-specific splice variant of Psg16) are listed in Additional file 3.

Domain structure of mouse PSG proteins
A schematic representation of the mouse Psg domain structures is shown in Fig. 1. Of the seventeen mouse Psgs, thirteen encode a common structure of three Ig variable (IgV)-like domains (N-domains) and a single Ig constant (IgC)-like domain (A-domain). Psg24, Psg30 and Psg31 have an expanded structure created by the duplication of IgV-like domains. An unrooted phylogenetic tree indicates three main branches of IgV-like domain evolution (Fig. 2). There is a group consisting of N1 domains, a group of N2 domains and N2-derived domains, and a group of N3 domains and N3-derived domains. Therefore, in agreement with the most common structure observed in Fig. 1, the ancestral mouse Psg would be expected to have had an N1-N2-N3-A arrangement of domains. The expansion of Psg24, Psg30 and Psg31 has occurred mostly through duplications of the N2 and N3 IgV-like domains, with the exception of Psg24 N5 and Psg31 N1 domains.
In order to characterise the evolution of the mouse Psgs with expanded domain numbers, Neighbor-Joining (NJ) trees with bootstrap values of 1000 were prepared ( However, using the alignment identities in Fig. 3B(i) it can be seen that, although generally poorly conserved, the best match of 51.2% was obtained by alignment with the N2 domain. Therefore, our evolutionary model assumes that the Psg24 N5 domain arose from an early duplication of the N2 domain. Also, based on agreement of the data in Fig. 3A(i) and 3B(i), the N2 domain duplicated again more recently to yield the N3 domain. This latter duplication explains why the Psg24 N4 domain is N3-like. The order of these events is shown schematically in Fig. 3C(i).
Using a similar analysis we propose a model for the expansion of domains within Psg30 and Psg31 ( Fig.  3C(ii)). We suggest that the N4 and N6 domains of Psg30 and Psg31 are derived from a progenitor N2 domain. Similarly, the N5 and N7 domains are derived from a progenitor N3 domain. Expansion is predicted to have occurred in 2 or 3 separate events in a common ancestor of Psg30 and Psg31. In the first instance the progenitor N3-like and N2-like domains were duplicated, either at different points in evolution or at the same time. The final step was a duplication of both of these daughter domains to create Psg30 and the precursor of Psg31. The precursor of Psg31 then underwent another duplication, this time of the N1 domain.

Expression of Psg genes in mouse placenta at different developmental stages
On the basis that all mouse Psg genes originated from a common ancestor, and expanded into a multigene family by duplication and subsequent divergence, the question as to whether the expression patterns have also diversified is relevant to determining the selective forces underlying Psg gene family expansion. As Psg genes are expressed predominantly in the placenta, cDNA was prepared from total RNA extracted from mouse placenta at four stages of development between E10.5 and E17.5. Psg cDNA sequences were then amplified with PCR primers designed to amplify Psg16 -Psg29 inclusive. Size fractionation of PCR products on an ethidium bromide-stained agarose gel, indicates that mouse Psg genes are predominantly expressed from around E15.5, increasing in expression through to at least E17.5 (Fig. 4). However, after blotting the products onto nylon membranes and hybridising radiolabelled oligonucleotide probes specific for individual Psg genes (Table 2), we observed significant differences in expression profiles of different genes during development. This method is probably semi-quantitative at best but does give some indication of relative expression levels. We observed that Psg16 and Psg26 are weakly expressed at E15.5 but strongly expressed at E17.5. In contrast, Psg17, Psg18, Psg21 and Psg23 are expressed strongly at E15.5, further increasing by E17.5. Psg27 shows a similar expression pattern to these four Psgs, but at a relatively low level. Very weak expression was observed on E17.5 for Psg19, Psg20, Psg24, Psg25 and Psg29, whereas Psg22 and Psg28 were undetectable. Psg30, Psg31 and Psg32 domain structures had not been finalised and therefore their expression was not analysed in this experiment.
To supplement the PCR-based Psg expression studies, we performed 'virtual northern' analysis in silico by screening the public EST database for sequences matching Psg N1 or A-domains and counting the numbers of matches (Fig. 5). There was generally good concordance of the virtual data with the RT-PCR data; notably, Psg21 and Psg23 are highly represented in both datasets. However, disagreements were also evident e.g. Psg16 expression was low in the RT-PCR data, but high in the virtual data. A random sample of twenty of the large number of Psg16 EST sequences in the database indicated that all were of placental origin, Psg-ps1 XR_000250 GNOMON prediction in NCBI a Where nucleotide start and end positions are shown in parenthesis after accession numbers, they refer to the start and end positions of the genomic sequence excerpt (encompassing the PSG exons) that is included in Additional file 1. RC indicates that the sequence in Additional file 1 is the reverse complement. b Where we have predicted the full CDS of a PSG (based on common structure and splice sites), the numbers shown refer to exon start and end positions within the excerpted sequence included in Additional file 1.
ruling out contamination with brain-derived sequences as an explanation for the disparity between RT-PCR and virtual analysis. There was also generally good agreement with the results from screening the EST database with N1 and A domain sequences, although the numbers of Adomain hits were 4-5 fold lower than the N1-domain hits. The only exception to this observation was that Psg30 and Psg31 sequences were identified in 2-fold greater abundance when screened with the A domain compared with the N1 domain. Despite some discrepancies, therefore, the combined RT-PCR and virtual Northern data demonstrate that developmental onset of expression, and maximum expression levels, vary considerably within the Psg family.

Mouse Psg locus genomic organisation
The published mouse Psg gene locus is contained on contig NT_039395. However, the complement of Psg genes is incomplete and the majority of gene sequences within the contig are unordered. We therefore decided to determine the organisation of Psg genes within the locus by screening BAC, YAC and cosmid clones using hybridisation with Domain organization of mouse PSGs 95.0% mal chromosome 7 and are interspersed with other genes, particularly Ceacams, as determined by comparison with the published mouse genomic sequence on contig NT_039395. We did not observe any obvious correlation between the relative positions of the Psg genes at the locus and their domain arrangements or expression patterns.
There is a discrepancy with respect to the distance between the two subclusters. The currently poorly resolved data covering this region in the the Ensembl assembly implies the presence of a gap between Psg29 and Psg32. However, we determined that the subclusters are fused between Psg32 and Psg30/Psg18. YAC F10104 (which is non-chi-Expression of Psg mRNAs during placental development Figure 4 Expression of Psg mRNAs during placental development. Total RNA (1 µg) from day 10.5, 12.5, 15.5 and 17.5 BALB/c placentae was reverse transcribed using an oligo (dT) oligonucleotide (reverse PCR primer). After addition of the degenerate Psg-all oligonucleotide (forward PCR primer), which anneals to the cDNA of all known members of the mouse Psg family, Psg cDNAs were amplified by PCR (see schematic diagram depicting generalised mouse Psg cDNA amplification). Aliquots were size-separated by agarose gel electrophoresis. a, PCR products were visualised by ethidium bromide staining. b-o, the amplification products were blotted onto nylon membranes and individual blots were hybridised with single gene-specific 32P-labelled oligonucleotides from the N1 domain regions ( Table 2). The location of the primers used for amplification of the Psg cDNAs and the region from which the sequences of the gene-specific oligonucleotides were derived are shown together with a schematic representation of mouse Psg mRNA.

Discussion
The human PSG genomic data in the public databases are relatively complete. For each PSG gene, there are annotated RefSeq resources comprising information on genomic structure, transcripts and translation products. The nomenclature is also standardised [37]. Further, there are accurate chromosome 19 locus assignments allowing complete visualisation of the PSG locus and surrounding genes. In contrast, a substantial quantity of mouse Psg genomic data in the public domain is fragmented, incomplete and somewhat unreliable. We sought to collate the existing genomic data, to present novel data to fill in gaps, and to provide a coherent resource of mouse Psg genomic data.
To determine whether the existing set of mouse Psg genes was complete we performed systematic BLAST searches of a variety of public DNA sequence databases. This analysis revealed the existence of a novel expressed Psg gene, which we name Psg31 in line with the accepted nomenclature convention [37]. Psg31 apparently evolved from a duplication of the whole of the Psg30 gene followed by a subsequent internal duplication of the N1 domain. We were also able to predict the complete coding sequences of four Psg genes for which previously only partial fragments were described. The gene, CDS and protein sequences of these predictions, coupled with a complete reference of all known mouse Psg CDS and primary protein sequences are provided in three attached Additional Files.
Using the full CDS information obtained for the complete set of mouse Psg gene sequences, domain structures for all family members were predicted. All of the PSG proteins possess previously described arrangements of Ig-like domains. Except for two members, discussed below, all are predicted to encode N-terminal secretory signal sequences. Our predicted novel splice variant of Psg16 has a complete N1 domain and secretory signal peptide sequence. Trace Archive EST database BLAST analysis confirmed that this variant is expressed in the placenta. In  contrast, the brain-expressed variant [38] has only a partial N1 domain and no secretory signal peptide. The previously described Psg-ps1 pseudogene [8] was found to be expressed in the placenta using Trace Archive EST database BLAST analysis and possesses an excellent Kozak sequence at the predicted translational initiation site. This evidence therefore indicates that this gene, which we rename Psg32, is not a pseudogene but a bona fide expressed Psg gene family member. The Psg32 transcript may encode a protein that is retained within the cytoplasm. We note that a precedent in the human exists in the form of a non-secreted splice variant of PSG11 [7].
Psg31 has the unusual N1-N1*-N2-N3-N4-N5-N6-N7-A domain structure. This newly characterised Psg gene has evolved from a duplication of the entire Psg30 gene followed by an internal duplication of the N1 domain. There may be functional significance associated with the N1 domain duplication. The complex nature of Psg gene evolution, including putative gene conversion and recombination events between family members [34], makes it difficult to analyse their evolution. Despite this, the data generated from ClustalX alignments and NJ trees enabled us to generate trees that allow prediction of the order of events of domain duplications in Psg24, Psg30 and Psg31. We note that the apparent route to domain number Virtual Northern analysis of the mouse Psg genes 100 P s g 1 6 P s g 1 7 P s g 1 8 P s g 1 9 P s g 2 0 P s g 2 1 P s g 2 2 P s g 2 3 P s g 2 4 P s g 2 5 P s g 2 6 P s g 2 7 P s g 2 8 P s g 2 9 P s g 3 0 P s g 3 1 P s g 3 2 A P s g 1 6 P s g 2 4 P s g 2 9 P s g 3 2 P s g 3 0 P s g 1 8 P s g 3 1 P s g 2 8 P s g 2 2 P s g 2 3 P s g 2 7 P s g 2 6 P s g 2 5 P s g 2 1 P s g 2 0 P s g 1 7 P s g 1 9 BAC GenBank: AC138311 P s g 1 6 P s g 2 4 P s g 2 9 P s g 3 2 P s g 3 0 P s g 1 8 P s g 3 1 P s g 2 8 P s g 2 2 P s g 2 3 P s g 2 7 P s g 2 6 P s g 2 5 P s g 2 1 P s g 2 0 P s g 1 7 P s g 1 9 Having collated all known mouse Psg gene protein coding sequences and protein domain structures, the mouse Psg genomic locus on chromosome 7 remained to be determined to complete a comprehensive resource for the analysis of Psg function. The NCBI build 32 composite mouse assembly data revealed that only four Psgs had been mapped. Other Psgs on contig NT_039395 are currently unordered. We therefore screened cosmid, YAC and BAC libraries, and orientated Psg-containing clones to identify, where possible, the order of Psg genes within the locus. We were not able to resolve all ambiguities in gene order on our map; however, where public database information is available, our data are in good agreement. We found no clear relationship between gene location and gene expression level suggesting that, within the Psg locus, each Psg gene is autonomously regulated.

Conclusions
The evolution and physiological functions of the relatively understudied mouse Psg gene family are poorly understood. This is a feature shared with other placentally-expressed, multigene families such as the prolactin and growth hormone genes [1]. In order to provide a comprehensive resource to facilitate functional studies of mouse Psg genes, including the generation of mouse mutants with modified Psg gene expression profiles, we have collated the entire set of mouse Psg genes, their pre-dicted encoded proteins, and their evolutionary histories. The complete CDS data will enable the cloning, overexpression, and gene targeting of individual or multiple mouse Psg genes. This will facilitate the elucidation of their function and, by extrapolation, their human homologues, which may be involved in diseases of pregnancy.  (Table 2) which bind to the N1, but not N2 and N3 exons of all known mouse Psg genes (except Psg32). The product was purified by electrophoresis on a 1.8% agarose gel and subcloned into pUC18 after blunt-ending (SureClone ligation kit: Pharmacia, Freiburg, Germany). The N1 exons from two of the 10 newly identified Psg genes (Psg28, Psg29) were analysed by sequencing recombinant plasmids which did not hybridise with oligonucleotide probes specific for known Psg genes ( Table 2).

Mapping of the Psg locus
The presence of the different Psg genes within YAC, BAC and cosmid clones was first determined by PCR followed by hybridisation with oligonucleotides specific for individual Psg genes. DNA from Psg-containing YAC (100 ng) and cosmid clones (10 ng) were used to amplify the N1 domain exons of all known Psg genes in a total volume of 60 µl as described above. The N1 exon of Psg32 was amplified in a separate reaction using Psg32N1-F and PsgN1-R ( . The only exception is the Psg19-specific oligonucleotide which exhibits only 2 mismatches to the Psg22 sequence. However, the stringency of the post-hybridisation washes only allowed binding of oligonucleotides with a maximum of one mismatch. The specificity of the oligonucleotides and the hybridisation conditions was demonstrated on cosmid DNAs containing individual Psg genes. The identity of the Psg genes was verified by sequencing. No cross-hybridisation with other Psg genes was observed. The size of the YACs was determined by pulsed field gel electrophoresis followed by Southern blot hybridisation with the Psg17 cDNA clone pCea2b (see above) essentially as described previously [48].

Fluorescence in situ hybridisation (FISH) analyses
The chromosomal location and chimerism of YAC clones were determined by FISH analyses, using B1-PCR of YAC DNA for probe preparation essentially as described [49]. Orientation and order relative to the chromosome 7 centromere and to each other of the two Psg gene subclusters was defined by FISH analysis using probes described in Fig. 6. FISH was performed essentially as described [50] on m5S cells [51] and concanavalin A-stimulated lymphocytes [52] from the C57BL/6CrS1c mouse strain.

RNA isolation, RT-PCR and specific detection of Psg cDNAs
BALB/c mice were mated overnight, and the next day plugged females were designated as day 0.5 of gestation. Pregnant females were killed by cervical dislocation and placentae were dissected free of maternal tissue, immediately frozen in liquid nitrogen and stored at -70°C. Total RNA was extracted by the acid phenol method [53]. The expression of individual Psg genes was studied by RT-PCR followed by hybridisation of the products with gene-specific oligonucleotides. Total RNA (1 µg) from placentae of different gestational stages was reverse transcribed in a total volume of 10 µl by avian myoblastosis virus (AMV) reverse transcriptase (Promega, Mannheim, Germany) in the presence of 6 U/µl RNasin (Promega) using a degenerate oligo (dT) 30 oligonucleotide (1 µM) as primer ( Table  2). The reaction mix was adjusted to 1x Taq buffer (20 mM Tris-Cl, 16 mM (NH 4 ) 2 SO 4 , pH 8.6), 3 mM MgCl 2 and 0.4 mM dNTPs in a total volume of 100 µl. Amplification of all known Psg cDNAs (except for the cDNA of Psg32 (Cea6), which at the time of the experiment was presumed to be a pseudogene [8]) was achieved by PCR (denaturation: 94°C, 15 s; annealing: 58°C, 30 s; extension: 72°C, 3 min; 30 cycles) using Taq polymerase after addition of 400 pmoles of Psg-all (Table 2) and 50 pmoles of the oligo (dT) oligonucleotide as 5'-and 3'-primer, respectively. Ten µl aliquots each were size fractionated by electrophoresis on a 1% agarose gel, blotted onto a positively charged nylon membrane (Roche Diagnostics, Mannheim, Germany) and hybridised with individual 32 P-