- Research article
- Open Access
Comparative genome analysis of cortactin and HS1: the significance of the F-actin binding repeat domain
BMC Genomics volume 6, Article number: 15 (2005)
In human carcinomas, overexpression of cortactin correlates with poor prognosis. Cortactin is an F-actin-binding protein involved in cytoskeletal rearrangements and cell migration by promoting actin-related protein (Arp)2/3 mediated actin polymerization. It shares a high amino acid sequence and structural similarity to hematopoietic lineage cell-specific protein 1 (HS1) although their functions differ considerable. In this manuscript we describe the genomic organization of these two genes in a variety of species by a combination of cloning and database searches. Based on our analysis, we predict the genesis of the actin-binding repeat domain during evolution.
Cortactin homologues exist in sponges, worms, shrimps, insects, urochordates, fishes, amphibians, birds and mammalians, whereas HS1 exists in vertebrates only, suggesting that both genes have been derived from an ancestor cortactin gene by duplication. In agreement with this, comparative genome analysis revealed very similar exon-intron structures and sequence homologies, especially over the regions that encode the characteristic highly conserved F-actin-binding repeat domain. Cortactin splice variants affecting this F-actin-binding domain were identified not only in mammalians, but also in amphibians, fishes and birds. In mammalians, cortactin is ubiquitously expressed except in hematopoietic cells, whereas HS1 is mainly expressed in hematopoietic cells. In accordance with their distinct tissue specificity, the putative promoter region of cortactin is different from HS1.
Comparative analysis of the genomic organization and amino acid sequences of cortactin and HS1 provides inside into their origin and evolution. Our analysis shows that both genes originated from a gene duplication event and subsequently HS1 lost two repeats, whereas cortactin gained one repeat. Our analysis genetically underscores the significance of the F-actin binding domain in cytoskeletal remodeling, which is of importance for the major role of HS1 in apoptosis and for cortactin in cell migration.
Cortactin (also designated EMS1 , CTTN, cttn, Amplaxin, see Genecard ) was initially identified as one of the most prominent tyrosine phosphorylated proteins in v-Src infected chicken embryo fibroblasts . Cortactin was independently isolated from mouse NIH3T3 cells  and human tumor cell lines . Human cortactin is encoded by the EMS1 gene, which is located on chromosome 11q13 [4, 5]. Gene amplification of 11q13 region and concomitant overexpression of cortactin frequently occurs in several human carcinomas [4, 6–8] and correlates with lymph node metastasis and increased mortality [9–11]. Elevated expression of cortactin increases cell motility, invasion [12–14] and metastasis .
The deduced amino acid sequence of cortactin revealed three main distinguishable domains: the N-terminal acidic domain containing a DDW-Arp2/3 binding motif followed by a six and one-half 37-amino acid F-actin binding repeat domain, a central region and an SH3 domain at the very C-terminal. The DDW-Arp2/3 binding site and the actin-binding domain together regulate F-actin polymerization and dynamics by activating the Arp2/3 complex  and both are necessary for translocation of cortactin to sites of actin polymerization . Recently, we reported the identification of two alternative splice variants of human cortactin lacking either 6th or the 5th /6th repeat, present in normal tissues as well as squamous cell carcinomas cell lines . These splice variants differ significantly in their ability to (i) bind F-actin, (ii) cross-link F-actin (iii) activate Arp2/3 mediated actin polymerization and (iv) induce cell migration in vitro . This indicates that also the number of repeats determines the affinity for F-actin and ability to regulate cell migration. Similar cortactin splice variants were also reported in the mouse , rat  and frog . The SH3 domain is a conserved protein module found in various signal proteins and mediates the interaction with various proteins such as N-WASP involved in actin polymerization, dynamin-2 in endocytosis, ZO-1 in cell-cell interactions and SHANK-2 in neuronal growth cones (reviewed in ). The central part of the protein between the F-actin repeat domain and the SH3 domain contains an alpha-helix sequence and a proline-rich region with three c-Src tyrosine phosphorylation sites [22, 23] and three serine/threonine phosphorylation sites . Cortactin tyrosine phosphorylation occurs in response to growth factor treatment, integrin cross-linking, bacterial invasion and cell shrinkage (reviewed in ). Tyrosine phosphorylation of cortactin reduces its F-actin cross-linking activity and is required for its ability to stimulate cell migration . Since cortactin operates mainly in cytoskeletal rearrangements, it may link other proteins via its SH3 domain to sites of actin polymerization. Alternatively, serine phosphorylation of cortactin by Erk enhances, whereas Src phosphorylation inhibits the activation of N-WASP by cortactin  and as a result affects actin polymerization. This suggests that cortactin at first instance may be directed to the site of actin polymerization by other proteins. Thus, changes in protein expression level, phosphorylation state, the relative expression of splice variants and interactions with other proteins can all influence cell migration.
Cortactin shows the highest similarity to the hematopoietic lineage cell-specific protein 1 (HS1). Human HS1 (also designated HCLS1 , see Genecard ) was originally isolated by its homology to the adenovirus E1A gene . HS1 overall similarity to cortactin at the amino acid level is 51% but is highest at the SH3 domain (86%) and the 37-amino-acids repeat domain (86%), except that HS1 carries only three and one-half repeats. Despite this high homology, the function of HS1 differs considerable from cortactin. First, HS1 is mainly expressed in hematopoietic cells , whereas cortactin is widely expressed in all cell types except most hematopoietic cells . Only in platelets and in megakaryocytes both genes are expressed [29, 30]. Second, in concordance with this tissue distribution, HS1 is tyrosine phosphorylated after receptor cross-linking in B-cells , T-cells , mast cells  and erythroid cells , but at different residues compared to the functional phosphorylation residues in cortactin [13, 23]. Third, HS1 is, like cortactin, a cytoplasmic protein, but after tyrosine phosphorylation HS1 translocates to the nucleus , whereas cortactin is never found in the nucleus. This is because HS1, but not cortactin, contains a nuclear localization signal (NLS) [36, 37]. Fourth, HS1 plays an important role in the receptor-mediated apoptosis and proliferative responses as demonstrated by the analysis of HS1 deficient mice  and WEH1-231 B lymphoma cells [37, 39]. An HS1 tyrosine mutant that could not translocate to the nucleus, also failed to induce apoptosis . Consistent with its role in apoptosis, HS1 is able to bind to the mitochondrial protein HAX-1, a Bcl2 like protein . Finally, the SH3 domain of HS1 at the C-terminus binds to other proteins (Ste20 related kinase HPK1  and HS1-BP3 ) than those binding to cortactin, despite the very high amino acid sequence similarity of both SH3 domains (86%). This most probably reflects the different tissue-specific expression pattern.
Cortactin and HS1 share also remarkable similarities. First, HS1 binds with its DDW-motif directly to Arp2/3 and is involved in Arp2/3 mediated actin polymerization in vitro , although less efficient than cortactin . Second, HS1 binds to F-actin with its 37-amino-acid repeat domain , however, it contains only three and one-half repeat in contrast to cortactin. Third, also HS1-splice variants have been detected such as a variant lacking the 3rd repeat of the F-actin binding domain in a systemic lupus erythematosus (SLE) patient resulting in increased apoptosis after B-cell receptor (BCR) stimulation . Fourth, HS1 is sequentially phosphorylated on three tyrosine residues by various Src family tyrosine kinases [31, 45] and two serine/threonine residues , although at different residues than cortactin . Finally, both cortactin and HS1 can accumulate into podosomes, structures found in osteoclasts  and marcrophages , but also in RSV transformed cells  and carcinoma cells .
Although cortactin and HS1 share a high amino acid sequence and structural similarity, their functions differ considerable. In this paper, we compare their genomic organization in order to provide more insight into their evolution, which may form the basis towards understanding specific functions of both genes. We describe the genomic organization and the exon-intron boundaries for human cortactin. Both the genomic cDNA and deduced amino acid sequences of human cortactin were compared to cortactin and HS1 genes from other species. Genomic comparisons revealed the evolution and underscore the significance of the conserved F-actin binding repeat domain for HS1 and cortactin and the importance of alternative splicing for cortactin function.
Results and discussion
The genomic organization of cortactin homologues
We have previously described the isolation and sequencing of the EMS1 cDNA [28, 49] (DDBJ/EMBL/GenBank Accession No. M98343) coding for the human cortactin protein. To evaluate the genomic structure, we determined the exon/intron-boundaries. Nucleotide sequence comparisons with human EMS1 cDNA sequence revealed homology with two human genomic clones (DDBJ/EMBL/GenBank Accession No. AP000487 and AP000405) (Table 1). The genomic structure of the EMS1/cortactin gene was determined by performing BLASTn comparisons of EMS1 cDNA against the genomic clones (Figure 1A). By amplifying the intron sequences (smaller than 2 Kb) using primers on adjacent exons followed by end-sequencing of these products, we confirmed the intron/exon boundaries of the human EMS1/cortactin gene. The EMS1 gene contains 18 exons spanning over about 38 Kb of genomic DNA. The length of the individual exons ranges from 55 to 178 bp, except the last exon (1564 bp). The splice donor and acceptor sequences, the sizes of the introns and exons of the human EMS1/ cortactin gene are provided in the supplementary materials [see Additional file 1]. The ATG is at position 169, at the first nucleotide of exon 3, indicating that the first two exons encode the 5' untranslated region (UTR). The F-actin-binding repeat domain is encoded by exon 5 to exon 12 with 5 exons of 111 nucleotides in length (exons 6, 8, 9, 10 and 11) (Figure 1A and [see Additional file 1]). The sequence encoding the DDW Arp2/3 binding site is located within exon 3 and the SH3 domain is encoded by exon 17 and 18. The 3' UTR is 1420 nucleotides in length with a polyadenylation signal AATAAA at position 3225.
Other cortactin homologues have been reported in mouse , rat , chicken , fruit fly (Drosophila melanogaster ) , and frog (Xenopus laevis ) . We searched in numerous databases for all known cortactin genes in other species (listed in Table 1). The identification is based on overall amino acid sequence and overall structural homology with human cortactin. Cortactin homologues exist in mammalians (human, chimpanzee, cattle, pig, mouse, rat), birds (chicken), amphibians (frog), fishes (zebrafish, pufferfish), urochordates (sea squirt), invertebrates (sea urchin), insects (fruit fly, mosquito), shrimps, worms and sponges. To date, there is no evidence for the existence of cortactin in unicellular species, nor in plants. Thus, cortactin seems to be restricted to metazoans.
For several species, both cDNA and genomic sequences (total or partial) are available and therefore we were able to reveal their genomic organization using BLASTn. The exon/intron-boundaries were determined and compared to human cortactin [see Additional file 1]. As schematically presented in Figure 1, the genomic organization and the lengths of the exons as well as the locations of the exon/intron boundaries are highly conserved from urochordates to mammalians. Pufferfishes have the shortest known genome of all vertebrate species due to much shorter introns, nevertheless most exon/intron boundaries were conserved and similar to mammalian cortactin. Intriguingly, the number of repeats in the actin-binding domain differs between species (Figure 1A–G). The number of exons and the location of the intron/exon borders of insect cortactin (Drosophila and mosquito) differ considerably with mammalian cortactin, despite the proteins sequences are very similar. Drosophila and mosquito carry 4 repeats in the actin-binding domain. In both species, repeat 1-to-3 and 4 are on separate exons with in mosquito the 4th repeat of the actin binding domain to be encoded by a single 111 bp large exon 2 (Figure 1F,G). Both, sponge (the lowest metazoan) and sea squirt (urochordate) cortactin protein carry 5 repeats. During evolution, after creation of sponges and worms, the coelomata divided into insects and urochordates (that evolved later into vertebrates). The genomic organization of ancestors of the coelomata should reveal the roots of cortactin evolution. However, complete cDNA and/or genomic DNA of cortactin homologues in these species are not yet available.
The genomic organization HS1 homologues
Both nucleotide and amino acid sequence comparisons with cortactin revealed the highest similarity with the hematopoietic lineage cell-specific protein 1 (HS1). So far, HS1 homologues have been reported in human , mouse , rat and chimpanzee (NCBI database), suggesting that HS1 exists in mammalians only. We determined the intron/exon boundaries of mammalian HS1 genes by aligning the cDNA with the genomic DNA using BLASTn (Figure 1H and [see Additional file 2]). The number and lengths of the exons and the locations of the exon/intron boundaries were very similar to cortactin, especially in the exons that encode the actin-binding domain (compare [see Additional file 1] and [see Additional file 2]). The exons 10–13 of HS1 encoding the centre region between the actin-binding domain and the SH3 domain are longer (633 bp versus 489bp in cortactin) and more divergent compared to corresponding exons of cortactin.
In addition to a single cortactin homologue in all other species, nucleotide sequences comparisons using the mammalian HS1 mRNA and genomic DNA sequences revealed (incomplete) genomic sequences in chicken, pufferfish, zebrafish and frog (Table 1 and Figure 1I–M) that were more related to the HS1 protein (Figure 3 and [see Additional file 3]). Because no HS1 homologues for these species were present in the mRNA/dbEST database (except for X. laevis HS1), the cDNA (and corresponding protein) sequences were deduced from the genomic DNA with BLASTn or were predicted by Ensemble program. In these lower species, two cortactin related proteins exist. To distinguish between cortactin and HS1 variants, only the most conserved N-terminal part of cortactin and HS1 protein variants, including repeat 3 (corresponding to amino acid 1–190 of human cortactin) was used in BLASTp analysis. In each species, one protein variant turned out to be more homologous to human cortactin, and was called cortactin, whereas the other protein variant appeared to be more related to HS1 and was called HS1. This analysis unveiled HS1 proteins with more than 3 repeats in chicken and pufferfish Tetraodon nigroviridis (containing 4 1/2 repeats), pufferfish Takifugi rubripes and Xenopus laevis HS1 (5 1/2 repeats) and zebrafish HS1 (6 1/2 repeats) (Figure 1 I-M).
Moreover, alignments of the exon/intron boundaries of these HS1 genes to the mammalian HS1 genes [see Additional file 2] revealed that exon 7 (repeat 3) of HS1 was most similar to exon 10 (repeat 5) of cortactin suggesting that in mammalians exon 8 and 9 (repeat 3 and 4) of HS1 were lost during evolution. This is supported by the presence of at least one sequence of 111 nucleotides in the 5670 bp intron 6 of human HS1 (location 3271–3381) that is predicted by the program HMMER when performing alignments using a consensus sequence of the 37 amino acid repeats. However, this sequence is not functional because it does not represent an exon based on the consensus sequence of exon-intron junctions ('gt ... ag' rule of intron sequences) and no human transcripts or ESTs of HS1 including this sequence are present in the NCBI databases. In summary, HS1 is not restricted to mammalians only, but exist also in fishes, amphibians and birds and its genomic structure is very similar to that of cortactin.
Different promoter regions explain distinct tissue specificity of cortactin and HS1
Cortactin is widely expressed in most cell types suggesting to be important for vital functions, while HS1 expression is restricted to hematopoietic cells suggesting to be tailored later in evolution to serve a specific function in these cells. In concordance with their tissue-specific expression pattern, we suppose that their expression might be differently regulated. Therefore, we compared the upstream promoter regions of several cortactin and HS1 genes (Figure 2). The mammalian cortactin gene is very GC rich and contains putative SP-1 transcriptional factor binding sites that are common to many TATA-less promoters and typical for promoter regions in 'widely-expressed housekeeping genes'. Ets family transcription factors, found in the HS1 promoters, are specific for hematopoietic cells and involved in controlling the expression of many B cell- and macrophage-specific genes  and are critical for development of lymphoid and myeloid cell lineages. The promoter region of Drosophila and mosquito cortactin shares putative transcription factors found both in mammalian cortactin and HS1. Thus at least in mammalians, the nature of the promoters seemed to determine the broad distribution of cortactin expression in various tissues except most hematopoietic cells and the limited expression of HS1 to hematopoetic cells.
The significance of the actin binding repeat domain in cortactin and HS1
We recently reported the identification of two alternative splice variants of human cortactin; SV1-cortactin lacking the 6th repeat and SV2 lacking the 5th and 6th repeat resulting in a different F-actin binding properties and decreased cell migration . As shown in Table 1, cortactin splice variants exist in other mammalians as well as in chicken and frog. So far, splice variants in other species have not been identified, suggesting that alternative splicing of cortactin seems to be restricted to higher metazoans. All intron sequences of cortactin bordering the splice site junctions follow the general GT/AG rule  except for intron 11 (GC/AG) [see Additional file 1]. As has been shown for other genes, a GT-to-GC transition might be responsible for the generation of an alternatively mRNA transcript . However, in frog (Xenopus laevis ), the SV1-cortactin variant exists despite the splice donor of intron 11 begins with a GT . Thus, concerning the genome of these different species, alternative splicing of the actin-binding domain of cortactin seems to be facilitated during evolution by modulating the splicing machinery by a GT-to-GC transition to create cortactin related variants that influences cellular properties . The relative expression of cortactin splice variants by tissue origin  suggested that splice variants might have tissue-specific functions such as fine-tuning the organization of the F-actin cytoskeleton and consequently regulating cell adhesion and migration.
Alternative splicing also occurs in human HS1. Recently a splice variant lacking the 3rd repeat (exon 7) has been found in an SLE patient , resulting in enhanced BCR-mediated cell death. This alternative splicing event was due to a germ line mutation. In contrast, the splice donor of HS1 intron 6 begins with a GC [see Additional file 2]. With respect to the similarities between cortactin and HS1, it might be of interest to investigate the occurrence of splicing of HS1 exon 6 and possible biological consequences. The 3rd repeat and its NLS links HS1 to a role in apoptosis, while such a role has not been described for cortactin lacking a NLS. Since the cytoskeleton architecture in hematopoietic lineage cells is very different from that in adherent cells, it is likely that HS1 plays an important role in the construction of tissue-type specific actin networks. Other types of actin cytoskeleton factors, such as the Arp2/3 complex activators of the WASP family have been reported to have distinct tissue specific expression profiles as well. Thus, the apparent role of HS1 in apoptosis is likely due to its actin remodeling related function. Additionally, our genomic comparisons revealed that the 3rd repeat of HS1 corresponds with the 5th repeat of cortactin, and therefore it might be of interest to investigate whether cortactin SV2 variant (lacking the 5th and 6th repeat) might be involved in apoptosis.
The 4th repeat of cortactin has been suggested to be required for F-actin-binding . Genomic comparisons revealed that HS1 lacks this 4th repeat. Nonetheless, HS1 does bind to F-actin and activate the Arp2/3 complex, although at a lower efficiency than cortactin . This suggests that not only a single repeat but the number of repeats is crucial for the F-actin-binding affinity [14, 18]. In addition, HS1 contains a PIP2 binding site in each of its 3 repeats, whereas cortactin has only one in the 4th repeat. PIP2 reduces F-actin cross-linking by cortactin, probably due to competition for the same binding site. Due to its higher affinity for PIP2 , HS1 restores this cortactin/F-actin cross-linking process by trapping PIP2. This might be of importance in platelets and megakaryocytes where both, cortactin and HS1 are expressed. Taken together, the composition of the repeat domain is also involved in diverting the functions of both genes.
An elegant way to study the function of a protein is to perform loss-of-function experiments. So far, cortactin knock-out models have not yet been generated successfully, because deletion of one allele of cortactin leads to premature differentiation of embryonic stem cells (personal communication in ). However, complete loss-of-function mutants of the Drosophila cortactin gene were viable and fertile, except impaired border cell migration during oogenesis . Down-regulation of cortactin by RNA interference, revealed an essential role for cortactin in dendritic spine morphogenesis  and in E-cadherin mediated contact formation in epithelial cells . Mice lacking HS1, showed normal development of the lymphoid system , however, the antigen-receptor induced clonal expansion and deletion of B and T lymphocytes were impaired. Thus, loss of function studies underscores the divergent functions of HS1 and cortactin in different cell systems.
Cortactin and HS1 are derived from an ancestral vertebrate cortactin-gene by gene duplication
To examine the genesis of the cortactin family, we studied the relationship between the cortactin and HS1 homologues by generating a phylogenetic tree based on a multi-sequence alignment with the ClustalW 1,83 program [see Additional file 3]. We compared the N-terminal regions including repeat 3 (corresponding to nucleotide 1 to 190 of human cortactin), because this is the best-conserved region among all homologues (Figure 3). One cluster contains all known HS1 proteins and appeared to be closest related to a cluster composed by insects (Mosquito (Ag), Drosophila (Dm)), urochordate (sea urchin, (Sp)) and sponge (Sd) cortactin. In this last cluster all the species with only one gene (with the highest similarity with cortactin) are present. This suggests that with the appearance of the vertebrates, an ancestral gene became duplicated to create two genes, which later evolved into cortactin and HS1. This hypothesis is supported by the fact that many genes duplicated at this stage in the evolution, the overall amino-acid sequence in both genes is very similar and the introns are located at the same amino acid position. Furthermore, gene duplication often correlates with a tissue specific expression pattern of the duplicated genes, which is true for mammalian cortactin and HS1.
Figure 4 displays a hypothetical model for the origin of the cortactin and HS1 genes during evolution. The oldest ancestor is the sponge that, like sea squirt (urochordate), carries one cortactin protein with 5 c1/2 repeats. Insects have also one cortactin gene and evolved to 4 1/2 repeats. During evolution, after the creation of the sponge and the worms, the coelomata divided into insects and urochordates (that evolved later into vertebrates). This suggests that during the evolution, the number of repeats decreased in the insects. Unfortunately, no genomic sequences of ancestors of the coelomata that could reveal the roots of cortactin evolution are available yet to perform more detailed genomic analysis.
The genome of pufferfish Takifugu rubripes contains two cortactin-related genomic sequences both including 5 1/2 repeats. Most likely, an ancestor vertebrate cortactin gene underwent gene duplication. From this moment on during evolution, two cortactin/HS1-releated genes are present in all higher species. One gene evolved to mammalian HS1 with a specific function in apoptosis in hematopoietic cells. For its function, exon 8 and 9 (encoding repeat 3 and 4) were not useful and lost during evolution. However, the HS1 protein in pufferfish Takifugu rubripes and frog Xenopus laevis contains 5 1/2 repeats, while chicken and pufferfish Tetraodon nigroviridis HS1 carries 4 1/2 repeats. It might be of interest to investigate the function of these HS1 proteins and their functional differences to mammalian HS1. The other gene evolved to a ubiquitously expressed mammalian cortactin protein with a vital function in the organization of the cytoskeleton and cell migration. The 6th repeat of cortactin most likely originated from a duplication event of the 5th repeat, since the 6th repeat is most similar to the 5th repeat in all species with 6 1/2 repeats. We recently demonstrated that 6 1/2 repeats are necessary for optimal F-actin cross-linking activity and cell migration, while the splice variant lacking both the 5th and 6th repeats (SV2) was less efficient . Thus, the number of repeats in the F-actin binding domain of cortactin fine-tunes its function in cytoskeletal remodeling. For that reason, in higher metazoans, alternative splicing of the F-actin binding domain is most likely facilitated by a GT-GC transition in the splice donor. Alternatively, we can not exclude that gene duplication might have taken place after duplicated of the 5th repeat (dotted arrows), since both zebrafish cortactin and HS1 contain 6 1/2 repeats.
We report the genomic organization of cortactin and HS1 genes of several species. These genes display a conserved genomic organization as the coding regions have almost identical exon/intron structure. Comparison of 5' sequences allows possible regulatory elements that stress their specific tissue distribution. Comparative analysis of the genomic organization and amino acid sequences of cortactin and HS1 provides insight into the evolution of the conserved actin-binding repeat domain, which forms the basis towards understanding specific functions of both genes. Most likely, both genes originated from a gene duplication event and subsequently HS1 lost two repeats, whereas cortactin gained one repeat. Our analysis genetically underscores the significance of the F-actin binding domain in cytoskeletal remodeling, which is of importance for the major role of HS1 in apoptosis and for cortactin in cell migration.
The genomic structure of human cortactin
To determine the genomic structure of the human cortactin gene, an algorithm was applied based on the consensus sequence of exon-intron junctions ('gt ... ag' rule of intronic sequence) as well as on the codon usage within ORF. Nucleotide sequence comparisons with human cortactin sequences (NCBI, GenBank accession no. M98343) using BLASTn  revealed homology with two genomic clones (GenBank accession no. AP000487 and AP000405). With these clones, we determined all exon/intron boundaries and size of all introns and exons (Table 2A) of the human cortactin gene by (1) performing BLAST comparisons with the cDNA against the genomic DNA and (2) using the GeneFinder program  based on the consensus sequence of exon-intron junctions ('gt ... ag' rule of intronic sequence) as well as on the codon usage within ORF .
To confirm the predicted genomic structure, we determined the intron/exon boundaries using a cloning procedure as described . Genomic DNA of two cosmid clones COS-7.12 and COS-3.72 covering the cortactin gene as determined by the full-length cDNA , was amplified with randomly selected primers from the cDNA sequence (GeneBank accession no. M98343). All PCR products that were larger than the cDNA control sample were considered to be caused by intron sequences and compared to genomic sequence (accession number AP000487 and AP000405) using BLASTn . The size of intron 1, 5, 8, 12 and 13 was too large to obtain a reliable sequence.
Because no overlapping genomic sequences immediately 5' of the first exon were present in the database, we performed sequence analysis of a 2.7-kb HincII-HincII fragment representing the first exon and its 5'-flanking sequences from cosmid COS-7.12 cloned into pUC18 (p5'EMS_3135). In addition, we sequenced a 5-kb PCR product using a 5'-primer in the vector (within the TET gene) and 3'-primer (p3135p601: 5'-ccgggtcggccctggattcc-3') within exon 1, subcloned in pUC18 (p5'EMS_4911). Nucleotide sequences of both products were compared with the genomic clones representing the cortactin gene present in the NCBI database (Accession number AP000487 (GI 8118774 and GI 6277297) and AP000405 (GI 8118742)) and used to define the 7.4 kb 5'-flanking region. The PROSCAN program  from BIMAS was used to define the 316 bp promoter region preceding exon 1. Putative transcription factor binding sites where determined by the TFSEARCH program  and graphically represented in figure 2. Sequences from human cortactin were submitted to NCBI GenBank  as accession No. M98343 (cDNA) and AJ288897 (promoter).
The (deduced) protein and genomic sequences of all cortactin and HS1 genes were retrieved from various WEB-sites and their available sequence data are summarized in Table 1. In addition, partial cortactin sequences (ESTs and/or genomic) of various organisms were identified based on amino acid sequence homology with existing cortactin proteins. The genomic organization of the sea squirt and Takifugu rubripes could not be completely elucidated, because cDNA/genomic sequences were only partially available. All data were compiled using BLAST searches of the following databases: National Center for Biotechnology Information (NCBI) (Bethesda, MD, USA) ; The Wellcome Trust Sanger Institute (Cambridge, UK) ; EnsEMBL of The Wellcome Trust Sanger Institute (Cambridge, UK) ; DOE Joint Genome Institute (Walnut Creek, CA, USA) ; TIGR: The Institute for Genomic Research (Rockville, MD, USA) ; DNA Data Bank of Japan (Mishima, Shizuoka, Japan) ; Nematode.net Genome Sequencing Center (St. Louis, MO, USA) ; Wormbase (NY, USA) ; European Bioinformatics Institute (EBI) (Cambridge, UK) ; Genoscope National Sequencing Center (Evry, France) ; The U.S. Poultry Gene Mapping Project (MI, USA)  and UCSC Genome Bioinformatics (Santa Cruz, CA, USA) .
To determine the exon/intron boundaries of all cortactin and HS1 genes, available genomic sequences were subjected to sequence alignments of each species-specific cDNA sequence using the BLAST program of NCBI. Using the same algorithms, as described for human cortactin, the exon/intron-boundaries could be predicted. The complete genomic sequences of the 5' flanking region of cortactin of human, chimpanzee, mouse, rat, fruit fly, and mosquito were determined using the various accession numbers of genomic DNA in Table 1. Putative transcription factor binding sites of 800 bp of the 5' flanking regions where determined by the TFSEARCH program (Figure 2). The predicted exon in intron 6 of HS1 was predicted by the bio-informatics program HMMER ) The human cortactin 6 1/2 repeats of the actin-binding domain were aligned, resulting in a consensus sequence: (kfGvqkdrvDksAvGfdyqekvekhesqkDysk). With HMMER this consensus sequence was 'tBLASTn' to intron 6 of human HS1. With an acceptable probability (E-value 0.095), the program predicted an exon in this intron 6 (at location 3271–3381).
Amino acid sequence comparisons
Sequence alignments were carried out using the BLAST program of NCBI. The multiple sequence alignments of various cortactin proteins were constructed using Basic GeneBee ClustalW 1.83 . The genome, cDNA or protein was completed for all cortactin homologues and the number of repeats differs across species and between HS1 and cortactin. Only the N-terminal of cortactin and HS1 proteins including repeat 3 (corresponding to amino acid 1–190 of human cortactin) was used to generate a phylogenetic tree, because this is the most conserved part. Predicted nuclear localization signals sequences were obtained using Predict NLS program .
expressed sequence tag
hematopoietic lineage cell-specific protein 1
nuclear localization signal
reverse transcriptase polymerase chain reaction
Genecard cortactin. 2005, [http://bioinfo.weizmann.ac.il/cards-bin/carddisp?CTTN&search=EMS1&suff=txt]
Wu H, Reynolds AB, Kanner SB, Vines RR, Parsons JT: Identification and characterization of a novel cytoskeleton-associated pp60src substrate. Mol Cell Biol. 1991, 11: 5113-5124.
Zhan X, Hu X, Hampton B, Burgess WH, Friesel R, Maciag T: Murine cortactin is phosphorylated in response to fibroblast growth factor-1 on tyrosine residues late in the G1 phase of the BALB/c 3T3 cell cycle. J Biol Chem. 1993, 268: 24427-24431.
Schuuring E, Verhoeven E, Mooi WJ, Michalides RJ: Identification and cloning of two overexpressed genes, U21B31/PRAD1 and EMS1, within the amplified chromosome 11q13 region in human carcinomas. Oncogene. 1992, 7: 355-361.
Brookes S, Lammie GA, Schuuring E, de Boer C, Michalides R, Dickson C, Peters G: Amplified region of chromosome band 11q13 in breast and squamous cell carcinomas encompasses three CpG islands telomeric of FGF3, including the expressed gene EMS1. Genes Chromosomes Cancer. 1993, 6: 222-231.
Patel AM, Incognito LS, Schechter GL, Wasilenko WJ, Somers KD: Amplification and expression of EMS-1 (cortactin) in head and neck squamous cell carcinoma cell lines. Oncogene. 1996, 12: 31-35.
Rodrigo JP, Garcia LA, Ramos S, Lazo PS, Suarez C: EMS1 gene amplification correlates with poor prognosis in squamous cell carcinomas of the head and neck. Clin Cancer Res. 2000, 6: 3177-3182.
Hui R, Campbell DH, Lee CS, McCaul K, Horsfall DJ, Musgrove EA, Daly RJ, Seshadri R, Sutherland RL: EMS1 amplification can occur independently of CCND1 or INT-2 amplification at 11q13 and may identify different phenotypes in primary breast cancer. Oncogene. 1997, 15: 1617-1623. 10.1038/sj.onc.1201311.
Schuuring E, Verhoeven E, van Tinteren H, Peterse JL, Nunnink B, Thunnissen FB, Devilee P, Cornelisse CJ, van de Vijver MJ, Mooi WJ: Amplification of genes within the chromosome 11q13 region is indicative of poor prognosis in patients with operable breast cancer. Cancer Res. 1992, 52: 5229-5234.
Schuuring E: The involvement of the chromosome 11q13 region in human malignancies: cyclin D1 and EMS1 are two new candidate oncogenes--a review. Gene. 1995, 159: 83-96. 10.1016/0378-1119(94)00562-7.
Hui R, Ball JR, Macmillan RD, Kenny FS, Prall OW, Campbell DH, Cornish AL, McClelland RA, Daly RJ, Forbes JF, Blamey RW, Musgrove EA, Robertson JF, Nicholson RI, Sutherland RL: EMS1 gene expression in primary breast cancer: relationship to cyclin D1 and oestrogen receptor expression and patient survival. Oncogene. 1998, 17: 1053-1059. 10.1038/sj.onc.1202023.
Patel AS, Schechter GL, Wasilenko WJ, Somers KD: Overexpression of EMS1/cortactin in NIH3T3 fibroblasts causes increased cell motility and invasion in vitro. Oncogene. 1998, 16: 3227-3232. 10.1038/sj.onc.1201850.
Huang C, Liu J, Haudenschild CC, Zhan X: The role of tyrosine phosphorylation of cortactin in the locomotion of endothelial cells. J Biol Chem. 1998, 273: 25770-25776. 10.1074/jbc.273.40.25770.
van Rossum AG, De Graaf JH, Schuuring-Scholtes E, Kluin PM, Fan YX, Zhan X, Moolenaar WH, Schuuring E: Alternative splicing of the actin binding domain of human cortactin affects cell migration. J Biol Chem. 2003, 278: 45672-45679. 10.1074/jbc.M306688200.
Li Y, Tondravi M, Liu J, Smith E, Haudenschild CC, Kaczmarek M, Zhan X: Cortactin potentiates bone metastasis of breast cancer cells. Cancer Res. 2001, 61: 6906-6911.
Uruno T, Liu J, Zhang P, Fan Y, Egile C, Li R, Mueller SC, Zhan X: Activation of Arp2/3 complex-mediated actin polymerization by cortactin. Nat Cell Biol. 2001, 3: 259-266. 10.1038/35060051.
Weed SA, Karginov AV, Schafer DA, Weaver AM, Kinley AW, Cooper JA, Parsons JT: Cortactin localization to sites of actin assembly in lamellipodia requires interactions with F-actin and the Arp2/3 complex. J Cell Biol. 2000, 151: 29-40. 10.1083/jcb.151.1.29.
Katsube T, Togashi S, Hashimoto N, Ogiu T, Tsuji H: Filamentous actin binding ability of cortactin isoforms is responsible for their cell-cell junctional localization in epithelial cells. Arch Biochem Biophys. 2004, 427: 79-90. 10.1016/j.abb.2004.04.015.
Ohoka Y, Takai Y: Isolation and characterization of cortactin isoforms and a novel cortactin-binding protein, CBP90. Genes Cells. 1998, 3: 603-612. 10.1046/j.1365-2443.1998.00216.x.
Yamashita A, Katsube T, Hashimoto N, Tomita K, Takahashi M, Ueda R, Togashi S: Identificationof Xenopus cortactin: two isoforms of the transcript and multiple forms of the protein. Zoological Science. 2001, 18: 331-336. 10.2108/zsj.18.331.
Daly RJ: Cortactin signalling and dynamic actin networks. Biochem J. 2004, 382: 13-25. 10.1042/BJ20040737.
Huang C, Ni Y, Wang T, Gao Y, Haudenschild CC, Zhan X: Down-regulation of the filamentous actin cross-linking activity of cortactin by Src-mediated tyrosine phosphorylation. J Biol Chem. 1997, 272: 13911-13915. 10.1074/jbc.272.21.13911.
Head JA, Jiang D, Li M, Zorn LJ, Schaefer EM, Parsons JT, Weed SA: Cortactin tyrosine phosphorylation requires Rac1 activity and association with the cortical actin cytoskeleton. Mol Biol Cell. 2003, 14: 3216-3229. 10.1091/mbc.E02-11-0753.
Campbell DH, Sutherland RL, Daly RJ: Signaling pathways and structural domains required for phosphorylation of EMS1/cortactin. Cancer Res. 1999, 59: 5376-5385.
Martinez-Quiles N, Ho HY, Kirschner MW, Ramesh N, Geha RS: Erk/Src phosphorylation of cortactin acts as a switch on-switch off mechanism that controls its ability to activate N-wasp. Mol Cell Biol. 2004, 24: 5269-5280. 10.1128/MCB.24.12.5269-5280.2004.
Genecard HS1. 2005, [http://bioinfo.weizmann.ac.il/cards-bin/carddisp?HCLS1&search=EMS1&suff=txt]
Kitamura D, Kaneko H, Miyagoe Y, Ariyasu T, Watanabe T: Isolation and characterization of a novel human gene expressed specifically in the cells of hematopoietic lineage. Nucleic Acids Res. 1989, 17: 9367-9379.
Schuuring E, van Damme H, Schuuring-Scholtes E, Verhoeven E, Michalides R, Geelen E, de Boer C, Brok H, van BV, Kluin P: Characterization of the EMS1 gene and its product, human Cortactin. Cell Adhes Commun. 1998, 6: 185-209.
Miglarese MR, Mannion-Henderson J, Wu H, Parsons JT, Bender TP: The protein tyrosine kinase substrate cortactin is differentially expressed in murine B lymphoid tumors. Oncogene. 1994, 9: 1989-1997.
Ruzzene M, Brunati AM, Sarno S, Marin O, Donella-Deana A, Pinna LA: Ser/Thr phosphorylation of hematopoietic specific protein 1 (HS1): implication of protein kinase CK2. Eur J Biochem. 2000, 267: 3065-3072.
Yamanashi Y, Okada M, Semba T, Yamori T, Umemori H, Tsunasawa S, Toyoshima K, Kitamura D, Watanabe T, Yamamoto T: Identification of HS1 protein as a major substrate of protein-tyrosine kinase(s) upon B-cell antigen receptor-mediated signaling. Proc Natl Acad Sci U S A. 1993, 90: 3631-3635.
Stone JD, Conroy LA, Byth KF, Hederer RA, Howlett S, Takemoto Y, Holmes N, Alexander DR: Aberrant TCR-mediated signaling in CD45-null thymocytes involves dysfunctional regulation of Lck, Fyn, TCR-zeta, and ZAP-70. J Immunol. 1997, 158: 5773-5782.
Fukamachi H, Yamada N, Miura T, Kato T, Ishikawa M, Gulbins E, Altman A, Kawakami Y, Kawakami T: Identification of a protein, SPY75, with repetitive helix-turn-helix motifs and an SH3 domain as a major substrate for protein tyrosine kinase(s) activated by Fc epsilon RI cross-linking. J Immunol. 1994, 152: 642-652.
Ingley E, Sarna MK, Beaumont JG, Tilbrook PA, Tsai S, Takemoto Y, Williams JH, Klinken SP: HS1 interacts with Lyn and is critical for erythropoietin-induced differentiation of erythroid cells. J Biol Chem. 2000, 275: 7887-7893. 10.1074/jbc.275.11.7887.
Kitamura D, Kaneko H, Taniuchi I, Akagi K, Yamamura K, Watanabe T: Molecular cloning and characterization of mouse HS1. Biochem Biophys Res Commun. 1995, 208: 1137-1146. 10.1006/bbrc.1995.1452.
He H, Watanabe T, Zhan X, Huang C, Schuuring E, Fukami K, Takenawa T, Kumar CC, Simpson RJ, Maruta H: Role of phosphatidylinositol 4,5-bisphosphate in Ras/Rac-induced disruption of the cortactin-actomyosin II complex and malignant transformation. Mol Cell Biol. 1998, 18: 3829-3837.
Yamanashi Y, Fukuda T, Nishizumi H, Inazu T, Higashi K, Kitamura D, Ishida T, Yamamura H, Watanabe T, Yamamoto T: Role of tyrosine phosphorylation of HS1 in B cell antigen receptor- mediated apoptosis. J Exp Med. 1997, 185: 1387-1392. 10.1084/jem.185.7.1387.
Taniuchi I, Kitamura D, Maekawa Y, Fukuda T, Kishi H, Watanabe T: Antigen-receptor induced clonal expansion and deletion of lymphocytes are impaired in mice lacking HS1 protein, a substrate of the antigen- receptor-coupled tyrosine kinases. EMBO J. 1995, 14: 3664-3678.
Fukuda T, Kitamura D, Taniuchi I, Maekawa Y, Benhamou LE, Sarthou P, Watanabe T: Restoration of surface IgM-mediated apoptosis in an anti-IgM-resistant variant of WEHI-231 lymphoma cells by HS1, a protein-tyrosine kinase substrate. Proc Natl Acad Sci U S A. 1995, 92: 7302-7306.
Suzuki Y, Demoliere C, Kitamura D, Takeshita H, Deuschle U, Watanabe T: HAX-1, a novel intracellular protein, localized on mitochondria, directly associates with HS1, a substrate of Src family tyrosine kinases. J Immunol. 1997, 158: 2736-2744.
Nagata Y, Kiefer F, Watanabe T, Todokoro K: Activation of hematopoietic progenitor kinase-1 by erythropoietin. Blood. 1999, 93: 3347-3354.
Takemoto Y, Furuta M, Sato M, Kubo M, Hashimoto Y: Isolation and characterization of a novel HS1 SH3 domain binding protein, HS1BP3. Int Immunol. 1999, 11: 1957-1964. 10.1093/intimm/11.12.1957.
Uruno T, Zhang P, Liu J, Hao JJ, Zhan X: Haematopoietic lineage cell-specific protein 1 (HS1) promotes actin-related protein (Arp) 2/3 complex-mediated actin polymerization. Biochem J. 2003, 371: 485-493. 10.1042/BJ20021791.
Sawabe T, Horiuchi T, Koga R, Tsukamoto H, Kojima T, Harashima S, Kikuchi Y, Otsuka J, Mitoma H, Yoshizawa S, Niho Y, Watanabe T: Aberrant HS1 molecule in a patient with systemic lupus erythematosus. Genes Immun. 2003, 4: 122-131. 10.1038/sj.gene.6363932.
Ruzzene M, Brunati AM, Marin O, Donella-Deana A, Pinna LA: SH2 domains mediate the sequential phosphorylation of HS1 protein by p72syk and Src-related protein tyrosine kinases. Biochemistry. 1996, 35: 5327-5332. 10.1021/bi9528614.
Hiura K, Lim SS, Little SP, Lin S, Sato M: Differentiation dependent expression of tensin and cortactin in chicken osteoclasts. Cell Motil Cytoskeleton. 1995, 30: 272-284. 10.1002/cm.970300405.
Mizutani K, Miki H, He H, Maruta H, Takenawa T: Essential role of neural Wiskott-Aldrich syndrome protein in podosome formation and degradation of extracellular matrix in src-transformed fibroblasts. Cancer Res. 2002, 62: 669-674.
Destaing O, Saltel F, Geminard JC, Jurdic P, Bard F: Podosomes display actin turnover and dynamic self-organization in osteoclasts expressing actin-green fluorescent protein. Mol Biol Cell. 2003, 14: 407-416. 10.1091/mbc.E02-07-0389.
Schuuring E, Verhoeven E, Litvinov S, Michalides RJ: The product of the EMS1 gene, amplified and overexpressed in human carcinomas, is homologous to a v-src substrate and is located in cell- substratum contact sites. Mol Cell Biol. 1993, 13: 2891-2898.
Wu H, Parsons JT: Cortactin, an 80/85-kilodalton pp60src substrate, is a filamentous actin-binding protein enriched in the cell cortex. J Cell Biol. 1993, 120: 1417-1426. 10.1083/jcb.120.6.1417.
Katsube T, Takahisa M, Ueda R, Hashimoto N, Kobayashi M, Togashi S: Cortactin associates with the cell-cell junction protein ZO-1 in both Drosophila and mouse. J Biol Chem. 1998, 273: 29672-29677. 10.1074/jbc.273.45.29672.
Wang X, Crispino JD, Letting DL, Nakazawa M, Poncz M, Blobel GA: Control of megakaryocyte-specific gene expression by GATA-1 and FOG-1: role of Ets transcription factors. EMBO J. 2002, 21: 5225-5234. 10.1093/emboj/cdf527.
Naora H, Deacon NJ: Relationship between the total size of exons and introns in protein- coding genes of higher eukaryotes. Proc Natl Acad Sci U S A. 1982, 79: 6196-6200.
Weil D, Bernard M, Combates N, Wirtz MK, Hollister DW, Steinmann B, Ramirez F: Identification of a mutation that causes exon skipping during collagen pre-mRNA splicing in an Ehlers-Danlos syndrome variant. J Biol Chem. 1988, 263: 8561-8564.
Cheng Y, Leung S, Mangoura D: Transient suppression of cortactin ectopically induces large telencephalic neurons towards a GABAergic phenotype. J Cell Sci. 2000, 113 ( Pt 18): 3161-3172.
Somogyi K, Rorth P: Cortactin modulates cell migration and ring canal morphogenesis during Drosophila oogenesis. Mech Dev. 2004, 121: 57-64. 10.1016/j.mod.2003.10.003.
Hering H, Sheng M: Activity-dependent redistribution and essential role of cortactin in dendritic spine morphogenesis. J Neurosci. 2003, 23: 11759-11769.
Helwani FM, Kovacs EM, Paterson AD, Verma S, Ali RG, Fanning AS, Weed SA, Yap AS: Cortactin is necessary for E-cadherin-mediated contact formation and actin reorganization. J Cell Biol. 2004, 164: 899-910. 10.1083/jcb.200309034.
GeneFinder program. [http://www.Bioscience.org/urllists/genefind.htm]
ORF finder program. [http://www.ncbi.nlm.nih.gov/gorf/gorf.html]
Alexander H, Alexander S: Identification of introns by reverse-transcription PCR. Biotechniques. 1996, 20: 778-780.
PROSCAN program. [http://bimas.dcrt.nih.gov/molbio/proscan/]
TFSEARCH program. [http://molsun1.cbrc.aist.go.jp/papia/papia.html]
National Center for Biotechnology Information (NCBI). [http://www.ncbi.nih.gov/Genomes/]
The Sanger Institute. [http://www.sanger.ac.uk/]
EnsEMBL of The Wellcome Trust Sanger Institute. [http://www.ensembl.org/]
DOE Joint Genome Institute. [http://www.jgi.doe.gov/]
TIGR, The Institute for Genomic Research. [http://www.tigr.org/]
DNA Data Bank of Japan. [http://www.ddbj.nig.ac.jp/]
Nematode.net Genome Sequencing Center. [http://www.nematode.net/]
European Bioinformatics Institute (EBI). [http://www.ebi.ac.uk/]
Genoscope National Sequencing Center. [http://www.genoscope.cns.fr/]
The U.S. Poultry Gene Mapping Project. [http://www.genome.iastate.edu/chickmap/]
UCSC Genome Bioinformatics. [http://www.genome.ucsc.edu/]
HMMER program. [http://hmmer.wustl.edu]
GeneBee ClustalW 1.83 program. [http://www.genebee.msu.su/genebee.html]
Predict NLS program. [http://cubic.bioc.columbia.edu/cgi/var/nair/resonline.pl]
This work was supported by grant NKB-RUL 98–1647 of the Dutch Cancer Society. We thank Berend Snel for general discussions and for critical reading of the manuscript.
AGSHvR designed the study on comparative genome analysis, performed database searches, sequence alignments and gene structure prediction and drafted the manuscript. ESS designed, conducted and analyzed the cloning and sequencing of the promoter of human cortactin. VvBvS conducted and analyzed the PCR and sequencing experiments of the exon-intron boundaries of human cortactin and its splice variants. PMK read the manuscript and provided comments. ES helped with writing the paper, provided overall technical guidance and coordination. All authors read and approved the final manuscript.
Electronic supplementary material
About this article
Cite this article
van Rossum, A.G., Schuuring-Scholtes, E., van Buuren-van Seggelen, V. et al. Comparative genome analysis of cortactin and HS1: the significance of the F-actin binding repeat domain. BMC Genomics 6, 15 (2005). https://doi.org/10.1186/1471-2164-6-15
- Putative Transcription Factor Binding Site
- Cortactin Protein
- Accession Number AP000487
- Mediate Actin Polymerization
- Cortactin Tyrosine Phosphorylation