Synaptotagmin gene content of the sequenced genomes

Background Synaptotagmins exist as a large gene family in mammals. There is much interest in the function of certain family members which act crucially in the regulated synaptic vesicle exocytosis required for efficient neurotransmission. Knowledge of the functions of other family members is relatively poor and the presence of Synaptotagmin genes in plants indicates a role for the family as a whole which is wider than neurotransmission. Identification of the Synaptotagmin genes within completely sequenced genomes can provide the entire Synaptotagmin gene complement of each sequenced organism. Defining the detailed structures of all the Synaptotagmin genes and their encoded products can provide a useful resource for functional studies and a deeper understanding of the evolution of the gene family. The current rapid increase in the number of sequenced genomes from different branches of the tree of life, together with the public deposition of evolutionarily diverse transcript sequences make such studies worthwhile. Results I have compiled a detailed list of the Synaptotagmin genes of Caenorhabditis, Anopheles, Drosophila, Ciona, Danio, Fugu, Mus, Homo, Arabidopsis and Oryza by examining genomic and transcript sequences from public sequence databases together with some transcript sequences obtained by cDNA library screening and RT-PCR. I have compared all of the genes and investigated the relationship between plant Synaptotagmins and their non-Synaptotagmin counterparts. Conclusions I have identified and compared 98 Synaptotagmin genes from 10 sequenced genomes. Detailed comparison of transcript sequences reveals abundant and complex variation in Synaptotagmin gene expression and indicates the presence of Synaptotagmin genes in all animals and land plants. Amino acid sequence comparisons indicate patterns of conservation and diversity in function. Phylogenetic analysis shows the origin of Synaptotagmins in multicellular eukaryotes and their great diversification in animals. Synaptotagmins occur in land plants and animals in combinations of 4–16 in different species. The detailed delineation of the Synaptotagmin genes presented here, will allow easier identification of Synaptotagmins in future. Since the functional roles of many of these genes are unknown, this gene collection provides a useful resource for future studies.


Background
Synaptotagmin (Syt) 1 was initially found as a protein component of synaptic vesicles [1]. New members of the Syt gene family have subsequently been discovered by DNA sequence similarity [2][3][4][5][6][7][8][9][10][11][12][13][14][15]. Syts encode proteins which share a common structure: an N-terminal transmembrane sequence joined to a variable length linker, followed by two tandemly arranged, distinct C2 domains, C2A and C2B. At present, a great deal more is known about Syt1 than the other Syts because it functions crucially in synaptic vesicle trafficking in the nervous systems of animals [16]. Other Syts are implicated in trafficking events in the nervous system as well as in various other tissues [17,18]. Certain Syts are known to express alternatively spliced transcripts [19][20][21] and RNA editing of Drosophila Syt1 has been described [22]. Little is known however, about the details of the variations in expression of different Syts.
Public sequence database resources are becoming quite comprehensive, including vast numbers of transcript sequences from a wide variety of organisms as well as a number of relatively complete genome sequences. Systematic identification of Syts by database searching makes it possible to begin to address questions such as: what is the evolutionary extent of this gene family? where do these genes appear on the tree of life? and how many of these genes does an organism need?
Building on my previous effort to extract the Syt content of the sequenced genomes [13] I have now collected information for 98 Syts from organisms with sequenced genomes. Transcript sequences reveal abundant variation in Syt expression and indicate the presence of Syts in all land plants and all animals.

Identification of Syts
Previously [13] I used a 44 amino acid sequence probe, representing the most highly conserved stretch of all the known Syts, and lying within a single exon in the C2B region, to search the sequence databases. This probe detected all the loci within the available genomes which could harbour Syts, but in order to confirm that these loci did indeed encode Syts it was necessary to ascertain that all the relevant parts were present (N-terminal transmembrane sequence, variable length linker, C2A and C2B). Whilst some regions (C2A and C2B) are well conserved, there is great variation in the sequences of other regions. It is difficult to predict exons accurately from genomic sequence unless a good degree of sequence similarity is present. Transcript sequences can reveal the true gene structure but few transcripts were available at that time, so although I could locate the already known Syts in Caenorhabditis, Drosophila and Homo, it was clear that there were more potential Syts in each of these genomes and that Syt relatives may even be present in plants, which would indicate a general function for this gene family, not restricted to the operation of nervous systems.
Recently, more genomes have been sequenced and some very good transcript resources have become available. I have also carried out cDNA library screening and RT-PCR to investigate the Arabidopsis Syts, the novel Homo Syts and the alternative splicing of Rattus Syt1 (accession numbers aj617615-aj617630). I used tblastn and blastn to search sequences at NCBI [23], EBI [24], Ensembl [25] and JGI [26]. I assembled transcript sequences into gap4 databases [27] and used Spin [27] and Align [28]  Animals have a more diverse array: 7 Syts in Caenorhabditis, 5 or more in Anopheles (incomplete genome sequence), 7 in Drosophila, 4 or more in Ciona (a surpris-ingly small number perhaps, but an incomplete genome sequence), 13-14 in Danio and Fugu (incomplete genome sequences) and 16 in Mus and Homo. Bearing in mind that some of the genome sequences are incomplete, the overall picture appears to reflect both acquisition and loss of different types of Syt, with different animals bearing different arrays of Syts. I have highlighted a motif (G X X X P E L Y) in the linker region of the Syt15 orthologues ( fig. 4) which Chromosomal locations of Homo and Mus Syts Arabidopsis Syts are identified with names, following the nomenclature of Fukuda [14]. 85 91  86  88  89  90  87  92  93  94  95  96  97  98  5  10  16  11  17  47  65  81  18  46  63  79  22  68  84  1  4  8  13  20  24  25  53  69  54  70  29  57  73  26  27  28  30  31  60  76  3  37  38  39  40  61  77  41  42  62  78  43  44  45  58  74  55  71  6  15  21  59  75  2  7  9  14  32  33  56  72  34  35  36  64  80  12  19  23  48  49  67  83  50  52  66  82 3). The functional consequences of this alternative splicing have recently been investigated [30]. In Syt1, the C2B region undergoes alternative splicing in Caenorhabditis and RNA editing in Anopheles and Drosophila. Alternative splicing equivalent to that of Caenorhabditis has just also been described in Aplysia [31]. There is no evidence for equivalent alteration of Ciona, Danio, Fugu, Mus or Homo Syt1. It is intriguing to note that this region in the most   410  420  430  440  450  460  470  480  490  500  510  520  530  540  550 Additional File 1 and accession numbers aj617615-aj617619 for alternative splicing in Homo, Mus and Rattus) seems to be particularly complex and is the likely explanation for the described variations [32]. This was not seen in the original 5' mapping work [33] but RNase protection (RPA) analysis in R. norvegicus and R. rattus ( fig. 9) confirms the evidence of complex, species specific alternative splicing in this region of Syt1 in the sequence databases. Alternative splicing of this region is also evident in Ciona Syt1 and a functional analysis of this region in the related organism Halocynthia has recently been carried out [34]. Insufficient transcript evidence is currently available from other organisms to establish the universality of Syt1 5'UTS alternative splicing. The first functional analysis of the yeast members (tricalbins) has just been published [35] but the family is poorly characterized otherwise. Additional File 1 entries 108-  fig. 10.

C2A to C-terminal regions of Syts
The advantages of performing an evolutionary analysis of Syts and attempting to understand their origins and diversity include the possibility of exhaustively defining the functions of a minimal set in a model organism (eg. Arabidopsis, Ciona). Comparative analysis of subgroups of Syts from a range of evolutionary lineages helps to define exactly which sequences are required to maintain function and which are able to diversify (see [36] for a structural evolutionary analysis of the C2 domains of Syts). The patterns of alternative splicing displayed by certain groups of Syts indicate enormous functional diversity that is only beginning to be understood. It will be fascinating to discover what it is about certain animal Syts that distinguishes them as essential players in neurotransmission.

RT-PCR and cDNA library screening
RT-PCR from Rattus brain mRNA was carried out with Pfuturbo polymerase. A Homo brain cDNA library (Clontech) was screened with probes for the 6 novel human loci identified in [13] (accession numbers aj303363-aj303368).
An Arabidopsis whole plant cDNA library (Stratagene) was screened with probes for the loci identified in [13]. The probes were produced by PCR from genomic DNA which was a gift from Ian Furner (Cambridge University department of Genetics).

RNase protection analysis
RNase protection analysis (RPA) analysis was carried out as described [20]. Rattus rattus brain was a gift from S.Redrobe at Bristol Zoo. Brain mRNA was prepared from Rattus rattus and from Rattus norvegicus (Sprague-Dawley) by guanidine isothiocyanate followed by polyA selection with oligo-dT cellulose. Regions of the 5' untranslated (5'UTS) portion of sequence accession x52772 (Rattus rattus Syt1) were cloned using RT-PCR with Rattus brain mRNA. RPA probes were produced using the Maxiscript kit (Ambion) from pBSIIKS-clones containing insert sequences aj617620-aj617622.
(C) 5'UTS probe aj617622. Lane 1: Rattus norvegicus brain mRNA. Lane 2: Rattus rattus brain mRNA. The uppermost bands are full-length products from mRNA transcripts which match the input probe across its whole length. Shorter products result from partially matching mRNA transcripts.