Open Access

ESTs from the microsporidian Edhazardia aedis

BMC Genomics20089:296

DOI: 10.1186/1471-2164-9-296

Received: 12 December 2007

Accepted: 20 June 2008

Published: 20 June 2008

Abstract

Background

Microsporidia are a group of parasites related to fungi that infect a wide variety of animals and have gained recognition from the medical community in the past 20 years due to their ability to infect immuno-compromised humans. Microsporidian genomes range in size from 2.3 to 19.5 Mbp, but almost all of our knowledge comes from species that have small genomes (primarily from the human parasite Encephalitozoon cuniculi and the locust parasite Antonospora locustae). We have conducted an EST survey of the mosquito parasite Edhazardia aedis, which has an estimated genome size several times that of more well-studied species. The only other microsporidian EST project is from A. locustae, and serves as a basis for comparison with E. aedis.

Results

The spore transcriptomes of A. locustae and E. aedis were compared and the numbers of unique transcripts that belong to each COG (Clusters of Orthologous Groups of proteins) category differ by at most 5%. The transcripts themselves have widely varying start sites and encode a number of proteins that have not been found in other microsporidia examined to date. However, E. aedis seems to lack the multi-gene transcripts present in A. locustae and E. cuniculi. We also present the first documented case of transcription of a transposable element in microsporidia.

Conclusion

Although E. aedis and A. locustae are distantly related, have very disparate life cycles and contain genomes estimated to be vastly different sizes, their patterns of transcription are similar. The architecture of the ancestral microsporidian genome is unknown, but the presence of genes in E. aedis that have not been found in other microsporidia suggests that extreme genome reduction and compaction is lineage specific and not typical of all microsporidia.

Background

Microsporidia are single-celled eukaryotic intracellular parasites that are related to fungi. Currently, over 1200 species have been identified, infecting animals from nearly every phylum, including commercially important species such as honeybees and fish, as well as humans [1]. Inside host cells, microsporidia proliferate as vegetative stages (meronts, schizonts) which eventually produce spores that are released when the host cell lyses. Spores possess a unique host cell invasion apparatus called the polar filament, which is forcefully everted upon germination to form a tube and can pierce a nearby host cell [1]. The tube then acts as a conduit allowing the contents of the spore to be injected into the host cell's cytoplasm, where the parasite undergoes vegetative replication.

Microsporidia are a diverse group of organisms, and vary greatly in the complexity of their life cycles. For instance, Encephalitozoon cuniculi and Antonospora locustae produce only one type of spore (uninucleate in the former and binucleate in the latter), and complete their entire life cycles inside one host individual, while Amblyospora californica requires two host groups (mosquitoes and microcrustacea) and produces three morphologically and functionally discrete spore types [1].

Microsporida possess some of the smallest primary nuclear genomes known (as tiny as 2.3 Mbp). The only microsporidian whose genome has been completely sequenced is the human parasite, E. cuniculi. At a meager 2.9 Mbp, E. cuniculi's genome is extremely compact, with only 2000 genes [2]. A small genome sequence survey (GSS) project has been conducted on A. locustae, a locust parasite that has been approved as a biological control agent in the United States [3]. A. locustae's genome is roughly 5.4 Mbp in size [4], or about twice the size of E. cuniculi's genome. Despite the genome size difference, both genomes appear to be structured in much the same way. Genes are closely packed (nearly one gene per kilobase), are small in size compared to homologues in animals and fungi, and are intron-poor. There is also a much greater degree of synteny between these two organisms than would be expected given their phylogenetic relationship, which implies that although microsporidian genes are fast-evolving, genomic rearrangements occur only rarely [3] (See Fig. 1).
https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-296/MediaObjects/12864_2007_Article_1489_Fig1_HTML.jpg
Figure 1

The phylogenetic relationships between several microsporidia. Species that house transposable elements belonging to the Ty3/gypsy family are highlighted in blue, while species containing LTR transposons are highlighted in yellow. Genome sizes are indicated to the right of each species. (Adapted from Slamovits et al., 2004.)

However, we have very little information on microsporidian genomes of larger sizes. Edhazardia aedis is a microsporidian that infects Aedes aegypti, the mosquito vector of the Dengue hemorrhagic and yellow fever viruses. E. aedis has been intensively studied as a viable biological control agent for A. aegypti [5] and has a genome estimated to be to be many times larger than that of E. cuniculi. There are several possible explanations for this difference: E. aedis may have more genes that control its complex life cycle. Genes may also be longer, more widely spaced, and contain more introns than E. cuniculi [2].

Morphological studies conducted on E. aedis have revealed at least four different types of spores – two uninucleate and two binucleate [6, 7]. The two types of uninucleate spore types differ morphologically but possess similar pyriform shapes. However, the cell division events from which they arise differ. Spores produced via mitosis are roughly 8.5 μm in length, whereas spores produced via meiosis (meiospores) are about 7.5 μm. Small binucleate spores (~6.5 μm in length) that have short polar filaments are formed first, followed by the production of larger binucleate spores (~9 μm in length) that are ovoid in shape. Meiospore formation is usually abortive and rarely produces normal spores [6].

E. aedis' life cycle is moderately complex and involves two generations of the mosquito host. It begins when a uninucleate spore is ingested by a mosquito larva from the environment. Once in the gut, the spore germinates and begins to multiply in the host tissue. Within 48 hours, small binucleate spores are formed that are responsible for spread to other tissues. Orally infected larvae generally exhibit reduced growth, and may die before reaching maturity if the parasite load is high, thus releasing more spores into the environment. However, if the infection load is sufficiently small, the larva will mature into an adult mosquito and survive to reproduce [6]. If the adult mosquito is female, large binucleate spores will develop in her ovaries and will infect oocytes, thus passing the infection on to the next generation where the majority of mortality occurs in larvae. Little is known about the factors that modulate the transition from one phase in the life cycle to the next, or about the changes in gene expression that occur during these transitions.

It is also possible that the difference in genome size between E. aedis and E. cuniculi or A. locustae may have less to do with the number of genes, and more to do with genome architecture. E. aedis genes could be longer, more widely spaced, and contain more introns than E. cuniculi [2]. In an effort to learn more about E. aedis' genome, a GSS of >200 kbp was conducted [8]. This study concluded that E. aedis' genome structure is very different from those of E. cuniculi and A. locustae. A large portion of the genome is occupied by non-coding DNA and genes are not closely packed together, although the existence of local areas of compaction could not be ruled out.

Previous examinations of ESTs from microsporidia have only been conducted on microsporidia with small genomes. These transcripts possessed unusual features that are atypical in eukaryotes. Examinations of ESTs from A. locustae [9] and transcripts from E. cuniculi revealed numerous multi-gene transcripts. These transcripts are different from prokaryotic operons, as the proteins encoded by the transcript do not have related functions and are often not encoded on the same DNA strand. Many transcripts encode only a portion of one gene, while the other is present in its entirety [9, 10]. The reason for this phenomenon is not known, but it has been suggested that transcriptional control elements have been lost (or moved into adjacent genes) during the process of genome compaction [10].

As E. aedis' genome and life cycle are very different from E. cuniculi and A. locustae, it is reasonable to assume that the transcript structure and number of genes present may differ as well. In this study, we describe the first survey of ESTs from a microsporidian with a much larger genome size and complex life cycle. In sequencing over 1300 transcripts, we have elucidated more of E. aedis' genome content, and have gained a profile of its transcript structure and composition. Surprisingly, the E. aedis uninucleate spore transcriptome is remarkably similar to that of A. locustae.

Results

Overview

Sequences were deposited into the Genbank EST database and have the accession numbers FG063843 to FG065106. From the 1307 clones sequenced, 133 unique genes were found; 55 were represented by a single transcript, while the remaining 78 were represented by two or more. 97 of the 133 unique genes are present in other microsporidia (See Table 1), while 10 are present in other (non-microsporidian) organisms (See Table 2), 18 are putatively E. aedis-specific and 8 have no apparent open reading frames. Coding sequences contained 43% G+C while 5' and 3' untranslated regions possessed 27% and 26%, respectively.

Table 1

Gene Name

Species

Genbank Accession Number

16S rRNA GENE

Brachiola algerae

AM422905

1-ACYL-SN-GLYCEROL-3-PHOSPHATE ACYLTRANSFERASE

Encephalitozoon cuniculi

NP_586146

26S PROTEASOME REGULATORY SUBUNIT 4

Encephalitozoon cuniculi

NP_586091

26S PROTEASOME REGULATORY SUBUNIT 6

Encephalitozoon cuniculi

NP_586128

26S PROTEASOME REGULATORY SUBUNIT 8

Encephalitozoon cuniculi

XP_955738

40S RIBOSOMAL PROTEIN S2

Leishmania infantum

XP_001466537

40S RIBOSOMAL PROTEIN S3

Encephalitozoon cuniculi

XP_955676

40S RIBOSOMAL PROTEIN S4

Mycetophagus quadripustulatus

CAJ17168

40S RIBOSOMAL PROTEIN SA or P40

Encephalitozoon cuniculi

NP_584728

60S RIBOSOMAL PROTEIN L3

Encephalitozoon cuniculi

NP_597630

60S RIBOSOMAL PROTEIN L4

Encephalitozoon cuniculi

NP_597213

60S RIBOSOMAL PROTEIN L5

Encephalitozoon cuniculi

NP_585846

6-PHOSPHOFRUCTOKINASE

Encephalitozoon cuniculi

NP_597579

ABC TRANSPORTER (MITOCHONDRIAL TYPE) #1

Encephalitozoon cuniculi

NP_586426

ABC TRANSPORTER (MITOCHONDRIAL TYPE) #2

Encephalitozoon cuniculi

NP_586426

ACTIN

Blakeslea trispora

AAW32475

ARGININE/SERINE RICH PRE-mRNA SPLICING FACTOR

Encephalitozoon cuniculi

NP_597487

ASSOCIATED WITH RAN (NUCLEAR IMPORT/EXPORT) FUNCTION FAMILY MEMBER

Caenorhabditis elegans

NP_499369

ATP SYNTHASE

Encephalitozoon cuniculi

XP_955732

BELONGS TO THE ABC TRANSPORTER SUPERFAMILY

Encephalitozoon cuniculi

NP_597462

cAMP-DEPENDENT PROTEIN KINASE TYPE 1 REGULATORY CHAIN

Encephalitozoon cuniculi

NP_597223

CASEIN KINASE 1 HOMOLOG (INVOLVED IN DNA REPAIR

Encephalitozoon cuniculi

NP_597600

CATION-TRANSPORTING ATPase

Encephalitozoon cuniculi

NP_586078

CHOLINE PHOSPHATE CYTIDYLYLTRANSFERASE

Encephalitozoon cuniculi

NP_586276

DNA REPLICATION LICENSING FACTOR MCM2

Encephalitozoon cuniculi

NP_584768

DNA REPLICATION LICENSING FACTOR OF THE MCM FAMILY MCM6

Encephalitozoon cuniculi

NP_597420

DNA REPLICATION LICENSING FACTOR OF THE MCM FAMILY MCM7

Encephalitozoon cuniculi

NP_585977

DNAJ PROTEIN HOMOLOG 2

Encephalitozoon cuniculi

NP_586004

DNAK-LIKE PROTEIN

Encephalitozoon cuniculi

NP_586489

EUKARYOTIC TRANSLATION INITIATION FACTOR 4A

Encephalitozoon cuniculi

XP_955671

FIBRILLARIN (34kDa NUCLEOLAR PROTEIN)

Encephalitozoon cuniculi

NP_586197

GENERAL TRANSCRIPTION FACTOR

Encephalitozoon cuniculi

NP_597292

GLUCOSAMINE FRUCTOSE-6-PHOSPHATE AMINOTRANSFERASE

Encephalitozoon cuniculi

NP_586057

GLYCERALDEHYDE-3-PHOSPHATE DEHYDROGENASE

Encephalitozoon cuniculi

NP_586008

GUANINE NUCLEOTIDE BINDING PROTEIN BETA SUBUNIT

Encephalitozoon cuniculi

NP_597241

HEAT SHOCK RELATED 70 kDa PROTEIN

Encephalitozoon cuniculi

NP_597563

HEAT-SHOCK PROTEIN HSP90 HOMOLOG

Encephalitozoon cuniculi

NP_584635

HISTIDYL tRNA SYNTHETASE

Antonospora locustae

AAT12372

HISTONE ACETYLTRANSFERASE TYPE B SUBUNIT 2

Encephalitozoon cuniculi

NP_586003

HISTONE DEACETYLASE 1

Encephalitozoon cuniculi

NP_597645

HISTONE DEACETYLASE

Encephalitozoon cuniculi

XP_955621

HISTONE H3

Mus musculus

JQ1983

HSP 101 RELATED PROTEIN

Encephalitozoon cuniculi

NP_586448

HYPOTHETICAL PROTEIN ECU02_0840

Encephalitozoon cuniculi

NP_584609

HYPOTHETICAL PROTEIN ECU02_0950

Encephalitozoon cuniculi

NP_584620

HYPOTHETICAL PROTEIN ECU06_0450

Encephalitozoon cuniculi

NP_585801

HYPOTHETICAL PROTEIN ECU06_1280

Encephalitozoon cuniculi

NP_585884

HYPOTHETICAL PROTEIN ECU07_0530

Encephalitozoon cuniculi

NP_585981

HYPOTHETICAL PROTEIN ECU08_1500

Encephalitozoon cuniculi

NP_597278

HYPOTHETICAL PROTEIN ECU09_0740

Encephalitozoon cuniculi

XP_955628

HYPOTHETICAL PROTEIN ECU09_1700

Encephalitozoon cuniculi

XP_955723

HYPOTHETICAL PROTEIN ECU11_1720

Encephalitozoon cuniculi

NP_586478

LIM DOMAIN-CONTAINING PROTEIN

Encephalitozoon cuniculi

NP_586340

LONG CHAIN FATTY ACID CoA LIGASE

Encephalitozoon cuniculi

NP_586206

METHIONINE AMINOPEPTIDASE TYPE 2

Encephalitozoon cuniculi

NP_586190

METHIONINE PERMEASE

Encephalitozoon cuniculi

NP_585905

NIFS-LIKE PROTEIN (CYSTEINE DESULFURASE) INVOLVED IN IRON-SULFUR CLUSTER SYNTHESIS

Encephalitozoon cuniculi

NP_586483

NUCLEAR SER/THR PROTEIN PHOSPHATASE PP1-1 GAMMA CATALYTIC SUBUNIT

Encephalitozoon cuniculi

NP_597385

P68-LIKE PROTEIN (DEAD BOX FAMILY OF RNA HELICASES)

Encephalitozoon cuniculi

NP_597238

PEPTIDE CHAIN RELEASE FACTOR SUBUNIT 1

Encephalitozoon cuniculi

NP_597376

PEPTIDE ELONGATION FACTOR 2

Glugea plecoglossi

BAA11470

PHOSPHATIDYLINOSITOL TRANSFER PROTEIN, ALPHA

Danio rerio

NP_957229

PHOSPHOMANNOMUTASE

Encephalitozoon cuniculi

NP_597365

POLYADENYLATE-BINDING PROTEIN 2

Encephalitozoon cuniculi

NP_586226

POLYPROTEIN

Sorghum bicolor

AAD27571

PRE-mRNA SPLICING FACTOR

Encephalitozoon cuniculi

NP_586183

PROTEIN KINASE B-LIKE PROTEIN

Plasmodium falciparum

AAT06260

PROTEIN TRANSPORT PROTEIN SEC23 HOMOLOG (COPII COAT)

Encephalitozoon cuniculi

NP_586385

PUTATIVE HYDROLASE-LIKE PROTEIN

Antonospora locustae

AAU11090

PUTATIVE ZINC FINGER PROTEIN

Encephalitozoon cuniculi

NP_597297

SER/THR PROTEIN PHOSPHATASE 2-A

Encephalitozoon cuniculi

NP_584753

SER/THR PROTEIN PHOSPHATASE PP2-A REGULATORY SUBUNIT B

Encephalitozoon cuniculi

NP_597423

SERINE/THREONINE PROTEIN KINASE (REQUIRED FOR ACTIN RING AND SEPTATION)

Encephalitozoon cuniculi

XP_965898

SIMILAR TO DNAJ-LIKE PROTEIN

Nasonia vitripennis

XP_001602403

SIMILARITY TO 14-3-3 PROTEIN 1

Encephalitozoon cuniculi

NP_597610

SIMILARITY TO ADP/ATP CARRIER PROTEIN

Paranosema grylli

CAI30461

SIMILARITY TO CDC20 (WD-REPEAT PROTEIN)

Encephalitozoon cuniculi

NP_597660

SIMILARITY TO Hsp70-RELATED PROTEIN

Encephalitozoon cuniculi

NP_584537

SIMILARITY TO HYPOTHETICAL INTEGRAL MEMBRANE PROTEIN YQ55_CAEEL

Encephalitozoon cuniculi

NP_597662

SIMILARITY TO HYPOTHETICAL PROTEIN YAAT_BACSU

Encephalitozoon cuniculi

NP_597532

SIMILARITY TO HYPOTHETICAL PROTEIN YB36_METJA

Encephalitozoon cuniculi

NP_597239

SIMILARITY TO PUTATIVE AMINOACID TRANSPORTER YEU9_yeast

Encephalitozoon cuniculi

NP_584803

SIMILARITY TO SKT5 PROTEIN

Encephalitozoon cuniculi

NP_586349

SIMILARITY TO TRANSCRIPTION INITIATION FACTOR TFIIA

Encephalitozoon cuniculi

NP_597616

STE12 TRANSCRIPTION FACTOR

Encephalitozoon cuniculi

NP_586509

STRUCTURE-SPECIFIC RECOGNITION PROTEIN

Encephalitozoon cuniculi

NP_586030

T COMPLEX PROTEIN 1 SUBUNIT BETA

Encephalitozoon cuniculi

XP_955601

THREONYL tRNA SYNTHETASE #1

Encephalitozoon cuniculi

NP_586084

THREONYL tRNA SYNTHETASE #2

Encephalitozoon cuniculi

NP_586084

TRANSLATION ELONGATION FACTOR 1 ALPHA

Glugea plecoglossi

BAA12288

TRIOSE PHOSPHATE ISOMERASE

Encephalitozoon cuniculi

NP_586329

TUBULIN BETA CHAIN

Encephalitozoon cuniculi

NP_597591

U5 ASSOCIATED snRNP

Encephalitozoon cuniculi

NP_586393

UNNAMED PROTEIN PRODUCT (Hsp70)

Candida glabrata

XP_445544

VACUOLAR ATP SYNTHASE CATALYTIC SUBUNIT A

Encephalitozoon cuniculi

NP_586434

VACUOLAR ATP SYNTHASE SUBUNIT B

Encephalitozoon cuniculi

NP_586219

ZINC FINGER PROTEIN

Encephalitozoon cuniculi

NP_584833

The 97 unique E. aedis transcripts which are homologous to genes present in other microsporidia are listed above. Species names and Genbank accession numbers of top BLASTX hits are indicated. Bold text in the "Gene Name" column indicates instances where two different transcripts both had the same top BLASTX hit. Underlining indicates a copy of Hsp70 that is most similar to a protein that remains unnamed in Genbank.

Table 2

Gene Name

Species Name

Genbank Accession Number

60S RIBOSOMAL PROTEIN L2

Babesia bovis

XP_001612300

ADENOSINE KINASE

Homo sapiens

AAA97893

HYPOTHETICAL PROTEIN

Candida albicans

XP_717148

HYPOTHETICAL PROTEIN PY5484

Plasmodium yoelii yoelii

XP_725949

L-ASPARIGINASE

Dirofilaria immitis

Q9U518

LYSINE tRNA LIGASE

Saccharomyces cerevisiae

CAA39699

PUTATIVE VESICULAR TRANSPORT FACTOR USO1P

Candida albicans

XP_710120

SEC63 DOMAIN CONTAINING PROTEIN

Trichomonas vaginalis

XP_001580151

PROTEIN PHOSPHATASE 2B

Cryptosporidium hominis

XP_666159

WD-40 REPEAT FAMILY PROTEIN

Arabidopsis thaliana

NP_201533

The 10 genes which are present in E. aedis but absent from other microsporidia are listed above. The species names and Genbank accession numbers of the top BLASTX hit for each gene are also listed.

Approximately a quarter of the transcripts analyzed coded for Hsp70. Almost all of the Hsp70 sequences were most similar to the "heat shock related 70 kDa protein" found in E. cuniculi (NP_597563). Single nucleotide variation exists between sequences, usually as 3rd position synonymous substitutions. Where non-synonymous substitutions exist, they are always a single nucleotide and there are no indels between sequences. Mitochondrial-type and DNAK-like Hsp70s were also represented.

Genes were assigned to COG categories to allow for comparison with A. locustae. Figure 2 illustrates the percentages of total E. aedis transcripts that are dedicated to each COG category. Total A. locustae transcripts are provided for comparative purposes. As the randomness of the library is uncertain, it is possible that some transcripts are artificially overrepresented. It is therefore more informative to examine unique transcripts (ie. counting multiple transcripts for the same gene only once) rather than total transcripts. Figure 3 displays the percentages of unique E. aedis and A. locustae transcripts dedicated to each category. Surprisingly, the values are similar and sometimes identical (maximum difference between E. aedis and A. locustae categories is 5%).
https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-296/MediaObjects/12864_2007_Article_1489_Fig2_HTML.jpg
Figure 2

Total E. aedis transcripts represented by COG (Clusters of Orthologous Groups of proteins) category with and without Hsp70. Total A. locustae transcripts are provided for comparison. (A. locustae data adapted from Williams et al., 2005.)

https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-296/MediaObjects/12864_2007_Article_1489_Fig3_HTML.jpg
Figure 3

Unique E. aedis transcripts represented by COG category. Unique A. locustae transcripts are provided for comparison. (A. locustae data adapted from Williams et al., 2005.)

Notable transcripts include a retrotransposon that is similar to LTR retrotransposons present in Sorghum bicolor (AAD27571) and Nosema bombycis (ABE26655). All belong to the Ty3/Gypsy family of retrotransposons. E. aedis also possesses a methionine aminopeptidase 2 gene (MetAP-2), which is present in E. cuniculi. There were several transcripts present that appear homologous to proteins found in various eukaryotes, but are absent in other microsporidia examined to date. These include hypothetical or unknown proteins found in Oryza, Danio and Plasmodium, as well as genes encoding proteins with identified functions, such as an adenosine kinase, a lysine-tRNA ligase and an L-asparaginase (See Table 2). In addition, E. aedis encodes a putative hydrolase-like protein that is present in A. locustae, but absent in E. cuniculi.

E. cuniculi and A. locustae both contain a small number of introns in their genomes and consequently, they have retained a minimal set of splicing machinery. These two organisms are not closely related (See Fig. 1), but they do share a few conserved introns [11]. Therefore, there is reason to suspect that some of these introns may also be present in E. aedis. Fortunately, seven transcripts of the gene encoding ribosomal protein L5 (which contains an intron in E. cuniculi) were recovered from the E. aedis library. These sequences were used to design primers to amplify the L5 gene from genomic DNA. It was found that the E. aedis L5 gene does not contain an intron.

Transcript structure

As E. aedis is an intracellular parasite and therefore cannot be easily cultured, RNA was limited and the library could not be constructed in a 5' cap-dependent manner. Therefore, nearly all of the inserts encoding the same gene were of different lengths, and most were 5' truncated. However, some of E. aedis' transcripts appear to have very long 5' untranslated regions (UTRs) of several hundred base pairs. To further assess transcript structure, cap-dependent 5' RACE (rapid amplification of cDNA ends) was conducted on transcripts from a moderately represented gene, glucosamine fructose-6-phosphate aminotransferase. 5' RACE confirmed that transcript lengths for this gene do vary, with 5' UTRs ranging from 255 to 348 bp (See Fig. 4).
https://static-content.springer.com/image/art%3A10.1186%2F1471-2164-9-296/MediaObjects/12864_2007_Article_1489_Fig4_HTML.jpg
Figure 4

5' RACE conducted on a moderately represented transcript in E. aedis reveals multiple transcription start sites. ESTs are depicted in green and RACE products in purple. The predicted translational start codon is indicated by the orange arrow. As indicated, the E. cuniculi homologue of this gene contains 128 amino acids at N-terminus that appear to be absent in E. aedis.

Contrary to the variable start sites of the transcripts, nearly all appear to have identical end sites. The notable exceptions are the heat shock related 70 kDa protein transcripts, which have somewhat variable 3' polyadenylation sites.

There were frequently single nucleotide differences between sequences in contigs, but these differences were usually restricted to silent third position substitutions. In instances where the substitutions are not silent, they are conservative amino acid substitutions. These differences could represent different copies of the same gene or different alleles within the population (UTRs were not available in most cases to determine which).

Discussion

Comparing microsporidian transcriptomes

This is the second microsporidian EST project to be conducted and the first from a microsporidian possessing a large genome, allowing for a meaningful comparison of microsporidian spore transcriptomes. Despite the vast differences in genome size and life cycle complexity between E. aedis and A. locustae, their transcriptomes are highly similar in their compositions. The proportions of unique transcripts encoding proteins devoted to the "protein destination" COG category in both E. aedis and A. locustae are relatively large (19% and 16%, respectively) (See Fig. 3). It is interesting to note that proteomic work correlates with these results, as the number of proteins in E. cuniculi devoted to the "protein destination" COG category form a large percentage of the total proteins present (~28%) that have known functions [12].

When the total number of unique genes found in E. aedis and A. locustae are compared based on COG category classification, the percentages in each category are close to identical (See Fig. 3). The largest differences lie in the categories of cellular organization and biogenesis, cellular communication and signal transduction and cell rescue, defense, cell death and aging. One notable difference between the two spore transcriptomes is that no transposable elements were recovered in the A. locustae ESTs, whereas E. aedis transcribes a retrotransposon of the Ty3/gypsy family. Transposable elements have been previously reported to exist in the genomes of Nosema bombycis [13], Spraguea lophii [14], Brachiola algerae and E. aedis [8] (See below). To the best of our knowledge, this is the first instance of documented transposable element transcription in microsporidia, and could indicate active transposition.

Nearly 8% of the unique transcripts from E. aedis encode genes that are present in various eukaryotes, but are absent from other microsporidia. The existence of these genes has several possible explanations. Sequence data from microsporidia is scarce, and the only completely sequenced genome is that of E. cuniculi. Therefore, it is currently impossible to assert that these genes are absent in any microsporidia other than E. cuniculi. The possibility exists that they were present in the genome of the microsporidian ancestor, and were lost during genome reduction/compaction events in E. cuniculi. These genes could also have arisen from lateral transfer events or they could have come to resemble genes in other organisms by chance or by convergence. Parsimoniously, the first explanation seems most likely, therefore, these data seem to suggest that the ancestor of microsporidia was not, indeed, compact to the extent of E. cuniculi.

The MetAP-2 protein is a target for drug therapy in E. cuniculi [15]. The E. aedis copy of the MetAP-2 gene is very similar to that present in E. cuniculi, and contains the amino acid residues that bind the drug fumagillin as well as those believed to coordinate metals. Like E. cuniculi, E. aedis lacks a polylysine tract at the N-terminus of the MetAP-2 protein that is present in animals, other fungi and plants. This tract plays a role in hindering the phosphorylation of eukaryotic initiation factor 2α (eIF2α), and its absence indicates that the microsporidian proteins likely lack this function [15].

Although our work indicates that the E. aedis L5 gene does not contain an intron like its E. cuniculi homologue (see Results, above), there is reason to believe that there are introns elsewhere in the genome. There are several transcripts encoding proteins that act in pre-mRNA splicing: an arginine/serine rich pre-mRNA splicing factor (NP_597487 in E. cuniculi), a pre-mRNA splicing factor (NP_586183 in E. cuniculi) and a U5 associated snRNP (NP_586393 in E. cuniculi). These genes comprise 2.2% of the total unique genes found.

Hsp70

Roughly 28% of total E. aedis transcripts encoded some form of Hsp70, a heat shock protein that assists in the folding of other proteins. Hsp70 helps prevent proteins from becoming insoluble and also plays a role in various other intracellular processes, such as apoptosis [16]. The action of Hsp70 allows mutant proteins to continue functioning by being refolded instead of being degraded, which necessitates the costly synthesis of more protein. The number of Hsp70 transcripts in the E. aedis ESTs is an order of magnitude higher than was found in A. locustae (2%) [17]. We are cautious in this interpretation as we have not quantitatively assessed the transcription level of Hsp70 in E. aedis, and it is likely that transcripts of this protein are somewhat overrepresented in the library.

Although no E. cuniculi ESTs have been published, Brosson et al. [12] investigated the proteins present in spores. Hsp70 constitutes a moderate amount of all protein present. Brosson and his colleagues classified all proteins based on their COG categories, and found that all "protein destination" proteins together comprise 21% of E. cuniculi's proteome. Intriguingly, Brosson et al.'s [12] experiments indicate that of the four copies of Hsp70 in E. cuniculi, the predominately expressed copy of Hsp70 in E. cuniculi is homologous to the highly represented transcript in E. aedis. In A. locustae, the most highly transcribed copy was most similar to the abundantly transcribed copy in E. aedis as well [17]. Therefore, it is likely that microsporidia employ similar primary mechanisms to ensure proper folding of proteins.

In other parasites and endosymbionts, such as Buchnera aphidicola, Hsp70 is also highly expressed [18] and may constitute up to 10% of the protein contained in the cell at any one time. In species that lead parasitic or endosymbiotic lifestyles, genetic drift and relaxed selection pressure frequently lead to an increased mutation rate. The need for Hsp70 in order for proteins to fold correctly seems to increase with both the size and number of mutations in the protein [16]. Although microsporidian genomes appear to have had little rearrangement, the nucleotide mutation rate seems to be high in this group of organisms [19, 20]. Microsporidia could, therefore, contain elevated levels of Hsp70 in order to allow folding of mutant proteins.

Transposable elements

One of the E. aedis ESTs closely matches the integrase domain of the Ty3/gypsy family of retrotransposons. Several of these elements were identified in a GSS of E. aedis [8] and a few other microsporidian species, but to the best of our knowledge, this is the first instance in which transcripts of any microsporidian retrotransposon have been found. Transcripts could be indicative of active transposition occurring in E. aedis' genome.

Ty3/gypsy retrotransposons exist in many organisms ranging from the microsporidia Spraguea lophii [14], Brachiola algerae [8], and Nosema bombycis [13] to Saccharomyces, Drosophila and Sorghum. Ty3 elements have been well characterized in budding yeast, and exist in 1–4 copies per genome, where they are transcribed by RNA polymerase III. Transcription typically occurs only in haploid cells in the presence of mating pheromones [21]. The N. bombycis genome contains at least 8 different retrotransposons in the Ty3/gypsy family, but unlike yeast, they are not exclusively located upstream of tRNAs [13]. Nearly all N. bombycis retrotransposons encode a polyprotein containing 5 domains, which exist in a defined order: Gag, protease, reverse transcriptase, RnaseH and integrase. As many of the sequences in the E. aedis library appear to be 5' truncated, it is possible that the other domains upstream of the integrase in the polyprotein are also present in genomic DNA. Indeed, the GSS project revealed sequences matching the reverse transcriptase domain [8]. Although the microsporidian Vittaforma corneae is also known to possess at least one transposable element [22], it belongs to a different family than those present in E. aedis - the L1 family present in humans.

The only completely sequenced microsporidian genome, that of E. cuniculi [2], is completely devoid of transposable elements. The existence of similar transposable elements (of the Ty3/gypsy family) in the distantly related S. lophii, N. bombycis, B. algerae and E. aedis (See Fig. 1) implies that this element may have been present in the genome of the ancestor of microsporidia. Therefore, the process of genome compaction that gave rise to the E. cuniculi genome likely involved purging transposable elements.

It has been suggested that transposable elements may act to reorganize genes within the genome. Xu et al. [13] compared regions of synteny between N. bombycis and E. cuniculi chromosomes, as selection appears to be acting to retain gene synteny among microsporidia, even if they are only distantly related [3]. In N. bombycis, transposable elements flank these syntenic regions [13]. If E. aedis' large genome is partially a product of transposable element proliferation, one would expect much less synteny between this species and other microsporidia. Perhaps future research will elucidate other roles that transposable elements have played in shaping microsporidian genomes, especially since the minute genome of E. cuniculi seems to lack them, while they are present in larger genomes.

The functions that these transposable elements perform in a given genome are cryptic at best, but evidence is emerging that they may be more than just simply parasitic DNA. Peaston et al. [23] recently discovered that a class of mouse retrotransposons appears to regulate gene expression in embryos.

Transcript structure

Transcripts in A. locustae typically contain more than one gene. These transcripts do not necessarily contain complete open reading frames for all genes and the genes are frequently in opposite orientations [9]. It is not known how many proteins are made from each transcript or whether this situation is typical for microsporidia, but recent work by Corradi et al. [10] suggests that E. cuniculi also possesses multi-gene transcripts.

Unlike A. locustae and E. cuniculi, E. aedis appears to transcribe very few multi-gene transcripts, if any at all. This is not unexpected, given that E. aedis genes appear to be separated by large intergenic spaces [8]. The E. aedis GSS could not rule out the possibility that local areas of compacted genes might exist [8]. Given the lack of multi-gene transcripts identified, this seems increasingly unlikely.

Also contrary to what is found in A. locustae, nearly all of E. aedis' transcripts encode proteins in a positive frame (<1% are in a negative frame, compared to 17% in A. locustae) [9]. Although antisense transcripts are used in many organisms (possibly also A. locustae) to suppress translation, it appears unlikely that this type of regulation occurs in E. aedis. Conversely, the large number of antisense transcripts in A. locustae may be due to a lack of transcriptional regulation resulting from genome compaction.

E. aedis' transcripts seem to start at multiple locations upstream of the start codon (5' UTR length is 180 bp on average) but terminate at the same position with a relatively short 3' UTR (51 bp on average) (See, for example, Figure 4). This is more in line with transcription in E. cuniculi and contrasts with the situation in A. locustae, where transcripts start directly upstream of the translation initiation site, but often terminate much farther downstream in the adjacent gene [10]. For comparison, the yeast S. cerevisiae contains much shorter 5' UTRs than 3' UTRs (15–75 and ~144 bp, respectively [24, 25]), a common trend seen in other fungi, plants and animals. The reason for this reversal is unknown, since 3' UTRs are ubiquitously used as translation regulators. It is likely that E. aedis lacks some of the translational control mechanisms present in other fungi, plants and animals [26].

Conclusion

This is the first examination of ESTs from a microsporidian containing a large genome. The extent of genome compaction in the microsporidian ancestor is not known, but the presence of genes in E. aedis that have not been found in other microsporidia suggests that extreme reduction and compaction occurred only in specific lineages. Surprisingly, E. aedis has a predicted uninucleate spore transcriptome that is highly similar to that of the distantly related A. locustae, although the two species have diverse life cycles and genome sizes.

Methods

Uninucleate E. aedis spores were grown and harvested from A. aegypti larvae as described previously [27].

E. aedis spores were lysed in Ambion's plant RNA isolation aid and lysis/binding solution from an Ambion RNAqueous kit using a bead beater operating at 2500 rpm for 6 minutes with glass beads. RNA was extracted from the resulting supernatant using the RNAqueous kit. A microquantity cDNA library was constructed by Marligen, using the pExpress-1 vector. 1307 clones with an average insert size of 1.5 kb were uni-directionally sequenced using an automated capillary sequencer. Sequences were manually edited and analyzed using Sequencher 4.2 software. Proteins encoded by the transcripts were identified via BLASTX [28] searches performed on the NCBI website (Genbank). Transcripts were identified as encoding a particular protein when BLASTX hits to Genbank proteins had e-values of 10-4 or lower. Transcripts were scored as "present in other microsporidia" when the best BLASTX hit was a gene present in other microsporidia or when the best hit was a gene that has a microsporidian homologue, and the homologue was identified in other microsporidia by BLASTing the E. aedis transcript against available microsporidian data. Putative E. aedis-specific genes are transcripts that contain open reading frames at least 100 base pairs in size and do not have any BLASTX hits with e-values lower than 10-3. In order to facilitate access to the EST sequences, they were uploaded and annotated by the dbEST website [17].

Declarations

Acknowledgements

EEG's work is supported by an NSERC Postgraduate Scholarship and research in the Fast lab is funded by an NSERC Discovery Grant to NMF.

Authors’ Affiliations

(1)
Department of Botany, University of British Columbia
(2)
Center for Medical, Agricultural and Veterinary Entomology, USDA/ARS

References

  1. Wittner M, Weiss LM: The Microsporidia and Microsporidosis. 1999, Washington, D. C.: ASM Press
  2. Katinka MD, Duprat S, Cornillot E, Metenier G, Thomarat F, Prensier G, Barbe V, Peyretaillade E, Brottier P, Wincker P, Delbac F, El Alaoui H, Peyret P, Saurin W, Gouy M, Weissenbach J, Vivares C: Genome sequence and gene compaction of the eukaryote parasite Encephalitozoon cuniculi. Nature. 2001, 414: 450-453. 10.1038/35106579.PubMedView Article
  3. Slamovits CH, Fast NM, Law JS, Keeling PJ: Genome compaction and stability in microsporidian intracellular parasites. Curr Biol. 2004, 14: 891-896. 10.1016/j.cub.2004.04.041.PubMedView Article
  4. Streett DA: Analysis of Nosema locustae (microsporidia:Nosematidae) chromosomal DNA with pulsed-field gel electrophoresis. J Invertebr Pathol. 1994, 63: 301-303. 10.1006/jipa.1994.1055.View Article
  5. Becnel JJ: Edhazardia aedis (microsporidia: Amblyosporidae) as a biological control agent of Aedes aegypti (diptera: Culicidae). Proc Vth Int Colloq Invertebr Pathol Microb Control: Adelaide, Australia. 1990, 56-60.
  6. Becnel JJ, Sprague V, Fukuda T, Hazard EI: Development of Edhazardia aedis (kudo, 1930 N. G., N. comb. (microsporidia: Amblyosporidae) in the mosquito Aedes aegypti (L.) (DIptera: Culicidae). J Protozool. 1989, 36: 119-130.PubMedView Article
  7. Johnson MA, Becnel JJ, Undeen AH: A new sporulation sequence in Edhazardia aedis (microsporidia: Culicosporidae), a parasite of the mosquito Aedes aegypti (diptera: Culicidae). J Invertebr Pathol. 1997, 70: 69-75. 10.1006/jipa.1997.4678.View Article
  8. Williams BAP, Lee RCH, Becnel JJ, Weiss LM, Fast NM, Keeling PJ: Genome sequence surveys of Brachiola algerae and Edhazardia aedis reveal microsporidia with low gene densities. BMC Genomics.
  9. Williams BA, Slamovits CH, Patron NJ, Fast NM, Keeling PJ: A high frequency of overlapping gene expression in compacted eukaryotic genomes. Proc Natl Acad Sci USA. 2005, 102: 10936-10941. 10.1073/pnas.0501321102.PubMedPubMed CentralView Article
  10. Corradi N, Gangaeva A, Keeling PJ: Comparative profiling of overlapping transcription in the compacted genomes of microsporidia Antonospora locustae and Encephalitozoon cuniculi. Genomics.
  11. Limpright VO, Fast NM: personal communication.
  12. Brosson D, Kuhn L, Delbac F, Garin J, Vivares CP, Texier C: Proteomic analysis of the eukaryotic parasite Encephalitozoon cuniculi (microsporidia): A reference map for proteins expressed in late sporogonial stages. Proteomics. 2006, 6: 3625-3635. 10.1002/pmic.200500796.PubMedView Article
  13. Xu J, Pan G, Fang L, Li J, Tian X, Li T, Zhou Z, Xiang Z: The varying microsporidian genome: Existence of long-terminal repeat retrotransposon in domesticated silkworm parasite Nosema bombycis. Int J Parasitol. 2006, 36:
  14. Hinkle G, Morrison HG, Sogin ML: Genes coding for reverse transcriptase, DNA-directed RNA polymerase, and chitin synthetase from the microsporidian Spraguea lophii. Biol Bull. 1997, 193: 250-251.PubMed
  15. Pandrea I, Mittleider D, Brindley PJ, Didier ES, Robertson DL: Phylogenetic relationships of methionine aminopeptidase 2 among Encephalitozoon species and genotypes of microsporidia. Mol Biochem Parasit. 2005, 140: 141-152. 10.1016/j.molbiopara.2004.12.006.View Article
  16. Mayer MP, Bukau B: Hsp70 chaperones: Cellular functions and molecular mechanism. Cell Mol Life Sci. 2005, 62: 670-684. 10.1007/s00018-004-4464-6.PubMedPubMed CentralView Article
  17. O'Brien E, Koski L, Zhang Y, Yang L, Wang E, Gray MW, Burger G, Lang BF: TBestDB: A taxinomically broad database of expressed sequence tags (ESTs). Nucleic Acids Res. 2007, 35: D445-451. 10.1093/nar/gkl770.PubMedPubMed CentralView Article
  18. Wilcox J, Dunbar HE, Wolfinger RD, Moran NA: Consequences of reductive evolution for gene expression in an obligate endosymbiont. Mol Microbiol. 2003, 48: 1491-1500. 10.1046/j.1365-2958.2003.03522.x.PubMedView Article
  19. Thomarat F, Vivares CP, Gouy M: Phylogenetic analysis of the complete genome sequence of Encephalitozoon cuniculi supports the fungal origin of microsporidia and reveals a high frequency of fast-evolving genes. J Mol Evol. 2004, 59: 780-791. 10.1007/s00239-004-2673-0.PubMedView Article
  20. Peer Van de Y, Ben Ali A, Meyer A: Microsporidia: Accumulating molecular evidence that a group of amitochondriate and suspectedly primitive eukaryotes are just curious fungi. Gene. 2000, 246: 1-8. 10.1016/S0378-1119(00)00063-9.PubMedView Article
  21. Kinsey PT, Sandmeyer SB: Ty3 transposes in mating populations of yeast: A novel transposition assay for Ty3. Genetics. 1995, 139: 81-94.PubMedPubMed Central
  22. Mittleider D, Green LC, Mann VH, Michael SF, Didier ES, Brindley PJ: Sequence survey of the genome of the opportunistic microsporidian pathogen, Vittaforma corneae. J Eukaryot Microbiol. 2002, 49: 393-401. 10.1111/j.1550-7408.2002.tb00218.x.PubMedView Article
  23. Peaston AE, Evsikov AV, Graber JH, de Vries WN, Holbrook AE, Solter D, Knowles BB: Retrotransposons regulate host genes in mouse oocytes and preimplantation embryos. Dev Cell. 2004, 7: 597-606. 10.1016/j.devcel.2004.09.004.PubMedView Article
  24. Zhang Z, Dietrich FS: Mapping of transcription start sites in Saccharomyces cerevisiae using 5' SAGE. Nucleic Acids Res. 2005, 33: 2838-2851. 10.1093/nar/gki583.PubMedPubMed CentralView Article
  25. Graber JH, Cantor CR, Mohr SC, Smith TF: Genomic detection of new yeast pre-mRNA 3'-end-processing signals. Nucleic Acids Res. 1999, 3: 888-894. 10.1093/nar/27.3.888.View Article
  26. Mazumder B, Seshadri V, Fox PL: Translational control by the 3'-UTR: The ends specify the means. Trends Biochem Sci. 2003, 2: 91-98. 10.1016/S0968-0004(03)00002-1.View Article
  27. Becnel JJ, Garcia JJ, Johnson MA: Edhazardia aedis (microspora: Culicosporidae) effects on the reproductive capacity of Aedes aegypti (diptera: Culicidae). J Med Entomol. 1995, 32: 549-553.PubMedView Article
  28. Altschul SF, Madden TL, Schaffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: A new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-3402. 10.1093/nar/25.17.3389.PubMedPubMed CentralView Article

Copyright

© Gill et al; licensee BioMed Central Ltd. 2008

This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://​creativecommons.​org/​licenses/​by/​2.​0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.

Advertisement