Characterization of Toxoplasma gondii subtelomeric-like regions: identification of a long-range compositional bias that is also associated with gene-poor regions
© Dalmasso et al.; licensee BioMed Central Ltd. 2014
Received: 14 August 2013
Accepted: 2 January 2014
Published: 13 January 2014
Chromosome ends are composed of telomeric repeats and subtelomeric regions, which are patchworks of genes interspersed with repeated elements. Although chromosome ends display similar arrangements in different species, their sequences are highly divergent. In addition, these regions display a particular nucleosomal composition and bind specific factors, therefore producing a special kind of heterochromatin. Using data from currently available draft genomes we have characterized these putative Telomeric Associated Sequences in Toxoplasma gondii.
An all-vs-all pairwise comparison of T. gondii assembled chromosomes revealed the presence of conserved regions of ∼ 30 Kb located near the ends of 9 of the 14 chromosomes of the genome of the ME49 strain. Sequence similarity among these regions is ∼ 70%, and they are also highly conserved in the GT1 and VEG strains. However, they are unique to Toxoplasma with no detectable similarity in other Apicomplexan parasites. The internal structure of these sequences consists of 3 repetitive regions separated by high-complexity sequences without annotated genes, except for a gene from the Toxoplasma Specific Family. ChIP-qPCR experiments showed that nucleosomes associated to these sequences are enriched in histone H4 monomethylated at K20 (H4K20me1), and the histone variant H2A.X, suggesting that they are silenced sequences (heterochromatin). A detailed characterization of the base composition of these sequences, led us to identify a strong long-range compositional bias, which was similar to that observed in other genomic silenced fragments such as those containing centromeric sequences, and was negatively correlated to gene density.
We identified and characterized a region present in most Toxoplasma assembled chromosomes. Based on their location, sequence features, and nucleosomal markers we propose that these might be part of subtelomeric regions of T. gondii. The identified regions display a unique trinucleotide compositional bias, which is shared (despite the lack of any detectable sequence similarity) with other silenced sequences, such as those making up the chromosome centromeres. We also identified other genomic regions with this compositional bias (but no detectable sequence similarity) that might be functionally similar.
KeywordsToxoplasma gondii Telomeric Associated Sites (TAS) Subtelomeric heterochromatin Trinucleotide compositional bias
Toxoplasma gondii is a widespread obligate intracellular protozoan parasite, member of the phylum Apicomplexa. T. gondii has been recognized as an important pathogen for humans, particularly during pregnancy and for immunocompromised patients . Toxoplasmosis has been also documented as an economically important disease that has considerable impact on the livestock industry [1, 2]. It was for these reasons that T. gondii was one of the first protozoan parasites chosen for a genome-sequencing project. And more recently, the genomes of other strains of T. gondii were sequenced .
T. gondii presents a haploid genome of ∼63 Mb, which is organized in 14 chromosomes that are well conserved in length and number among different strains [4, 5]. The genomic DNA sequence, and the way it is organized in the nucleus are fundamental for the correct regulation of cell processes. The level of chromatin condensation, its nucleosome composition and positioning, together with their binding to non-histone nuclear proteins generates the different states of chromatin . Euchromatin is a gene-rich decondensed chromatin, where transcription is facilitated, whereas heterochromatin is a gene-poor condensed chromatin, refractory to transcription. A general and very simplified rule is that euchromatin is active chromatin where nucleosomes are enriched in histones H3 and H4 acetylated (H3ac/H4ac), and H3 tri-methylated at lysine 4 (H3K4me3); whereas constitutive heterochromatin nucleosomes are enriched in histone H3 di- and tri-methylated at lysine 9 (H3K9me2/3), which binds Heterochromatin Protein 1 (HP1) forming a compact chromatin. These epigenetic marks are conserved from yeast to humans, and T. gondii is not an exception [7–10]. Also, histones H2A and H2B, which are less conserved than H3 and H4, have several variants that contribute differently to the chromatin state . We have previously characterized histones H2A and H2B in T. gondii, and found the presence of a histone H2B variant (H2Bv) which is only present in protozoans , recently renamed as H2B.Z , and two H2A variants: H2A.X and H2A.Z . We have also observed that H2A.Z and H2B.Z are enriched in active promoters, whereas H2A.X is associated to silenced promoters and heterochromatin . Thus, these modified histones and histone variants can be used in Toxoplasma as epigenetic markers of euchromatin and heterochromatin.
In constitutive heterochromatin, two specialized domains can be readily identified, the centromere and the telomeres. The centromere is a genetic locus, where the spindle fibers attach to the chromosomes forming the kinetochore, and is required for proper chromosome segregation (reviewed in ). Centromeric DNA sequences are not conserved among species, and they differ even between chromosomes in the same organism. However, the centromeric chromatin in all organisms studied to date, contains the histone H3 variant CENP-A [15, 16]. The telomeres, on the other hand, are located at the chromosome ends and contain the telomeric repeats and the subtelomeric region, sometimes denominated Telomeric Associated Sequence (or TAS). Basically, telomeres are composed of a tract of simple repeats (like TTAGGG in humans and trypanosomatids), followed by the subtelomere that comprises repetitive elements and, in some cases, subtelomeric genes [17, 18]. In general, these subtelomeric genes are associated with different stress responses, whose expression is regulated by the Telomeric Position Effect . Telomeres and subtelomeres replicate late in the S-phase of cell cycle [19, 20]. The repetitive elements seem to be involved in blocking replication initiation in the subtelomeric regions [20–22], and favoring their nuclear-periphery localization [20, 23]. Although the features of chromosome ends are highly conserved, the DNA sequences are specific to each species . TASs, as well as centromeres, display a specialized type of chromatin, with a particular nucleosome composition and a set of associated proteins responsible of its highly condensed state [18, 24]. In Toxoplasma, only a few heterochromatin-associated proteins have been described to date, including the histone markers mentioned above [8, 14, 25], and the recently described TgChromo1 with a peri-centromeric localization . Concerning constitutive heterochromatin, only centromeres were identified. Centromeric sequences were determined as genomic regions enriched in cenH3 (Toxoplasma CENP-A homologous) . Thus, T. gondii heterochromatin characteristics, composition, regulation, and distribution across the genome are still to be discovered.
In many protozoan pathogens, such as Trypanosoma brucei, Leishmania major, Plasmodium falciparum and some yeast, the subtelomeric genes are contingency genes . These are sets of genes responsible for pathogen diversity and for the clonal phenotype switches often associated with the parasite’s escape from the host immune system. In the Apicomplexan parasite P. falciparum, chromosome ends consist of the telomeric repeat (T(G/A)AAGGG)n followed by a telomere-associated repetitive elements (TARE1-6) organized in six blocks flanked with the members of the var, stevor and rif multi-gene families responsible for the antigenic variation and cytoadhesion [27–29]. TASs in this parasite also present a specialized chromatin, different from the rest of the genome, rich in nucleosomes containing H3K9me3 and the histone deacetylase PfSir2 [30, 31].
Despite the importance and growing evidence of the effect of telomeres and subtelomeres on the expression of surrounding genes and DNA replication timing modulation, in Toxoplasma telomeres have not yet been analyzed; probably because clustered contingency genes at chromosome ends have not been described in this apicomplexan parasite, and/or because the chromosome ends are not completely assembled in the current available genomes. In this study we describe putative subtelomeric sequences in T. gondii assembled chromosomes, their nucleosome and nucleotide composition.
Several Toxoplasma chromosome ends display significant sequence similarity
The putative Toxoplasma Telomeric Associated Sequences have a conserved structure
Localization of putative TgTASL regions
1–2528 (2527) a
1–6928 (6927) b
1–7727 (7726) c
We also inspected annotations and functional genomics data available at the ToxoDB resource. Except for one gene localized at the end of every Fragment C (see Figure 2B, and Additional file 3A), the lack of transcript expression and protein expression data mapped to these regions suggest that TgTASL are mostly silent and/or gene-free DNA. The single gene located at the end of Fragment C is a hypothetical protein annotated as a member of the ‘Toxoplasma Specific Family’ (TSF) . Interestingly, the chromosomal location of TSF family members is restricted to the TgTASL region. Based on this observation, we could identify two extra members of this family: TGME49_098060 as TSF11, and TGME49_100990 as TSF12 (Figure 2B, Additional file 3C). All these findings confirm that TgTAS-like regions have a conserved canonical structure.
All TSF genes are apparently transcribed; they have associated EST and RNA-seq data, as well as H3K9ac and H3K4me3 peaks that indicate the presence of an active promoter. In addition, the transcript levels of these genes seem to vary among parasite strains (type I, II, and III), stages (tachyzoite, bradyzoite, sporozoite), and/or cell cycle (Additional file 3C, expression evidence at ToxoDB v7.3). However, only TgTSF8 has evidence of expression at the protein level (mass spectrometry evidence at ToxoDB v7.3). This behavior is similar to the one described for subtelomeric genes. Further studies should be performed to elucidate the role of TSF members.
Comparing TgTASL chromosome coordinates (Table 1) and their DNA strand location (Figure 2), it is clearly evident that these genomic elements have a defined orientation. TgTARE1 is located at the beginning of the TgTASL, whereas the TSF is at the end of the element, being in the forward strand when it is located at the beginning of the chromosome, and in the reverse strand when it is located at the end of it. Therefore, TgTARE1 will be closer to the telomere and TSF toward the centromere. This is true for all TgTASL except for TgTASL_IV and TgTASL_IX, which are in an inverted orientation: they are both at end of the chromsome but in the forward strand. Interestingly, in the current genome assembly (ToxoDB v9.0) the TgTASL_IX is now at the end of chromosome XII in the reverse strand, as observed for the rest of the TgTASL sequences (Additional file 3G); hence, TgTASL_IV is the unique TgTAS-like region with a different orientation.
It is expected that subtelomeres start next to the tract of telomeric repeats. In Toxoplasma the TgTASL are located at the ends of chromosomal assemblies, but their proximity to the telomeric repeats is hard to determine because the chromosome ends are not completely assembled. In Toxoplasma the telomeric repeat seems to be TTTAGGG. Tracts of this repeat are assembled only at the end of chromosomes Ia, X and XI, and at the beginning of chromosomes III and XI. Only one TgTASL is close to them, TgTASL_XI, which is next to the telomeric repeat (data not shown). In other cases (TgTASL_Ia, _X, and _XII), where telomeric repeats are not assembled, the TgTASL are the first or last sequence region in assembled chromosomes (Table 1, Additional file 3F). Except for TgTASL_IV, all TgTAS-like are within the 6% final portion of each chromosome (Additional file 3F). Therefore, in general, these regions are either next to or in close proximity to telomeres. Recently, more repeat tracts of TTTAGGG have been assembled (ToxoDB v9.0). They can be observed at both ends of chromosomes Ib, III, XI, IV y VIIa, at the beginning of chromosomes VI and VIIb, and at the end of chromosomes Ia, IX y XII. They appear in the forward strand when they are the first chromosomal sequence element, and in the reverse strand when they are the last one. In this assembly, there is another TgTASL close to a telomeric repeat, TgTASL_III-a. There are ∼ 50 Kb between the telomeric repeat and this TgTASL, where there are not annotated genes (Genome Browser in ToxoDB v9.0). Other curious observation is that the telomeric repeat at the end of chromosome IV is not so close to the end, and it is in the forward strand. Interestingly, it is ∼ 60 Kb upstream the TgTASL_IV, which is the unique TgTASL at the end of a chromosome in the forward strand. In addition, the genomic sequence between the telomeric repeat and this TgTASL does not contain any annotated genes, as occur with TgTASL_III-a.
TgTAS-like regions present a special nucleosomal composition, enriched in heterochromatin markers
TgTAS-like regions are conserved in three T. gondii evolutionary lineages
Initial studies showed that the majority of T. gondii strains isolated in Europe and North America belong to three distinct clonal haplotypes, called types I, II, and III . In this section, the analysis performed with the ME49 (type III) genome data was extended to the genomes of the GT1 (type I) and VEG (type II) strains (data available in ToxoDB). Using the same BLAST strategy to detect similarity among chromosomes, all the TgTAS-like elements detected in ME49 were also detected in these two lineages (Table 1). In addition, we compared all identified TgTASL among strains. In all cases, the TgTASL regions were syntenic (Additional file 4), and the pairwise identity between each TgTASL and its syntenic partners in other strains was >96%. Furthermore, a new TgTASL regions was found in chromosomes VI of the VEG and GT1 strains (Table 1), which was not previously detected in ME49 chromosomal assemblies. We took advantage of the high identity between TAS-like regions in the three lineages, and used TgTASL_VI of GT1 and VEG to perform BLAST searches in ME49 contigs that were left out of the chromosomal assembly. Sequences corresponding to a putative TgTASL_VI could be identified in the ME49 contigs DS984876, DS984831, DS984825 (Table 1). The presence of additional sequences similar to TgTAS-like was examined in contigs and scaffolds, in the three strains: ME49, GT1 and VEG. Several sequences were retrieved, including contigs/scaffolds containing a complete TgTASL (Additional file 3E). These could represent assembly artifacts, or bona fide TgTASL regions that could not be accommodated properly in the current assembly. In total, we detected twelve TgTASL regions distributed in 10 chromosomes (Table 1).
Recently, a new Toxoplasma ME49 genome assembly was made available in ToxoDB (version 9.0 release, September 2013). Although most of the TgTASL regions are conserved in the two assemblies, some differences could be observed. In the newest version there are two extra TgTASL regions, one at the opposite end of chromosome II, and one in chromosome X, upstream of TgTASL_ X-a (Additional file 3F); in addition to the relocation of TgTASL_IX. Overall, this suggests, together with the sequences found in scaffolds, that there may be additional TgTAS-like sequences. The current draft nature of the T. gondii genome currently prevents us from knowing if both chromosome ends are covered with these TAS-like sequences.
Finally, we also investigated the conservation of these TgTAS-like regions in other coccidian genomes available in ToxoDB (Neospora caninum and Eimeria tenella). BLAST searches using TgTASL sequences as query only revealed matching sequences with low similarity to these genomes. Besides, TgTASL sequences do not present detectable similarity to other Apicomplexan genomes available in EuPathDB. By using the same intra-species all-vs-all chromosomal comparisons used before, we could identify two putative N. caninum TAS-like regions in Nc chromosomes VIIa and VIIb, as well as other related DNA fragments (Additional file 3D, Additional file 5). The presence of a conserved structure of repetitive elements in these NcTASL regions were evaluated by pairwise comparisons using Dotter. Dotplots with representative results are shown in Additional file 5. In this analysis it is clear that NcTASL also present a conserved pattern, consisting of two blocks of repeats separated by non- repetitive DNA (first two dotplots of the first two rows in Additional file 5). However, the similarity between NcTASL and TgTASL is extremely low (three last panels of the first two rows in Additional file 5). Consequently, TgTAS-like regions are composed of sequences that are unique to Toxoplasma.
TgTAS-like show a unique bias in trinucleotide composition
Because TgTAS-like regions are depleted of genes and are localized near chromosome ends, we reasoned that they might present a different base composition from the rest of the genome. We thus investigated a number of ways to identify compositional bias in these sequences. For this, we analyzed the nucleotide composition of these regions using different metrics, which are all independent of linear sequence similarity. The trinucleotide composition of TgTAS-like sequences was used in our case to identify diagnostic biases. Our measure of compositional bias simultaneously considered all 64 trinucleotides across a given sequence using a multivariate statistical technique (Correspondence Analysis, see Methods).
Correspondence Analysis (CA) is a powerful statistical technique to explore high-dimensional categorical data. The trinucleotide composition of a DNA sequence is one such dataset; with each dimension/column measuring the count of a different trinucleotide (there are 64 trinucleotides/dimensions) in different genomic fragments (rows). The aim in a correspondence analysis is to identify trends in the variation of data and associations among datapoints. CA has been successfully applied to the study of synonymous codon usage bias associated with high levels of expression [42–45]. Using this technique, it is possible to summarize and explore most of the 64-dimensional information using only a few representative axes. This allows visual representations in 2- dimensional plots (also known as “CA maps”) that capture the most influential trends that define the compositional bias.
In the CA analysis mentioned above (window size = 40 Kb), the major compositional trend, represented in the 1st principal coordinate (horizontal axis of Figure 4), explains 45% of the information on the variability of trinucleotide frequencies. This means that CA is effectively summarizing high-dimensional data, due to the presence of strong trends (biases). The second trend, represented as the 2nd principal coordinate (vertical axis in Figure 4), which is independent to the 1st one (CA produces orthogonal axes by design), is lead by one trinucleotide pair (ATA, TAT). Upon further inspection, we found that this trend is due to the presence of sequences with long TpA dinucleotide repeats, which are not highly represented in TgTAS-like regions. The relative contribution of each trinucleotide to the 1rst and 2nd Principal Coordinates are listed in Additional file 3H.
The unique trinucleotide bias of TgTAS-like sequences is shared with other gene-poor genomic regions in the T. gondii genome
Telomeric and subtelomeric heterochromatin differs from standard heterochromatin on several aspects including its sequence, nucleosome composition, and nature of binding factors. Even though these three components vary between species, telomeric function is conserved . Here we have identified and characterized a number of Toxoplasma TAS-like regions, which are localized near several chromosome ends, and are specific for T. gondii. They have features similar to those observed in other subtelomeric sequences like blocks of repeats, stress-asociated genes, and a heterochromatin-like structure, based on the presence of informative histones and histone post-translational modifications.
Even though the T. gondii genome was recently further sequenced and re-assembled, the subtelomeres of this important parasite were not identified or characterized. Telomeres and subtelomeric sequences are noticeably hard to assemble when using shotgun sequencing. This is because the telomeric repeat is identical in all chromosomes and therefore cannot be used to distinguish one telomere from another. Furthermore, they are not expected to be well represented in the Bacterial Artificial Chromosome (BAC) libraries used to guide the scaffolding and chromosomal assembly of draft genomes, due to the size-selection of recombinant DNA clones and the natural genomic instability of these regions. Finally, they provide large similarity segments that are rich in repetitive sequences, which are the cause of frequent assembly errors. Using the current chromosomal assemblies, we mapped TgTAS-like sequences to a conserved region of 30 Kb located at the end of most chromosomes. We defined the extent of these regions based on the consensus TgTAS-like structure present in most chromosomes. However, it is reasonable to consider that the actual size of each TgTASL may be larger. In this regard, the fact that the trinucleotide compositional bias peaks at 40-50Kb provides an independent estimate for the extent of these regions.
In all cases, the identified TgTASL regions displayed a similar structural pattern, consisting of three telomeric associated repeats (TgTARE 1-3), with a single member of the Toxoplasma Specific Family, separated by silenced DNA. These TgTASL regions have a clear chromosomal orientation, with the TgTARE1 towards the telomere, and the TSF gene towards the centromere; similar to P. falciparum TASs . Nevertheless, several exceptions were detected in the ME49 genome that still remain in the new genome assembly (ToxoDB 9.0). In chromosomes III and X there are two consecutive TgTASL regions, and in chromosome IV, the TgTASL_IV is inverted (its TgTARE1 is towards the centromere and the TSF gene towards the telomere). These unusual features, which so far are unique to Toxoplasma genomes, as well as the presence of TAS-like regions at only one end of the chromosomes, could be explained by simple chromosomal rearrangements, probably with functional consequences. However, they could also be explained by artifacts of the genome sequencing and/or assembly. Several observations support this last possibility: changes in the chromosome location and number observed between TgTAS-like regions in two different genome assemblies (Additional file 3F); the relocation of TgTASL_IX which is now located at the end of chromosome XII with the expected orientation; and the absence of BAC-end sequences connecting TgTASL regions with the rest of the assembled chromosome, as in the case of TgTASL_IV (ToxoDB Genome Browser; see also Figure 4 in Khan et al. publication ). Therefore, further studies should be performed to shed light on these issues. Also future genome assemblies should pay special attention to the quality of the assemblies near chromosome ends.
From human to yeast, the shelterin protein complex binds DNA telomeric repeats, and subtelomeric sequences are silenced as evidenced by the presence of telltale heterochromatin epigenetic marks such as histones H3K9me3 and H4K20me3, methylated DNA, and heterochromatin protein 1 (HP1) [46, 47]. Telomeric and subtelomeric sequences are transcribed in a telomeric repeat-containing RNA (TERRA), which regulates the telomere function . Some telomeric binding factors have been described in P. falciparum including PfSir2a, a sirtuin that may be important to maintain DNA as heterochromatin [31, 48]; PfOrc1 that binds to telomeres and TARE-3, and may be involved in forming the T-loop structure that prevents fusion between chromosome ends ; and PfSIP2 (ApiAP2 family member) that binds to TARE2, TARE3 and the upsB promoter of a var gene, and colocalizes with PfHP1 that binds H3K9me3, suggesting that both proteins participate in the assembly of telomeric heterochromatin . With the exception of TgChromo1 , none of these proteins have been characterized to date in Toxoplasma, despite the fact that all of them have orthologs. TgChromo1 is the HP1 chromobox homologue. A number of studies focused on this protein provide additional evidence that TgTAS-like regions are embedded in a typical heterochromatin environment. Although TgChromo1 is mainly associated with pericentromeric heterochromatin, IFA-FISH experiments have shown that TgChromo1 is also in close proximity to repeats present at the end of chromosome Ia and IX at the nuclear periphery . Interestingly, the chromosome Ia probe used by Gissot et al. corresponds to TgTARE1, suggesting that TgTASL elements could be located at the nuclear periphery. In a separate set of ChIP- qPCR experiments, Braun et al. detected an enrichment of H4K20me1 and H3K9me1 heterochromatin marks associated with the Sat350 repetitive element (which would be TgTARE1) . We also revealed the enrichment of H4K20me1 and H2A.X (another silencing-associated histone in T. gondii) in fragment A from three TgTASL, and also in TgIRE . In addition, Braun et al. uncovered the presence of repeat-associated siRNAs in Toxoplasma, which map to Sat350 (TgTARE1) and Sat529a . It remains to be seen if these satellite-associated RNAs are involved in heterochromatin formation and/or in the regulation of longer telomeric noncoding RNA resembling the TERRA RNA from other organisms .
Genes that are close to subtelomeric regions have been implicated in a wide array of stress responses or niche adaptive roles . This is also true for a number of important human pathogens, including T. brucei and T. cruzi, L. major, and P. falciparum, where so called contingency gene families are located embedded within or just next to the subtelomeric repeats. Toxoplasma contains several distinct, coccidian-specific multicopy gene families throughout its genome, including those that encode the SRS, ROPK, and SUSA proteins [51–53]. Recently, a bioinformatics study showed that 60 out of the 144 ME49 SRS genes (42%) are located in subtelomeric sites . However, none of these genes are near TgTAS-like regions in the current genome assembly. We only detected one Toxoplasma specific family (TSF) member . These TSF members were also found in VEG and GT1 T. gondii strains. Judging by the sequence, length, and number of TM domains, this family presents a high diversity among their members (Additional file 7C). Moreover, such diversity extends to their expression profiles, as the apparent expression of each TSF gene varies among different strains, life cycle stages, and/or throughout the cell cycle (Additional file 3C). Notwithstanding this, there is mass spectrometry evidence for only one TSF member, according with the experimental data available in ToxoDB version 7.3. Although the function of TSF genes is unknown, their expression profiles resemble those of tightly regulated genes associated to parasite adaptation to the environment (differences between species), or virulence (differences between strain). Hence, they could be proposed as Toxoplasma stress and/or contingency response genes. Notably, recent RNAseq experiments revealed some discrepancies between transcripts and annotated genes, and supported the existence of new genes and/or putative non-coding RNAs (, and RNAseq evidences in ToxoDB v9.0). Some of these RNAseq transcripts can be detected within several TgTASL, suggesting there might be more genes present in these regions.
The trinucleotide composition of TgTASL DNA displays a strong bias when analyzing genomic fragments greater than ∼5 Kb. However this bias it not evident below this size, indicating that this is a long-range bias. The major compositional trend is associated with the relatively high frequency of stop codons in these regions, in correlation with the absence of genes. However, the trinucleotides TAA, TAG and TGA (together with their reverse complements) which are read as Stop codons within coding sequences, only contribute to 17.4% of this major trend. Other trinucleotides, not particularly associated with non-coding sequences, such as the pairs CTC/GAG, AGA/TCT, CGC/GCG, AAT/ATT, GTA/TAC have larger contributions to this major trend (in Additional file 3H). For example, the trinucleotide pairs AAT/ATT (enriched) and AGA/TCT (depleted) together represent 19.6% of the bias in the major trend, and show a clustering pattern of TgTAS-like and centromeric sequences similar to the one observed for the stop codons (Additional file 8).
Interestingly, the second most relevant trend at ∼40 Kb is led by a single trinucleotide pair ATA/TAT. These trinucleotides also represent the major trend (1PC) at shorter fragments (1Kb). Although, there are other AT-rich trinucleotides such as ATA/TAT and TTA/TAA, they define two major independent trends, respectively. The trinucleotides ATA/TAT (but not TTA/TAA), which also govern the compositional bias at shorter window sizes, are part of microsatellite-like sequences (TpA dinucleotide repeats). These TpA tracts have the lowest base stacking energy, and therefore the greatest flexibility for unwinding the DNA (like in the TATA box); consecuently, they provide interesting functional features to these genomic territories in T. gondii.
All observed trinucleotide frequencies were approximately identical to the frequency of their reverse-complementary trinucleotides in the T. gondii genome (Figure 4). These correlations between a trinucleotide and its reverse complement were evident even at windows of size 5 Kb, and strongly increase for larger fragments. We constructed these CA maps with trinucleotide counts obtained from a single-strand of each chromosome to show an important consequence of this fact: TgTAS-like regions located in the positive strand cluster together with negative-strand TAS (i.e. their trincucleotide compositions are highly similar), even though their sequences are completely different (they are reverse-complements). This forward/reverse- complement symmetry of trinucleotide frequencies is part of a more general property: Chargaff’s second parity rule. This rule states that within a single strand of double-stranded DNA, each oligonucleotide occurs with approximately the same frequency as its reverse-complement . This observation has been verified for “sufficiently long” (>100 Kb) genomic sequences, in a wide range of sequenced genomes of bacteria and eukaryotes . Here we confirm that this is also true within the TgTAS- like regions of ∼40 Kb in T. gondii.
Finally, it is also noteworthy that the major trend (1PC) in trinucleotide composition allows the separation of TgTAS-like regions from the rest of the genome. This essentially means that TgTASL represent some of the most peculiar regions of the genome when looked at the level of trinucleotide composition. Other genomic regions that share the same compositional bias are centromeric fragments, most of the chromosome ends in the current assembly, as well as other chromosomal internal regions (see Additional file 7). We propose that this compositional bias may be associated with a more general property of these regions, such as their preferred chromatin state. However, further studies will need to be performed to elucidate the functional features shared by all these regions in T. gondii.
In this work we have identified and described a number of chromosomal regions, that present a special long-range compositional bias, are gene-poor, and enriched in repeats and in heterochromatin-associated histones. These genomic domains are usually present in subtelomeric and subtelomeric-like regions. In addition, this long range compositional bias was also detected in other Toxoplasma gene-poor genomic regions that do not share any sequence similarity among them. Future studies are planned to further study the protein(s) associated to these TgTAS-like regions as well as the role of these elements in Toxoplasma biology.
Sequence resource, analysis and representation
The genome sequence, coding sequences, and gene annotations were downloaded from ToxoDB version 7.3 (www.toxodb.org). Some results were compared with the current version of ToxoDB v9.0. P. falciparum 3D7 genome assembly was download from PlasmoDB v8.2 (www.plasmodb.org). Comparative genomics were performed using BLAST version 2.2.25 obtained from the NCBI , and WebACT . BLAST results were analyzed and visualized with the Artemis Comparative Tool (ACT, http://www.sanger.ac.uk/Software/ACT) . Circular chromosome layouts were made with circos (http://circos.ca/) . The size and repeats of TgTASL regions were determined by visual inspection of chromosomal dotplots generated with Dotter (http://sonnhammer.sbc.su.se/Dotter.html) . To facilitate the pairwise comparison of all chromosomes, we produced a dotplot comparing a multifasta file containing all chromosome fragments encompassing TgTASL, whose size was estimated using ACT.
ChIP and qPCR were performed as described in Dalmasso et al.. Briefly, 1 x 107 tachyzoites were used per reaction. The 10% of total lysate was used as input. The antibodies used to perform the immunoprecipitation were: α-H2A.X, α-H2A.Z, and α-H2Bv polyclonal antibodies previously generated in the laboratory [12, 14]; α-H3ac (Upstate 17-615), and α-H4K20me1 (Abcam ab9051). The specific primers used to amplify the TgTAS-like are listed in Additional file 3G. They were designed to specific amplify each TgTASL, after sequence alignment among them. Sag1 promoter was used to represent a transcriptionally active chromatin region . Three independent experiments were performed.
Trinucleotide Correspondence Analysis (CA)
T. gondii ME49 chromosomes (ToxoDB version 7.3) were scanned through overlapping windows of varying length L, by 1 Kb nucleotide steps (where L ranged from 1 Kb to 80 Kb). A number of contigency tables of size 64 x L were built by counting the number of each trinculeotide in every genomic fragment of length L. CA is an exploratory multivariate statistical technique used to decompose the contingency table variability and extract coordinates that explain major trends in the inertia of the table . Inertia is defined as the total Pearson’s Chi- Square of the 2-way table divided by the total sum and can be interpreted as a global measure of the data’s deviation from expected values under the hypothesis of homogeneity (where all genomic fragments would have similar trinucleotide profiles). Correspondence Analysis was performed using the R package CA (). CA plots shown in Additional file 6 are symmetric maps displaying the 2 principal dimensions with most of the trinucleotide variability (rows and columns in principal coordinates). To determine trinucleotide bias associated with TgTAS-like regions, we considered genome fragments containing complete TgTASL regions as well as those fragments where the annotated TgTASL region would cover >80% of the window length. Average deviation of these fragments (i.e. TgTAS-like trinucleotide bias, or inertia in the CA theory) from the origin (position representing the expected composition under homogeneity of trinucleotide frequencies) was calculated, and then divided by the average deviation of all genomic fragments to derive a measure of TgTASL-specific bias. This was repeated for all window lengths (Figure 5, left axis). The compactness of TgTAS- like fragments was evaluated by measuring disimilarity in trinucleotide space, calculated as the mean of all TgTAS-like inter-fragments distances relative to the mean inter fragments distances considering all genomic fragments (Figure 5, right axis). To explore the relation between trinucleotide bias (at 40 Kb window length) and gene density, we calculated a gene coverage index defined as the proportion of bases covered by gene annotations (CDS, gene annotation from ToxoDB 01/2012). The gene coverage index was plotted against a 40 Kb trinucleotide bias index, which is simply the first principal coordinate of the CA.
FA and SOA (Researchers), MCD (Postdoctoral Fellow) and SJC (Fellow) are members of the National Research Council of Argentina (CONICET). This work was supported by an NIH-NIAID grant (1R01AI083162-01 to SOA) and by grants PICT-2011-0623 (to SOA) and PICT-2010-1479 (to FA) from the National Agency for the Promotion of Science and Technology (ANPCyT, Argentina).
- Tenter AM, Heckeroth AR, Weiss LM:Toxoplasma gondii: from animals to humans. Int J Parasitol. 2000, 30 (12–13): 1217-1258. 0020-7519 (Print) Journal Article Research Support, Non-U.S. Gov’t Research Support, U.S. Gov’t, P.H.S. Review,PubMed CentralPubMedView ArticleGoogle Scholar
- Dubey JP, Sundar N, Hill D, Velmurugan GV, Bandini LA, Kwok OCH, Majumdar D, Su C:High prevalence and abundant atypical genotypes ofToxoplasma gondiiisolated from lambs destined for human consumption in the USA. Int J Parasitol. 2008, 38 (8–9): 999-1006. doi:10.1016/j.ijpara.2007.11.012,PubMedView ArticleGoogle Scholar
- Gajria B, Bahl A, Brestelli J, Dommer J, Fischer S, Gao X, Heiges M, Iodice J, Kissinger JC, Mackey AJ, Pinney DF, Roos DS, Stoeckert JCJ, Wang H, Brunk BP:ToxoDB: an integratedToxoplasma gondiidatabase resource. Nucleic Acids Res. 2008, 36 (Database issue): 553-556.Google Scholar
- Khan A, Taylor S, Su C, Mackey AJ, Boyle J, Cole R, Glover D, Tang K, Paulsen IT, Berriman M, Boothroyd JC, Pfefferkorn ER, Dubey JP, Ajioka JW, Roos DS, Wootton JC, Sibley LD:Composite genome map and recombination parameters derived from three archetypal lineages ofToxoplasma gondii. Nucleic Acids Res. 2005, 33 (9): 2980-2992. 10.1093/nar/gki604.PubMed CentralPubMedView ArticleGoogle Scholar
- Reid AJ, Vermont SJ, Cotton JA, Harris D, Hill-Cawthorne GA, Konen-Waisman S, Latham SM, Mourier T, Norton R, Quail MA, Sanders M, Shanmugam D, Sohal A, Wasmuth JD, Brunk B, Grigg ME, Howard JC, Parkinson J, Roos DS, Trees AJ, Berriman M, Pain A, Wastling JM:Comparative genomics of the apicomplexan parasitesToxoplasma gondiiandNeospora caninum: Coccidia differing in host range and transmission strategy. PLoS Pathog. 2012, 8 (3): 1002567-10.1371/journal.ppat.1002567.View ArticleGoogle Scholar
- Strahl BD, Allis CD:The language of covalent histone modifications. Nature. 2000, 403 (6765): 41-45. 10.1038/47412.PubMedView ArticleGoogle Scholar
- Gissot M, Kelly KA, Ajioka JW, Greally JM, Kim K:Epigenomic modifications predict active promoters and gene structure inToxoplasma gondii. PLoS Pathog. 2007, 3 (6): 77-10.1371/journal.ppat.0030077.View ArticleGoogle Scholar
- Sautel CF, Cannella D, Bastien O, Kieffer S, Aldebert D, Garin J, Tardieux I, Belrhali H, Hakimi MA:SET8-mediated methylations of histone H4 lysine 20 mark silent heterochromatic domains in apicomplexan genomes. Mol Cell Biol. 2007, 27 (16): 5711-5724. 10.1128/MCB.00482-07.PubMed CentralPubMedView ArticleGoogle Scholar
- Bougdour A, Braun L, Cannella D, Hakimi M-A:Chromatin modifications: implications in the regulation of gene expression inToxoplasma gondii. Cell Microbiol. 2010, 12 (4): 413-423. 10.1111/j.1462-5822.2010.01446.x. doi:10.1111/j.1462-5822.2010.01446.x,PubMedView ArticleGoogle Scholar
- Gissot M, Walker R, Delhaye S, Huot L, Hot D, Tomavo S:Toxoplasma gondiichromodomain protein 1 binds to heterochromatin and colocalises with centromeres and telomeres at the nuclear periphery. PLoS One. 2012, 7 (3): 32671-10.1371/journal.pone.0032671.View ArticleGoogle Scholar
- Malik HS, Henikoff S:Phylogenomics of the nucleosome. Nat Struct Biol. 2003, 10 (11): 882-891. 10.1038/nsb996.PubMedView ArticleGoogle Scholar
- Dalmasso MC, Echeverria PC, Zappia MP, Hellman U, Dubremetz JF, Angel SO:Toxoplasma gondiihas two lineages of histones 2b (H2B) with different expression profiles. Mol Biochem Parasitol. 2006, 148 (1): 103-107. 10.1016/j.molbiopara.2006.03.005.PubMedView ArticleGoogle Scholar
- Talbert PB, Ahmad K, Almouzni G, Ausio J, Berger F, Bhalla PL, Bonner WM, Cande WZ, Chadwick BP, Chan SW, Cross GA, Cui L, Dimitrov SI, Doenecke D, Eirin-Lopez JM, Gorovsky MA, Hake SB, Hamkalo BA, Holec S, Jacobsen SE, Kamieniarz K, Khochbin S, Ladurner AG, Landsman D, Latham JA, Loppin B, Malik HS, Marzluff WF, Pehrson JR, Postberg J, et al: A unified phylogeny-based nomenclature for histone variants. Epigenet Chromatin. 2012, 5: 7-10.1186/1756-8935-5-7.View ArticleGoogle Scholar
- Dalmasso MC, Onyango DO, Naguleswaran A, Sullivan WJ, Angel SO:ToxoplasmaH2A variants reveal novel insights into nucleosome composition and functions for this histone family. J Mol Biol. 2009, 392 (1): 33-47. 10.1016/j.jmb.2009.07.017. doi:10.1016/j.jmb.2009.07.017,PubMed CentralPubMedView ArticleGoogle Scholar
- Verdaasdonk JS, Bloom K:Centromeres: unique chromatin structures that drive chromosome segregation. Nat Rev Mol Cell Biol. 2011, 12 (5): 320-332. 10.1038/nrm3107.PubMed CentralPubMedView ArticleGoogle Scholar
- Henikoff S, Dalal Y:Centromeric chromatin: what makes it unique?. Curr Opin Genet Dev. 2005, 15 (2): 177-184. 10.1016/j.gde.2005.01.004. Henikoff, Steven Dalal, Yamini England Curr Opin Genet Dev. 2005 Apr;15(2):177-84.,PubMedView ArticleGoogle Scholar
- Ottaviani A, Gilson E, Magdinier F:Telomeric position effect: from the yeast paradigm to human pathologies?. Biochimie. 2008, 90 (1): 93-107. 10.1016/j.biochi.2007.07.022.PubMedView ArticleGoogle Scholar
- Pryde FE, Louis EJ:Limitations of silencing at native yeast telomeres. EMBO J. 1999, 18 (9): 2538-2550. 10.1093/emboj/18.9.2538.PubMed CentralPubMedView ArticleGoogle Scholar
- Raghuraman MK, Winzeler EA, Collingwood D, Hunt S, Wodicka L, Conway A, Lockhart DJ, Davis RW, Brewer BJ, Fangman WL:Replication dynamics of the yeast genome. Science. 2001, 294 (5540): 115-121. 10.1126/science.294.5540.115.PubMedView ArticleGoogle Scholar
- Arnoult N, Schluth-Bolard C, Letessier A, Drascovic I, Bouarich-Bourimi R, Campisi J, Kim S-H, Boussouar A, Ottaviani A, Magdinier F, Gilson E, Londoño-Vallejo A:Replication timing of human telomeres is chromosome arm-specific, influenced by subtelomeric structures and connected to nuclear localization. PLoS Genet. 2010, 6 (4): 1000920-10.1371/journal.pgen.1000920. doi:10.1371/journal.pgen.1000920,View ArticleGoogle Scholar
- Zappulla DC, Sternglanz R, Leatherwood J:Control of replication timing by a transcriptional silencer. Curr Biol. 2002, 12 (11): 869-875. 10.1016/S0960-9822(02)00871-0.PubMedView ArticleGoogle Scholar
- Ofir R, Wong AC, McDermid HE, Skorecki KL, Selig S:Position effect of human telomeric repeats on replication timing. Proc Natl Acad Sci USA. 1999, 96 (20): 11434-11439. 10.1073/pnas.96.20.11434.PubMed CentralPubMedView ArticleGoogle Scholar
- Ottaviani A, Schluth-Bolard C, Rival-Gervier S, Boussouar A, Rondier D, Foerster AM, Morere J, Bauwens S, Gazzo S, Callet-Bauchu E, Gilson E, Magdinier F:Identification of a perinuclear positioning element in human subtelomeres that requires A-type lamins and CTCF. EMBO J. 2009, 28 (16): 2428-2436. 10.1038/emboj.2009.201. doi:10.1038/emboj.2009.201,PubMed CentralPubMedView ArticleGoogle Scholar
- Barry JD, Ginger ML, Burton P, McCulloch R:Why are parasite contingency genes often associated with telomeres?. Int J Parasitol. 2003, 33 (1): 29-45. 10.1016/S0020-7519(02)00247-3.PubMedView ArticleGoogle Scholar
- Saksouk N, Bhatti MM, Kieffer S, Smith AT, Musset K, Garin J, Sullivan JWJ, Cesbron-Delauw MF, Hakimi MA:Histone-modifying complexes regulate gene expression pertinent to the differentiation of the protozoan parasiteToxoplasma gondii. Mol Cell Biol. 2005, 25 (23): 10301-10314. 10.1128/MCB.25.23.10301-10314.2005.PubMed CentralPubMedView ArticleGoogle Scholar
- Brooks CF, Francia ME, Gissot M, Croken MM, Kim K, Striepen B:Toxoplasma gondiisequesters centromeres to a specific nuclear region throughout the cell cycle. Proc Natl Acad Sci USA. 2011, 108 (9): doi:10.1073/pnas.1006741108-View ArticleGoogle Scholar
- Figueiredo LM, Pirrit LA, Scherf A:Genomic organisation and chromatin structure ofPlasmodium falciparumchromosome ends. Mol Biochem Parasitol. 2000, 106 (1): 169-174. 10.1016/S0166-6851(99)00199-1.PubMedView ArticleGoogle Scholar
- Scherf A, Figueiredo LM, Freitas-Junior LH:Plasmodium telomeres: a pathogen’s perspective. Curr Opin Microbiol. 2001, 4 (4): 409-414. 10.1016/S1369-5274(00)00227-7.PubMedView ArticleGoogle Scholar
- Hernandez-Rivas R, Perez-Toledo K, Herrera Solorio AM, Delgadillo DM, Vargas M:Telomeric heterochromatin inPlasmodium falciparum. J Biomed Biotechnol. 2010, 2010: 290501-PubMed CentralPubMedView ArticleGoogle Scholar
- Lopez-Rubio JJ, Mancio-Silva L, Scherf A:Genome-wide analysis of heterochromatin associates clonally variant gene regulation with perinuclear repressive centers in malaria parasites. Cell Host Microbe. 2009, 5 (2): 179-190. 10.1016/j.chom.2008.12.012.PubMedView ArticleGoogle Scholar
- Freitas-Junior LH, Hernandez-Rivas R, Ralph SA, Montiel-Condado D, Ruvalcaba-Salazar OK, Rojas-Meza AP, Mancio-Silva L, Leal-Silvestre RJ, Gontijo AM, Shorte S, Scherf A:Telomeric heterochromatin propagation and histone acetylation control mutually exclusive expression of antigenic variation genes in malaria parasites. Cell. 2005, 121 (1): 25-36. 10.1016/j.cell.2005.01.037.PubMedView ArticleGoogle Scholar
- Mefford HC, Linardopoulou E, Coil D, van den Engh G, Trask BJ:Comparative sequencing of a multicopy subtelomeric region containing olfactory receptor genes reveals multiple interactions between non-homologous chromosomes. Hum Mol Genet. 2001, 10 (21): 2363-2372. 10.1093/hmg/10.21.2363.PubMedView ArticleGoogle Scholar
- Mefford HC, Trask BJ:The complex structure and dynamic evolution of human subtelomeres. Nat Rev Genet. 2002, 3 (2): 91-102. 10.1038/nrg727.PubMedView ArticleGoogle Scholar
- Ossorio PN, Sibley LD, Boothroyd JC:Mitochondrial-like DNA sequences flanked by direct and inverted repeats in the nuclear genome ofToxoplasma gondii. J Mol Biol. 1991, 222 (3): 525-536. 10.1016/0022-2836(91)90494-Q.PubMedView ArticleGoogle Scholar
- Sonnhammer EL, Durbin R:A dot-matrix program with dynamic threshold control suited for genomic DNA and protein sequence analysis. Gene. 1995, 167 (1-2): 1-10. 10.1016/0378-1119(95)00657-5.View ArticleGoogle Scholar
- Benson G:Tandem repeats finder: a program to analyze DNA sequences. Nucleic Acids Res. 1999, 27 (2): 573-580. 10.1093/nar/27.2.573.PubMed CentralPubMedView ArticleGoogle Scholar
- Matrajt M, Angel SO, Pszenny V, Guarnera E, Roos DS, Garberi JC:Arrays of repetitive DNA elements in the largest chromosomes ofToxoplasma gondii. Genome. 1999, 42 (2): 265-269.PubMedView ArticleGoogle Scholar
- Clemente M, de Miguel N, Lia VV, Matrajt M, Angel SO:Structure analysis of twoToxoplasma gondiiandNeospora caninumsatellite DNA families and evolution of their common monomeric sequence. J Mol Evol. 2004, 58 (5): 557-567. 10.1007/s00239-003-2578-3.PubMedView ArticleGoogle Scholar
- Echeverria PC, Rojas PA, Martin V, Guarnera EA, Pszenny V, Angel SO:Characterisation of a novel interspersedToxoplasma gondiiDNA repeat with potential uses for PCR diagnosis and PCR-RFLP analysis. FEMS Microbiol Lett. 2000, 184 (1): 23-27. 10.1111/j.1574-6968.2000.tb08984.x.PubMedView ArticleGoogle Scholar
- Braun L, Cannella D, Ortet P, Barakat M, Sautel CF, Kieffer S, Garin J, Bastien O, Voinnet O, Hakimi M-A:A complex small RNA repertoire is generated by a plant/fungal-like machinery and effected by a metazoan-like Argonaute in the single-cell human parasiteToxoplasma gondii. PLoS Pathog. 2010, 6 (5): 1000920-10.1371/journal.ppat.1000920. doi:10.1371/journal.ppat.1000920,View ArticleGoogle Scholar
- Howe DK, Sibley LD:Toxoplasma gondiicomprises three clonal lineages: correlation of parasite genotype with human disease. J Infect Dis. 1995, 172 (6): 1561-1566. 10.1093/infdis/172.6.1561.PubMedView ArticleGoogle Scholar
- Holm L:Codon usage and gene expression. Nucleic Acids Res. 1986, 14 (7): 3075-3087. 10.1093/nar/14.7.3075.PubMed CentralPubMedView ArticleGoogle Scholar
- McInerney JO:GCUA: general codon usage analysis. Bioinformatics. 1998, 14 (4): 372-373. 10.1093/bioinformatics/14.4.372.PubMedView ArticleGoogle Scholar
- Sharp PM, Bailes E, Grocock RJ, Peden JF, Sockett RE:Variation in the strength of selected codon usage bias among bacteria. Nucleic Acids Res. 2005, 33 (4): 1141-1153. 10.1093/nar/gki242.PubMed CentralPubMedView ArticleGoogle Scholar
- Suzuki H, Brown CJ, Forney LJ, Top EM:Comparison of correspondence analysis methods for synonymous codon usage in bacteria. DNA Res. 2008, 15 (6): 357-365. 10.1093/dnares/dsn028.PubMed CentralPubMedView ArticleGoogle Scholar
- Arnoult N, Van Beneden A, Decottignies A:Telomere length regulates TERRA levels through increased trimethylation of telomeric H3K9 and HP1alpha. Nat Struct Mol Biol. 2012, 19 (9): 948-956. 10.1038/nsmb.2364.PubMedView ArticleGoogle Scholar
- Bah A, Azzalin CM:The telomeric transcriptome: from fission yeast to mammals. Int J Biochem Cell Biol. 2012, 44 (7): 1055-1059. 10.1016/j.biocel.2012.03.021.PubMedView ArticleGoogle Scholar
- Mancio-Silva L, Rojas-Meza AP, Vargas M, Scherf A, Hernandez-Rivas R:Differential association of Orc1 and Sir2 proteins to telomeric domains inPlasmodium falciparum. J Cell Sci. 2008, 121 (Pt 12): 2046-2053.PubMedView ArticleGoogle Scholar
- Flueck C, Bartfai R, Niederwieser I, Witmer K, Alako BTF, Moes S, Bozdech Z, Jenoe P, Stunnenberg HG, Voss TS:A major role for thePlasmodium falciparumApiAP2 protein PfSIP2 in chromosome end biology. PLoS Pathog. 2010, 6 (2): 1000784-10.1371/journal.ppat.1000784. doi:10.1371/journal.ppat.1000784,View ArticleGoogle Scholar
- Luke B, Lingner J:TERRA: telomeric repeat-containing RNA. EMBO J. 2009, 28 (17): 2503-2510. 10.1038/emboj.2009.166.PubMed CentralPubMedView ArticleGoogle Scholar
- Peixoto L, Chen F, Harb OS, Davis PH, Beiting DP, Brownback CS, Ouloguem D, Roos DS:Integrative genomic approaches highlight a family of parasite-specific kinases that regulate host responses. Cell Host Microbe. 2010, 8 (2): 208-218. 10.1016/j.chom.2010.07.004. doi:10.1016/j.chom.2010.07.004,PubMed CentralPubMedView ArticleGoogle Scholar
- Pollard MA, Onatolu KN, Hiller L, Haldar K, Knoll LJ:Highly polymorphic family of glycosylphosphatidylinositol-anchored surface antigens with evidence of developmental regulation inToxoplasma gondii. Infect Immun. 2008, 76 (1): 103-110. 10.1128/IAI.01170-07. doi:10.1128/IAI.01170-07,PubMed CentralPubMedView ArticleGoogle Scholar
- Calvin Jung, Cleo Y-F Lee, Grigg E Michael:The SRS superfamily of Toxoplasma surface proteins. Int J Parasitol. 2004, 34 (3): 285-296. 10.1016/j.ijpara.2003.12.004. doi:10.1016/j.ijpara.2003.12.004,View ArticleGoogle Scholar
- Wasmuth JD, Pszenny V, Haile S, Jansen EM, Gast AT, Sher A, Boyle JP, Boulanger MJ, Parkinson J, Grigg ME:Integrated bioinformatic and targeted deletion analyses of the SRS gene superfamily identify SRS29C as a negative regulator ofToxoplasmavirulence. MBio. 2012, 3 (6): doi:10.1128/mBio.00321-12,Google Scholar
- Hassan MA, Melo MB, Haas B, Jensen KDC, Saeij JPJ:De novo reconstruction of theToxoplasma gondiitranscriptome improves on the current genome annotation and reveals alternatively spliced transcripts and putative long non-coding RNAs. BMC Genomics. 2012, 13: 696-10.1186/1471-2164-13-696. doi:10.1186/1471-2164-13-696,PubMed CentralPubMedView ArticleGoogle Scholar
- Ferdig MT, Su XZ:Microsatellite markers and genetic mapping inPlasmodium falciparum. Parasitol Today. 2000, 16 (7): 307-312. 10.1016/S0169-4758(00)01676-8.PubMedView ArticleGoogle Scholar
- Rudner R, Karkas JD, Chargaff E:Separation of B. subtilis DNA into complementary strands. 3 Direct analysis. Proc Natl Acad Sci USA. 1968, 60 (3): 921-922. 10.1073/pnas.60.3.921.PubMed CentralPubMedView ArticleGoogle Scholar
- Albrecht-Buehler G:Asymptotically increasing compliance of genomes with Chargaff’s second parity rules through inversions and inverted transpositions. Proc Natl Acad Sci USA. 2006, 103 (47): 17828-17833. 10.1073/pnas.0605553103.PubMed CentralPubMedView ArticleGoogle Scholar
- Camacho C, Madden T, Ma N, Agarwala R, Morgulis A:BLAST Command Line Applications User Manual. National Center for Biotechnology Information (US), (2008). National Center for Biotechnology Information (US). Camacho C, Madden T, Coulouris G, et al. BLAST Command Line Applications User Manual. 2008 Jun 23 [Updated 2013 Mar 25]. In: BLAST Help [Internet]. Bethesda (MD): National Center for Biotechnology Information (US); 2008-. Available from: [http://www.ncbi.nlm.nih.gov/books/NBK1763/],
- James C Abbott, David M Aanensen, Stephen D Bentley:WebACT: an online genome comparison suite. Methods Mol Biol. 2007, 395: 57-74. 10.1007/978-1-59745-514-5_4.View ArticleGoogle Scholar
- Carver TJ, Rutherford KM, Berriman M, Rajandream MA, Barrell BG, Parkhill J:ACT: the Artemis Comparison Tool. Bioinformatics. 2005, 21 (16): 3422-3423. 10.1093/bioinformatics/bti553.PubMedView ArticleGoogle Scholar
- Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA:Circos: an information aesthetic for comparative genomics. Genome Res. 2009, 19 (9): 1639-1645. 10.1101/gr.092759.109.PubMed CentralPubMedView ArticleGoogle Scholar
- Greenacre MJ:Correspondence Analysis in Practice, 2nd edn. Interdisciplinary Statistics. 2007, doi:10.1201/9781420011234. [http://www.crcnetbase.com/doi/book/10.1201/9781420011234],View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License(http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.