In higher eukaryotes the telomeric repeat array extends several kilobases from the chromosome ends [24, 25], whereas in protozoans and fungi it is much shorter, averaging 130–350 bp. We estimated the average length of the T. cruzi telomere repeats to be ~ 320 bp (53.3 repeats) (Table 1). However, their lengths varied widely among telomeric contigs (6 to 142 repeats). In a previous work, Freitas-Junior and coworkers  experimentally observed a great variability in the length of telomeric repeats in the CL Brener clone, ranging from 1 to 10 Kb. The smaller size of the sequences identified in silico may have been caused by slippage artifacts during BAC replication in Escherichia coli cells or by the difficulty assembling small repeat sequences obtained by whole genome shotgun sequencing, both methods used in the T. cruzi genome project .
Despite the sequence variations, similar telomeric structures have been detected in almost all T. cruzi chromosomes studied to date. The telomeric junction, a signature for T. cruzi telomeres, was present in all chromosome ends, confirming this sequence as a signature sequence of T. cruzi chromosomes . In a previous work we suggested that the events that generated the common T. cruzi telomeric block could be reconstructed from events that occurred at a tandem array of gp85 genes  as follows: first, a deletion brought together a fragment containing the spacer between two gp85 genes and part of a gp85 5’-UTR, with the 3’- UTR of the same gene; subsequently, a break took place in the 3’ UTR generating an end that was healed by telomerase or an alternative telomere repair mechanism; eventually these two structures were fixed as the T. cruzi telomere. In the present study, the size of the subtelomere varied widely from 5 kb to 182 kb among individual T. cruzi chromosome ends, and the organization of several subtelomeres, for instance, Tel 31 and Tel 32 (Figure 1), suggests that they have undergone truncation and that this could be a general phenomenon in T. cruzi.
We were able to identify 49 chromosome ends harboring the telomeric repeats in clone CL Brener, 40 located in chromosome-sized scaffolds and 9 in unassigned contigs. The number of chromosome ends found is smaller than we had expected; however, it is worth mentioning that about 50% of the T. cruzi genome is composed of multigenic families and repetitive sequences  and as the chromosome ends are enriched with these sequences they are very difficult to assemble. For this reason there are still a number of small unassigned contigs harboring typical subtelomeric genes or hexamer repeats that were not analyzed in this work. It should also be highlighted that the chromosome-sized scaffolds of T. cruzi are useful for sequence analysis and constitute an important tool for defining the linear gene sequence of the parasite. However, in most cases they do not reflect the actual chromosomal lengths and are in fact part of a single chromosome . Our in-depth analysis of telomeric and subtelomeric regions showed that the T. cruzi chromosome end structure varies widely as a result of differences in the abundance and organization of surface protein coding genes (TS and DGF-1) and RHS, retrotransposon, RNA-helicase and N-acetyltransferase genes. All the 425 complete genes within the subtelomeric region were present at more than one chromosome end. For example, RHS sequences were distributed in 47 subtelomeres, TS in 39, retrotransposons and DGF-1 in 29, RNA helicase in 16 and N-acetyltransferase in 11 chromosome ends. Therefore, it seems that switching mechanisms operated in T. cruzi to generate new variants of these gene families.
Comparison of T. cruzi homologous chromosomes showed that synteny breaks down around the subtelomeric region, reinforcing the hypothesis that frequent recombination events occurred between subtelomeric regions of this parasite. Adjacent to the telomeric repeats is a mosaic of surface protein coding sequences and RHS, retrotransposon, RNA-helicase and N-acetyltransferase genes that exhibit a great deal of polymorphism both between termini of an individual chromosome or between different chromosome ends (see Figure 1). In T. brucei, chromosomal rearrangements have been associated with the presence of RHS genes and retrotransposons . T. cruzi chromosome-sized scaffolds TcChr13-P and TcChr13-S are syntenic up to the beginning of the subtelomeric region, where the synteny is broken by the insertion of a 7 Kb region flanked by RHS genes. Apparently, the RHS sequences were duplicated during the insertion, suggesting that homologous recombination had occurred. The mosaicism in subtelomeric regions in T. cruzi chromosomes could be due to some common underlying mechanism. It is reasonable to suggest that there may be a selective advantage to maintaining the chromosome end polymorphism or a common active mechanism that leads to the accumulation and maintenance of mosaicism. Recently Souza and coworkers  reported extensive variation in genome size and karyotype polymorphism among T. cruzi lineages. They observe that T. cruzi lineages exhibit conservation of chromosome structure and synteny indicating that the variability found in the subtelomeric regions are typical of these chromosomal regions.
Confirming the findings of previous studies, RHS sequences were found flanking DGF-1 and TS genes. All subtelomeric copies of DGF-1 were flanked by RHS or TS sequences. DGF-1 genes were organized in tandem, with multiple copies flanked by RHS and/or TS sequences. The organization of RHS genes flanking surface protein genes (TS and/or DGF-1) may suggest that these sites have been involved in the generation of new surface protein variants of the parasite. The repetitive sequences present in the RHS genes and pseudogenes might be a target for homologous recombination or microhomology-mediated end joining, allowing the generation of variants by recombination of different chromosome ends.
In addition, we confirmed that RHS, DGF-1, TS, DEAD/H-RNA helicase and N-acetyltransferase sequences are abundant in subtelomeric regions of T. cruzi[9, 14]. For instance, 19%, 12% and 9%, respectively, of RHS, DGF-1 and TS sequences of the whole genome were found in the subtelomeric regions. Thirty-four and 12%, respectively, of N-acetyltransferase and DEAD/H-RNA helicase sequences were also located in these regions, indicating that they too could be considered characteristic markers for the subtelomeric regions (Table 2). Despite great abundance in T. cruzi genome mucins and MASP are poorly found in the subtelomeric regions. Helicases are essential molecular motor enzymes involved in processes requiring the separation of nucleic acid strands. They are classified into six different superfamilies according to the presence of conserved motifs. Both RNA-helicase and RecQ helicase belong to superfamily 2, the largest family, which is implicated in diverse cellular processes, including telomere maintenance . In yeast ATP-dependent DEAD/H RNA helicases are part of complexes involved in mRNA decapping and deadenylation .
Recently in T. cruzi, ATP-dependent DEAD/H RNA helicases have been found in RNA in stress granules that may be involved in RNA metabolism and whose cell distribution seemed to be developmentally regulated . Considering the polycistronic nature of Kinetoplastida transcription, a fine tuning of gene expression during cell cycle has to be exerted post-transcriptionally. Therefore, mRNA processing is a critical step in the parasite’s survival, and the machinery involved in this process can be considered an essential mechanism of regulation.
In protozoan parasites, especially T. brucei and P. falciparum, the role of subtelomeric regions in the generation of new variants of surface antigen genes and the control of expression of these genes has been widely demonstrated [11–13]. In P. falciparum, telomeres are followed by a non-coding region called TAS (telomere associated sequence) that consists of six blocks of repetitive sequences – TAREs (telomere associated repetitive elements). Upstream TASs are members of multigene families that encode virulence factors, like the var gene family. Each cell has up to 70 different var genes, and differential expression of these allows the escape of the parasite from the immune system by a mechanism known as antigenic variation [30–32]. In T. brucei, surface glycoprotein genes – VSG (variant surface glycoprotein) - were identified near telomeric repeats, and each trypanosome encodes up to a thousand different VSGs [13, 20]. Parasite survival in mammalian hosts results from a sophisticated strategy of antigenic variation that involves switching the glycoprotein coat . It was not possible to identify similar organizational patterns in T. cruzi chromosome ends, and no active transcriptional promoters have been identified to date in this parasite. However, as observed in T. brucei, retrotransposons and RHS genes are commonly located next to subtelomeric surface antigen genes and could have acted as a recombination site.
In the chromosome ends of T. cruzi there are a large number of genes and pseudogenes annotated as trans-sialidases (TS) with no further specifications. The TS superfamily is divided into four groups with different biological functions [1, 2, 34]. In the present study, all the members of these four groups were identified in the chromosome ends, genes from group II being the most abundant. This group comprises proteins that function as surface-located adhesins involved in host cell invasion [1, 2, 35]. Freitas and coworkers  also described the presence of gp85, gp82, gp90 and ASP-2 genes in the subtelomeric regions of T. cruzi. These genes could be a target for recombination, generating genetic variability and reinforcing the hypothesis of the participation of subtelomeric regions in the generation of new variants of surface antigens. Here, TS genes and pseudogenes flanked on both sides by RHS genes were observed in several chromosome ends. This organization is suggestive of the repetitive regions adjacent to VSG genes in T. brucei telomeres , where the repetitive sequences are involved in recombination mechanisms responsible for antigenic variation [37–39]. Perhaps a similar mechanism for generating gene diversity existed in T. cruzi that produced the surface antigens variability that we currently observe.
Complete copies of TS (31) and DGF-1 (37) genes, some of them larger than 10 kb, were identified in the subtelomeric regions, indicating that these regions are sites for generation and storage of variant surface antigens and that they can also act as active transcription sites for these genes. Subtelomeric genes are transcribed towards the telomeric repeats in all the chromosome ends analyzed (Additional file 2). In some chromosome ends analyzed the inversion of transcription sense was observed at the beginning of the interstitial region. In this work we have described a detailed analysis of the structure and organization of chromosome ends in T. cruzi and have confirmed the abundance of surface protein genes flanked by repetitive sequences at the subtelomeric regions. It is tempting to suggest that these regions acted as a gene reservoir and recombination site responsible for the large number of surface gene variants in T. cruzi and play an important role in the parasite adaptation and evasion of the host immune system.
Finally, we would like to make some considerations regarding the state of the assembly of the T. cruzi genome. The results presented in this work highlight the complexity of the T. cruzi genome and the difficulties involved in carrying out a more in-depth analysis of the chromosome structure of this parasite. We carried out an initial analysis of a set of subtelomeric sequence assemblies which were properly ordered and positioned in relation to the respective telomeres. This allows comparison of subtelomeric sequence organization of a few separate telomeres. Although the in silico chromosome assemblies were of great value for analysis, they should be improved by re-sequencing of selected regions and analysis by Comparative Genomic Hybridization (CGH) . Sequencing of new strains of T. cruzi coupled with the CGH technique can highlight deleted and/or amplified regions along the chromosome . For the subtelomeric region, and also possibly other repeated regions of the genome, this effort should be complemented by the cloning of genomic fragments in traditional vectors such as BAC, since the high-throughput DNA sequencing of the whole T. cruzi genome produced relatively short telomeric contigs.