Centromeres are the chromosomal loci that facilitate segregation in most eukaryotes. They are the site of assembly of the kinetochore, the nucleoprotein complex which anchors the microtubule spindles that separate sister chromatids and mediate their movement to the daughter nuclei. Most centromeres are "regional" and encompass large sections of DNA, spanning 0.06 - 5 Mb, in species as diverse as plants, insects and mammals [1–3]. Centromeric DNA is typically comprised of arrays of highly repeated sequences, interrupted by transposable elements [4, 5]. The repeats are generally restricted to centromeric regions and are often in the size range 150 - 180 bp. This length is similar to that of nucleosomes, a property that may be of functional significance . Although many features of centromeric DNA are widespread, there is little sequence conservation, even between closely related species , and most evidence suggests that centromeres are determined epigenetically [8, 9].
In human chromosomes, centromeres have a conserved core of α-satellite repeats (~170 bp) stretching over several megabases, which is flanked by extensive regions that contain multiple retrotransposon insertions . In eukaryotic microorganisms, centromeres can also encompass large regions of chromosomal DNA. Those of Schizosaccharomyces pombe for example, range from 35 - 110 kb  and are organised as chromosome-specific core elements, flanked by inverted arrays of 3 - 7 kb. These in turn are flanked by more extensive outer repeats. Unusually in Saccharomyces cerevisiae, the regions that specify kinetochore assembly are restricted to single 125 bp elements termed "point" centromeres . Some organisms, such as Caenorhabditis elegans, have holocentric chromosomes that lack specific centromeres . In these instances, microtubules bind along the entire length of the chromosome.
Protozoan parasites of the Trypanosoma brucei species complex are insect-transmitted pathogens that are of major medical and veterinary importance throughout sub-Saharan Africa. They belong to the Excavata, a eukaryotic lineage which includes the other trypanosomatid parasites Trypanosoma cruzi and Leishmania species. Several features of gene organisation and expression in these organisms are unusual. Protein coding genes lack conventional RNA polymerase II (pol II) promoters  and are organised in long co-directional clusters which can stretch for tens to hundreds of kilobases . Transcription is polycistronic, and processing involves a trans-splicing mechanism in which all mRNAs are modified post-transcriptionally by the addition of a 39-nucleotide spliced leader to their 5'-ends. T. brucei has a haploid genome content of 35 Mb, with 11 megabase pair chromosomes (0.9 - 5.7 Mb). Unusually, chromosome homologues can vary significantly in size . In addition, this parasite also contains two classes of atypical nuclear chromosomes; the intermediate-size chromosomes (300 - 900 kb) that contain some variant surface glycoprotein (VSG) genes, but no house-keeping genes, and the minichromosomes (50 - 100 kb), which appear to act as a reservoir of VSG sequences .
The T. brucei genome project was completed in 2005 . However, sequence elements characteristic of centromeric DNA in other eukaryotes were not described. Furthermore, candidates for the 'core' centromeric proteins and most of the other factors involved in kinetochore assembly could not be identified [14, 16]. This includes the variant histone CenH3, which specifies centromere location in eukaryotes and was thought to be ubiquitous . The first evidence on the nature and location of centromeric DNA in T. brucei came from a biochemical mapping approach based on etoposide-mediated topoisomerase-II cleavage [18, 19]. Topoisomerase-II has a major regulatory role in chromosome segregation and accumulates at centromeres during late metaphase, where it resolves the catenated DNA strands that provide the final structural link between sister kinetochores [20, 21]. This process requires double stranded DNA cleavage, passage of the uncut duplex through the gap and re-ligation to repair the break. Etoposide inhibits this re-ligation step leading to lesions in chromosomal DNA at sites of topoisomerase-II activity. In human chromosomes, etoposide-mediated cleavage sites occur within the α-satellite repeats that constitute centromeric DNA [22, 23]. In both T. cruzi  and Plasmodium [25, 26], these sites have been delineated to chromosomal loci that confer mitotic stability. In Toxoplasma gondii, they co-locate with the binding sites of the centromeric histone CenH3 .
Using the etoposide mapping method, we identified the location of putative centromeric domains on the 8 T. brucei chromosomes that had been fully assembled . These loci, which occur once per chromosome, encompass regions between directional gene clusters that contain transposable elements and an array of AT-rich repeats predicted to extend between 2 and 8 kb. The tandem repeats are arranged in units of ~147 bp and share intra-chromosomal identities ranging from 50% to more than 90%. The units have a complex structure made up of degenerate sub-repeats of ~48 and ~30 bp (for a more detailed description of their make-up, see reference ). We also noted that the repeat arrays were located adjacent to ribosomal RNA genes on 5 of the chromosomes, although the significance of this is unknown. The intermediate and minichromosomes did not exhibit site-specific topoisomerase-II activity, suggesting that their segregation might involve a centromere-independent mechanism, a finding consistent with the "lateral-stacking" model .
In the initial analysis of the T. brucei centromeric domains, we identified discrepancies between the published sequence data of two chromosomes and our preliminary long range restriction mapping . We also found evidence of heterogeneity in the extent of these regions between chromosome homologues. However, it was unclear whether the differences arose from an under-estimation of the copy number of the tandem repeats, whether they were due to the gaps in the assembly of the adjacent regions, or whether this under-estimation of size was also the case with other T. brucei chromosomes. Here, we show that the centromeric repeats in T. brucei chromosomes are present at much higher copy number than predicted, with an organisation that is more typical of centromeric domains in higher eukaryotes than realised. These data provide a more complete model for T. brucei chromosome structure, an improved basis for investigating the mechanisms of segregation, and will enable more detailed functional mapping of this crucial chromosomal region to be undertaken.