Reconstruction of evolutionary trajectories of chromosomes unraveled independent genomic repatterning between Triticeae and Brachypodium

Background After polyploidization, a genome may experience large-scale genome-repatterning, featuring wide-spread DNA rearrangement and loss, and often chromosome number reduction. Grasses share a common tetraploidization, after which the originally doubled chromosome numbers reduced to different chromosome numbers among them. A telomere-centric reduction model was proposed previously to explain chromosome number reduction. With Brachpodium as an intermediate linking different major lineages of grasses and a model plant of the Pooideae plants, we wonder whether it mediated the evolution from ancestral grass karyotype to Triticeae karyotype. Results By inferring the homology among Triticeae, rice, and Brachpodium chromosomes, we reconstructed the evolutionary trajectories of the Triticeae chromosomes. By performing comparative genomics analysis with rice as a reference, we reconstructed the evolutionary trajectories of Pooideae plants, including Ae. Tauschii (2n = 14, DD), barley (2n = 14), Triticum turgidum (2n = 4x = 28, AABB), and Brachypodium (2n = 10). Their extant Pooidea and Brachypodium chromosomes were independently produced after sequential nested chromosome fusions in the last tens of millions of years, respectively, after their split from rice. More frequently than would be expected by chance, in Brachypodium, the ‘invading’ and ‘invaded’ chromosomes are homoeologs, originating from duplication of a common ancestral chromosome, that is, with more extensive DNA-level correspondence to one another than random chromosomes, nested chromosome fusion events between homoeologs account for three of seven cases in Brachypodium (P-value≈0.00078). However, this phenomenon was not observed during the formation of other Pooideae chromosomes. Conclusions Notably, we found that the Brachypodium chromosomes formed through exclusively distinctive trajectories from those of Pooideae plants, and were well explained by the telomere-centric model. Our work will contribute to understanding the structural and functional innovation of chromosomes in different Pooideae lineages and beyond. Electronic supplementary material The online version of this article (10.1186/s12864-019-5566-8) contains supplementary material, which is available to authorized users.


Background
Whole-genome duplication (WGD) occurs recursively and shapes the plant genomes. Ploidy changes have been quite common during cereal evolution [1]. The origination of cereals were related to a paleopolyploid event~100 million years ago (Mya). Cereals are the major food in temperate regions. Their genomes are characterized by a high content of repetitive elements, such as the Triticeae plants, barley and wheat.
Wheat is now one of the most widely cultivated crops [2], and was domesticated in the Fertile Crescent more than 10,000 years ago [3,4]. It executes a diploid inheritance but has a genome of an ancestral hexaploid origin, resulting from the union of three diploid grasses [5]. A hybridization of the tetraploid durum wheat (Triticum turgidum; AABB; 2n = 4x = 28) with the wild diploid grass (Aegilops tauschii; DD; 2n = 2x = 14) resulted in hexaploid wheat (Triticum aestivum; AABBDD; 2n = 6x = 42, [6][7][8]). A 10.1-gigabase assembly of the 14 chromosomes of wild tetraploid wheat was reported in 2017 [3]. The Genome of wild wheat progenitor Triticum dicoccoides was sequenced in 2018 [9]. The complex polyploidy nature of wheat large genomes brings difficulty of genetic and functional analyses [10]. By use of wheat ancestors, the approach would provides a viable alternative to overcome the complex polyploidy challenging [11]. The wheat diploid progenitor species Triticum urartu (AA) [10,12], Aegilops tauschii (DD) [7,13,14], and tetraploid wheat Triticum turgidum (AABB) [3,9] provides convenience for studying the evolution of the wheat genome structure changes. Barley (Hordeum valgare) is among the earliest domesticated crops. A high-quality reference genome assembly for barley was presented [15], and the repetitive fraction of the 5100 Mb barley genome was analyzed in 2017 [16]. Actually, both genetic research and crop improvement in barley have benefited from genome sequencing [17].
Owing to its small and conservative genome, rice proved to be a model for other monocotyledonous species. It was sequenced as the second plant genome, and reported to have evolved much slower than other grasses, and preserved the ancestral genome structure after the grass-common whole-genome duplication (cWGD) [18,19]. Brachypodium distachyon (Brachypodium) is a member of the Pooideae subfamily, its morphological and genomic features make it a model monocot plant for both comparative and functional genomics for its Pooideae relatives [20][21][22][23][24][25]. It has a small and compact genome, self-fertility, a life cycle of less than 4 months, and undemanding growth requirements [25,26]. Besides, it is phylogenetically close to barley and wheat [25]. Due to the availability of its genome sequence [27] and many tools for functional genomics, Brachypodium was proposed to be used as a model for genomes of all temperate grasses [28]. The molecular cytogenetic studies advanced greatly with the development of Brachypodium bacterial artificial chromosome (BAC) libraries [29]. These resources coupled with the sequenced genome of Brachypodium provided insight into grass karyotype evolution [30]. Brachypodium shares an extensive synteny among other grasses, so it was a good structural model for the assembly of large genomes [28]. Brachypodium is also taken as a good intermediate between wheat and rice [31]. The availability of Brachypodium pan-genome sequences revealed genes doubled previous inference in an individual genome [32].
During the evolution of grasses, there has been continually genome repatterning, especially after the whole-genome duplications, often followed by genome instability and fractionation. Eukaryotic chromosomes contain linear structure possessing centromeres and telomeres, which keep the integrity of them and prevent chromosome fusions during nuclear divisions. Centromeric sequences may differ between species, while telomeric sequences are usually highly conserved among plants. Karyotype evolution can be resolved by genome sequencing, comparative genetic mapping, and comparative chromosome painting [33]. The A. thaliana karyotype evolution was inferred based on comparative chromosome painting in 2006 [33]. It was proposed that chromosome number reduction is often the result of reciprocal translocations, which combine two chromosomes into a larger one and a smaller one. The smaller chromosome got lost during meiosis [34]. Whole-genome duplication and erroneous DNA double-strand break repair are the main sources of genome structural variation [35].
Paleogenomics is adapted to reconstruct ancestral genomes from the genomes of actual modern species [36]. Modern genomes arose through centromeric fusion of protochromosomes, leading to neochromosomes [37]. The genome of the common ancestor of flowering plants was reconstructed in 2017 [38]. A new theory of telomere-centric genome repatterning explains chromosome number reductions of linear chromosomes [19], emphasizing the removal of telomeres during the process. Accordingly, evolutionary trajectories of genome repatterning and chromosome changes along some major grass lineages were reconstructed during the last~100 millions of years [18].
So far, the formation and evolutionary trajectory of Triticeae chromosomes, shared by wheat, barley, and other close relatives, have not been available. With Brachpodium as an intermediate linking different major lineages of grasses and a model plant of the Pooideae plants, we wonder whether/how it mediated the evolution from ancestral grass chromosomes to Triticeae chromosomes. Here, by inferring the homology within each genome and between them, we reconstructed the evolutionary trajectories of the Triticeae chromosomes, and compared to those of Brachypodium chromosomes. This present work will contribute to understanding the structural and functional innovation of chromosomes in different Pooideae lineages.

Inferring collinear homologs
Grasses share extensive gene collinearity, that is, thouands of genes share the same chromosomal order in the different plants, indicating descent from a common ancestral chromosomal region. To reveal gene colinearity, each genome was compared against other genomes using BLASTP, and also compared against itself. The best five hits meeting an E-value threshold 1 × 10 − 5 were retrieved. The syntenic regions were grouped to form multiple alignments using MCscan, the homologous pairs were used as the input for MCscan [39]. The default scoring scheme is min (log10 E, 50) match score for one gene pair and 1 gap penalty for each 10 kb distance between any two consecutive gene pairs. The resulting syntenic chains were evaluated using a procedure adopted by ColinearScan [40], and E-value threshold was set to be 1 × 10 − 10 . We enriched the collinear gene data set by inferring more small homologous blocks by running ColinearScan to detect pairwise chromosome homology. In collinearity methods, maximum gap length (mg) is the most important parameter which determines the length, quality and extensiveness of the predicated collinearity. The mg was set to be 40 intervening genes between neighboring genes in collinearity on both chromosomes. Gene clusters that contain 30 or more genes in a chromosome were removed from the present analysis, in that they may algorithcally happer the inference of gene colinearity, especially when they clustered up in a neighboring region [41].

Dot-plot generation
We used BLASTN to search for CDS anchors (E-value < 1 × 10 − 5 ) between every possible pair of chromosomes in multiple genomes. The best, second best, and other matches with E-value >1e-5 were displayed in different colors, to help distinguish orthology from paralogy, or layers of paralogy as a result of recursive WGD events. Gene families with > 30 members were removed from the analysis, for gene redundancy may lead to an aberrantly fast evolutionary rate and affect the accuracy of analytical results. Dot-plots were produced using Perl scripts [19].

Flash cartoon production
We used Adobe Flash language to produce flash multimedia cartoons. The seven ancestral chromosomes in seven different colors were related to extant and intermediate chromosomes in different grasses. These color schemes was integrated previous color schemes for grasses [19]. These color schemes were also used in dot-plots.

Statistical significance of homoeologous chromosome fusion
We estimated the occurrence probability of nested chromosomal fusions (NCFs) between homoeologous chromosomes with combinatorial statistics. For instance, rice (2n = 24) merged from 14 ancestral chromosomes, or seven ancestral homoeologous chromosome pairs. If merged chromosomes are viewed still as independent chromosomal segments, the probability of this event can be estimated. For example, the occurrence probability of one out of two NCFs between homoeologous chromosomes can be estimated with combinatory formula

Inference of Triticeae karyotype evolution
Parsimony-based phylogenomic analysis can help find and relatively date genomic changes, therefore contribute to clarify karyotype evolution. For example, comparing two grass genomes sharing the 100-mya tetraploidy [18], a single chromosomal inversion in their common ancestor would result in incongruity between paralogous chromosomes in both grasses, but no incongruity between the corresponding orthologous chromosomes, whereas an inversion in a chromosome of one grass genome would lead to incongruity with its orthologous chromosomes in the other grass, and at the mean time incongruity with the outparalog chromosomes. Similarly, the above analysis can infer the occurrence of chromosome fission, fusion, and number reduction.
Here, to understand the evolutionary trajectories of Pooideae chromosomes, we analyzed the syntenic conservation and chromosome rearrangements between the genomes of Ae. tauschii, barley, Triticum turgidum, and two sequenced grass relatives, rice and Brachypodium for comparison. By searching homologous genes within a genome or between different genomes, we drew homologous gene dotplots, which showed orthologous correspondence between these genomes and paralogous correspondence in each genome.
As to homologous gene dotplots between wheat and its Pooideae relatives, we found their 7 chromosomes had nearly perfect orthologous correspondence, showing that they inherited their ancestral karyotype and chromosomes without much changes in chromosome constitution. A homologous dotplot between rice and the Pooideae grasses showed the evolutionary changes that led to chromosome number reduction from 12 in an ancestral haploid grass genome, as previously studied [19]. The 12 ancestral chromosomes were just well represented by extant rice chromosomes with 1-1 correspondence. Therefore, for simplicity, we used rice chromosomes Os1-12 to represent ancestral chromosomes A1-12. Correspondence between orthologous chromosomes or chromosomal segments indicated that Triticeae chromosome 1 (T1) formed by a nested fusion of ancestral chromosome Os10 into chromosome Os5 (Figs. 1a, e and 2). The nested fusion process can occur as follows: Os10 crossed-over to form a major chromosome and a satellite chromosome, then the major chromosome insert the centromeric regions of Os5, the satellite chromosome may be lost. Spatial proximity would then favor ligation, resulting in NCFs.
Likewise, Ae2 (Hv2) formed by a fusion of Os4 and Os7 (Figs. 1b, f and 2), Ae7 (Hv7) formed by a fusion of Os6 and Os8 (Figs. 1c, g and 2). Ae3 (Hv3) and Ae6 (Hv6) were simple, they respectively corresponding to Os1 and Os2. The most complex evolutionary process was Ae4 (Hv4) and Ae5 (Hv5). A fusion of Os11 and Os3 formed an intermediate Os11/3 by nested chromosome fusion (NCF), another intermediate Os12/9 formed by Os12 and Os9 with end-end joining (EEJ), that produced a satellite chromosome, reciprocal translocation of arms between the two intermediates produced extant chromosomes Ae4 (Hv4) and Ae5(Hv5) (Figs. 1d, h and 2). The chromosome evolutionary process of Triticum turgidum was the same as Ae. Tauschii and barley from the dot-plot between Triticum turgidum and rice (Additional file 1: Figure S1). The evolution process of Triticeae is represented in the form of graphs and a video ( Fig. 2; Additional file 2: Video S1).
During the formation of Triticeae chromosomes, four intra-chromosome telomere-proximal crossing occurred to produce four free-end intermediate chromosomes (Fig. 2), which fused into the peri-centromeric Fig. 1 Chromosome fusions during the evolution of Hordeum vulgare and Aegilops tauschii. Chromosomes, shown as rectangular blocks, are arranged horizontally and vertically to the dot-plot. The color scheme (A1-A7, the seven ancestral chromosomes was used seven different colors as reference were related to chromosomes in different grasses) for the chromosomes of grasses follows that of a previous study [19]. Homologous blocks can be classified as primary, resulting from chromosomal orthology, and secondary, resulting from paralogy from ancestral polyploidy. Hv, Hordeum vulgare; Ae, Aegilops tauschii; Os, Oryza sativa. a regions of other chromosomes, and four satellite chromosomes; and one inter-chromosome telomere-proximal crossing occurred to produce an end-end merging chromosome and a satellite chromosome. The total five satellite chromosomes were all lost, and reduced the chromosome number from 12 to 7 in extant Triticeae genomes. Besides, an inter-chromosome in-arm crossing-over occurred, to exchange DNA between two chromosomes.

A comparison of karyotype evolution in Pooideae
As reported previously, the extant Brachypodium chromosomes (Bd1-5) formed exclusively by recursive occurrence of NCFs (Brachypodium genome sequencing project), and resulted in formations of 7 satellite chromosomes [19]. Here we showed the evolutionary process by following the telomere-centric model. Bd1 formed by two fusions, a fusion of Os3 and Os7 formed an intermediate  (Figs. 3a and 4). Bd2 formed by a NCF of Os5 into Os1. Bd3 formed also by two NCFs, with Os8 and Os10 nested into Os2 sequentially in time, or in reversed order (Figs. 3c and  4). Bd4 formed also by two NCFs, with Os11 and Os12 nested into Os9 (Figs. 3d and 4). Bd5 preserved the structure of Os4 (Fig. 4).
During formation of five extant Brachypodium chromosomes, seven nested chromosome fusions occurred, to produce seven satellite chromosomes. The loss of these satellite chromosomes resulted in the chromosome number reduction from 12 to 5.

Distinct evolutionary pathways taken by chromosomes Brachypodium and its Pooideae relatives
Notably, the above inference of the evolutionary trajectories of Pooideae chromosomes showed that the karyotypes of Brachypodium and its Pooideae relatives under consideration formed totally independently (Figs. 2 and 4). This means that not a single event, e.g., crossing-over or fusion, to form intermediate or extant chromosomes, was shared by two lineages.
Besides, in Brachypodium, more frequently than would be expected by chance, the 'invading' and 'invaded' chromosomes are homoeologs, originating from duplication of a common ancestral chromosome, that is, with more extensive DNA-level correspondence to one another than random chromosomes. In Brachypodium, three out of seven NCF events occurred between homoeologous chromosomes, and the corresponding probability can be estimated by (7, 1, 6, 1, 5, 1)/[(14,2, 12, 2, 10, 2)], where (n, m) is n!/[m!(n-m)!], or P-value≈0.00078. However, this phenomenon was not observed during the formation of other Pooideae chromosomes. Just none homoeologous fusion occurred to produce Pooideae chromosomes. These suggest that chromosomes in two lineages evolved in exclusively different trajectories.

Ancestral genome reconstruction
By checking gene collinearity, we revealed homologous genes within Triticeae genome, and between it and other grass relatives. Most of the collinear genes with Triticeae were produced by the grass cWGD [42,43]. Here we used two methods to show the collinearity information. On the one hand, the putative 7 ancient chromosomes was inferred with collinear genes in paralogous regions in a genome, as shown previously [19], and by using these preserved genes to relate to extant chromosomal regions (Fig. 5). On the other hand, rice genes on its 12 chromosomes were related to other genomes to show the collinearity/orthology between them (Fig. 5). These two representation schemes helped find homologous regions between genomes and display evolutionary repatterning results.

A different evolutionary history
Integrated synteny and phylogenomic analyses of grass genomes had revealed ancient polyploidy events and lineage-specific WGD events [44]. WGD events have been of central importance in angiosperm macroevolution and have provided raw material for natural selection [45,46]. Poaceae was profoundly influenced by a WGD event that occurred~100 Mya [18]. Following the WGD, genomic instability increased by extensive chromosomal rearrangements and numerous gene losses [42,43,47,48]. These changes eventually led to the formation of a new diploid karyotype [33,36,49]. Factors including gene loss, chromosomal rearrangement events and repeat-rich sequence accumulation may have Chromosomes, shown as rectangular blocks, are arranged horizontally and vertically to the dot-plot. The color scheme (A1-A7, the seven ancestral chromosomes was used seven different colors as reference were related to chromosomes in different grasses) for the chromosomes of grasses follows that of a previous study [19]. Homologous blocks can be classified as primary, resulting from chromosomal orthology, and secondary, resulting from paralogy from ancestral polyploidy. Bd, Brachypodium distachyon; Os, Oryza sativa. a Formation of chromosome Bd1; b formation of chromosome Bd2; c formation of chromosome Bd3; d formation of chromosome Bd4 contributed to the evolutionary history, which have to be left for future exploration.
The evolution of chromosome number in organisms is caused by the rearrangement of centromeres and telomeres [50]. The mechanism of chromosome number changes have been studied in certain eukaryotes, such as the fusion of two chromosomes and the insertions of whole chromosomes into other centromeres [51][52][53][54][55][56]. As to chromosome number reduction, we previously proposed a telomere-centric model to explain likely mechanisms, emphasizing the role of telomeres during the process [19]. Telomeres were inferred to be removed from the same chromosome by forming an intermediate free-end chromosome, which would eventually insert into another chromosome, or from two different chromosomes, the major structure of which would fuse to produce a larger chromosome. During the process, a satellite chromosome, formed by the two removed telomeres and some intervening DNA would be produced. If the satellite chromosome was lost or not counted, that would explain chromosome number reduction. Chromosomes evolved along exclusively diffrerent trajectories in two studied lineages: Pooideae and Brachypodium. Actually, in that Brachypodium was taken as a model of Triteceae plants, we would anticipated that chromosomes in two lineages may share much of their evolution. Interestingly, we found that Triticeae chromosomes were produced by sequential occurrence of 4 NCFs and 1 chromosome end-end merge, and likely produced 5 satellite chromosomes, while Brachpodium chromosomes were produced 7 NCFs and 7 likely satellite chromosomes. The lost of those satellite chromosomes resulted in chromosome number reduction. Notably, the Pooideae and Brachypodium lineages evolved their extant chromosomes through exclusively different trajectories, that is, not a single event, e.g., crossing-over or fusion, to form intermediate or extant chromosomes, was shared by two lineages. More requently than would be expected by chance, in Brachypodium, the 'invading' and 'invaded' chromosomes are homoeologs, originating from duplication of a common ancestral chromosome, that is, with more extensive DNA-level correspondence to one another than random chromosomes, NCF events between homoeologs account for three of seven cases in Brachypodium (P-value≈0.00078). However, this phenomenon was not observed during the formation of other Pooideae chromosomes. The situations were completely different along two lineages.

Conclusions
With Brachpodium as an intermediate linking different major lineages of grasses and a model plant of the Pooideae plants, we wonder whether it mediated the evolution from ancestral grass karyotype to Triticeae karyotype. Notably, we found that the Brachypodium chromosomes formed through exclusively distinctive trajectories from those of Pooideae plants, and were well explained by the telomere-centric model. Our work will contribute to understanding the structural and functional innovation of chromosomes in different Pooideae lineages and beyond.