Flowering plants have extensively and often recursively experienced polyploidization [1–4]. The resulting duplicated regions, especially those produced recently, offer the means to further study the contributions of segmental and/or whole-genome duplication/triplication to the evolution of a lineage, but add to genome complexity. The high abundance of repetitive DNA sequences in some flowering plants adds further to genome complexity. At present, many plant genomes have been or are being sequenced. Draft genome sequences can lack sufficient contiguity in many genomic regions to support cross-species comparison of genome organization and structure, which is crucial to understanding plant evolution and speciation. In concert with sequence assemblies, independent physical maps often facilitate the correct ordering of DNA segments on chromosomes and thus clarify the genome organization changes revealed by multiple species comparisons [5, 6].
Brassica is in the tribe Brassiceae, a well-defined clade in the family Brassicaceae that also includes Arabidopsis
thaliana, the source of the first flowering plant genome to be sequenced. Brassica and Arabidopsis are thought to have shared common ancestry ~14-20 million years ago [7–10]. The genus Brassica has great scientific and economic importance . Crops of the genus Brassica are widely used in the cuisine of many cultures and provide much of world-wide edible vegetable oil supplies. Six Brassica species are widely cultivated, including three diploids: B. rapa (AA, 2n = 20), B. nigra (BB, 2n = 16) and B. oleracea (CC, 2n = 18), and three amphidiploids (allotetraploids): B. juncea (AABB, 2n = 36), B. napus (AACC, 2n = 38) and B. carinata (BBCC, 2n = 34).
Study of B. oleracea offers particularly great promise of new insights into morphological evolution that complement and extend upon what is available in Arabidopsis [12–14]. In B. oleracea, morphological divergence has been unusually rapid relative to reproductive isolation, i.e., this single species has a stunning range of morphologies among genotypes that are readily intercrossed. While domestication of most crops resulted in enhancement of a single plant part for use by humans, such as the seeds/grains of cereal crops, the fruits of some trees, or the roots of some vegetable crops, the B. oleracea crops are a striking exception. They include forms that have been selected for enlarged vegetative meristems at the apex (cabbages, B. oleracea subspecies capitata) or in the leaf axils (Brussels sprouts, subsp. gemmifera), forms with proliferation of floral meristems (broccoli, subsp. italica) or even aborted floral meristems (cauliflower, subsp. botrytis), and forms with swollen bulbous stems (kohlrabi, subsp. gongylodes), or orate leaf patterns (kales, subsp. acephala). These morphologically divergent genotypes ('morphotypes') are freely intercrossing.
The plasticity of B. oleracea makes it a potential model for the study of plant morphological evolution in much the same manner that the dog (Canis spp.) is an attractive model for mammalian evolution. While a few genes like the homologs of Arabidopsis mutants such as "CAULIFLOWER" are thought to play roles in some Brassica morphologies [15–17], these morphologies are under complex genetic control [18–21]. Some Brassica QTLs map to locations that correspond to relevant Arabidopsis mutants, suggesting positional candidates -- but many do not, suggesting the opportunity to identify functions recalcitrant to mutation in Arabidopsis [22, 23] or that escaped detection due to small phenotypic effects .
Due to their close phylogenetic relationship, Brassica-Arabidopsis comparative genomics promises to identify genetic determinants of a much broader spectrum of variation than might be accessible using Arabidopsis alone [12–14]. The close relationship of Brassica to Arabidopsis motivated NSF-funded low-coverage (0.6×) sequencing of B. oleracea (BO) genotype TO 1000 . However, while the physiology and developmental biology of Arabidopsis and Brassica are similar, the genomes of Brassica species are much more complex than that of A. thaliana [26–28]. The 'diploid' Brassica genomes are 3-5 times larger than that of Arabidopsis, ranging from 0.97 pg/2C (468 Mb/1C) for B. nigra to 1.37 pg/2C (662 Mb/1C) for B. oleracea, partially as a result of multiple rounds of polyploidy during their ancestry [29, 30]. One round of ancient whole-genome triplication (gamma) in an early eudicot ancestor and two whole-genome duplications (beta and alpha) occurred before the Arabidopsis-Brassica split [4, 31, 32]. Additional polyploidization(s) occurred in the Brassica lineage after its divergence from Arabidopsis, reflected by large duplicated segments in the genetic maps of each of three diploids [B. rapa (syn. rapa,), B. nigra and B. oleracea] [27, 33–36]. The corresponding duplicated structure of the B. rapa and B. oleracea maps indicates that species divergence was after polyploidization, resulting whole-genome triplication [29, 37–39]. It was estimated that the genome triplication event and the initial diversification of the Brassiceae must have occurred between 7.9 and 14.6 mya , which might be the hypothesized single and major evolutionary event that have gave rise to the early lineages . According to the analysis of the FLOWERING Locus C region, it was further estimated that the Brassica triplication occurred 13 to 17 mya, very soon after the Arabidopsis and Brassica divergence at 17-18 mya .
Significant progress has been made in developing genomic resources to expedite Brassica research [41–44]. A detailed genetic linkage map of B. rapa has been constructed containing 545 sequence-tagged loci distributed on 10 linkage groups covering 1287 cM, with an average interval of 2.4 cM between markers . Genetic linkage maps were constructed for four B. oleracea populations, with an average length of 863.6 cM and a total of 367 loci were detected in the constructed composite map with an average interval between loci of 2.35 cM , which revealed at least 19 chromosomal rearrangements differentiating B. oleracea and Arabidopsis. Linkage maps of immortal mapping populations of rapid cycling, self-compatible lines from B. rapa and B. oleracea were recently developed, which included 224 and 279 markers, respectively . A genome-wide physical map of the B. rapa genome was constructed by high-information-content fingerprinting (HICF) , which facilitates improved physical map construction in both throughput and quality by exploiting the fluorescence-labeled finger-printing approach. The map provided 242 anchored contigs on 10 linkage groups to serve as seed points from which to continue bidirectional chromosome extension for genome sequencing. There are also efforts to refine genetic linkage maps. Genome sequencing projects involving "A" and "C" genomes are on-going or planned [47, 48]. The Multinational Brassica Genome Project (MBGP) and Brassica rapa Genome Sequencing Project (BrGSP) are aiming to completely sequence the genome of Brassica rapa inbred line 'Chiifu" (http://www.brassicagenome.org; http://www.brassica-rapa.org).
Here we report a physical map of a rapid-cycling strain of B. oleracea (accession TO1434), integrating high-information-content fingerprinting (HICF) of Bacterial Artificial Chromosome (BAC) clones with overgo hybridization data from 2882 probes, including about 600 that have been genetically mapped. By integrating the B. rapa physical map, we explored genome-wide microsynteny between Arabidopsis and Brassica, and found probable (peri)centromere-related contigs. Comparison of the B. oleracea map with Arabidopsis and other available eudicot genomes showed appreciable 'shadowing' produced by more ancient polyploidies, resulting in a web of relatedness among contigs which increased genomic complexity, and interchromosomal breakpoints during their diversification. This physical map is of immediate value for gene isolation, and will serve as a valuable genomic resource for Brassica "C" genome sequencing, assembly of BAC sequences and further comparative genomics between Brassica genomes.