Complexity of genome evolution by segmental rearrangement in Brassica rapa revealed by sequence-level analysis
- Martin Trick†1,
- Soo-Jin Kwon†2,
- Su Ryun Choi3,
- Fiona Fraser1,
- Eleni Soumpourou1,
- Nizar Drou1,
- Zhi Wang3,
- Seo Yeon Lee3,
- Tae-Jin Yang4,
- Jeong-Hwan Mun2,
- Andrew H Paterson5,
- Christopher D Town6,
- J Chris Pires7,
- Yong Pyo Lim3,
- Beom-Seok Park2 and
- Ian Bancroft1Email author
© Trick et al; licensee BioMed Central Ltd. 2009
Received: 31 July 2009
Accepted: 18 November 2009
Published: 18 November 2009
The Brassica species, related to Arabidopsis thaliana, include an important group of crops and represent an excellent system for studying the evolutionary consequences of polyploidy. Previous studies have led to a proposed structure for an ancestral karyotype and models for the evolution of the B. rapa genome by triplication and segmental rearrangement, but these have not been validated at the sequence level.
We developed computational tools to analyse the public collection of B. rapa BAC end sequence, in order to identify candidates for representing collinearity discontinuities between the genomes of B. rapa and A. thaliana. For each putative discontinuity, one of the BACs was sequenced and analysed for collinearity with the genome of A. thaliana. Additional BAC clones were identified and sequenced as part of ongoing efforts to sequence four chromosomes of B. rapa. Strikingly few of the 19 inter-chromosomal rearrangements corresponded to the set of collinearity discontinuities anticipated on the basis of previous studies. Our analyses revealed numerous instances of newly detected collinearity blocks. For B. rapa linkage group A8, we were able to develop a model for the derivation of the chromosome from the ancestral karyotype. We were also able to identify a rearrangement event in the ancestor of B. rapa that was not shared with the ancestor of A. thaliana, and is represented in triplicate in the B. rapa genome. In addition to inter-chromosomal rearrangements, we identified and analysed 32 BACs containing the end points of segmental inversion events.
Our results show that previous studies of segmental collinearity between the A. thaliana, Brassica and ancestral karyotype genomes, although very useful, represent over-simplifications of their true relationships. The presence of numerous cryptic collinear genome segments and the frequent occurrence of segmental inversions mean that inference of the positions of genes in B. rapa based on the locations of orthologues in A. thaliana can be misleading. Our results will be of relevance to a wide range of plants that have polyploid genomes, many of which are being considered according to a paradigm of comprising conserved synteny blocks with respect to sequenced, related genomes.
The cultivated Brassica species, like Arabidopsis thaliana, are members of the Brassicaceae family . Brassica rapa (n = 10) contains the Brassica A genome, which is the smallest, at ca. 500 Mb . A genome sequencing project is underway http://brassica.bbsrc.ac.uk/. A number of genome analysis studies have shown that the Brassica genomes contain extensive triplication, consistent with their having evolved from a hexaploid ancestor [3–5]. Two sequence-level studies, one in B. oleracea  and one in B. rapa  have provided further support for the hypothesis of hexaploid ancestry for the Brassica species. Recent cytogenetic studies have shown that a distinctive feature of the Brassiceae tribe, of which the Brassica species are members but A. thaliana is not, is that they contain extensively triplicated genomes .
An elegant study using sequenced RFLP markers demonstrated that 21 segments of the genome of A. thaliana, representing almost its entirety, could be replicated and rearranged to generate a structure approximating that of the B. napus genome . In a similarly ground-breaking study, an ancestral karyotype (AK) of n = 8 was proposed for the Brassicaceae, which has been related to the A. thaliana genome sequence and the structure of the B. rapa genome derived by linkage mapping . Thus the genome sequence of A. thaliana is being used, either directly via the B. napus comparative analysis or indirectly via the AK inferred genome, to inform studies in the Brassica species. A complication in such comparative studies is that there are typically multiple orthologues in Brassica species for each gene represented in A. thaliana, although interspersed gene loss has reduced the number that might be expected in paleohexaploids such as the Brassica species .
Brassica species have been used to study the early responses of genomes to the induction of polyploidy, via resynthesis of B. napus by hybridization of B. rapa with B. oleracea. Such lines display genome instability, which can persist for many generations . Although this is hypothesised to involve homoeologous non-reciprocal translocations, such evolutionary events have not been studied at the sequence level. Indeed, sequence-level studies in Brassica to date have focussed on regions that show collinearity between the Brassica genome studied and that of A. thaliana. Similarly, in comparative studies in grass genomes, which are considered very much in terms of rearranged collinear blocks , little attention has been paid to the regions of collinearity breakdown.
We aimed to test the veracity of our present understanding of the evolution of the Brassica and Arabidopsis genomes from the AK genome by identifying and sequencing BAC clones containing genomic DNA of B. rapa that represent a sample of collinearity discontinuities (CDs) relative to the A. thaliana genome. This involved the development of bioinformatics tools and accessing data arising from ongoing activities to sequence the first four of the ten chromosomes of B. rapa.
Identification of BAC clones putatively containing collinearity discontinuities
We developed a method by which candidate B. rapa BAC clones spanning CDs with the Arabidopsis genome could be identified and selected for sequencing. Our starting point was the set of BAC end sequence (BES) data available for the combined libraries from the 'Chiifu' cultivar, which is the subject of the multinational genome sequencing project. Using a strategy opposite to that employed for selection of seed BACs for that programme, we analysed the mate-pairs within the BES data primarily for inferred disruptions in short- to medium-range synteny (up to five-fold of an average BAC insert, i.e. 500 kb). We first conducted a BLASTN similarity search against the Arabidopsis genome sequence with all 200,031 individual BES available from 106,144 B. rapa BAC clones, these sequences comprising 93,887 mate-pairs and 12,257 singletons. For each BES we recorded the pseudochromosome coordinates of the most significant alignment above a threshold E-value of 10-30. Of the clones with both mate-pair BES available, 26,574 (28%) gave mappings with each E-value above this threshold and were therefore amenable to further analysis.
We loaded these pseudochromosome mappings into our own copy of the ATIDB Arabidopsis genome database  to enable a programmatic analysis. A Perl script was developed to interrogate the database and to identify associations between non-contiguous regions of the Arabidopsis genome that are linked by a number of disjoined mate-pair mappings and thus produce a list of cognate B. rapa BAC clones that might contain discontinuities. The algorithm we used is described in more detail in Methods. Our initial approach took into account several factors; filtering out instances of clone duplications and discounting mate-pair mappings whose DNA strand dispositions differed from the majority. We experimented with a threshold number, over the range of 2-5, of independent mate-pair mappings linking any given pair of bins required to signal an association.
BAC clones potentially representing CDs with the A. thaliana genome were thus selected, one from each association identified by three or more BAC clones. These BAC clones were sequenced and annotated, inter alia, for similarity to A. thaliana gene models and to B. rapa BES using BLASTN. Of the 68 sequenced BACs, 38 were found not to contain CDs. In the majority of these (25), the BACs show alignment of multiple gene models from two regions of the A. thaliana genome. These pairs of regions of the A. thaliana genome are related to each other, representing paralogous segments. The sequences at one end of each B. rapa BAC shows the highest similarity to the corresponding gene model from one of the A. thaliana genome segments, whereas the sequences at the other end of the B. rapa BAC shows the highest similarity to the corresponding gene model from the other A. thaliana genome segment. We termed this paralogue conflation. In the remaining cases, there appears to be at one end of the clone a small stretch of inverted sequence or a single gene (or gene fragment) with similarity elsewhere in the A. thaliana genome. The remaining 30 B. rapa BAC clones contain similarity to two or more collinear runs of multiple A. thaliana gene models.
Results of database interrogation for collinearity discontinuities
B. rapa BES
B. rapa+ B. oleracea BES
Sequence validation of putative collinearity discontinuities
Characterization of B. rapa BAC clones containing discontinuities in collinearity with the A. thaliana genome
Collinearity with A. thaliana*
No. confirming BACsd
4g03430-4g03630| 4g05460-4g05430| 3g26280-3g26570
4g38560-4g38350| 4g14145-4g14350| 1g30480-1g30400
2g22840-2g20920| 2g41990-2g42005| 5g49760-5g49900
3g52930-3g52770| 1g02080-1g01980| 5g42100-5g42020
2g20440-2g20900| 3g24620-3g25290| 1g62200-1g63390
2g25290-2g26170| 4g00040-4g00080| 2g24690-2g24450
5g28060-5g28150| 5g30510-5g28490| 5g49570-5g49360
2g05540-2g05760| 2g07690-2g05840| 2g11890-2g12480
4g35450-4g35335| 4g23620-4g24120| 4g36130-4g36140
5g59700-5g59650| 5g60110-5g59320| 5g59030-5g59130
4g36760-4g36870| 4g37240-4g36880| 4g37260-4g37410
Fifteen of the 19 inter-chromosomal CDs were genetically mapped in the B. rapa genome, either by direct linkage mapping or by sequence overlap with a BAC mapped by linkage mapping described elsewhere , as summarised in Table 2. These could be related to the position in the Brassica A genome of CDs previously inferred by linkage mapping-, defined relative to A. thaliana chromosomes  and subsequently to the AK . We will use the nomenclature At(chromosome number, letter) to refer to the previously described A. thaliana chromosome blocks (e.g. At1A refers to A. thaliana chromosome 1, block A) as described in  and AK(letter) to refer to the ancestral karyotype blocks (e.g. AKA refers to ancestral karyotype block A), as described in .
The sequences within KBrH010M06 represent the end of collinearity block At3C (AKM) and sequences internal to collinearity blocks At5F (AKX), but neither had been identified previously on linkage group A3.
The sequences within KBrH034P16 are internal to collinearity blocks At3D and At4B (AKN and AKT). Although At4B (AKT) had been identified previously on B. rapa linkage group A3, At3D (AKN) had not. Therefore the transition previously inferred on this linkage group between collinearity blocks At4B (AKT) and At3A (AKF) [9, 10] may be more complex than anticipated. This is supported by the results of analysis of the structure of the Brassica A genome as represented in B. juncea, in which AKN and AKG-H were identified between AKT and AKF .
The sequences within KBrB055E21 represent the ends of collinearity blocks At3A and At4A (AKF and AKO). This is consistent with the transition inferred on the basis of linkage mapping , but is not consistent with the inferred interpolation of AKP between AKF and AKO that has been proposed .
The sequences within KBrH004I22 correspond to the end of At3A (AKF) and sequences internal to At2A (or at the end of AKK). Although At3A (AKF) had been identified previously on linkage group A3, At2A (AKK) had not. Only two copies of AKK had been identified previously , so our study may have identified the position of the "missing" third block in the B. rapa genome that would be expected from its paleohexaploid ancestry. The linkage mapping study  had identified markers on A3 that have similarity to this region of the A. thaliana genome, but there was insufficient evidence to call the block.
The sequences within KBrH004M24 correspond to the end of At2A (AKK) and sequences internal to At5D (or at the end of AKV) and confirm one of the CDs on linkage group A6 previously inferred .
The sequences within KBrH001J23 correspond to the end of collinearity block At3C (AKM) and sequences internal to collinearity blocks At1B (AKB). Block At1B (AKB) had been position on linkage group A6 previously, but no copy of At3C (AKM) had been positioned previously on this linkage group. Thus this BAC represents the position of the third copy of this segment (along with those identified in BACs KBrH010M06 and KBrH109L07).
The sequences in KBrB028F11 represent sequences at the end of collinearity blocks At2A (AKH) and within At1D (AKD). Although At1D (AKD) had been identified previously on B. rapa linkage group A9, At2A (AKH) had not. Therefore the transitions previously inferred on this linkage group bordering block At1D (AKD) [9, 10] may be more complex than anticipated. This is supported by the results of analysis of the structure of the Brassica A genome as represented in B. juncea, in which AKH was identified as being adjacent to AKD . The linkage mapping study  had identified markers on A9 that have similarity to this region of the A. thaliana genome, but there was insufficient evidence to call the block. The transition revealed by the BAC sequence shows an additional small segment in between At2A (AKH) and At1D (AKD), with collinearity to the end of At3A (AKF).
The sequences in KBrB026A12 represent sequences at the end collinearity block At2A (AKK) and within At5D (AKV). Although At5D (AKV) had been identified previously on linkage group A9, the part of At2A corresponding to AKK had not. Therefore the transitions previously inferred on this linkage group bordering block At5D (AKV) [9, 10] may be more complex than anticipated. Only two copies of AKK had been identified previously , so our study has identified the position of the "missing" third block in the B. rapa genome that would be expected from its paleohexaploid ancestry. The linkage mapping study  had identified markers on A9 that have similarity to this region of the A. thaliana genome, but there was insufficient evidence to call the block.
The sequences in KBrH006I08 represent sequences at the end of collinearity blocks At3D (AKN) and within At2B (AKI). They confirm one of the CDs on linkage group A9 previously inferred [9, 10], but indicate that At2B (AKI), as represented on linkage group A9, may be truncated.
Twenty four of the 31 CDs representing the end points of intra-chromosomal rearrangements (segmental inversions) were mapped in the B. rapa genome, either by direct linkage mapping or by overlap with a BAC mapped by linkage mapping. Their occurrence appears genome-wide, as summarised in Table 2. Few such rearrangements had been inferred previously, and these had not been clearly defined. The positions on linkage group A8 of BAC clones KBrB022J01 and KBrH064I20, and on linkage group A1 of BAC clones KBrH001C16, KBrH027B04, KBrB008F10 and KBrB090F01 are consistent with those expected for the inversions noted in At4B segments on these chromosomes .
In addition to the small segmental inversion (relative to the A. thaliana genome) contained within KBrH066L21, we found further examples of secondary rearrangements at the points of CDs in KBrB022J01 and KBrH108B16, and wholly contained inversions within KBrB011D06 and KBrH064I20, as illustrated in Figure 8. We identified one example of a CD apparently representing intra-chromosomal rearrangements that were separated by sequences from elsewhere in the genome. The sequences in KBrH026A01 represent a small segment from the end of At4A (AKO) at one end of a segmental inversion within At2B (AKI), as illustrated in Figure 8.
Analysis of collinearity discontinuity sequences
None of the BAC clones containing the CDs was found to contain B. rapa satellite repeat sequences characteristic of centromeres [16, 17], nor was any found with tandem tracts of TTTAGGG repeats that are associated with telomeres , although clone KBrH108B16 did have a cluster of 31 such repeats interspersed over 926 bp, but some 17 kb from the CD.
Summary of features over identified collinearity discontinuities and all annotated B. rapa sequence
Gene model density/kb
Models with EST support/kb
Mapped Solexa leaf read density/kb
Computational methods were used to successfully identify BAC clones representing verified CDs between the genomes of B. rapa and A. thaliana, and relative to an ancestral karyotype. Along with CDs identified during the ongoing chromosome sequencing project, these represent a substantial (but incomplete) sampling of the CDs in the genome of B. rapa. Previous studies had defined a segmental structure for the paleohexaploid Brassica genome based largely on genetic linkage of markers with similarity to sequences in the A. thaliana genome [9, 10]. Whereas the seminal study in this area  compared the arrangements of the B. napus genome with that of A. thaliana, and ours compared the arrangement of the B. rapa genome with that of A. thaliana, we anticipate that the results should be directly comparable as there seems to be little difference in the organization of the A genome in these two Brassica species . Remarkably few of our CDs correspond to those expected from this structure: 3 of the 18 representing inter-chromosomal rearrangements and 6 of the 32 representing intra-chromosomal rearrangements. The relatively high "noise" inherent to comparative genomics studies in Brassica species, which is a consequence of the widespread occurrence of apparently transduplicated fragments of genes , means that multiple instance of collinear alignments are required to correctly identify collinear genome segments. This requirement limits the ability to identify relatively small segments using, for example, comparative linkage mapping based on RFLP markers. Although the paleopolyploid ancestry of Brassica species is now widely accepted, the lack of discernable triplication throughout the genome has not been fully explained. The hypothesised segments that have not been identified had been assumed to have been deleted. However, we have found evidence for the existence of numerous additional copies of genome segments, bringing the count of many of these to (or closer to) the predicted three. In one case (At3C/AKM) we identified and mapped onto the B. rapa genome three copies where none had been identified previously in that species, with only one copy having been identified in the Brassica A genome as represented in B. juncea .
Our results show that previous studies of segmental collinearity between A. thaliana, Brassica and AK genomes, although very useful, represent over-simplifications of the true inter-relationships of the genomes. In addition to the occurrence of individual genes in non-collinear regions of the genomes previously noted , the presence of numerous cryptic collinear genome segments and the frequent occurrence of segmental inversions mean that inference of the positions of genes based on the locations of orthologues in A. thaliana can be misleading. Indeed, excessive reliance on collinearity with the genome of A. thaliana may prove problematic for the ongoing efforts to sequence the B. rapa genome. Polyploidy is common in plants, and there is no reason to conclude that the greater complexity of segmental rearrangement and evolution that we have observed is unusual. Therefore, our results will be of relevance to studies in a wide range of polyploid plant genomes, many of which are being considered as having blocks of conserved synteny with respect to the genomes of model species, and studies relating to evolutionary breakpoints and their relation to genome organisation .
The 200,031 publicly available BAC end sequence reads (BES) from the combined KBrH, KBrB and KBrS Brassica rapa ssp. pekinensis cv. Chiifu libraries, which were provided by the Korea Brassica Genome Resource Bank, were used in a WU-BLASTN search  versus the TAIR v6 Arabidopsis pseudomolecule sequences, using 1E-30 as the E-value cutoff. Supplementary BLAST parameters used were application of the Dust simple sequence filter and setting hspsepsmax = 1000, appropriate for use against very large subject sequences. Coordinates and scores for individual HSPs from the significant hits were then parsed into GFF format and loaded as features into a local copy of the ATIDB genome database  which is built on the GBrowse platform using a MySQL adaptor . An identical exercise was performed with a set of 85,317 B. oleracea BES obtained from line TO 1434 and these data added incrementally.
A Perl CGI script was developed to interrogate the ATIDB database using the Bio::DB::GFF applications programming interface and methods. The five Arabidopsis reference chromosome sequences (pseudomolecules) were divided into bins of a selectable size (250 kb - 1 Mb) and the B. rapa (and B. oleracea) BES features mapping within each were extracted and loaded into a hashed array structure, keyed by chromosome and bin. Each bin was then systematically compared with every other bin, with the algorithm exploiting mirror symmetry for efficiency. Text string comparisons of feature object names (e.g. KBrH088K13_F and KBrH088K13_R) were used to identify mate-pairs amongst the BES mappings linking any given pair of bins. The raw mate-pair associations between bins identified from this initial process are inherently noisy and so the algorithm goes on to filter them on a combination of theoretical and empirical criteria. These can be summarised as follows: (1) any mate-pair mappings between neighbouring bins on the same chromosome were discounted if their physical separation in Arabidopsis pseudomolecule space was less than a set threshold of 500 kb, reflecting our estimate of the conserved microsynteny range; (2) duplicate instances of the mate-pair mappings, indicating either simple duplications of clones within libraries or multiple cloning events of the same DNA fragment during library construction, were eliminated; (3) DNA strand dispositions of mate-pair BES mappings (e.g. "Chr3:Bin20:plus" vs. "Chr5:Bin10:minus") were analysed to eliminate minority variants as it was reasoned that independent physical correlates of CDs should reflect a consistent pattern (unlike chimaeric clones generated by in vitro recombination events), and finally, in a development of the algorithm prompted by analysis of false positives; (4) the raw BES mappings in pseudomolecule space were used to locate the nearest annotated gene models and mate-pair mappings were eliminated if these conflicted at either end with the results of direct BLASTN query of the BES against annotated Arabidopsis genes - countering what we termed paralogue conflation;. We imposed an arbitrary threshold for the number of independent mate-pair mappings to annotated gene regions required to trigger a significant association. We varied this threshold between 2 and 5 in order to experiment with the signal to noise ratio in the dataset.
The final output of the script was directed through two routes, a graphical dot-plot style representation of the mate-pair associations using an interface to the GridMap Java applet  and also a spreadsheet format summary of the details underlying each significant association (clone identifiers, HSP coordinates and gene models, strand dispositions). The implementation is available from additional file 1 and additional file 2.
Sequence annotation and analysis
Our automated annotation pipeline  was used to analyse the sequences at the CDs. All annotated B. rapa BAC sequences were stored in a GBrowse MySQL database . Minimal regions for each CD were manually selected by identifying the sequence flanked by the Arabidopsis gene models listed in Table 2, supplemented (where informative) either by annotated Brassica gene predictions from SNAP  post-processed with PASA  using raw EST data or by BLAT alignments  of Brassica transcript assemblies . The CD sequences were analysed for the presence of characterised telomeric or centromeric repeats and then scanned for various annotated features with a Perl script using Bio::DB::GFF methods. This was repeated for extracted subsets of the entire annotated sequence defined as genic or intergenic by the gene predictions. Solexa leaf transcriptome reads obtained from B. napus  were aligned with MAQ  onto the B. rapa BAC sequences.
We would like to thank the Beijing Genomics Institute for BAC sequencing. This work was funded by the UK Biotechnology and Biological Sciences Research Council (BB/E017363) and Rural Development Administration (BioGreen 21 Program 20050301034438, NAAS project No. 2007139062200001502 and 200901FHT020710397), and the Technology Development Program for Agriculture and Forestry, Ministry for Food, Agriculture, Forestry and Fisheries (Project No. 607003-05), Korea. CDT, AHP and JCP were supported by the U.S. National Science Foundation (DBI-0638536).
- Warwick SI, Black LD: Molecular systematics of Brassica and allied genera (Subtribe Brassicinae, Brassiceae) - Chloroplast genome and cytodeme congruence. Theor Appl Genet. 1991, 82: 81-92. 10.1007/BF00231281.View ArticlePubMedGoogle Scholar
- Arumuganthan K, Earle ED: Nuclear DNA content of some important plant species. Plant Mol Biol Report. 1991, 9: 208-218. 10.1007/BF02672069.View ArticleGoogle Scholar
- O'Neill CM, Bancroft I: Comparative physical mapping of segments of the genome of Brassica oleracea var alboglabra that are homoeologous to sequenced regions of the chromosomes 4 and 5 of Arabidopsis thaliana. Plant Journal. 2000, 23: 233-243. 10.1046/j.1365-313x.2000.00781.x.View ArticlePubMedGoogle Scholar
- Rana D, Boogaart van den T, O'Neill CM, Hynes L, Bent E, Macpherson L, Park JY, Lim YP, Bancroft I: Conservation of the microstructure of genome segments in Brassica napus and its diploid relatives. Plant J. 2004, 40: 725-733. 10.1111/j.1365-313X.2004.02244.x.View ArticlePubMedGoogle Scholar
- Park JY, Koo DH, Hong CP, Lee SJ, Jeon JW, Lee SH, Yun PY, Park BS, Kim HR, Bang JW, Plaha P, Bancroft I, Lim YP: Physical mapping and microsynteny of Brassica rapa ssp. pekinensis genome corresponding to a 222 kb gene-rich region of Arabidopsis chromosome 4 and partially duplicated on chromosome 5. Mol Gen Genomics. 2005, 274: 579-588. 10.1007/s00438-005-0041-4.View ArticleGoogle Scholar
- Town CD, Cheung F, Maiti R, Crabtree J, Haas BJ, Wortman JR, Hine EE, Althoff R, Arbogast TS, Tallon LJ, Vigouroux M, Trick M, Bancroft I: Comparative genomics of Brassica oleracea and Arabidopsis thaliana reveals gene loss, fragmentation and dispersal following polyploidy. Plant Cell. 2006, 18: 1348-1359. 10.1105/tpc.106.041665.PubMed CentralView ArticlePubMedGoogle Scholar
- Yang TJ, Kim JS, Kwon SJ, Lim KB, Choi BS, Kim JA, Jin M, Park JY, Lim MH, Kim HI, Lee MC, Lim YP, Kang JJ, Hong JH, Kim CB, Bhak J, Bancroft I, Park BS: Sequence-level analysis of the diploidization process in the triplicated FLC region of Brassica rapa. Plant Cell. 2006, 18: 1339-1347. 10.1105/tpc.105.040535.PubMed CentralView ArticlePubMedGoogle Scholar
- Lysak MA, Koch MA, Pecinka A, Schubert I: Chromosome triplication found across the tribe Brassiceae. Genome Res. 2005, 15: 516-525. 10.1101/gr.3531105.PubMed CentralView ArticlePubMedGoogle Scholar
- Parkin IAP, Gulden SM, Sharpe AG, Lukens L, Trick M, Osborn TC, Lydiate DJ: Segmental Structure of the Brassica napus Genome Based on Comparative Analysis With Arabidopsis thaliana. Genetics. 2005, 171: 765-781. 10.1534/genetics.105.042093.PubMed CentralView ArticlePubMedGoogle Scholar
- Schranz ME, Lysak MA, Mitchell-Olds T: The ABC's of comparative genomics in the Brassicaceae: building blocks of crucifer genomes. Trends in Plant Sci. 2006, 11: 535-542. 10.1016/j.tplants.2006.09.002.View ArticleGoogle Scholar
- Song K, Lu P, Tang K, Osborn TC: Rapid genome change in synthetic polyploids of Brassica and its implications for polyploid evolution. Proc Natl Acad Sci USA. 1995, 92: 7719-7723. 10.1073/pnas.92.17.7719.PubMed CentralView ArticlePubMedGoogle Scholar
- Moore G, et al: Grasses, line up and form a circle. Curr Biol. 1995, 5: 737-739. 10.1016/S0960-9822(95)00148-5.View ArticlePubMedGoogle Scholar
- Pan X, Liu H, Clarke J, Jones J, Bevan M, Stein L: ATIDB: Arabidopsis thaliana insertion database. Nucl Acids Res. 2003, 31: 1245-1251. 10.1093/nar/gkg222.PubMed CentralView ArticlePubMedGoogle Scholar
- Kim H, Choi SR, Bae J, Hong CP, Lee SY, Hossain MJ, Van Nguyen D, Jin M, Park B-S, Bang J-W, Bancroft I, Lim Y-P: Sequenced BAC anchored reference genetic map that reconciles the ten individual chromosomes of Brassica rapa. BMC Genomics. 2009, 10: 432-10.1186/1471-2164-10-432.PubMed CentralView ArticlePubMedGoogle Scholar
- Panjabi P, Jagannath A, Bisht NC, Padmaja KL, Sharma S, Gupta V, Pradhan AK, Pental D: Comparative mapping of Brassica juncea and Arabidopsis thaliana using Intron Polymorphism (IP) markers: homoeologous relationships, diversification and evolution of the A, B and C Brassica genomes. BMC Genomics. 2008, 9: 113-10.1186/1471-2164-9-113.PubMed CentralView ArticlePubMedGoogle Scholar
- Harrison GE, Heslop-Harrison JS: Centromeric repetitive DNA sequences in the genus Brassica. Theor Appl Genet. 1995, 90: 157-165. 10.1007/BF00222197.View ArticlePubMedGoogle Scholar
- Lim KB, Yang TJ, Hwang YJ, Kim JS, Park JY, Kwon SJ, Kim JA, Choi BS, Lim MH, Jin M, Kim HI, de Jong H, Bancroft I, Lim Y, Park BS: Characterization of the centromere and peri-centromere retrotransposons in Brassica rapa and their distribution in related Brassica species. Plant Journal. 2007, 49: 173-183. 10.1111/j.1365-313X.2006.02952.x.View ArticlePubMedGoogle Scholar
- Richards EJ, Ausubel FM: Isolation of a higher eukaryotic telomere from Arabidopsis thaliana. Cell. 1988, 53: 127-136. 10.1016/0092-8674(88)90494-1.View ArticlePubMedGoogle Scholar
- Cheung F, Trick M, Drou N, Lim Y-P, Park J-Y, Kwon S-J, Kim J-A, Scott R, Pires JC, Paterson AH, Town C, Bancroft I: Comparative analysis between homoeologous genome segments of Brassica napus and its progenitor species reveals extensive sequence-level divergence. Plant Cell. 2009, 21: 1912-1928. 10.1105/tpc.108.060376.PubMed CentralView ArticlePubMedGoogle Scholar
- Lemaitre C, Zaghloul L, Sagot MF, Gautier C, Arneodo A, Tannier E, Audit B: Analysis of fine-scale mammalian evolutionary breakpoints provides new insight into their relation to genome organization. BMC Genomics. 2009, 10: 335-10.1186/1471-2164-10-335.PubMed CentralView ArticlePubMedGoogle Scholar
- Gish W: WU-BLAST. 1996, [http://blast.advbiocomp.com]Google Scholar
- Stein LD, et al: The generic genome browser: a building block for a model organism system database. Genome Res. 2002, 12: 1599-610. 10.1101/gr.403602.PubMed CentralView ArticlePubMedGoogle Scholar
- Priestly M, Dickson J, Dicks J: Grid Map. 2002, [http://cbr.jic.ac.uk/dicks/software/Grid_Map/]Google Scholar
- Trick M, Drou N: Brassica genome annotation pipeline. 2006, [http://brassica.bbsrc.ac.uk]Google Scholar
- Korf I: Gene finding in novel genomes. BMC Bioinformatics. 2004, 5: 59-68. 10.1186/1471-2105-5-59.PubMed CentralView ArticlePubMedGoogle Scholar
- Haas BJ, Delcher AL, Mount SM, Wortman JR, Smith RK, Hannick LI, Maiti R, Ronning CM, Rusch DB, Town CD, Salzberg SL, White O: Improving the Arabidopsis genome annotation using maximal transcript alignment assemblies. Nucl Acids Res. 2003, 31: 5654-5666. 10.1093/nar/gkg770.PubMed CentralView ArticlePubMedGoogle Scholar
- Kent WJ: BLAT - The BLAST-Like Alignment Tool. Genome Res. 2002, 4: 656-664.View ArticleGoogle Scholar
- Trick M, Cheung F, Drou N, Fraser F, Lobenhofer EK, Hurban P, Magusin A, Town CD, Bancroft I: A newly-developed community microarray resource for transcriptome profiling in Brassica species enables the confirmation of Brassica-specific expressed sequences. BMC Plant Biology. 2009, 9: 50-60. 10.1186/1471-2229-9-50.PubMed CentralView ArticlePubMedGoogle Scholar
- Trick M, Long Y, Meng J, Bancroft I: Single nucleotide polymorphism (SNP) discovery in the polyploid Brassica napus using Solexa transcriptome sequencing. Plant Biotech J. 2009, 7: 334-346. 10.1111/j.1467-7652.2008.00396.x.View ArticleGoogle Scholar
- Li H, Durbin R: Mapping short DNA reads and calling variants using mapping quality scores. Genome Res. 2008, 18: 1851-1858. 10.1101/gr.078212.108.PubMed CentralView ArticlePubMedGoogle Scholar