Use of mRNA-seq to discriminate contributions to the transcriptome from the constituent genomes of the polyploid crop species Brassica napus
© Higgins et al.; licensee BioMed Central Ltd. 2012
Received: 20 October 2011
Accepted: 29 May 2012
Published: 15 June 2012
Polyploidy often results in considerable changes in gene expression, both immediately and over evolutionary time. New phenotypes often arise with polyploid formation and may contribute to the fitness of polyploids in nature or their selection for use in agriculture. Oilseed rape (Brassica napus) is widely used to study the process of polyploidy both in artificially resynthesised and natural forms. mRNA-Seq, a recently developed approach to transcriptome profiling using deep-sequencing technologies is an alternative to microarrays for the study of gene expression in a polyploid.
Illumina mRNA-Seq is comparable to microarray analysis for transcript quantification but has increased sensitivity and, very importantly, the potential to distinguish between homoeologous genes in polyploids. Using a novel curing process, we adapted a reference sequence that was a consensus derived from ESTs from both Brassica A and C genomes to one containing separate A and C genome versions for each of the 94,558 original unigenes. We aligned reads from B. napus to this cured reference, finding 38% more reads mapping from resynthesised lines and 28% more reads mapping from natural lines. Where the A and C versions differed at single nucleotide positions, termed inter-homoeologue polymorphisms (IHPs), we were able to apportion expression in the polyploid between the A and C genome homoeologues. 43,761 unigenes contained at least one IHP, with a mean frequency of 10.5 per kb unigene sequence. 6,350 of the unigenes with IHPs were differentially expressed between homoeologous gene pairs in resynthesised B. napus. 3,212 unigenes showed a similar pattern of differential expression across a range of natural B. napus crop varieties and, of these, 995 were in common with resynthesised B. napus. Functional classification showed over-representation in gene ontology categories not associated with dosage-sensitivity.
mRNA-Seq is the method of choice for measuring transcript abundance in polyploids due to its ability to measure the contributions of homoeologues to gene expression. The identification of large numbers of differentially expressed genes in both a newly resynthesised polyploid and natural B. napus confirms that there are both immediate and long-term alterations in the expression of homoeologous gene pairs following polyploidy.
Polyploidy or whole genome duplication (WGD) has occurred multiple times throughout the evolutionary history of plants. It has long been recognized as a major force in angiosperm evolution, plant speciation and diversification [1–3]. Polyploidization is both an ancient and an ongoing evolutionary process [4, 5] and has played a role in the adaptation of a wide range of crops to different environments by generating phenotypic variation. Polyploids are generally divided into two categories; autopolyploids from duplication of the same genome; and allopolyploids from hybridization of two diverged genomes with subsequent genome duplication. These distinctions are less clear in paleopolyploids. Soybean  and maize  are considered to be paleopolyploids having been formed between 10 – 15 Mya. Both show evidence of diploidization, an ongoing process by which a newly formed polyploid becomes stabilized, involving the loss of duplicated genes, thereby returning the genome to a diploid-like form . Both potato and alfalfa are derived through autopolyploidy, while wheat, oat, cotton, coffee and oilseed rape have allopolyploidy in their evolutionary history. Trapogoyon is a young allopolyploid species that has formed multiple times over the last 80 years  and so offers the opportunity to study a natural allopolyploid which is sympatric with its parental species . The success of newly formed angiosperm polyploids is partly attributable to their highly plastic genome structure. Recent studies have documented rapid and dynamic changes in genomic structure and gene expression in plant polyploids. Much of the functional plasticity in polyploids is correlated with gene expression changes at transcriptional and post-transcriptional levels. Such gene expression changes are controlled largely by epigenetic mechanisms [1, 2, 11].
The Brassica species include an important group of vegetable and oil crops and their genomes have complex evolutionary histories. A major focus for research has been Brassica napus (oilseed rape). This is an allopolyploid species formed by the hybridization of progenitor species Brassica rapa (which contributed the A genome) and Brassica oleracea (which contributed the C genome). The Brassica species in general, and B. napus in particular, provide an excellent system in which to study the impacts of polyploidy and the processes by which genomes subsequently stabilize. B. rapa and B. oleracea are closely related, having diverged around 3.5 Mya . The B. napus types cultivated as crops arose from natural polyploid formation, probably during human cultivation, i.e. less than 10,000 years ago. Genetic mapping studies confirmed that the progenitor A and C genomes are essentially intact in natural lines of B. napus and have not been substantially rearranged . It is also possible to make newly constructed (“resynthesised”) polyploids in the laboratory by crossing B. rapa and B. oleracea accessions and doubling chromosomes (typically by chemical treatment). Song et al. used resynthesised polyploids to study genome evolution in the early generations after polyploidization and demonstrated that polyploid species can generate extensive genetic diversity in a short period of time. Pires et al. were interested in the ability of polyploids to possess novel traits that are not present in their diploid progenitors which has allowed polyploids to successfully enter new ecological niches. Focussing on flowering time they showed evidence of chromosomal rearrangements and changes in gene expression, which partially explained the phenotypic variation in B. napus. The mechanisms for chromosome stability and diploidization in polyploids remain largely unknown but a study of 50 resynthesised lines of B. napus showed that in the first generation (S 0 ) of resynthesised B. napus, genetic changes are rare but cytosine methylation changes are frequent, whereas in later generations (S 5 ) genetic changes are much more frequent, but the S 0 methylation remained fixed . The genetic changes observed in resynthesised B. napus are not random and there is evidence that many are the consequence of homoeologous recombination . Recent cytological investigations including a S 10:11 generation showed that changes in copy number of individual chromosomes increased with successive generations; they showed gross chromosomal rearrangements and that dosage balance mechanisms enforced chromosome number stability . There is much interest on how these genetic and epigenetic changes contribute to changes in gene expression. Transcriptional changes are likely to be a critical component of polyploid evolution as they can contribute directly to novel phenotypes. Most studies have compared gene expression in resynthesised polyploid lines to expression in their parents to provide evidence of additive or non-additive gene expression. According to the "additivity hypothesis”, newly-synthesized allopolyploids are supposed to display mid-parental expression patterns. Many exceptions are found in resynthesised allopolyploids e.g. Arabidopsis, Senecio, Brassica, Triticum, and Gossypium, suggesting that the differential regulation of gene expression is a common feature of plant allopolyploids. Although the phenomenon of non-additive expression in inter-specific hybrids and allopolyploids is now well described, the underlying mechanisms are still poorly understood. Recent studies have used statistical methods to predict the contribution of each parent to gene expression in the polyploid using genome-wide microarrays that are not able to distinguish between expression of homoeologous pairs [17, 22]. The “additivity” hypothesis was confirmed using comparative proteomics on newly resynthesised B. napus. Identification using mass spectrometry and functional categorisation of the differentially regulated proteins did not show that any functional category, metabolic pathway or subcellular localization was over- or under represented within non-additive polypeptides . Comparing transcript levels in resynthesised B. napus to protein levels showed that differential protein regulation is not explained by transcriptional changes . This is a complex process so another approach has been to measure transcript levels of homoeologous pairs of genes, but not transcriptome-wide. For example, Dong et al. showed a complex pattern of differential expression in response to abiotic stress in both natural and resynthesised allopolyploid Gossypium hirsutum using SSCP-cDNA gels to distinguish homoeologous pairs of 60 genes. Also, Chaudhary et al. used a mass-spectrometry-based SNP detection technique to measure allele- and homoeologue-specific contributions to the transcriptome of diploid and allopolyploid cotton and showed that 40% of homoeologues were transcriptionally biased in at least one stage of cotton development. Development of a method to measure genome-wide differential expression of homoeologous pairs using transcriptome sequencing in both synthetic and natural polyploids would contribute to our understanding of this complex process.
Next generation sequencing technologies (NGS) have opened exciting opportunities to study genomes and transcriptomes of plant species with and without sequenced genomes. Many crop genome projects are ongoing, including oilseed rape, bread wheat and banana, but many of these polyploid plants have complex genome structures meaning that producing a draft sequence is challenging [28, 29]. Meanwhile plant transcriptomics using NGS can yield much information on crops , including gene discovery, transcript quantification, post-transcriptional regulation and linking genotypes to phenotypes . mRNA-Seq is a recently developed approach to transcriptome profiling that uses deep-sequencing technologies . Previous experiments, using mRNA-Seq for SNP detection in B. napus[33, 34], have proved Illumina sequencing to be an efficient method. mRNA-Seq can also be used as a method to estimate transcript abundance. The first step is to map the reads to the genome or transcriptome, and then the number of reads aligning to a specific region of the reference sequence is counted and subjected to relevant normalisation procedures . It is anticipated that mRNA-Seq will revolutionize the manner in which eukaryotic transcriptomes are analysed  as sequencing-based approaches have clear advantages over hybridization-based approaches for quantifying the transcriptome. A range of studies comparing microarray and mRNA-Seq have consistently shown that sequencing has higher sensitivity and dynamic range [36, 37], although reproducibility has been shown to depend on the type of sample studied [36, 38, 39] and a recent study has shown that technical variability is too high to be ignored .
We previously developed a set of 94,558 Brassica unigenes, by assembly of all available Brassica ESTs, and used this set for the design of a Brassica microarray . We used this for the analysis of gene expression in resynthesised B. napus lines (“B. napus 1” and “B. napus 2”). These resynthesised B. napus lines shared the same parental combinations of B. rapa (R-o-18) and B. oleracea (A12DH), but from reciprocal crosses. The Brassica unigenes were assembled using parameters that enabled the separate assembly of transcripts of paralogous genes within each diploid Brassica genome (as these differ by ~15% at the nucleotide level) but the co-assembly of transcripts of homoeologous genes (which differ by only ~3%). The surface-bound oligonucleotide probes of the microarrays were designed to regions that differed most between unigenes (typically 3’ untranslated regions), so discriminate well between unigenes representing paralogous genes, but they have no capability to discriminate between homoeologous genes.
In the present study, we report the development of methodology for using mRNA-Seq to quantify transcript abundance in polyploids, with estimation of the relative contributions of homoeologous genes. As a proof of concept, we analysed mRNA from reserved aliquots of the same ground leaf samples taken from the resynthesised B. napus plants used in our 2009 microarray-based study. We show, inter alia, that mRNA-Seq can be used successfully for both qualitative and quantitative analyses of gene expression and for the apportioning of transcript abundance to A and C genomes in both resynthesised and natural B. napus.
Results and discussion
Comparison of microarray and mRNA-seq for transcript quantification
Summary of read alignments
No. sequence reads
No. sequence reads mapped to reference
Aligned reference sequence
Average depth over aligned reference
No. unigenes with mapped sequence reads
Total sequence mapped to naive reference/Gb
No. sequence reads mapped to A genome
No. A genome unigenes with mapped sequence reads
No. sequence reads mapped to C genome
No. C genome unigenes with mapped sequence reads
Aligned reference sequence
Average depth over aligned reference
Total sequence mapped to cured reference/ Gb
Increase sequence mapped to reference
B. napus 1 rep1
Resynthesized (B. rapa cytoplasm)
B. napus 1 rep2
B. napus 1 rep3
B. napus 1 rep4
B. napus 2 rep1
Resynthesized (B. oleracea cytoplasm)
B. napus 2 rep2
B. napus 2 rep3
B. napus 2 rep4
Aphid Resistant Rape
Modifying the reference sequence to improve read alignment
The B. rapa-cured and B. oleracea-cured unigene sequences were simply combined to produce the cured reference sequence, which thus comprised the A and C genome variants for each unigene (189,116 unigene sequences in all). Sequences (as intact 80 base reads) from the 4 biological replicates of B. napus 1 and B. napus 2 were re-mapped to the cured reference. The results are shown in Table 1. Starting with an average of 30.5 million raw reads for B. napus 1, an average of 8.7 million reads aligned to A genome versions of the unigenes (57,078 unigenes)and an average of 8.3 million reads aligned to the C genome versions of the unigenes (57,191 unigenes), this represented a total number of reads aligning to the cured reference of an average of 17.1 million reads with a mean depth of coverage of the aligned reference of 27.7-fold, which is an increase of an average of 4.7 million reads (38.6%) compared to the naive reference. Similar results were obtained for B. napus 2, which showed a 37.8% increase in the number of mapped reads. These results confirmed that the curing process was highly successful in enabling the mapping of a greater proportion of sequence reads from the resynthesised B. napus. The B. rapa-cured and B. oleracea-cured unigene sequences will be of value for exploiting mRNA-Seq for transcript quantification in these species. The large number of IHPs differentiating the unigenes in the cured reference (385,789) also provides the opportunity to estimate the relative contributions to the transcriptome of homoeologues in the A and C genomes.
Estimation of the relative contributions of homoeologues to the transcriptome
The mapping to the cured reference of sequences from the four biological replicates of B. napus 1 and B. napus 2 results in the assignment of each sequence read to the unigene with the fewest mismatches. Where MAQ finds equally probable matches, the read is assigned randomly to one of them. Consequently, the cured reference provides the opportunity to estimate the relative contributions to the transcriptome of homoeologues based on this apportioning of reads, but with an anticipated underestimation of differences between the contributions of homoeologues in the A and C genomes where there are tracts of identical sequence in the corresponding unigenes in the cured reference.
To test whether our method for the apportioning of transcript abundance to genomes is also applicable to natural B. napus, we generated leaf mRNA-Seq data from five cultivars representing differing crop types. The sequence datasets consisted of between 23 and 29 million raw reads, the mean Phred-like quality value at base 80 was Q26.8 and the mean position at which the quality value declined below Q30 of 60.8. The sequence reads were mapped to both the naive reference and the cured reference. As was observed with mRNA-Seq reads from the resynthesised B. napus, more sequences were mapped to the cured reference than to the naive reference, as summarised in Table 1. However, the increase for natural B. napus of 27.8%, was lower than that observed for resynthesised B. napus (~38%). The difference is likely to be due to the curing process having used sequences from the precise progenitors of the resynthesised B. napus, resulting in optimum read mapping, whereas the actual progenitors of the natural B. napus types are unknown. We observed consistent (but not significant) biases in the proportions of reads mapping to the A and C genome versions of unigenes for both resynthesised B. napus (the majority, 51.2%, mapping to the A genome) and natural B. napus (the majority, 51.5%, mapping to the C genome). This may be a consequence of the transcriptome sequence of B. oleracea A12DH being more similar to that of the C genome ancestor of natural B. napus than the transcriptome sequence of B. rapa R-o-18 is to that of the A genome ancestor of natural B. napus, which has been observed previously for some regions of the genome .
Analysis and functional classification of homoeologous unigene pairs showing differential transcript abundance
Our investigations have shown that mRNA-Seq represents an excellent approach to the analysis of both qualitative and quantitative transcript abundance in B. napus. The increased sensitivity and dynamic range compared with a corresponding analysis using microarrays, along with the good (but far from perfect) correlation of transcript abundance quantification is consistent with the results of previous studies conducted using organisms with simpler genomes. As the cost to sequence a sample is now comparable to the cost to conduct a microarray analysis and software tools are available to manage and mine mRNA-Seq data, we conclude that mRNA-Seq will frequently be the method of choice.
Our studies have shown that mRNA-Seq data can be used to overcome a key problem in polyploid species: the presence of homoeologous genes with very similar sequence being transcribed from each genome. By “curing” the reference unigene sequences so they more closely represent the sequences of the two genomes contained in B. napus, we have been able to allocate, on a genome-wide basis, transcript abundance to the individual genomes. Although limited to regions of unigenes in which inter-homoeologue polymorphisms occur, such regions are in the majority and the approach identified many homoeologous transcripts with differential expression from the two genomes. This approach should be generally applicable to polyploid species, so long as diploid representatives of the constituent genomes are available to support the curing process. Moreover, the identification of large numbers of genes with such transcription biases, in replicated experiments, demonstrates that stable homoeo-allelic variation for transcript abundance is common in B. napus.
This method also enabled us to successfully study differential expression of homoeologous genes across a range of natural B. napus crop varieties. For many genes, the ancestral profiles have been partitioned between the homoeologues in a similar pattern, despite many generations of independent evolution. This is consistent with observations in other polyploids, such as cotton (Gossypium) [26, 27] and Spartina, that there are both immediate and long-term alterations in the expression of homoeologous gene pairs following polyploid formation. Functional classification of genes differentially expressed between homoeologous gene pairs showed enrichment for classes not associated with gene dosage effects, consistent with the notion that such gene classes can show allelic variation for transcript abundance that might, itself, represent a class of molecular marker.
Growth of plants and preparation of RNA
The B. napus 1 and B. napus 2 RNA samples were extracted from stored aliquots of the same ground leaf tissue used for the microarray experiment as described by Trick et al.. Five varieties of natural B. napus were selected to represent the main crop types: Tapidor, Ningyou 7, Altasweet, Ceska and Aphid Resistant Rape. The plants were grown and the RNA extracted using the same experimental design, growth and harvesting conditions as described for the B. napus 1 and B. napus 2 plants by Trick et al..
The sequencing libraries were prepared using the Illumina mRNA-Seq kit (RS-100-0801, Illumina Inc.) as described by Bancroft et al.. Each library was run on a single lane for 80 cycles on the Illumina Genome Analyzer GAIIx. Illumina base calling files were processed using GERALD to produce a sequence file containing 80 base reads for each sample in FASTQ format. The Illumina FASTQ format was converted to Sanger FASTQ format before further processing.
Mapping of reads to the reference sequence and generation of read counts using MAQ
MAQ version 0.7.1  was used to align the 80 base Illumina reads to the 95 k naive reference sequence following the protocols described in the online documentation (http://maq.sourceforge.net) and adopting the default parameter values. MAQ pileup text files were generated from the MAQ binary map files. The Perl script tagcounter.pl (Additional file 1) was used to count the number of reads aligning to each unigene by accessing the pileup files, outputting a count and calculated RPKM value (reads per kb per million aligned reads) for each unigene.
Comparison of unigene expression measurements obtained using the Agilent microarray with illumina mRNA-seq count data
Data for the four biological replicates of each of B. napus 1 and B. napus 2 was obtained from the Agilent 60-mer oligonucleotide microarray experiment already published by Trick et al. and available from the GEO repository, accession number GSE15915. Unigenes were called expressed, marginally expressed or non-expressed using the Agilent microarray. A similar classification of unigenes was carried out using the Illumina count data. Expressed unigenes were defined as having one or more reads aligning to the unigene in all of the 4 replicates; non-expressed unigenes were defined as having zero counts in each of the 4 replicates, the remaining unigenes were not classified. The statistical programming language R  was used to compare the lists obtained from both methods using Set Analysis. Above-background quantitative signal values were obtained for the Agilent microarray. These were correlated with the RPKM values (normalised transcript abundance) obtained using Illumina mRNA-Seq for the unigenes which were called expressed using both methods.
Preparation of the cured reference sequence
Libraries prepared from B. rapa and B. oleracea RNA samples were each run on two lanes of the Illumina Genome Analyzer GAIIx for 80 cycles. The FASTQ files from the two lanes were combined generating a total of 46,120,559 reads for B. rapa and 49,268,765 reads for B. oleracea. The 80 base reads were split into two files, each containing a set of 40 base reads using the Perl script illumina_split_read.pl (Additional file 2). The 40 base reads were used separately to cure the naive reference sequence to an A genome version and a C genome version, described as follows. Using the Perl script cure_cycle_split.pl (Additional file 3), the 40 base reads were aligned against the naive reference sequence to produce a map file. The map files generated by alignment of the first and second sets of 40 base reads were merged using MAQ mapmerge and a consensus sequence generated. The Perl script, cure_refseqs.pl (Additional file 4), was used to cure the naive reference using the consensus sequence. This process was iterated over six cycles after which there was no significant gain in alignment efficiency. On each iteration, bases were replaced in the reference, where these differed from high quality consensus bases called by MAQ (i.e. contributed by a read depth greater than 3, with quality values greater than 40). This process resulted in the production of an A genome and C genome version of the naive reference, these two sequences were compared at each base position using the Perl script compare_sequences.pl (Additional file 5) to give a list of positions within unigenes where the base differed in the two sequences. The cured reference sequence was constructed by combining the two ‘cured’ reference sequences, thus creating a reference sequence containing both the A and C variants of each unigene.
Mapping of reads to the cured reference and the apportioning of reads to the A and C genome
B. napus 1 and B. napus 2 samples were re-aligned against the cured reference using MAQ. When a read maps equally well to multiple positions, MAQ will randomly pick one position, thereby distributing reads evenly between the A and C genome versions of the unigene where the sequence is identical. Using the pileup files, the Perl script ACtagcounter.pl (Additional file 6) was used to generate a count of the number of reads and corresponding RPKM value, separately for the A and C genome version of each unigene.
RPKM values for the number of reads aligning to the A and C genome versions of each unigene were analysed in R. Following the principle of Occam’s razor, the simplest distribution, Poisson, was fitted to the paired RPKM values across the four replicates of each cross using a Generalised Linear Model (GLM); with the paired structure of the data captured algebraically in the design matrix Chi. However, evident over-dispersion in the data necessitated the use of a quasi-Poisson link function. That is, the variance and mean regression functions can be obtained from a Poisson GLM  but the dispersion parameter, Phi, is not fixed at 1 but left unrestricted and subsequently estimated from the data. Hence, we obtain the same estimates of coefficients as for a standard Poisson, but inference becomes effectively over-dispersion-adjusted. The resulting p-values were subjected to Benjamini-Hochberg adjustment for multiple testing; and we regarded adjusted p-values < 0.05 as significant. Using the same statistical analysis, we compared the five B. napus cultivars with the aim of identifying unigenes which were consistently differentially expressed across all five varieties.
Identifying biological functions of differentially expressed genes
BLASTN hits in Arabidopsis (p ≤ 1.0E-30) were found for 62,383 of the 94,558 brassica unigenes (90,864 unique sequences ). The top Arabidopsis hit corresponding to each Brassica unigene was used for functional analysis. Functional classification was carried out using the agriGO web-based GO analysis toolkit . Arabidopsis genes (corresponding to each unigenes) were input into the Singular Enrichment Analysis (SEA) using the 43,761 unigenes with IHPs as a customised reference background (15,173 corresponding Arabidopsis genes). The Fisher’s Exact Test with Bonferroni-adjusted p-values was employed in the SEA analysis using the “complete GO” ontology.
Short read sequence data have been deposited at the Sequence Read Archive (SRA) under accession number ERA063602, except for Tapidor and Ningyou 7, which were deposited previously under accession number ERA036824.
We would like to thank The Genome Analysis Centre for generating Illumina sequence data.
This work was supported by UK Biotechnology and Biological Sciences Research Council (ERAPG08.008).
- Doyle JJ, Flagel LE, Paterson AH, Rapp RA, Soltis DE, Soltis PS, Wendel JF: Evolutionary genetics of genome merger and doubling in plants. Annual Review of Genetics. Volume 42. 2008, Annual Reviews, Palo Alto, 443-461. Annual Review of GeneticsGoogle Scholar
- Leitch AR, Leitch IJ: Genomic plasticity and the diversity of polyploid plants. Science. 2008, 320: 481-483. 10.1126/science.1153585.View ArticlePubMedGoogle Scholar
- Soltis DE, Albert VA, Leebens-Mack J, Bell CD, Paterson AH, Zheng C, Sankoff D, dePamphilis CW, Wall PK, Soltis PS: Polyploidy and angiosperm diversification. Am J Bot. 2009, 96: 336-348. 10.3732/ajb.0800079.View ArticlePubMedGoogle Scholar
- Soltis PS, Soltis DE: The role of hybridization in plant speciation. Annual Review of Plant Biology. 2009, 60: 561-588. 10.1146/annurev.arplant.043008.092039.View ArticlePubMedGoogle Scholar
- Wendel JF: Genome evolution in polyploids. Plant Mol Biol. 2000, 42: 225-249. 10.1023/A:1006392424384.View ArticlePubMedGoogle Scholar
- Schlueter JA, Lin JY, Schlueter SD, Vasylenko-Sanders IF, Deshpande S, Yi J, O'Bleness M, Roe BA, Nelson RT, Scheffler BE: Gene duplication and paleopolyploidy in soybean and the implications for whole genome sequencing. BMC Genomics. 2007, 8.
- Gaut BS, Doebley JF: DNA sequence evidence for the segmental allotetraploid origin of maize. Proc Natl Acad Sci U S A. 1997, 94: 6809-6814. 10.1073/pnas.94.13.6809.PubMed CentralView ArticlePubMedGoogle Scholar
- Thomas BC, Pedersen B, Freeling M: Following tetraploidy in an Arabidopsis ancestor, genes were removed preferentially from one homeolog leaving clusters enriched in dose-sensitive genes. Genome Res. 2006, 16: 934-946. 10.1101/gr.4708406.PubMed CentralView ArticlePubMedGoogle Scholar
- Soltis DE, Soltis PS, Pires JC, Kovarik A, Tate JA, Mavrodiev E: Recent and recurrent polyploidy in Tragopogon (Asteraceae): cytogenetic, genomic and genetic comparisons. Biol J Linn Soc. 2004, 82: 485-501. 10.1111/j.1095-8312.2004.00335.x.View ArticleGoogle Scholar
- Buggs RJA, Zhang L, Miles N, Tate JA, Gao L, Wei W, Schnable PS, Barbazuk WB, Soltis PS, Soltis DE: Transcriptomic shock generates evolutionary novelty in a newly formed, natural allopolyploid plant. Curr Biol. 2011, 21: 551-556. 10.1016/j.cub.2011.02.016.View ArticlePubMedGoogle Scholar
- Jackson S, Chen ZJ: Genomic and expression plasticity of polyploidy. Current Opinion in Plant Biology. 2010, 13: 153-159. 10.1016/j.pbi.2009.11.004.PubMed CentralView ArticlePubMedGoogle Scholar
- Inaba R, Nishio T: Phylogenetic analysis of Brassiceae based on the nucleotide sequences of the S-locus related gene, SLR1. Theor Appl Genet. 2002, 105: 1159-1165. 10.1007/s00122-002-0968-3.View ArticlePubMedGoogle Scholar
- Parkin IAP, Sharpe AG, Keith DJ, Lydiate DJ: Identification of the A and C genomes of amphidiploid Brassica napus (oilseed rape). Genome. 1995, 38: 1122-1131. 10.1139/g95-149.View ArticlePubMedGoogle Scholar
- Song KM, Lu P, Tang KL, Osborn TC: Rapid genome change in synthetic polyploids of Brassica and its implications for polyploid evolution. Proc Natl Acad Sci U S A. 1995, 92: 7719-7723. 10.1073/pnas.92.17.7719.PubMed CentralView ArticlePubMedGoogle Scholar
- Pires JC, Zhao JW, Schranz ME, Leon EJ, Quijada PA, Lukens LN, Osborn TC: Flowering time divergence and genomic rearrangements in resynthesized Brassica polyploids (Brassicaceae). Biol J Linn Soc. 2004, 82: 675-688. 10.1111/j.1095-8312.2004.00350.x.View ArticleGoogle Scholar
- Gaeta RT, Pires JC, Iniguez-Luy F, Leon E, Osborn TC: Genomic changes in resynthesized Brassica napus and their effect on gene expression and phenotype. Plant Cell. 2007, 19: 3403-3417. 10.1105/tpc.107.054346.PubMed CentralView ArticlePubMedGoogle Scholar
- Gaeta RT, Yoo S-Y, Pires JC, Doerge RW, Chen ZJ, Osborn TC: Analysis of gene expression in resynthesized Brassica napus allopolyploids using Arabidopsis 70mer oligo microarrays. PLoS One. 2009, 4.
- Xiong Z, Gaeta RT, Pires JC: Homoeologous shuffling and chromosome compensation maintain genome balance in resynthesized allopolyploid Brassica napus. Proc Natl Acad Sci U S A. 2011, 108: 7908-7913. 10.1073/pnas.1014138108.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang JL, Tian L, Lee HS, Wei NE, Jiang HM, Watson B, Madlung A, Osborn TC, Doerge RW, Comai L, Chen ZJ: Genomewide nonadditive gene regulation in Arabidopsis allotetraploids. Genetics. 2006, 172: 507-517.PubMed CentralView ArticlePubMedGoogle Scholar
- Hegarty MJ, Jones JM, Wilson ID, Barker GL, Coghill JA, Sanchez-Baracaldo P, Liu GQ, Buggs RJA, Abbott RJ, Edwards KJ, Hiscock SJ: Development of anonymous cDNA microarrays to study changes to the Senecio floral transcriptome during hybrid speciation. Mol Ecol. 2005, 14: 2493-2510. 10.1111/j.1365-294x.2005.02608.x.View ArticlePubMedGoogle Scholar
- He P, Friebe BR, Gill BS, Zhou JM: Allopolyploidy alters gene expression in the highly stable hexaploid wheat. Plant Mol Biol. 2003, 52: 401-414. 10.1023/A:1023965400532.View ArticlePubMedGoogle Scholar
- Rapp RA, Udall JA, Wendel JF: Genomic expression dominance in allopolyploids. BMC Biol. 2009, 7.
- Albertin W, Balliau T, Brabant P, Chevre A-M, Eber F, Malosse C, Thiellement H: Numerous and rapid nonstochastic modifications of gene products in newly synthesized Brassica napus allotetraploids. Genetics. 2006, 173: 1101-1113. 10.1534/genetics.106.057554.PubMed CentralView ArticlePubMedGoogle Scholar
- Albertin W, Alix K, Balliau T, Brabant P, Davanture M, Malosse C, Valot B, Thiellement H: Differential regulation of gene products in newly synthesized Brassica napus allotetraploids is not related to protein function nor subcellular localization. BMC Genomics. 2007, 8.
- Marmagne A, Brabant P, Thiellement H, Alix K: Analysis of gene expression in resynthesized Brassica napus allotetraploids: transcriptional changes do not explain differential protein regulation. New Phytol. 2010, 186: 216-227. 10.1111/j.1469-8137.2009.03139.x.View ArticlePubMedGoogle Scholar
- Dong S, Adams KL: Differential contributions to the transcriptome of duplicated genes in response to abiotic stresses in natural and synthetic polyploids. New Phytol. 2011, 190: 1045-1057. 10.1111/j.1469-8137.2011.03650.x.View ArticlePubMedGoogle Scholar
- Chaudhary B, Flagel L, Stupar RM, Udall JA, Verma N, Springer NM, Wendel JF: Reciprocal silencing, transcriptional bias and functional divergence of homeologs in polyploid cotton (Gossypium). Genetics. 2009, 182: 503-517. 10.1534/genetics.109.102608.PubMed CentralView ArticlePubMedGoogle Scholar
- Feuillet C, Leach JE, Rogers J, Schnable PS, Eversole K: Crop genome sequencing: lessons and rationales. Trends Plant Sci. 2011, 16: 77-88. 10.1016/j.tplants.2010.10.005.View ArticlePubMedGoogle Scholar
- Jackson SA, Iwata A, Lee SH, Schmutz J, Shoemaker R: Sequencing crop genomes: approaches and applications. New Phytol. 2011, 191: 915-925. 10.1111/j.1469-8137.2011.03804.x.View ArticlePubMedGoogle Scholar
- Braeutigam A, Gowik U: What can next generation sequencing do for you? Next generation sequencing as a valuable tool in plant research. Plant Biology. 2010, 12: 831-841. 10.1111/j.1438-8677.2010.00373.x.View ArticleGoogle Scholar
- Marguerat S, Baehler J: RNA-seq: from technology to biology. Cellular and Molecular Life Sciences. 2010, 67: 569-579. 10.1007/s00018-009-0180-6.PubMed CentralView ArticlePubMedGoogle Scholar
- Wang Z, Gerstein M, Snyder M: RNA-Seq: a revolutionary tool for transcriptomics. Nat Rev Genet. 2009, 10: 57-63. 10.1038/nrg2484.PubMed CentralView ArticlePubMedGoogle Scholar
- Bancroft I, Morgan C, Fraser F, Higgins J, Wells R, Clissold L, Baker D, Long Y, Meng J, Wang X: Dissecting the genome of the polyploid crop oilseed rape by transcriptome sequencing. Nat Biotechnol. 2011, 29: 762-766. 10.1038/nbt.1926.View ArticlePubMedGoogle Scholar
- Trick M, Long Y, Meng J, Bancroft I: Single nucleotide polymorphism (SNP) discovery in the polyploid Brassica napus using Solexa transcriptome sequencing. Plant Biotechnology Journal. 2009, 7: 334-346. 10.1111/j.1467-7652.2008.00396.x.View ArticlePubMedGoogle Scholar
- Oshlack A, Robinson MD, Young MD: From RNA-seq reads to differential expression results. Genome Biol. 2010, 11.
- Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y: RNA-seq: An assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 2008, 18: 1509-1517. 10.1101/gr.079558.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Mortazavi A, Williams BA, McCue K, Schaeffer L, Wold B: Mapping and quantifying mammalian transcriptomes by RNA-Seq. Nature Methods. 2008, 5: 621-628. 10.1038/nmeth.1226.View ArticlePubMedGoogle Scholar
- Bullard JH, Purdom E, Hansen KD, Dudoit S: Evaluation of statistical methods for normalization and differential expression in mRNA-Seq experiments. BMC Bioinforma. 2010, 11.
- Willenbrock H, Salomon J, Sokilde R, Barken KB, Hansen TN, Nielsen FC, Moller S, Litman T: Quantitative miRNA expression analysis: Comparing microarrays with next-generation sequencing. RNA-a Publication of the RNA Society. 2009, 15: 2028-2034. 10.1261/rna.1699809.View ArticleGoogle Scholar
- McIntyre LM, Lopiano KK, Morse AM, Amin V, Oberg AL, Young LJ, Nuzhdin SV: RNA-seq: technical variability and sampling. BMC Genomics. 2011, 12:Google Scholar
- Trick M, Cheung F, Drou N, Fraser F, Lobenhofer EK, Hurban P, Magusin A, Town CD, Bancroft I: A newly-developed community microarray resource for transcriptome profiling in Brassica species enables the confirmation of Brassica-specific expressed sequences. BMC Plant Biology. 2009, 9.
- Li H, Ruan J, Durbin R: Mapping short DNA sequencing reads and calling variants using mapping quality scores. Genome Res. 2008, 18: 1851-1858. 10.1101/gr.078212.108.PubMed CentralView ArticlePubMedGoogle Scholar
- Bottomly D, Walter NAR, Hunter JE, Darakjian P, Kawane S, Buck KJ, Searles RP, Mooney M, McWeeney SK, Hitzemann R: Evaluating gene expression in C57BL/6 J and DBA/2 J mouse striatum using RNA-Seq and microarray. PLoS One. 2011, 6.
- Hoen PAC, Ariyurek Y, Thygesen HH, Vreugdenhil E, Vossen RHAM, de Menezes RX, Boer JM, van Ommen G-JB, den Dunnen JT: Deep sequencing-based expression analysis shows major advances in robustness, resolution and inter-lab portability over five microarray platforms. Nucleic Acids Res. 2008, 36.
- Fu X, Fu N, Guo S, Yan Z, Xu Y, Hu H, Menzel C, Chen W, Li Y, Zeng R, Khaitovich P: Estimating accuracy of RNA-Seq and microarrays with proteomics. BMC Genomics. 2009, 10:Google Scholar
- Cheung F, Trick M, Drou N, Lim YP, Park J-Y, Kwon S-J, Kim J-A, Scott R, Pires JC, Paterson AH: Comparative analysis between homoeologous genome segments of Brassica napus and its progenitor species reveals extensive sequence-level divergence. Plant Cell. 2009, 21: 1912-1928. 10.1105/tpc.108.060376.PubMed CentralView ArticlePubMedGoogle Scholar
- Du Z, Zhou X, Ling Y, Zhang Z, Su Z: agriGO: a GO analysis toolkit for the agricultural community. Nucleic Acids Res. 2010, 38: W64-W70. 10.1093/nar/gkq310.PubMed CentralView ArticlePubMedGoogle Scholar
- Yang X, Ye C-Y, Cheng Z-M, Tschaplinski TJ, Wullschleger SD, Yin W, Xia X, Tuskan GA: Genomic aspects of research involving polyploid plants. Plant Cell Tissue and Organ Culture. 2011, 104: 387-397. 10.1007/s11240-010-9826-1.View ArticleGoogle Scholar
- Birchler JA, Veitia RA: The gene balance hypothesis: implications for gene regulation, quantitative traits and evolution. New Phytol. 2010, 186: 54-62. 10.1111/j.1469-8137.2009.03087.x.PubMed CentralView ArticlePubMedGoogle Scholar
- Chelaifa H, Monnier A, Ainouche M: Transcriptomic changes following recent natural hybridization and allopolyploidy in the salt marsh species Spartina x townsendii and Spartina anglica (Poaceae). New Phytol. 2010, 186: 161-174. 10.1111/j.1469-8137.2010.03179.x.View ArticlePubMedGoogle Scholar
- Ihaka R, Gentleman R: R: A language for data analysis and graphics. J Comput Graph Stat. 1996, 5: 299-314.Google Scholar
- McCullagh P, Nelder J: Generalized Linear Models, Second Edition. 1989, Chapman and Hall, LondonView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://(http://creativecommons.org/licenses/by/2.0)), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.