“A draft Musa balbisiana genome sequence for molecular genetics in polyploid, inter- and intra-specific Musa hybrids”
© Davey et al.; licensee BioMed Central Ltd. 2013
Received: 14 April 2013
Accepted: 24 September 2013
Published: 5 October 2013
Modern banana cultivars are primarily interspecific triploid hybrids of two species, Musa acuminata and Musa balbisiana, which respectively contribute the A- and B-genomes. The M. balbisiana genome has been associated with improved vigour and tolerance to biotic and abiotic stresses and is thus a target for Musa breeding programs. However, while a reference M. acuminata genome has recently been released (Nature 488:213–217, 2012), little sequence data is available for the corresponding B-genome.
To address these problems we carried out Next Generation gDNA sequencing of the wild diploid M. balbisiana variety ‘Pisang Klutuk Wulung’ (PKW). Our strategy was to align PKW gDNA reads against the published A-genome and to extract the mapped consensus sequences for subsequent rounds of evaluation and gene annotation.
The resulting B-genome is 79% the size of the A-genome, and contains 36,638 predicted functional gene sequences which is nearly identical to the 36,542 of the A-genome. There is substantial sequence divergence from the A-genome at a frequency of 1 homozygous SNP per 23.1 bp, and a high degree of heterozygosity corresponding to one heterozygous SNP per 55.9 bp. Using expressed small RNA data, a similar number of microRNA sequences were predicted in both A- and B-genomes, but additional novel miRNAs were detected, including some that are unique to each genome. The usefulness of this B-genome sequence was evaluated by mapping RNA-seq data from a set of triploid AAA and AAB hybrids simultaneously to both genomes. Results for the plantains demonstrated the expected 2:1 distribution of reads across the A- and B-genomes, but for the AAA genomes, results show they contain regions of significant homology to the B-genome supporting proposals that there has been a history of interspecific recombination between homeologous A and B chromosomes in Musa hybrids.
We have generated and annotated a draft reference Musa B-genome and demonstrate that this can be used for molecular genetic mapping of gene transcripts and small RNA expression data from several allopolyploid banana cultivars. This draft therefore represents a valuable resource to support the study of metabolism in inter- and intraspecific triploid Musa hybrids and to help direct breeding programs.
Bananas and plantains (Musaceae, Zingiberlaes, Musa ssp., ‘bananas’) are giant monocotoyledenous herbs, which originated in Southeast Asia and the western Pacific. They were one of the first crops to be domesticated, and are now widely distributed throughout the subtropics where they constitute a major staple food for millions of people . Indeed, bananas are the World’s most important fruit and rank fourth in the list of global food crops , but with the important caveat that 80% of the production is for local consumption .
There are four genomes present within Musa spp., corresponding to the genetic constitutions of the four wild Eumusa species i.e. Musa acuminata (A-genome, 2n = 2x = 22), Musa balbisiana (B-genome, 2n = 2x = 22), Musa schizocarpa (S genome, 2n = 2x = 22) and the Australimusa species (T genome, 2n = 2x = 20). Within M. acuminata, a number of subspecies are also recognized which have been further classified into four subgroups on the basis of DNA markers i.e. banksii, zebrina, malaccensis and burmannica/burmannicoides [4–6]. Despite evidence of interspecific genetic variation, there has been no subspecies classification within M. balbisiana. Members of the Australimusa sections have a basic chromosome number of 2n = 20, and comprise seven species one of which contains edible parthenocarpic types known as ‘Fe’i’ bananas. Members of this Fe’i group are notable for their upright fruit and in some cases exceptionally high fruit provitamin A carotenoid contents [8, 9].
Cultivated bananas are parthenocarpic, generally seedless and vegetatively propagated hybrids that have arisen primarily as a result of hybridizations between wild diploid M. acuminata and M. balbisiana species [10–12]. Intraspecific hybridizations within M. acuminata, and interspecific hybridizations between M. acuminata and M. balbisiana have naturally resulted in various combinations of these A- and B -genomes which are classified into six groups (AA, AAA, AB, AAB, ABB and ABBB) . The majority of edible cultivars though are allopolyploid triploids with a genome constitution of AAA (dessert banana), AAB (plantains) and ABB (cooking bananas).
It has also become increasingly clear that there is no simple division of parental A- and B- chromosomes during hybridization, and several research groups have provided evidence for pairing and recombination between homeologous A- and B- chromosomes [13, 14]. Similar events have been observed in allopolyploids of other species . It seems likely therefore that current interspecific triploids have arisen from one or more steps of (re)combination and exchange of chromosomal segments between the A- and B-genomes [13, 16, 17]. As a result, most if not all Musa cultivars probably have genomes consisting of different proportions of the A- and B-genome. A similar process of hybridization between subspecies of M. acuminata probably also underlines the evolution of the edible AA and AAA types. A consequence of these recombination events is that the hybrid genomes contain an unbalanced number of A- and B-genome alleles . This clearly complicates genetic studies of trait inheritance, as well as the development and application of molecular marker technologies .
Currently, over 90% of commercial export dessert bananas are produced from a single cultivar, namely ‘Cavendish’ (AAA). Unsurprisingly, this dependency on a single cultivar and the consequent lack of genetic variation in production systems, has resulted in a crop which is potentially highly susceptible to disease pandemics. Pandemics are also not without precedent in banana, and in the 1950’s, ‘Panama Disease’ (‘Fusarium wilt’) caused by the soil-born fungus Fusarium oxysporum was responsible for entirely wiping out commercial banana production, which at that time was dependent on the cultivar ‘Gros Michel’ (AAA). By necessity, ‘Gros Michel’ was replaced by the Fusarium-resistant cultivar ‘Cavendish’ , but unfortunately, in the intervening years new strains of F. oxysporum have evolved, (F. oxysporum f. sp. cubense race 4, ‘Tropical race 4’), that have overcome this natural Cavendish resistance . Currently, the other major disease threat is that posed by Mycosphaerella fijiensis (‘Black Sigatoka’). Chemical control of Fusarium wilt is ineffective, and even though ‘Black Sigatoka’ can be controlled by frequent (up to 50) fungicide applications per year, these are socio-economically and environmentally inappropriate, and require integrated strategies to avoid the development of fungicide resistance in the pathogen [20, 21].
Consequently, there is currently much interest in developing sustainable solutions to these disease threats through the introgression of novel resistance loci and the development of new, disease resistant varieties. However the largely sterile nature of the majority of commercial varieties means that banana breeding is both lengthy and time-consuming. Therefore the application of molecular marker technologies for the identification of trait-marker associations, and high throughput genotyping technologies can greatly accelerate the whole breeding cycle through marker assisted selection (MAS) . In this regard the recent publication of a high quality 523 Mb draft reference genome for the doubled haploid genome of the wild M. acuminata variety ‘Pahang’ (AA), represents a major advance in this field . The variety ‘Pahang’ is member of the subspecies malaccensis that contributed one of the three M. acuminata genomes for ‘Cavendish’ . This invaluable resource therefore provides for the first time a complete catalogue of all predicted genes, transcripts and markers in Musa, and can greatly facilitate and accelerate the search for novel genes, transcripts, allelic variants etc., for important biological processes. It also opens up the prospects for the rapid development of high-throughput molecular tools for Musa improvement. Despite this, there is also an urgent need for the development of similar resources for the Musa B-genome, to be able to identify and exploit M. balbisiana accessions with resistance to various abiotic and biotic stresses including osmotic and cold stresses [23–26] as well as for vigour [27, 28].
The aim of this work therefore was to sequence and assemble a draft genome of the wild M. balbisiana diploid variety ‘Pisang Klutuk Wulung’ (‘PKW’ , B-genome) for use in comparative transcriptomics and genomics studies of interspecific triploid and tetraploid hybrids. PKW is one of the possible ancestral parents of the B-genome present in cultivated triploids, and has for example has been shown to have very strong partial resistance to black leaf streak virus (http://www.proMusa.org/tiki-index.php?page=Pisang+Klutuk+Wulung). Other M. balbisiana cultivars have demonstrated resistance to Xanthomonas , and are considered to be more drought tolerant [24, 26]. The utility of the PKW B-genome sequence generated here was validated by examining the distribution of CDS, EST and RNA-read mappings from a range of genetically diverse interspecific triploid cultivars across the combined A- and B-genomes, by characterising the predicted miRNA sequences encoded in both A- and B-genomes and by prediction of miRNA targets in both genomes.
Improved insights into genome structure, allelic diversity and regulatory elements within M. balbisiana spp. will help in the design and application of breeding strategies for novel banana cultivars with improved stress resistance and quality traits.
Sterile plantlets of the wild diploid M. balbisiana variety ‘Pisang Klutuk Wulang’ (‘PKW’ , BB genome), were obtained from the International Transit Centre, KU Leuven (reference number ITC 1587), and healthy leaf material collected for gDNA isolation from 8-week old, greenhouse grown plants. RNA was extracted from lyophilised fruit pulp samples from the following cultivars; ‘Yangambi-km5’ (AAA), ‘Gros Michel’ (AAA), ‘Batard’ (AAB), ‘Mbouroukou-3’ (AAB), ‘Iholena Lele sub var 'Long Peduncular'’ (AAB), and ‘Karat’ (Fe’i, Australimusa).
Banana root tissues were prepared from clonal tissue-cultured plantlets of M. acuminata cultivar ‘Berangan’ (AAA), when 6-8 cm tall, with three fully-expanded leaves and healthy roots. Root tissues were pooled from 3–4 plantlets for RNA isolation and small RNA sequencing. Banana embryogenic suspension cell samples were prepared from embryogenic callus induced from immature male flowers of ‘Berangan’ as described by . Newly initiated suspension cells were used for RNA isolation and small RNA sequencing.
Genomic DNA (gDNA) was extracted from young leaves essentially as according to Michiels et al. .
Total nucleic acids were extracted from lyophilized fruit pulp samples, from 3 individual fruit per cultivar using the Tris-LiCl procedure of Tattersall et al. , modified as previously described . Equivalent amounts of RNA from each sample were combined per cultivar. Total nucleic acids were isolated from banana root tissues and embryogenic suspension cells using a modified CTAB nucleic acid isolation method .
RNA and DNA quality
Concentrations of purified nucleotides were determined at 260 nm using a NanoDrop 2000 Spectophotometer (Thermo Scientific) and purity assessed at an absorbance ratio of 260/280 nm and 260/230 nm. RNA integrity was confirmed by agarose gel electrophoresis and on an Agilent 2100 Bioanalyzer (Agilent Technologies, Inc.). Only samples with high RNA integrity number (RIN ≥ 8) were used for RNA sequencing. A total of 2 μg of purified gDNA and of each combined RNA/DNA sample was precipitated in ethanol and used for sequencing.
RNA and DNA sequencing
Sequencing of both RNA and DNA samples was carried out at the Genomics Core sequencing facilities of the Katholieke Universiteit Leuven (http://www.uzleuven.be/genomicscore/), using an Illumina HiSeq 2000 II instrument. Small RNA libraries were sequenced using an Illumina HiSeq 2000 II at BGI, Shenzhen.
Illumina paired-end cDNA library construction and sequencing
The cDNA libraries were constructed using the TruSeq™ RNA Sample Preparation Kit (Illumina, Inc.) according to the manufacturer's instructions. Poly-A containing mRNA was purified from 2 μg of total RNA using oligo(dT) magnetic beads and fragmented into 200–500 bp pieces using divalent cations at 94°C for 5 min. The cleaved RNA fragments were copied into first strand cDNA using SuperScript II reverse transcriptase (Life Technologies, Inc.) and random primers. After second strand cDNA synthesis, fragments were end repaired, a-tailed and indexed adapters were ligated. The products were purified and enriched with PCR to create the final cDNA library. The 6 tagged cDNA libraries were pooled in equal ratios and used for 2 × 100 bp paired-end sequencing on a single lane of the Illumina HiSeq2000 II.
Sequence data processing
Sequencing data was provided as fastq files and unless otherwise mentioned, all data processing steps were carried out using the CLC Genomics Workbench software package v 6.01. Raw reads were uploaded to GenBank and are accessible via accession number SAMN02333823.
Mapping PKW gDNA reads to the reference A-genome
The raw data was first trimmed to remove low quality bases and the trimmed PKW gDNA reads then aligned to the publically available M. acuminata ‘Pahang’ doubled haploid A-genome assemblies available from http://banana-genome.cirad.fr/download.php. This consists of 11 chromosomes together with one sequence containing concatenated unassembled contigs, each separated by 100 ‘N’s’. Reads were aligned using the settings; mismatch cost (2), insertion cost (3) deletion cost (3), length fraction (0.9) and similarity fraction (0.9). Reads mapping equally well to two positions were assigned randomly. Following mapping, the consensus sequences were extracted and served as the PKW consensus reference genome for further treatments. For chromosomes 1 – 11, during extraction of the consensus PKW genome sequence, regions of 0 read coverage (e.g. gaps etc.) are removed to produce a single continuous sequence. For the large unmapped chromosome, the consensus PKW sequence was extracted using ‘N’ ambiguity symbols to fill in gap regions, as otherwise unrelated genic sequences could be concatenated together allowing ‘bridging’ of reads across unrelated genic sequences.
Mapping RNA de novo assembled transcripts, CDS and unigenes to gDNA contigs and genome sequences
RNA reads (de novo assembled transcripts, CDS, unigenes), were aligned to the PKW genome or gDNA contig data using the ‘large gap’ mapping function within the CLC Genomics Workbench, using the following settings; Maximum number of hits for a segment = 10, Maximum distance from seed = 50,000, Key Match _Mode = random, Mismatch cost = 2, Insertion cost = 3, Deletion cost = 3, Similarity = 0.8, Length fraction = 0.9. The large gap mapper function aligns reads to a reference sequence, while allowing for large gaps in the mapping. It is therefore able to map reads that span introns without requiring prior transcript annotations or for the detection of large deletions in genomic data. Additional details can be found http://www.clcbio.com/white-paper.
Ab initio gene prediction was carried out using the FGENESH software, available online from http://linux1.softberry.com/all.htm and using the default parameters and the ‘monocot model plant’ parameters. The list of predicted PKW gene models was then blasted against the NCBI nr protein database and gene ontology terms assigned using the Blast2Go software (Conesa et al., ; Götz et al., ). Repeats were annotated by BLAST against the repetitive part of the Musa genome containing 1902 sequences which were retrieved from a published report . Evaluation of the PKW B-genome gene model set took place by large-gap mapping of available CDS, and EST resources within CLC Genomics Workbench. These resources consisted of the ‘Pahang’ consensus CDS set, an in-house Musa unigene set of 22,205 sequences derived from the Syngenta M. acuminata 3’ EST database , transcript sets generated from the de novo assembly of Illumina 100 bp paired end RNA reads from 6 Musa cultivars.
De novo assembly
All the trimmed, PKW gDNA reads were de novo assembled using the default settings in CLC Genomics Workbench with the settings as follows; Word size: 25, Bubble size: 50, Minimum contig length = 200, Mismatch cost = 2, Insertion cost = 3, Deletion cost = 3, Length fraction = 0.5, Similarity fraction = 0.8. In addition the ‘scaffolding’ option was used to take advantage of the paired read data, and reads were mapped back to the contigs generated to validate the sequences. The same parameters were also used for the de novo assembly of PKW gDNA reads that could not be mapped to the ‘Pahang’ reference genome, and for the de novo assembly of all RNA transcript data from the 5 triploid hybrids and the diploid Fe’i variety.
Sequence variant analysis was carried out on both the RNA- and DNA-mappings using the ‘Probabilistic Variant Detection’ plugin in CLC Genomics Workbench, with settings specifying a minimum read coverage of 10 and a variant probability of >90%. The maximum expected number of variants was set as 2 or 3 according to the ploidy level of the samples and variants were calculated using either all the mapped reads, or only using the uniquely mapped reads.
This was performed on both the ‘Pahang’ A-genome and the PKW consensus B-genomes (our data) as well as de novo assembled contigs with Repeat Masker V4.0.3 software tool (http://www.repeatmasker.org),  using RMBLAST 2.2.27 as the engine and using the customized library of M. acuminata repeats (1903 sequences) from Hribova et al. 2010 . SSR detection was carried out using Tandem Repeat Finder (TRF) software  and TRAP  using the default parameters.
microRNA (miRNA) and target prediction
miRNA prediction was performed using miRDeep2 tool  using scripts modified according to the criteria set for plant genomes . A non-redundant query set of small RNA reads was compiled from root and embryogenic cell suspension and included all 235 miRNA sequences reported for the Musa ‘Pahang’ doubled haploid A-genome retrieved from the banana genome database , and publically available small RNA data from M. acuminata ‘Calcutta 4’ leaf, flower and fruit tissues (sequenced within the framework of a NSF project http://smallrna.udel.edu). Small RNA reads were trimmed to remove adapter sequences and low quality reads. Reads were then mapped to the Rfam database  using Bowtie . Matches with tRNA, rRNA, small nucleolar RNA and sequences below 19 or above 24 nucleotides in length were not considered for further analysis. The query miRNA data set was mapped separately to the A-genome  and the draft PKW B-genome developed here. Regions of 300 nt surrounding the matched position of each read were excised from the genome sequences, and then RNAfold software  used to predict sequences able to form stem-loop structures, using default options, whilst ‘Randfold’  was used to calculate p-values for potential miRNA precursors predicted by the ‘miRDeep2’ algorithm. The candidate miRNA precursors selected had the following features; a predicted stem loop structure of 75 nt and a bulge-loop size of less than 6 nt; the mature miRNA was within the stem region of the precursor; less than four mismatches were allowed between the mature miRNA:miRNA* duplex; miRNA and miRNA* were on opposite arms of the precursor forming a duplex with 3’ overhangs; the predicted minimum folding energy (MFE) was between -15 kcal/mol to −47.2 kcal/mol. Known miRNA sequences/homologues were annotated by BLASTn comparison  to mature and stem loop miRNA sequences from miRBase v19 . Predicted miRNA were considered novel if they had no match (allowing for a maximum of 2 mismatches i.e. n/n, (n-1)/n, (n-2)/n nucleotide matches, n = length of mature miRNA) to any entry in miRBase (release 19) and PMRD (accessed, February 2013). Novel Musa miRNA sequences not present in either miRBAse or PMRD databases, were arbitrarily named starting at ‘1’ and using the miRBase species based name format. For miRNA families observed to be present in both A- and B-genomes, paralogous miRNA loci counts in each Musa genome were estimated based on the 300 nt precursor regions predicted by miRDeep2. miRNA targets were predicted with ‘psRNAtarget’ online server (http://plantgrn.noble.org/psRNATarget/)  with default options.
Results and discussion
The majority of edible cultivated banana varieties are inter- and intra-specific triploid hybrids between varieties of two wild diploid M. acuminata and M. balbisiana species. To be able to carry out molecular genetic studies in these cultivars therefore it is necessary to have a reference B-genome. For these reasons we carried out Illumina 100 bp paired end gDNA sequencing of the wild diploid M. balbisiana variety ‘Pisang Klutuk Wulang’ (PKW), which is considered by many in the Musa research community to be one of the possible ancestors of the modern AAB hybrids.
PKW gDNA Read Mapping to the A-genome
overview of mean mapped read depth per chromosome, and the derived B-genome chromosome lengths following alignment to the A-genome
‘Pahang’ Reference length/bp
Mapped reads count
Mean mapped read depth
‘PKW’ mapped read length/bp
% of A -genome coverage
total counts / averages
This difference in size between the A- and B-genomes was largely expected as flow cytometric analyses of five M. acuminata genotypes (including DH ‘Pahang’) and representing 4 different subspecies, and of four M. balbisiana genotypes (including PKW) indicate that the haploid B-genome size is on average only 90% of the size of the A-genome [50–52]. However according to recent work from Cizkova et al.  the PKW haploid genome is actually 93.3% of the size of DH ‘Pahang’ with a sizes of 554 and 594 Mb respectively. This value for DH Pahang is slightly higher than the 523 Mb reported by D’Hont et al. 2012 . The sequenced ‘Pahang’ genome size of 473 Mb, represents an assembly of ~90% of the total M. acuminata genome, of which only 323 Mb is anchored to the 11 chromosomes . If the expected PKW M. balbisiana genome size is 93.3% of ‘Pahang’ then we would expect to generate a consensus length of in the region of 0.93 x 473 or 440 Mb. By analogy therefore, our consensus PKW B-genome size of 341.4 Mb represents ~78% of the expected PKW B-genome size.
De novo assembly of unmapped gDNA reads
A total of 36.8 M gDNA reads (36.3 Gbp), remained unmapped after alignment to the A-genome. These could represent reads from regions that are structurally highly divergent from the A-genome so to test for the presence of unique, genic B-genome regions, the unmapped reads were de novo assembled into 63,245 contigs (Additional file 2: Table S2), and the presence of genic sequences tested for by large gap mapping of Musa unigene and reference CDS sequences, followed by a round of transcript detection. In total, 58,746 reads were used, but only 28 sequences actually mapped to these contigs. We can therefore conclude that the unmapped gDNA reads do not contain any significant gene rich regions, and that essentially all genic regions are retained in the consensus PKW B-genome sequence. An overview of the repeats annotation of these contig sequences is provided in Additional file 3: Table S3.
De novo assembly of gDNA reads
We also carried out de novo assembly of all gDNA reads, independent of a reference sequence. Here, over 96% of the 281 M trimmed reads, representing 27.4 Gbp of nucleotide sequence were assembled into 180,175 contigs with a total length of 339.3 Mb, an N50 of 7,884 bp, and an average contig length of 1,883 bp (Additional file 4: Table S4). The accumulated assembled contig length of 339.3 Mb is very similar to the consensus read mapping length of 341 Mb, but due to its much more fragmented nature (78.2% of these contigs were less than 1 kb in length) this resource is much more difficult to utilize. To evaluate the set of PKW gDNA contig sequences, the Musa reference CDS set was mapped to the PKW contig set as well to as the consensus PKW B-genome. In the case of the consensus PKW B-genome 32,192 (88% of total) Musa CDS were successively mapped, corresponding to 25,565 individual transcripts. In the case of the gDNA contig set, 71% of the CDS could be mapped (25,694 CDS), and a total of 21,272 individual transcripts were identified (data not shown). These data indicate therefore that simply mapping the gDNA reads to the A-genome and extracting the consensus sequence is the most effective way to generate a draft working M. balbisiana genome.
Evaluation/characterisation of the PKW B-genome assembly
While these results demonstrate that there are large regions with a high degree of homology across the M. acuminata and M. balbisiana genomes, it is important to realize that with this read mapping approach we cannot determine whether large scale genomic rearrangements such as insertions, inversions, transversions etc. have taken place. Previous work in banana however, suggests that gene order is likely to have been preserved - at least over short regions. For example, a comparison of 1.4 Mb of orthologous BAC clones from M. acuminata (cv. ‘Madang’), and M. balbisiana (cv. ‘Pisang Klutuk Wulung’), showed a high degree of microsynteny with preservation of gene order, and 96 – 96.3% sequence identity within genic regions [53, 54]. The same authors also reported that predicted gene structure was good for well-conserved homologous genes, but that discrepancies were detected in the gene predictions of those orthologous BACs whose protein products had no match in public databases (i.e. hypothetical protein genes).
Variant detection in the PKW B-genome
In total, 20,657,389 sequence variants were detected in the PKW B-genome relative to the reference doubled haploid ‘Pahang’ A-genome based on only the uniquely mapped reads (Additional file 5: Table S5). Of these 18,868,899 were single nucleotide variants (SNVs or SNPs), 815,805 insertions and 972,588 deletions. From the list of SNVs 8,738,760 were homozygous variants which therefore represent sequence differences from ‘Pahang’. The remaining 10,130,236 were heterozygous variants and therefore represent allelic variation and the degree of heterozygosity present in the PKW B-genome. On the basis of the total consensus PKW genome size of 341,431,243 bp, this heterozygosity corresponds to a SNP frequency of 1 variant every 33.7 bp, or 2.97%. The number and densities of sequence variants present in heterozygotic eukaryotic genomes varies enormously according to the species, whether they are obligate out-crossers or not, the number and genetic diversity of the cultivars assessed, and whether coding or noncoding regions are being considered. For example, in maize Zea mays L.) the SNV density was 1 SNV per 124 bp of coding sequence, and 1 per 31 bp in non-coding regions . Visual inspection of Figure 1 also confirms that SNV density is higher in the intronic regions of the PKW B-genome than in the exons, and that there is also a higher SNV frequency in the non-transcribed regions. Similar conclusions were reached by Boonruangrod et al. in their comparison of the rDNA sequences of M. acuminata type ‘Calcutta4’ and ‘Yangambi KM5’ with the wild type M. balbisiana accession ‘Tani .
By comparison, our data indicate that the PKW gDNA sequence differs from the A-genome at a frequency of 1 (homozygous) SNV every 39.1 bp (2.6%). This is much higher than estimates for interspecies SNP variation in rice, where a comparison of the O. indica and O. japonica genome sequence data found a SNP frequency of around 0.4%, . However differences between M. acuminata and M. balbisiana are comparable to frequencies measured in Eucalyptus species (between one SNP per 16 bp to one per 33 bp were reported ), Quercus crispula (one SNP in 25 bp ) and Populus tremula (one SNP per 60 bp ). Estimates for the degree of interspecific variation in Musa are likely to grow as more cultivars and species are sequenced.
Repeat detection and annotation
Overview and classification of the repeats present in the ‘Pahang’ (Musa A) and ‘Pisang Klutuk Wulung’ (PKW, Musa B) genomes
Overview of the microsatelites present in the ‘Pahang’ (Musa A) and ‘Pisang Klutuk Wulung’ (PKW, Musa B) genomes, and a comparison with the results of Hribova et al. obtained from the analysis of ~100 Mb of the repetitive region of M. acuminata cv ‘Calcutta 4’ genome
Total number of repeats
Total number of repeat bases
genome sized (bp)
SSR bps/Kbp genome
Overview of the dinucleotide microsatellite motifs present in the ‘Pahang’ (Musa A) and ‘Pisang Klutuk Wulung’ (PKW, Musa B) genomes, and a comparison with the results of Hribova et al. obtained from the analysis of ~100 Mb of the repetitive region of M. acuminata cv ‘Calcutta 4’ genome
Predicted novel miRNA in Musa A- and Musa B-genomes
Locus B = genome
Mature_miRNA sequence 5′ – 3′
Genome distribution of miRNA precursors
Targets of novel B-genome miRNA
Using a fairly stringent cut-off expect value of 2.0, 18 predicted novel miRNA families were found in the B-genome. Of these, seven (mba-miR3, mba-miR5, mba-miR8, mba-miR12, mba-miR13, mba-miR15 and mba-miR18) were found to have targets within coding regions of the B-genome sequence. None of these predicted miRNA families were present in the A-genome  and thus are presumed to be B-genome specific in function and to have evolved after the divergence of the M. balbisiana and M. acuminata species ~4.6 Mya . Predicted targets of these B-genome candidate miRNAs correlate with a range of functions across plant development and metabolism and notably, several are proposed to be involved in tolerance or responses to biotic and abiotic stress. These included casein kinase, a predicted target of mba-miR3 and previously reported to affect multiple developmental and stress response pathways in Arabidopsis ; dirigent, a predicted target of mba-miR15, which are a family of proteins associated with lignification, biotic and abiotic stress responses, also recently reported as responding to drought, salt and oxidative stresses in sugarcane  and mba-miR8 targets from the multidrug and toxic compound extrusion (MATE) family, which in plants are associated with tolerance to various xenobiotic compounds  and metals including zinc tolerance in Arabidopsis  and aluminium tolerance in sorghum . An additional predicted target of mba-miR8, Sal-1 phosphatase, has been reported as a negative regulator of drought tolerance in Arabidopsis, acting via ABA-dependent and independent pathways .
Several of the B-genome specific miRNA predicted targets, are proteins with unknown function, and appear to be only present in the B-genome and thus may represent B-genome distinct functional networks. Given the wider stress and disease resistance reported for banana B genomes [23, 76, 77] further functional validation of these miRNA and target genes, and in particular those of unknown function, is of particular interest. A full list of predicted miRNA targets is provided in Additional file 7: Table S7.
Annotating the PKW consensus B-genome
Comparison of the Musa A- and B-genome annotations
RNA read and transcript mapping to the A- and B-genomes
To evaluate the usefulness and validity of the consensus PKW B-genome for expression studies, a total of 256 M paired-end, 100 bp RNA reads from two AAA, and three AAB cultivars, and one diploid Austalimusa Fe’i cultivar were mapped to the combined A- and B-genome sets. In addition we carried out de novo assembly of these reads to generate transcript sets for each cultivar, and these expressed gene transcripts were also mapped to the combined genomes set together with a ‘Grande Naine’ (AAA), Unigene set.
Interestingly, the normalized read coverage across the all chromosomes of both genomes was also found to differ between the cultivars. Of particular note are the differences between the two AAA hybrids (Figure 4, Additional file 9: Table S9), where 40.2% of all ‘Yangambi-km5’ mapped reads localized to the A- and B-chrUn_random’ sequences, compared to only 16.7% of the ‘Gros Michel’ reads. This suggests that there could be substantial differences between the AAA genome sequences of ‘Yangambi km-5’ and ‘Pahang’. However we also see that ‘B-chrUn_random’ is also the sequence with the highest count of TE sequences, so that these differences could simply represent differences in the abundance of these elements. Indeed comparing the genes with the highest expression levels in this chromosome in these 2 cultivars, shows that 250,822 reads specifically map to a single 472 bp intronic region of sequence (KMMuB_chrUn_random_G39317)_in ‘Yangambi km-5’ , while ‘only’ 60,891 map to this same sequence in ‘Gros Michel’. No such phenomenon was observed elsewhere however, where we see that a much higher proportion of ‘Gros Michel’ reads map to A_chr3 (Figure 4), and where for all cultivars a proportionally higher number of reads mapped to A_chr6, which is also the chromosome with the highest gene count (3,794, Table 6). However, even though the homeologous B_chr6 gene also has the highest gene count of the B-genome, more reads from the African plantains (‘Mbouroukou’ and ‘Batard’), mapped to B_chr7. These results indicate that there are different patterns of expression across the two chromosome sets (Figure 4), and suggest different and unique contributions of each genome to banana metabolism.
overview of RNA-transcript read coverage distribution per chromosome following mapping to the combined A- and B-genomes
Normalised mapped transcripts count
Firstly, we see that 99.8% of the reference ‘Pahang’ CDS map to the A-genome, as expected. Similarly, 96.1% of the ‘Yangambi-km5’ (AAA) transcripts, and 94.1% of the ‘unigene’ sequences derived from “Grande Naine” (AAA, Cavendish sub-group) map to the B-genome (Table 6). Somewhat surprisingly though, only 87.3% of the ‘Gros Michel’ (AAA) transcripts map to the A-genome suggesting either that the ‘Ibota’ subgroup is more closely related to ‘Malacenssis’ subgroup to which ‘Pahang’ belongs than to the ‘Gros Michel’ subgroup, or that there is a larger proportion of M. balbisiana-like gene sequences in the ‘Gros Michel’ genome. Importantly the proportion of AAB transcripts mapping to the B-genome approaches the theoretically expected 33%, at 34.2 – 35.4% for the African plantains, but is slightly different than the 31.5% observed for Pacific plantain cultivar ‘Iholena lele’. Despite the fact that ‘Karat’ belongs to a completely different species (Australimusa), diploid ‘Karat’ transcripts map to both reference genomes equally well (51.2:48.8, A:B respectively), indicating that these resources could be useful to study gene expression even in Australimusa cultivars.
Looking at the mean normalized expressed transcript coverage per chromosome, it is clear that the high proportion of 100 bp reads mapping to the ‘chrUn_random’ sequences (Figure 4) is no longer evident. This suggests that the majority of the differences between cultivars involves non-expressed regions. For all cultivars the highest proportion of mapped reads are to be found on chromosomes A_chr6, and A_chr8, and the lowest number of mapped transcripts occurs on the two chrUn_random sequences. The African plantains are interesting in having a proportionally higher number of reads mapping to B_chr7 despite the low gene count of this chromosome. Some care has to be taken in the interpretation of differences in read distributions across the chromosomes as variations in physiological status (e.g. ripening) could lead to shifts in the mean mapped read depths per chromosomes. Nonetheless these results could provide indications as to those chromosomes in which there has been a higher degree of homeologous recombination, or which are preferentially expressed under certain conditions. The difference in the distribution of mapped ‘Iholena lele’ (Pacific plantain) transcripts compared to the African plantains, ‘Mbouroukou-3’ and ‘Batard’ supports their different classifications. i.e. even though all are hybrids between the M. acuminata subsp. banksii and M. balbisiana genomes , ‘Iholena lele’ is classified as a member of the Iholena and Maoli-Popoulu subgroup of plantains while the African plantains are considered to have arisen as a result of somaclonal mutations following a limited number of introductions to the African continent 2,300 – 2,750 years ago.
The majority of modern cultivated banana cultivars are interspecific sterile triploids formed by hybridization between M. acuminata and M. balbisiana diploid progenitor species. Therefore the study of these hybrids ideally requires the presence of reference sequences for both genomes. In particular for the introgression of beneficial resistance traits associated with the M. balbisiana genome. Here, we generated a consensus B-genome sequence, which based on size estimates, represents ~78% of the expected B-genome length. This smaller size is presumably due to differences in the organisation of intergenic repetitive regions between the A- and B-genomes. This B-genome shows a high degree of sequence divergence from the A-genome as well as moderately high levels of heterozygosity. Ab initio gene prediction for the B-genome generates a total of 39,914 gene models, of which 3,540 are annotated as TE’s, so that the number of functional gene sequences is actually 36,638, which is nearly identical to the gene count of ‘Pahang’. We also identified several new B-genome specific miRNAs, some of which have predicted targets suggestive of novel stress related pathways that may have evolved separately in M. Balbisiana. To validate the usefulness of the consensus B-genome, mRNA-reads and de novo assembled transcripts from a series of genetically diverse interspecific triploid cultivars were mapped to the combined A- and B-genomes set. Results suggest significant genetic divergence between subgroups of AAA cultivars, and the presence of regions with high B-genome homology causing up to 12.7% of the transcripts to map preferentially to the B-genome. Comparison of the transcript mappings of the African and Plantain (AAB) varieties suggests structural diversity between these two groups, but importantly validate the quality of the B-genome, as ~33% of all reads map exclusively to these B-genome sequences. For all varieties there is evidence for differences in expression levels across homeologous chromosomes suggesting independent contributions of the two genomes to banana metabolism.
While our results demonstrate the usefulness of the consensus B-genome sequence for molecular genetic studies, it is important to relalise that the structure and organization is heavily dependent on the quality of the A-genome used as a reference. In this respect it should be remembered that only ~70% of the published A-genome sequence has actually been anchored to the 11 chromosomes. As a result, large scale structural differences in the B-genome are difficult to detect with our strategy, and intergenic regions of high diversity, and regions of the B-genome consisting of repeats will be difficult to assemble. In particular the position of the TEs, that represent around half of the Musa genome will be based on their position in the A-genome while in reality their position will in the vast majority of cases be different in the B genome.
There is much interest in the exploitation of the M. balbisiana genome sequence for the introgression of beneficial traits such as biotic and abiotic stress resistance within Musa breeding programs. The PKW B-genome sequence therefore represents a valuable resource for the molecular genetic studies, not only in AAB, and ABB cultivars, but also for AAA dessert bananas. In addition we have shown that it can be used obtain useful information on gene expression levels in members of the more distantly-related Australimusa species.
The raw Illumina PKW gDNA sequence data generated was submitted to SRA with accession no. SRR956987. The assembled PKW consensus chromosome sequences, with annotations, as well as the CDS set and list of gDNA contigs are hosted at the Banana Genome Hub and publically available at http://banana-genome.cirad.fr/content/musa-balbisiana-pisang-klutuk--wulung.
The authors thank Otto van Poeselaere, Sabine Van Leirberghe and Lucas N. Davey for stimulating discussions during the preparation of this manuscript. We acknowledge access to the Syngenta Musa 3'EST database, donated by Syngenta to Bioversity International within the framework of the Global Musa Genomics Consortium. We thank Bioversity International, Dr. Gerard Ngoh-Newilah of CARBAP, Djombe, Cameroon, Dr. Angela Kepler of Pacific-Wide Ecological Consulting, Hawaii, and the late Dr. Lois Engelberger of Pohnpei for providing samples of fruit. We thank the Ministry of Higher Education, Malaysia, for University of Malaya grants RG006-09BIO, PV109/2011A and FRGS grant FP005-2011A to JAH, GR and NZK. We would like to thank Wendy Chin Yi Wen from Plant Biotechnology Research Laboratory, University of Malaya for providing the embryogenic cell suspension. Finally the authors would like to thank Mathieu Rouard from Bioversity International, Montpellier for constructing the website to host the data generated here.
- De LE, Vrydaghs L, De Maret P, Xavier P, Denham T: Why bananas matter: an introduction to the history of banana domestication. Ethnobot Res Appl. 2009, 7: 165-177.Google Scholar
- FAOSTAT, banana consumption.http://faostat.fao.org/site/567/DesktopDefault.aspx?PageID=567#ancor,
- FAOSTAT, banana exports.http://faostat.fao.org/site/535/DesktopDefault.aspx?PageID=535#ancor,
- Raboin L, Carreel F, Noyer J: Diploid ancestors of triploid export banana cultivars: molecular identification of 2n restitution gamete donors and n gamete donors. Mol Breed. 2005, 16: 333-341. 10.1007/s11032-005-2452-7.View ArticleGoogle Scholar
- Boonruangrod R, Fluch ÆS, Burg ÆK: Elucidation of origin of the present day hybrid banana cultivars using the 5’ ETS rDNA sequence information. Mol Breed. 2009, 24: 77-91. 10.1007/s11032-009-9273-z.View ArticleGoogle Scholar
- Boonruangrod R, Desai D, Fluch S: Identification of cytoplasmic ancestor gene-pools of Musa acuminata Colla and Musa balbisiana Colla and their hybrids by chloroplast and mitochondrial haplotyping. TAG Theor Appl Genet. 2008, 118: 43-55. 10.1007/s00122-008-0875-3.View ArticlePubMedGoogle Scholar
- Wang X-L, Chiang T-Y, Roux N, Hao G, Ge X-J: Genetic diversity of wild banana (Musa balbisiana Colla) in China as revealed by AFLP markers. Genet Resour Crop Evol. 2006, 54: 1125-1132.View ArticleGoogle Scholar
- Englberger L, Schierle J, Aalbersberg W, Hofmann P, Humphries J, Huang A, Lorens A, Levendusky AMY, Daniells J, Marks GC, Fitzgerald MH: Carotenoid and vitamin content of Karat and other Micronesian banana cultivars. Int J Food Sci Nutri. 2006, 57: 399-418. 10.1080/09637480600872010.View ArticleGoogle Scholar
- Davey MW, Van den Bergh I, Markham R, Swennen R, Keulemans J: Genetic variability in Musa fruit provitamin A carotenoids, lutein and mineral micronutrient contents. Food Chem. 2009, 115: 806-813. 10.1016/j.foodchem.2008.12.088.View ArticleGoogle Scholar
- McKey D, Elias M, Pujol B, Duputié A: The evolutionary ecology of clonally propagated domesticated plants. New Phytol. 2010, 186: 318-332. 10.1111/j.1469-8137.2010.03210.x.View ArticlePubMedGoogle Scholar
- Simmonds N, Sheppard K: The taxonomy and origins of the cultivated bananas. Bot J Linn Soc. 1955, 55: 302-312. 10.1111/j.1095-8339.1955.tb00015.x.View ArticleGoogle Scholar
- Heslop-Harrison JS, Schwarzacher T: Domestication, genomics and the future for banana. Ann Bot. 2007, 100: 1073-1084. 10.1093/aob/mcm191.PubMed CentralView ArticlePubMedGoogle Scholar
- Jeridi M, Bakry F, Escoute J, Fondi E, Carreel F, Ferchichi A, D’Hont A, Rodier-Goud M: Homoeologous chromosome pairing between the A and B genomes of Musa spp. revealed by genomic in situ hybridization. Ann Bot. 2011, 108: 975-981. 10.1093/aob/mcr207.PubMed CentralView ArticlePubMedGoogle Scholar
- D’Hont A, Denoeud F, Aury J-M, Baurens F-C, Carreel F, Garsmeur O, Noel B, Bocs S, Droc G, Rouard M, Da Silva C, Jabbari K, Cardi C, Poulain J, Souquet M, Labadie K, Jourda C, Lengellé J, Rodier-Goud M, Alberti A, Bernard M, Correa M, Ayyampalayam S, Mckain MR, Leebens-Mack J, Burgess D, Freeling M, Mbéguié-A-Mbéguié D, Chabannes M, Wicker T: The banana (Musa acuminata) genome and the evolution of monocotyledonous plants. Nature. 2012, 488: 213-217. 10.1038/nature11241.View ArticlePubMedGoogle Scholar
- Gaeta RT, Chris Pires J: Homoeologous recombination in allopolyploids: the polyploid ratchet. New Phytol. 2010, 186: 18-28. 10.1111/j.1469-8137.2009.03089.x.View ArticlePubMedGoogle Scholar
- Perrier X: Combining biological approaches to shed light on the evolution of edible bananas. Ethnobotany Res Appl. 2009, 7: 199-216.Google Scholar
- De Langhe E, Hribová E, Carpentier S, Dolezel J, Swennen R: Did backcrossing contribute to the origin of hybrid edible bananas?. Ann Bot. 2010, 106: 849-857. 10.1093/aob/mcq187.PubMed CentralView ArticlePubMedGoogle Scholar
- Henry IM, Carpentier SC, Pampurova S, Van Hoylandt A, Panis B, Swennen R, Remy S: Structure and regulation of the Asr gene family in banana. Planta. 2011, 234: 785-798. 10.1007/s00425-011-1421-0.PubMed CentralView ArticlePubMedGoogle Scholar
- Molina AB, Williams RC, Hermanto C, Suwanda M, Komolong B, Kokoa P: Mitigating the threat of banana Fusarium wilt: understanding the agroecological distribution of pathogenic forms and developing disease management strategies. Australia: ACIAR, GPO Box 1571, Canberra ACT 2601, Volume Report Number FR 2010–12, 978 1 921738 17 3Google Scholar
- Marin D, Romero R, Guzmán M, Sutton T: Black Sigatoka: an increasing threat to banana cultivation. Plant disease. 2003, 87: 208-221. 10.1094/PDIS.2003.87.3.208.View ArticleGoogle Scholar
- De Bellaire LDL, Fouré E, Abadie C, Carlier J: Black leaf streak disease is challenging the banana industry. Fruits. 2010, 65: 327-342. 10.1051/fruits/2010034.View ArticleGoogle Scholar
- Davey MW, Van Den BI, Roux N: Vitamin A biofortification in Musa: status, bottlenecks and prospects. Acta Horticult. 2011, 897: 169-178.View ArticleGoogle Scholar
- Bakry F, Carreel F, Jenny C, Horry J: Genetic improvement of banana. Breeding plantation tree crops: tropical species. Edited by: Priyadarshan P, Jain SM. 2009, New York: Springer, 3-50.View ArticleGoogle Scholar
- Vanhove A-C, Vermaelen W, Panis B, Swennen R, Carpentier SC: Screening the banana biodiversity for drought tolerance: can an in vitro growth model and proteomics be used as a tool to discover tolerant varieties and understand homeostasis. Frontiers in plant science. 2012, 3: 176-PubMed CentralView ArticlePubMedGoogle Scholar
- Baurens F, Bocs S, Rouard M, Matsumoto T, Miller RNG, Rodier-goud M, Mbéguié-a-mbéguié D, Yahiaoui N: Mechanisms of haplotype divergence at the gene locus in wild banana (Musa balbisiana). BMC Plant Biol. 2010, 10: 149-10.1186/1471-2229-10-149. 10.1186/1471-2229-10-149PubMed CentralView ArticlePubMedGoogle Scholar
- Liu H-Y, Dai J-R, Feng D-R, Liu B, Wang H-B, Wang J-F: Characterization of a novel plantain Asr gene, MpAsr, that is regulated in response to infection of Fusarium oxysporum f. sp. cubense and abiotic stresses. J Integr Plant Biol. 2010, 52: 315-323. 10.1111/j.1744-7909.2010.00912.x.View ArticlePubMedGoogle Scholar
- Thomas DS, Turner DW, Eamus D: Independent effects of the environment on the leaf gas exchange of three banana (Musa sp.) cultivars of different genomic constitution. Sci Hortic. 1998, 75: 41-57. 10.1016/S0304-4238(98)00114-9.View ArticleGoogle Scholar
- Tripathi L, Tripathi J: Relative susceptibility of banana cultivars to Xanthomonas campestris pv. musacearum. Afr J Biotechnol. 2009, 8: 5343-5350.Google Scholar
- Tripathi L, Odipio J, Tripathi JN, Tusiime G: A rapid technique for screening banana cultivars for resistance to Xanthomonas wilt. Eur J Plant Pathol. 2007, 121: 9-19.View ArticleGoogle Scholar
- Jalil M, Khalid N, Othman RY: Plant regeneration from embryogenic suspension cultures of Musa acuminata cv. Mas (AA). Plant Cell Tiss Org Cult. 2003, 75: 209-214. 10.1023/A:1025814922547.View ArticleGoogle Scholar
- Michiels A, Van Den EW, Tucker M, Van RL, Van LA: Extraction of high-quality genomic DNA from latex-containing plants. Anal Biochem. 2003, 315: 85-89. 10.1016/S0003-2697(02)00665-6.View ArticlePubMedGoogle Scholar
- Tattersall EAR, Ergul A, Alkayal F, Deluc L, Cushman JC, Cramer GR: Comparison of methods for isolating high-quality RNA from leaves of grapevine. Am J Enol Vitic. 2005, 4: 400-407.Google Scholar
- Davey M, Graham N, Vanholme B, Swennen R, May S, Keulemans J: Heterologous oligonucleotide microarrays for transcriptomics in a non-model species; a proof-of-concept study of drought stress in Musa. BMC Genomics. 2009, 10: 436-10.1186/1471-2164-10-436.PubMed CentralView ArticlePubMedGoogle Scholar
- Kiefer E, Heller W, Ernst D: A simple and efficient protocol for isolation of functional RNA from plant tissues rich in secondary metabolites. Plant Mol Biol Report. 2000, 18: 33-39. 10.1007/BF02825291.View ArticleGoogle Scholar
- Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M: Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics (Oxford, England). 2005, 21: 3674-3676. 10.1093/bioinformatics/bti610.View ArticleGoogle Scholar
- Götz S, García-Gómez JM, Terol J, Williams TD, Nagaraj SH, Nueda MJ, Robles M, Talón M, Dopazo J, Conesa A: High-throughput functional annotation and data mining with the Blast2GO suite. Nucleic acids res. 2008, 36: 3420-3435. 10.1093/nar/gkn176.PubMed CentralView ArticlePubMedGoogle Scholar
- Hribová E, Neumann P, Matsumoto T, Roux N, Macas J, Dolezel J: Repetitive part of the banana (Musa acuminata) genome investigated by low-depth 454 sequencing. BMC plant biol. 2010, 10: 204-10.1186/1471-2229-10-204.PubMed CentralView ArticlePubMedGoogle Scholar
- RepeatMasker Open-3.0. 1996,http://www.repeatmasker.org, –2010,
- Benson G: Tandem repeats finder: a program to analyze DNA sequences. Nucleic acids. 1999, 27: 573-580. 10.1093/nar/27.2.573.View ArticleGoogle Scholar
- Sobreira TJP, Durham AM, Gruber A: TRAP: automated classification, quantification and annotation of tandemly repeated sequences. Bioinformatics (Oxford, England). 2006, 22: 361-362. 10.1093/bioinformatics/bti809.View ArticleGoogle Scholar
- Friedländer MR, Mackowiak SD, Li N, Chen W, Rajewsky N: miRDeep2 accurately identifies known and hundreds of novel microRNA genes in seven animal clades. Nucleic acids research. 2012, 40: 37-52. 10.1093/nar/gkr688.PubMed CentralView ArticlePubMedGoogle Scholar
- Meyers BC, Axtell MJ, Bartel B, Bartel DP, Baulcombe D, Bowman JL, Cao X, Carrington JC, Chen X, Green PJ, Griffiths-Jones S, Jacobsen SE, Mallory AC, Martienssen RA, Poethig RS, Qi Y, Vaucheret H, Voinnet O, Watanabe Y, Weigel D, Zhu J-K: Criteria for annotation of plant MicroRNAs. The Plant cell. 2008, 20: 3186-3190. 10.1105/tpc.108.064311.PubMed CentralView ArticlePubMedGoogle Scholar
- Griffiths-Jones S, Moxon S, Marshall M, Khanna A, Eddy SR, Bateman A: Rfam: annotating non-coding RNAs in complete genomes. Nucleic acids research. 2005, 33: D121-D124.PubMed CentralView ArticlePubMedGoogle Scholar
- Langmead B, Trapnell C, Pop M, Salzberg SL: Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009, 10: R25-10.1186/gb-2009-10-3-r25.PubMed CentralView ArticlePubMedGoogle Scholar
- Hofacker IL: Vienna RNA secondary structure server. Nucleic Acids Res. 2003, 31: 3429-3431. 10.1093/nar/gkg599.PubMed CentralView ArticlePubMedGoogle Scholar
- Bonnet E, Wuyts J, Rouzé P, Van de Peer Y: Evidence that microRNA precursors, unlike other non-coding RNAs, have lower folding free energies than random sequences. Bioinformatics (Oxford, England). 2004, 20: 2911-2917. 10.1093/bioinformatics/bth374.View ArticleGoogle Scholar
- Altschul S, Gish W, Miller W: Basic local alignment search tool. J Mol Biol. 1990, 215: 403-410.View ArticlePubMedGoogle Scholar
- Kozomara A, Griffiths-Jones S: miRBase: integrating microRNA annotation and deep-sequencing data. Nucleic Acids Res. 2011, 39: D152-D157. 10.1093/nar/gkq1027.PubMed CentralView ArticlePubMedGoogle Scholar
- Dai X, Zhao PX: psRNATarget: a plant small RNA target analysis server. Nucleic Acids Res. 2011, 39: W155-W159. 10.1093/nar/gkr319.PubMed CentralView ArticlePubMedGoogle Scholar
- Cížková J, Hřibová E, Humplíková L, Christelová P, Suchánková P, Doležel J: Molecular analysis and genomic organization of major DNA satellites in banana (Musa spp.). PloS one. 2013, 8: e54808-10.1371/journal.pone.0054808.PubMed CentralView ArticlePubMedGoogle Scholar
- Bartoš J, Alkhimova O, Doleželová M, De Langhe E, Doležel J: Nuclear genome size and genomic distribution of ribosomal DNA in Musa and Ensete(Musaceae): taxonomic implications. Cytogenet Genome Res. 2005, 109: 50-57. 10.1159/000082381.View ArticlePubMedGoogle Scholar
- Lysak M, Dolez M, Horry J: Flow cytometric analysis of nuclear DNA content in Musa. Theor Appl Genet. 1999, 98: 1344-1350. 10.1007/s001220051201.View ArticleGoogle Scholar
- Lescot M, Ciampi A, Ruiz M: Fresh insights into the Musa genome and its comparison with rice. Journées Ouvertes 2005. Meeting: Journées Ouvertes Biologie, Informatique et Mathématiques (JOBIM), Lyon (FRA), 06 – 08 July 2005. 2005, Montpellier (FRA): CIRAD, 10-15.Google Scholar
- Lescot M, Piffanelli P, Ciampi A: Insights into the Musa genome: syntenic relationships to rice and between Musa species. BMC genomics. 2008, 20: 58-78.View ArticleGoogle Scholar
- Ching A, Caldwell KS, Jung M, Dolan M, Smith OS, Tingey S, Morgante M, Rafalski AJ: SNP frequency, haplotype structure and linkage disequilibrium in elite maize inbred lines. BMC genetics. 2002, 3: 19-10.1186/1471-2156-3-19.PubMed CentralView ArticlePubMedGoogle Scholar
- Feltus FA, Wan J, Schulze SR, Estill JC, Jiang N, Paterson AH: An SNP resource for rice genetics and breeding based on subspecies indica and japonica genome alignments. Genome Res. 2004, 14: 1812-1819. 10.1101/gr.2479404.PubMed CentralView ArticlePubMedGoogle Scholar
- Mizrachi E, Hefer CA, Ranik M, Joubert F, Myburg AA: De novo assembled expressed gene catalog of a fast-growing Eucalyptus tree produced by Illumina mRNA-Seq. BMC genomics. 2010, 11: 681-10.1186/1471-2164-11-681.PubMed CentralView ArticlePubMedGoogle Scholar
- Quang ND, Ikeda S, Harada K: Nucleotide variation in Quercus crispula Blume. Heredity. 2008, 101: 166-174. 10.1038/hdy.2008.42.View ArticlePubMedGoogle Scholar
- Ingvarsson PK: Nucleotide polymorphism and linkage disequilibrium within and among natural populations of European aspen (Populus tremula L., Salicaceae). Genetics. 2005, 169: 945-953. 10.1534/genetics.104.034959.PubMed CentralView ArticlePubMedGoogle Scholar
- Amorim EP, Silva PH, Ferreira CF, Amorim VBO, Santos VJ, Vilarinhos AD, Santos CMR, Souza Júnior MT, Miller RNG: New microsatellite markers for bananas (Musa spp). Genet Mol Res: GMR. 2012, 11: 1093-1098. 10.4238/2012.April.27.8.View ArticlePubMedGoogle Scholar
- Cheung F, Town CD: A BAC end view of the Musa acuminata genome. BMC Plant Biol. 2007, 7: 1-7. 10.1186/1471-2229-7-1.View ArticleGoogle Scholar
- Hippolyte I, Bakry F, Seguin M, Gardes L, Rivallan R, Risterucci A-M, Jenny C, Perrier X, Carreel F, Argout X, Piffanelli P, Khan IA, Miller RNG, Pappas GJ, Mbéguié-A-Mbéguié D, Matsumoto T, De Bernardinis V, Huttner E, Kilian A, Baurens F-C, D’Hont A, Cote F, Courtois B, Glaszmann J-C: A saturated SSR/DArT linkage map of Musa acuminata addressing genome rearrangements among bananas. BMC plant biology. 2010, 10: 65-10.1186/1471-2229-10-65.PubMed CentralView ArticlePubMedGoogle Scholar
- Arango RE, Togawa RC, Carpentier SC, Roux N, Hekkert BL, Kema GHJ, Souza MT: Genome-wide BAC-end sequencing of Musa acuminata DH Pahang reveals further insights into the genome organization of banana. Tree Genetics & Genomes. 2011, 7: 933-940. 10.1007/s11295-011-0385-3.View ArticleGoogle Scholar
- Victoria FC, Da Maia LC, De Oliveira AC: In silico comparative analysis of SSR markers in plants. BMC plant biology. 2011, 11: 15-10.1186/1471-2229-11-15.PubMed CentralView ArticlePubMedGoogle Scholar
- Yao YY, Guo GG, Ni ZF, Sunkar R, Du JK, Zhu JK, Sun QX: Cloning and characterization of microRNAs from wheat (Triticum aestivum L.). Genome Biol. 2007, 8: R96-10.1186/gb-2007-8-6-r96.PubMed CentralView ArticlePubMedGoogle Scholar
- Unver T, Bakar M, Shearman RC, Budak H: Genome-wide profiling and analysis of Festuca arundinacea miRNAs and transcriptomes in response to foliar glyphosate application. J Mol Genet Genomics. 2010, 283: 397-413. 10.1007/s00438-010-0526-7.View ArticleGoogle Scholar
- Borges F, Pereira PA, Slotkin RK, Martienssen RA, Becker JD: MicroRNA activity in the Arabidopsis male germline. J Exp Bot. 2011, 62: 1611-1620. 10.1093/jxb/erq452.View ArticlePubMedGoogle Scholar
- Zhang Z, Yu J, Li D, Zhang Z, Liu F, Zhou X, Wang T, Ling Y, Su Z: PMRD: plant microRNA database. Nucleic Acids Res. 2010, 38: D806-D813. 10.1093/nar/gkp818.PubMed CentralView ArticlePubMedGoogle Scholar
- Nozawa M, Miura S, Nei M: Origins and evolution of microRNA genes in plant species. Genome Biol Evol. 2012, 4: 230-239. 10.1093/gbe/evs002.PubMed CentralView ArticlePubMedGoogle Scholar
- Mulekar JJ, Bu Q, Chen F, Huq E: Casein kinase II α subunits affect multiple developmental and stress-responsive pathways in Arabidopsis. Plant J. 2012, 69: 343-354. 10.1111/j.1365-313X.2011.04794.x.View ArticlePubMedGoogle Scholar
- Jin-long G, Li-ping X, Jing-ping F, Ya-chun S, Hua-ying F, You-xiong Q, Jing-sheng X: A novel dirigent protein gene with highly stem-specific expression from sugarcane, response to drought, salt and oxidative stresses. J Plant Cell Reports. 2012, 31: 1801-1812. 10.1007/s00299-012-1293-1.View ArticleGoogle Scholar
- Diener AC, Gaxiola RA, Fink GR: Arabidopsis ALF5, a multidrug efflux transporter gene family member, confers resistance to toxins. Plant cell. 2001, 13: 1625-1638.PubMed CentralView ArticlePubMedGoogle Scholar
- Pineau C, Loubet S, Lefoulon C, Chalies C, Fizames C, Lacombe B, Ferrand M, Loudet O, Berthomieu P, Richard O: Natural variation at the FRD3 MATE transporter locus reveals cross-talk between Fe homeostasis and Zn tolerance in Arabidopsis thaliana. PLoS genetics. 2012, 8: e1003120-10.1371/journal.pgen.1003120.PubMed CentralView ArticlePubMedGoogle Scholar
- Magalhaes JV: How a microbial drug transporter became essential for crop cultivation on acid soils: aluminium tolerance conferred by the multidrug and toxic compound extrusion (MATE) family. Ann bot. 2010, 106: 199-203. 10.1093/aob/mcq115.PubMed CentralView ArticlePubMedGoogle Scholar
- Wilson PB, Estavillo GM, Field KJ, Pornsiriwong W, Carroll AJ, Howell KA, Woo NS, Lake JA, Smith SM, Harvey Millar A, Von Caemmerer S, Pogson BJ: The nucleotidase/phosphatase SAL1 is a negative regulator of drought tolerance in Arabidopsis. Plant J: for cell and molecular biology. 2009, 58: 299-317. 10.1111/j.1365-313X.2008.03780.x.View ArticleGoogle Scholar
- Safár J, Noa-Carrazana JC, Vrána J, Bartos J, Alkhimova O, Sabau X, Simková H, Lheureux F, Caruana ML, Dolezel J, Piffanelli P: Creation of a BAC resource to study the structure and evolution of the banana (Musa balbisiana) genome. Genome. 2004, 47: 1182-1191. 10.1139/g04-062.View ArticlePubMedGoogle Scholar
- Tripathi L, Odipio J, Tripathi J, Tusiime G: A rapid technique for screening banana cultivars for resistance to Xanthomonas wilt. Eur J Plant Pathol. 2008, 121: 9-19. 10.1007/s10658-007-9235-4.View ArticleGoogle Scholar
- Perrier X, De Langhe E, Donohue M, Lentfer C, Vrydaghs L, Bakry F, Carreel F, Hippolyte I, Horry J-P, Jenny C, Lebot V, Risterucci A-M, Tomekpe K, Doutrelepont H, Ball T, Manwaring J, De Maret P, Denham T: Multidisciplinary perspectives on banana (Musa spp.) domestication. Proc Natl Acad Sci U S A. 2011, 108: 11311-11318. 10.1073/pnas.1102001108.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an open access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.