The Rhododendron Plant Genome Database (RPGD): a comprehensive online omics database for Rhododendron
BMC Genomics volume 22, Article number: 376 (2021)
The genus Rhododendron L. has been widely cultivated for hundreds of years around the world. Members of this genus are known for great ornamental and medicinal value. Owing to advances in sequencing technology, genomes and transcriptomes of members of the Rhododendron genus have been sequenced and published by various laboratories. With increasing amounts of omics data available, a centralized platform is necessary for effective storage, analysis, and integration of these large-scale datasets to ensure consistency, independence, and maintainability.
Here, we report our development of the Rhododendron Plant Genome Database (RPGD; http://bioinfor.kib.ac.cn/RPGD/), which represents the first comprehensive database of Rhododendron genomics information. It includes large amounts of omics data, including genome sequence assemblies for R. delavayi, R. williamsianum, and R. simsii, gene expression profiles derived from public RNA-Seq data, functional annotations, gene families, transcription factor identification, gene homology, simple sequence repeats, and chloroplast genome. Additionally, many useful tools, including BLAST, JBrowse, Orthologous Groups, Genome Synteny Browser, Flanking Sequence Finder, Expression Heatmap, and Batch Download were integrated into the platform.
RPGD is designed to be a comprehensive and helpful platform for all Rhododendron researchers. Believe that RPGD will be an indispensable hub for Rhododendron studies.
Rhododendron L. is the largest genus in the Ericaceae, which is the largest genus of woody angiosperms in China . The genus is widely distributed throughout the Northern Hemisphere from tropical Southeast Asia to northeastern Australia . There are more than 1000 species of Rhododendron worldwide, approximately 600 of which encompassing nine subgenera are found in China [3, 4]. Southwestern China and the eastern Himalayas are considered as centers of Rhododendron diversification and differentiation . Rhododendrons are considered to have great ornamental and medicinal value [6, 7].
Horticultural interest in Rhododendron can be traced back at least several centuries, owing in part to their bright coloring and elegant posture [8, 9]. In China, its introduction and cultivation was first documented in poetry from the Tang dynasty, and rhododendrons have long been developed as one of the ten national-traditional ornamental flowers . The breeding history began with gardening enthusiasts in Western countries in the late eighteenth century . Currently, there are over 28,000 cultivars of Rhododendron , which are widely cultivated in many regions such as Asia, America, and Europe . Most wild rhododendrons are found in regions with temperate climates, high rainfall, humid atmosphere, and organic acid soils with low nutrient composition . Furthermore, most varieties are derived through crossbreeding by gardening enthusiasts according to their preference for ornamental traits. In general, breeding goals have previously been focused mostly on ornamental characteristics rather than adaptability and resistance, resulting in a disconnect between existing varieties and market demands. Therefore, a challenge for Rhododendron breeding is the development of varieties capable of adapting to environments with cold winters, hot summers, lower rainfall and humidity, and less optimal soils .
Additionally, the genus Rhododendron has a long history in traditional medicine . Phytochemists have demonstrated interest in Rhododendron species due to their abundance of secondary metabolites . Currently, approximately 200 compounds, mostly flavonoids and diterpenoids, have been isolated from Rhododendron. Some of the isolates have demonstrated intriguing bioactivity [14, 15]. For example, diterpenoids isolated from the flowers, roots, and fruits of R. molle exhibit significant anticancer, antiviral, antinociceptive, immunomodulatory, and sodium channel antagonistic activities.
With the rapid development of sequencing and genomic editing technology, molecular design breeding has become a more efficient and accurate plant breeding method . Elucidation of the genetic mechanisms associated with ornamental traits (flower color, flower shape, etc.), adaptability, resistance, secondary metabolism, etc. will be a helpful and necessary foundation for more practical Rhododendron breeding. A great deal of omics data concerning Rhododendron have been accumulated to date and several rhododendron genomes have been sequenced. The R. delavayi genome sequence was released in 2017 , R. williamsianum in 2019 , and R. simsii in 2020 . In addition, relevant transcriptomic data have also been published in recent years [20,21,22,23,24]. Progress in the development of high-throughput sequencing technology has greatly accelerated studies on Rhododendron [17,18,19,20,21,22,23,24]. These large genomic data sets provide a new perspective for understanding biological traits such as ornamentation, adaptability, resistance, and secondary metabolism for breeders and phytochemists alike.
Rhododendron omics data sets are currently distributed in public databases that are easily accessible [25, 26]. However, processing these data is a considerable challenge for research groups with limited bioinformatics experience. To address this problem, we have constructed a comprehensive database for data storage, categorization, online analysis, and visualization of Rhododendron omics data sets.
Here, we present the Rhododendron Plant Genome Database (RPGD; http://bioinfor.kib.ac.cn/RPGD/), a data center for Rhododendron functional genomics researchers. The database integrates the three released genome sequences, expression profiles, functional annotations, gene family ontologies, simple sequence repeats, chloroplast genome assemblies, and gene homology information. We have also incorporated bioinformatics tools such as BLAST, JBrowse, Flanking Sequence Finder, Genome Synteny Browser, Ortholog Gene Finder, Expression Heatmap, and Batch Download into the user interface. The interface is designed to be simple and user-friendly. We suggest that RPGD will be of great convenience as a “one-stop shop” to a wide range of Rhododendron researchers.
Construction and content
Currently, three reference genome sequences of Rhododendron - R. delavayi, R. williamsianum and R. simsii - are hosted in RPGD (Table 1). The genome sizes are 695 Mb, 532 Mb and 529 Mb, respectively; and the scaffold N50 are 637.83 kb, 218.8 kb and 36.3 Mb, respectively [17,18,19]. The genome of R. simsii was sequenced by PacBio long-read sequencing technology , while R. delavayi and R. williamsianum were based on next-generation sequencing [17, 18]. We downloaded the genome assembly, general feature format (GFF3), coding sequence (CDS), and protein sequence (PEP) of R. delavayi (http://gigadb.org/dataset/100331) from the GigaScience database [17, 26], and for R. williamsianum (https://www.ncbi.nlm.nih.gov/assembly/GCA_009746105.1) and R. simsii (https://www.ncbi.nlm.nih.gov/assembly/GCA_014282245.1) from NCBI [18, 19, 25].
All publicly available RNA-Seq datasets in the NCBI Sequence Read Archive (SRA) database, including data from two projects and 19 samples, were obtained. One transcriptomics project was related to drought stress (4 samples) while the other was related to the flower bud in different dormancy statuses (15 samples)  (Table 1). Both projects focused on R. delavayi.
We processed and analyzed the RNA-Seq datasets by a standard pipeline method. First, we used the SRA Toolkit  to convert the data format to FASTQ and low-quality reads were removed from raw reads by Trimmomatic . We then employed Tophat2  to map all clean reads onto the reference genome (R. delavayi) with default parameters, which were assembled using Cufflinks (version 2.2.1) using the reference genome as a guide . Combined transcriptome assemblies were generated using Cuffmerge. Based on the alignments, the read counts of each gene were calculated and normalized to fragments per kilobase of transcript per million mapped fragments (FPKM) values in Cuffdiff. Mean and standard errors of the FPKM values were derived for the biological replicates.
Gene model and function annotation
A total of 89,496 protein-coding genes were collected from the downloaded data mentioned in the genomic data, including 32,938 from R. delavayi, 23,559 from R. williamsianum, and 32,999 from R. simsii. The protocol for annotating protein-coding genes is described as follows. Firstly, protein-coding genes were annotated using two software packages, eggNOG-mapper [31, 32] and InterProScan with default parameters . Then, the results from the two different tools were combined and redundant annotations were removed to obtain complete and precise GO annotations using homemade scripts. The protein sequences were aligned against the NCBI non-redundant (nr), UniProt (Swiss-Prot and TrEMBL), and Arabidopsis protein (TAIR) databases using the BLASTP command of DIAMOND with an E-value cutoff of 1e− 5 . The BLASTP results against the UniProt and TAIR databases were then fed to the AHRD program (https://github.com/groupschoof/AHRD) to obtain concise, precise, and informative gene function descriptions. All BLASTP results are shown on the detailed gene page. All of these protein sequences were further compared against the InterPro database using InterProScan to identify functional domains .
As a result, the genes from R. delavayi were functionally annotated to 805,276 on GO database and 77,221 on InterPro. The R. williamsianum gene were functionally annotated to 687,600 on GO and 60,834 on InterPro. The R. simsii genes were functionally annotated to 785,704 on GO and 81,654 on InterPro (Table 1).
These genes were used as a “data hub” to link all data types (Fig. 1), including gene summary information (species, gene ID, location, description, InterPro and gene family) (Fig. 1a), expression profiles (Fig. 1b), JBrowse gene visualization (Fig. 1c), gene exon/CDS information (Fig. 1d), GO annotation (Fig. 1e), genomic synteny blocks (Fig. 1f), homologous genes and BLASTP results against the nr-NCBI, UniProt and TAIR databases (Fig. 1g), gene/mRNA/CDS/protein sequences (Fig. 1h). All information mentioned here is shown on an integrated interface to allow users to browse conveniently.
Transcription factors and transcriptional regulators
The iTAK package was used to identify transcription factors (TFs) and transcriptional regulators (TRs) in the three Rhododendron genomes and all candidates were classified into different gene families using the default parameters . Thus, R. delavayi contains 1662 TFs and 442 TRs, R. williamsianum contains 1261 TFs and 361 TRs, and R. simsii contains 1740 TFs and 416 TRs (Table 1).
OrthoFinder [36, 37] was employed to identify orthologous and paralogous genes by using default parameters among R. delavayi, R. williamsianum, R. simsii, Actinidia chinensis , Camellia sinensis  and Arabidopsis thaliana . In total, 18,048 orthologous groups were identified. To ensure that the inference of orthologous genes was sufficiently accurate, we extracted 985 groups of single-copy orthologs to construct the “Orthologous Groups” module (Table 1). We also used OrthoFinder to search for pairwise homologous genes between the three Rhododendron genomes and A. thaliana respectively [36, 37]. We considered the genes of each orthologous group as belonging to one gene family and mapped gene family information from A. thaliana to R. delavayi (4168 gene families), R. williamsianum (3546 gene families), and R. simsii (3742 gene families).
Simple sequence repeats
Simple sequence repeats (SSRs) were identified in R. delavayi, R. williamsianum and R. simsii by MISA with default parameters; the total number were 361,268, 230,013, and 358,705, respectively  (Table 1). We also used Primer3 with default parameters to design primers for SSRs and the primers can be displayed on the SSR detail page .
We also collected full-length chloroplast genomes of R. delavayi and R. pulchrum from the NCBI database [43,44,45]. RPGD hosts two complete chloroplast genome assemblies of R. delavayi. One of them is 193,798 bp in length, and 123 genes were annotated, including 80 protein-coding genes, 35 tRNA genes, and 8 rRNA genes . The other is 202,169 bp in length, a total of 137 genes were found, including 88 protein-coding genes, 41 tRNAs, and 8 rRNAs . The chloroplast genome of R. pulchrum is 136,249 bp in length, and it contains 73 genes, comprising 42 protein-coding genes, 29 tRNA genes, and 2 rRNA genes  (Table 1).
Syntenic relationships among R. delavayi, R. williamsianum and R. simsii
We identified syntenic blocks and homologous gene pairs in the three Rhododendron genomes. Protein sequences were first aligned against each other (pairwise comparisons) using BLASTP with an E-value cutoff of 1e− 5 . Based on the BLASTP results and gene positions, syntenic blocks were determined using MCScanX with default parameters . A total of 2913 syntenic blocks and 55,590 homologous genes were identified (Table 1) with detail presented in the “Tools/Genome Synteny” module. Users should note that the current assembly of draft genomes and annotations might affect the results of syntenic relationships, and we will update the data when new versions become available.
Utility and discussion
Users can browse all data in RPGD easily on the “Browse” page, including genome statistics, gene models, gene function annotations, SSRs, genome syntenic blocks, gene expression profiles, gene families and transcription factor information from R. delavayi, R. williamsianum and R. simsii, respectively. The information described above is presented in tabular form on the web page using a Bootstrap-table plug. Additionally, a detailed information page for a specific gene can be accessed by clicking the gene ID hyperlink. Information about each gene is displayed on a detailed page, including the gene summary, exons, gene structure (in JBrowse), GO, family, expression, homology, and sequence information.
A series of search tools are presented on the navigation menu “Search”, such as “Gene”, “Genome”, “Gene Ontology”, “Gene Family”, “Gene Expression”, “Transcription Factor”, “Chloroplast Genome” and “SSR” to help users more easily find data of interest to them. (i). “Search Gene”: RPGD provides four different ways to search genes including gene ID, AHRD descriptions, InterPro, GO accession, and GO term. The response is a dynamic table that contains all genes associated with the entered search terms, and the list of those genes can be downloaded as a TXT file for further analysis. Additionally, the details of the genes can be viewed by clicking the gene ID hyperlink. (ii). “Search Genome”: users can use scaffold/chromosome ID to search the scaffold/chromosome information. The results are divided into a list, a table, and a chromosome viewer. The list shows basic information about the chromosome, including the species, chromosome ID, and the length of the chromosome. The table displays information about all genes on the chromosome. The chromosome viewer is embedded in JBrowse to display the chromosome profile. (iii). “Search GO”: users can use gene ID, GO accession, and GO term to query GO information of a gene. The responses are a set of genes annotated with the queried functions. Similarly, users can download the list of genes and click the gene ID hyperlink to review gene details. (iv). “Search Family”: users can find genes with gene family names specified by the user. A list of genes related to this gene family are generated as the response. Users can also download the list of genes and click the gene ID hyperlink to view gene details. (v). “Search Gene Expression”: users can input gene ID of interest to search their expression patterns based on currently provided transcriptomics results. The output is a line chart that shows graphically the expression level and can be downloaded locally for further analysis. (vi). “Search Transcription Factor”: users can search for transcription factor genes by clicking transcription factor names. The responses are a list of genes annotated as transcription factors. Users can also download the list of genes and click the gene ID hyperlink to view gene details. (vii). “Search Chloroplast Genome”: users can use the gene or product name to find the information from chloroplast genes. The response is a list of detailed information about the entered keywords. In addition, the list returned contains a number of hyperlinks which allow user to view the details about that chloroplast gene at NCBI. (viii). “Search SSR”: RPGD provides SSR location, SSR type (monomer to hexamer) and SSR motif to query the SSR detailed information, including SSR ID, type, motif, size, and location. Users can click the SSR ID hyperlink to view SSR primer information. Examples are displayed below each search field that can be clicked to autofill the search keywords on every search page.
BLAST is a sequence similarity searching program frequently used for bioinformatics queries . ViroBLAST , a useful and user-friendly tool for online data analysis, was integrated into RPGD (Fig. 2a). Users can input their sequence of interest or upload their sequence files to perform BLASTN, BLASTP, BLASTX, tBLASTN, and tBLASTX against a whole genome, CDS, or peptide library.
Flanking sequence finder
The flanking sequences of genes often contain a wealth of information including regulatory elements and promoters. To aid in research of flanking sequences, we utilized gene annotations and genome data to develop a useful tool - “Flanking Sequence Finder”. Researchers can find and download flanking sequences by inputting gene ID and specifying the length of the desired flanking sequences.
Genome syntenic browser
A common task in routine bioinformatics analysis is the identification of homologous genes. Users can input gene IDs to find orthologous groups in R. delavayi, R. williamsianum, R. simsii, as well as A. chinensis, C. sinensis, and A. thaliana. The details of the homologous genes are be presented in a table, which also provides links to “data hub” page for each gene (Fig. 1).
RPGD not only stores gene expression profiles derived from RNA-Seq datasets but also provides an “Expression Heatmap” module (Fig. 2c). “Expression Heatmap” can be used to retrieve the gene expression patterns of a group of genes from different samples. The output is a heatmap that graphically shows expression levels and can be downloaded locally for further analysis.
GO and KEGG enrichment analysis
Functional enrichment analysis is a powerful method for mining gene data, providing further insight into what biological processes these genes may be involved in. To help users to capture biological information of genes, we construct the GO and KEGG enrichment analysis tools base on the functional annotation mentioned above and clusterProfiler R package . Users can input a list of interested genes to perform the enrichment analysis (Fig. 2d). The results returned the significantly enriched functional categories.
Download and batch download
All the data in RPGD were available for users to download, including genome assembly (FASTA), gene prediction (GFF3), gene function annotation (TXT), complete chloroplast genome (FASTA), gene family data (CSV), orthologous groups data (CSV), simple sequence repeat data (TXT), gene expression data (CSV), and other related data can also be downloaded in this module. “Batch Download” is provided for users to export custom datasets or bulk download datasets from RPGD. Users can download multiple types of sequences (gene, CDS, PEP, flanking sequence and gene expression profile) by inputting a list of genes.
RPGD is dedicated to providing a comprehensive database of Rhododendron omics data. The current implementation of RPGD integrates important data including genome sequence assemblies, gene expression profiles, functional annotations, gene families, transcription factors, homologous genes, simple sequence repeats, and chloroplast genome assemblies. It also provides a series of tools for online data analysis and visualization. The integration of these data and tools makes RPGD a valuable database. We intend to continue updating the datasets when new data are released. For instance, our team will release a novel Rhododendron genome (R. irroratum) and its phenotypic datasets, including breeds, genotypes, and phenotypes in the near future. Additionally, we will continue to develop and integrate tools for functional, evolutionary, and network analysis. We hope that researchers will take advantage of these resources and also provide comments and suggestions for improving RPGD. Believe that RPGD will be in indispensable hub for Rhododendron studies.
Availability of data and materials
RPGD is freely available at http://bioinfor.kib.ac.cn/RPGD/.
Rhododendron Plant Genome Database
General feature format
Simple sequence repeat
Sequence Read Archive
Fragments per kilobase of transcript per million mapped fragments
Yan LJ, Liu J, Möller M, Zhang L, Zhang XM, Li DZ, et al. DNA barcoding of Rhododendron (Ericaceae), the largest Chinese plant genus in biodiversity hotspots of the Himalaya-Hengduan Mountains. Mol Ecol Resour. 2015;15(4):932–44. https://doi.org/10.1111/1755-0998.12353.
Chamberlain D, Hyam R, Argent G, Fairweather G, Walter KS. The genus Rhododendron: its classification and synonymy. Edinburgh: Royal Botanic Garden Edinburgh; 1996.
Tian XL, Chang YH, Neilsen J, Wang SH, Ma YP. A new species of Rhododendron (Ericaceae) from northeastern Yunnan. China Phytotaxa. 2019;395(2):66–70. https://doi.org/10.11646/phytotaxa.395.2.2.
Fang MY, Fang RZ, He MY, Hu LZ, Yang HB, Qin HN, et al. Flora of China. Volume 14: Apiaceae through Ericaceae. Beijing: Science Press; 2005.
Ma YP, Wu ZK, Xue RJ, Tian XL, Gao LM, Sun WB. A new species of Rhododendron (Ericaceae) from the Gaoligong Mountains, Yunnan, China, supported by morphological and DNA barcoding data. Phytotaxa. 2013;114(1):42–50. https://doi.org/10.11646/phytotaxa.114.1.4.
De RJ, De KE, Calsyn E, Eeckhaut T, Van HJ, Kobayashi N. Azalea. In: Van HJ, editor. Ornamental Crops. Cham: Springer; 2018. p. 237–71.
Popescu R, Kopp B. The genus Rhododendron: an ethnopharmacological and toxicological review. J Ethnopharmacol. 2013;147(1):42–62. https://doi.org/10.1016/j.jep.2013.02.022.
Yonghui Z, Weibing J, Mangling W. Meanings of Rhododendron and ways used in gardens. Chin Agric Sci Bull. 2007;09:376–80.
Kron KA, Gawen LM, Chase MW, et al. Evidence for introgression in azaleas (Rhododendron; Ericaceae): Chloroplast DNA and morphological variation in a hybrid swarm on Stone Mountain, Georgia. Am J Bot. 1993;80(9):1095–9. https://doi.org/10.1002/j.1537-2197.1993.tb15335.x.
Leslie A. The international Rhododendron register and checklist. 2nd ed. London: Royal Horticultural Society; 2004.
Cox PA. The larger species of rhododendron. 1st ed. London: Batsford Ltd; 1979.
Perkins S, et al. More weighings: exploring the ploidy of hybrid elepidote. rhododendrons. Azalean. 2015;37:28–42.
Qiang Y, Zhou B, Gao K. Chemical constituents of plants from the genus Rhododendron. Chem Biodivers. 2011;8(5):792–815. https://doi.org/10.1002/cbdv.201000046.
Zhu YX, Zhang ZX, Yan HM, Lu D, Zhang HP, Li L, et al. Antinociceptive diterpenoids from the leaves and twigs of Rhododendron decorum. J Nat Prod. 2018;81(5):1183–92. https://doi.org/10.1021/acs.jnatprod.7b00941.
Zhou J, Liu T, Zhang H, Zheng G, Qiu Y, Deng M, et al. Anti-inflammatory grayanane diterpenoids from the leaves of Rhododendron molle. J Nat Prod. 2018;81(1):151–61. https://doi.org/10.1021/acs.jnatprod.7b00799.
Zhu H, Li C, Gao C. Applications of CRISPR–Cas in agriculture and plant biotechnology. Nat Rev Mol Cell Biol. 2020;21(11):661–77. https://doi.org/10.1038/s41580-020-00288-9.
Zhang L, Xu PW, Cai YF, Ma LL, Li SF, Li SF, et al. The draft genome assembly of Rhododendron delavayi Franch. var. delavayi. GigaScience. 2017;6(10):11.
Soza VL, Lindsley D, Waalkes A, Ramage E, Patwardhan RP, Burton JN, et al. The Rhododendron genome and chromosomal organization provide insight into shared whole-genome duplications across the heath family (Ericaceae). Genome Biol Evol. 2019;11(12):3353–71. https://doi.org/10.1093/gbe/evz245.
Yang FS, Nie S, Liu H, Shi TL, Tian XC, Zhou SS, et al. Chromosome-level genome assembly of a parent species of widely cultivated azaleas. Nat Commun. 2020;11(1):5269. https://doi.org/10.1038/s41467-020-18771-4.
Choudhary S, Thakur S, Jaitak V, Bhardwaj P. Gene and metabolite profiling reveals flowering and survival strategies in Himalayan Rhododendron arboreum. Gene. 2019;690:1–10. https://doi.org/10.1016/j.gene.2018.12.035.
Xing W, Liao J, Cai M, Xia Q, Liu Y, Zeng W, et al. De novo assembly of transcriptome from Rhododendron latoucheae Franch. using Illumina sequencing and development of new EST-SSR markers for genetic diversity analysis in Rhododendron. Tree Genet Genomes. 2017;13(3):53.
Choudhary S, Thakur S, Najar RA, Majeed A, Singh A, Bhardwaj P. Transcriptome characterization and screening of molecular markers in ecologically important Himalayan species (Rhododendron arboreum). Genome. 2018;61(6):417–28. https://doi.org/10.1139/gen-2017-0143.
Cai YF, Wang JH, Zhang L, Song J, Peng LC, Zhang SB. Physiological and transcriptomic analysis highlight key metabolic pathways in relation to drought tolerance in Rhododendron delavayi. Physiol Mol Biol Plants. 2019;25(4):991–1008. https://doi.org/10.1007/s12298-019-00685-1.
Jia X, Tang L, Mei X, Liu H, Luo H, Deng Y, et al. Single-molecule long-read sequencing of the full-length transcriptome of Rhododendron lapponicum L. Sci Rep. 2020;10(1):6755. https://doi.org/10.1038/s41598-020-63814-x.
Sayers EW, Beck J, Brister JR, Bolton EE, Canese K, Comeau DC, et al. Database resources of the National Center for Biotechnology Information. Nucleic Acids Res. 2020;48(D1):D9–D16. https://doi.org/10.1093/nar/gkz899.
Sneddon TP, Li P, Edmunds SC. GigaDB: announcing the GigaScience database. GigaScience. 2012. https://doi.org/10.1186/2047-217X-1-11.
Leinonen R, Sugawara H, Shumway M, on behalf of the International Nucleotide Sequence Database Collaboration. The sequence read archive. Nucleic Acids Res. 2011;39(D1):D19–21. https://doi.org/10.1093/nar/gkq1019.
Bolger AM, Lohse M, Usadel B. Trimmomatic: a flexible trimmer for Illumina sequence data. Bioinformatics. 2014;30(15):2114–20. https://doi.org/10.1093/bioinformatics/btu170.
Kim D, Pertea G, Trapnell C, Pimentel H, Kelley R, Salzberg SL. TopHat2: accurate alignment of transcriptomes in the presence of insertions, deletions and gene fusions. Genome Biol. 2013;14(4):13.
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, et al. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(5):511–5. https://doi.org/10.1038/nbt.1621.
Huerta-Cepas J, Forslund K, Coelho LP, Szklarczyk D, Jensen LJ, von Mering C, et al. Fast genome-wide functional annotation through orthology assignment by eggNOG-mapper. Mol Biol Evol. 2017;34(8):2115–22. https://doi.org/10.1093/molbev/msx148.
Huerta-Cepas J, Szklarczyk D, Heller D, Hernandez-Plaza A, Forslund SK, Cook H, et al. eggNOG 5.0: a hierarchical, functionally and phylogenetically annotated orthology resource based on 5090 organisms and 2502 viruses. Nucleic Acids Res. 2019;47(D1):D309–14. https://doi.org/10.1093/nar/gky1085.
Mitchell AL, Attwood TK, Babbitt PC, Blum M, Bork P, Bridge A, et al. InterPro in 2019: improving coverage, classification and access to protein sequence annotations. Nucleic Acids Res. 2019;47(D1):D351–60. https://doi.org/10.1093/nar/gky1100.
Buchfink B, Xie C, Huson DH. Fast and sensitive protein alignment using DIAMOND. Nat Methods. 2015;12(1):59–60. https://doi.org/10.1038/nmeth.3176.
Zheng Y, Jiao C, Sun HH, Rosli Hernan G, Pombo Marina A, Zhang P, et al. iTAK: a program for genome-wide prediction and classification of plant transcription factors, transcriptional regulators, and protein kinases. Mol Plant. 2016;9(12):1667–70. https://doi.org/10.1016/j.molp.2016.09.014.
Emms DM, Kelly S. OrthoFinder: solving fundamental biases in whole genome comparisons dramatically improves orthogroup inference accuracy. Genome Biol. 2015;16(1):157. https://doi.org/10.1186/s13059-015-0721-2.
Emms DM, Kelly S. OrthoFinder: phylogenetic orthology inference for comparative genomics. Genome Biol. 2019;20(1):238. https://doi.org/10.1186/s13059-019-1832-y.
Huang S, Ding J, Deng D, Tang W, Sun H, Liu D, et al. Draft genome of the kiwifruit Actinidia chinensis. Nat Commun. 2013;4(1):2640. https://doi.org/10.1038/ncomms3640.
Xia EH, Li FD, Tong W, Li PH, Wu Q, Zhao HJ, et al. Tea plant information archive: a comprehensive genomics and bioinformatics platform for tea plant. Plant Biotechnol J. 2019;17(10):1938–53. https://doi.org/10.1111/pbi.13111.
Lamesch P, Berardini TZ, Li DH, Swarbreck D, Wilks C, Sasidharan R, et al. The Arabidopsis information resource (TAIR): improved gene annotation and new tools. Nucleic Acids Res. 2012;40(D1):D1202–10. https://doi.org/10.1093/nar/gkr1090.
Beier S, Thiel T, Munch T, Scholz U, Mascher M. MISA-web: a web server for microsatellite prediction. Bioinformatics. 2017;33(16):2583–5. https://doi.org/10.1093/bioinformatics/btx198.
Untergasser A, Cutcutache I, Koressaar T, Ye J, Faircloth BC, Remm M, et al. Primer3-new capabilities and interfaces. Nucleic Acids Res. 2012;40(15):e115. https://doi.org/10.1093/nar/gks596.
Liu J, Chen T, Zhang YB, Li YK, Gong JY, Yi Y. The complete chloroplast genome of Rhododendron delavayi (Ericaceae). Mitochondrial DNA Part B-Resour. 2020;5(1):37–8. https://doi.org/10.1080/23802359.2019.1689860.
Li HE, Guo QQ, Li Q, Yang L. Long-reads reveal that Rhododendron delavayi plastid genome contains extensive repeat sequences, and recombination exists among plastid genomes of photosynthetic Ericaceae. Peerj. 2020. https://doi.org/10.7717/peerj.9048.
Shen JS, Li XQ, Zhu XT, Huang XL, Jin SH. Complete chloroplast genome of Rhododendron pulchrum, an ornamental medicinal and food tree. Mitochondrial DNA Part B-Resour. 2019;4(2):3527–8. https://doi.org/10.1080/23802359.2019.1676181.
Altschul SF, Gish W, Miller W, Myers EW, Lipman DJ. Basic local alignment search tool. J Mol Biol. 1990;215(3):403–10. https://doi.org/10.1016/S0022-2836(05)80360-2.
Wang YP, Tang HB, DeBarry JD, Tan X, Li JP, Wang XY, et al. MCScanX: a toolkit for detection and evolutionary analysis of gene synteny and collinearity. Nucleic Acids Res. 2012;40(7):e49. https://doi.org/10.1093/nar/gkr1293.
Deng W, Nickle DC, Learn GH, Maust B, Mullins JI. ViroBLAST: a stand-alone BLAST web server for flexible queries of multiple databases and user's datasets. Bioinformatics. 2007;23(17):2334–6. https://doi.org/10.1093/bioinformatics/btm331.
Buels R, Yao E, Diesh CM, Hayes RD, Munoz-Torres M, Helt G, et al. JBrowse: a dynamic web platform for genome visualization and analysis. Genome Biol. 2016;17(1):66. https://doi.org/10.1186/s13059-016-0924-1.
Yu G, Wang LG, Han Y, He QY. clusterProfiler: an R package for comparing biological themes among gene clusters. Omics. 2012;16(5):284–7. https://doi.org/10.1089/omi.2011.0118.
We would like to thank Editage (www.editage.cn) for English language editing.
This study was supported by grants from the National Natural Science Foundation of China (31760231), Construction of International Flower Technology Innovation Center and Industrialization of achievements (2019ZG006), Program of Science and Technology Talents Training in Yunnan province (2016HA005), Youth Program of National Natural Science Foundation of China (32000180), Yunnan Young & Elite Talents Project.
Ethics approval and consent to participate
Consent for publication
The authors declare that they have no competing interests.
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
About this article
Cite this article
Liu, N., Zhang, L., Zhou, Y. et al. The Rhododendron Plant Genome Database (RPGD): a comprehensive online omics database for Rhododendron. BMC Genomics 22, 376 (2021). https://doi.org/10.1186/s12864-021-07704-0
- Horticulture plant
- Functional genomics