Skip to main content

Complete chloroplast genome structural characterization of two Aerides (Orchidaceae) species with a focus on phylogenetic position of Aerides flabellata



The disputed phylogenetic position of Aerides flabellata Rolfe ex Downie, due to morphological overlaps with related species, was investigated based on evidence of complete chloroplast (cp) genomes. The structural characterization of complete cp genomes of A. flabellata and A. rosea Lodd. ex Lindl. & Paxton were analyzed and compared with those of six related species in “Vanda-Aerides alliance” to provide genomic information on taxonomy and phylogeny.


The cp genomes of A. flabellata and A. rosea exhibited conserved quadripartite structures, 148,145 bp and 147,925 bp in length, with similar GC content (36.7 ~ 36.8%). Gene annotations revealed 110 single-copy genes, 18 duplicated in inverted regions, and ten with introns. Comparative analysis across related species confirmed stable sequence identity and higher variation in single-copy regions. However, there are notable differences in the IR regions between two Aerides Lour. species and the other six related species. The phylogenetic analysis based on CDS from complete cp genomes indicated that Aerides species except A. flabellata formed a monophyletic clade nested in the subtribe Aeridinae, being a sister group to Renanthera Lour., consistent with previous studies. Meanwhile, a separate clade consisted of A. flabellata and six Vanda R. Br. species was formed, as a sister taxon to Holcoglossum Schltr.


This research was the first report on the complete cp genomes of A. flabellata. The results provided insights into understanding of plastome evolution and phylogenetic relationships of Aerides. The phylogenetic analysis based on complete cp genomes showed that A. flabellata should be placed in Vanda rather than in Aerides.

Peer Review reports


Aerides Lour. (Aeridinae, Vandeae, Epidendroideae, Orchidaceae) consists of about 29 species, which are distributed from India to Papua New Guinea [1,2,3]. There are five species recorded in China, including one endemic species, which occurs in Southern China [4]. The distinct fragrance emitted by Aerides species has made them a valuable source for the production of numerous artificial hybrids and cultivars [5].

Aerides has been a focus of taxonomic disagreement within the subtribe Aeridinae [3, 5,6,7]. Since Aerides was first described, many members previously placed in other genera have been moved into it [7]. Conversely, dozens of species once included in Aerides have now been removed into other related genera [7]. The intrageneric taxonomy of Aerides were questioned due to the transfer of several species to other genera, such as Ornithochilus (Lindl.) Wall. ex Heynh., Papilionanthe Schltr., and Seidenfadenia Garay [8, 9]. Aerides was characterized by the presence of two cleft pollinia and divided into five groups based predominantly on pollinia morphology [10, 11]. However, two cleft pollinia were observed in other related genera, including Brachypeza Garay, Phalaenopsis Bl., Rhynchostylis Bl., Vanda R. Br. and among others [7]. Then, the concept of the “Vanda-Aerides alliance”, comprising Aerides, Ascocentrum Schltr., Holcoglossum Schltr., Neofinetia Hu, Papilionanthe, Rhynchostylis and Vanda, was proposed [12], while the intergeneric delimitation has been controversial based on nuclear DNA data [3]. It is worth mentioning that the phylogenetic position of Aerides flabellate Rolfe ex Downie has been a focus issue [13, 14]. It was placed in Aerides based on an analysis using a plastid matK gene [15], but moved into Vanda in the latter treatment supported by an analysis of combined DNA datasets (nrITS and matK, trnL, trnL-F) [16].

The chloroplast (cp) genome has been increasingly utilized in taxonomy and phylogeny of Orchidaceae [17,18,19]. The complete cp genomes of six Aerides species (Aerides crassifolia C. S. P. Parish ex Burb., Aerides falcata Lindl. & Paxton, Aerides lawrenceae Rchb.f., Aerides odorata Lour., Aerides quinquevulnera Lindl., and Aerides rosea Lodd. ex Lindl. & Paxton) were published [20]. The results indicated that Aerides should be a separate clade within Aeridinae, sister to Renanthera Lour [20]. However, it should be noted that the complete cp genomic data of A. flabellata have not been reported. In this study, the structural and genomic information of the cp genomes of A. flabellata and A. rosea was characterized in detail and compared with those of six related species in the “Vanda-Aerides alliance”. The objectives of this study were: (1) to characterize and compare the complete cp genome structures of A. flabellata and A. rosea in detail, (2) to reconstruct the phylogenetic tree of Aeridinae to verify the position of A. flabellata, and (3) to provide new genomic data for a better understanding of the phylogeny of Aerides.


General data on the chloroplast genome

The depth of the assemblies was 494.99 (Aerides flabellata) and 240.80 (A. rosea) (Fig.S1). The structures of cp genomes of the two Aerides species are highly similar. The total sizes of two cp genomes were 148,145 bp (A. flabellata) and 147,925 bp (A. rosea) (Fig. 1, Table 1). Same as most angiosperms, their cp genome displayed a typical quadripartite structure with a large single-copy (LSC) region (84,905 bp, 85,317 bp), a small single-copy (SSC) region (11,636 bp, 11,018 bp), and two inverted repeats (IR) regions (25,802 bp, 25,795 bp). The two cp genomes were all AT-rich, overall GC content ranged from 36.7 ~ 36.8%. The GC content in IR regions (43.1 ~ 43.2%) was higher than in LSC (34 ~ 34.1%) and SSC regions (28.82%) (Table 1). The GC content of the three codon positions of the two cp genomes was very similar. Furthermore, the third codon position was related to codon bias and mRNA stability. However, the third letter GC (36.28%) content was lower than the first (37.18%) and second (36.80%) letter GC content in A. flabellata. In contrast, the third letter GC content (36.53%) was lower than the second (37.18%) letter GC content, but higher than the first letter GC (36.49%) content in A. rosea (Table 2). Both cp genomes contained 128 genes, including 2 (A. flabellata) ~ 3 (A. rosea) pseudogenes, 79 (A. rosea) ~ 80 (A. flabellata) CDS (coding sequences), eight rRNAs, and 38 tRNAs (Table 1). Among these, there were 110 unique genes in each cp genome. The LSC region contained 62 CDS genes and 21 tRNA genes in the two cp genomes. The SSC region comprised only one tRNA gene in the two cp genomes but eight CDS genes in A. flabellata and seven CDS genes in A. rosea. Six CDS genes (rpl2, rpl23, rps7, rps12, rps19, and ycf2), eight tRNA genes (trnA-UGC, trnH-GUG, trnI-CAU, trnI-GAU, trnL-CAA, trnN-GUU, trnR-ACG, and trnV-GAC), and four rRNA genes (rrn4.5, rrn5, rrn16, and rrn23) were repeated in the IR regions (Table S1). There were ten genes with introns in the two cp genomes, seven genes with one intron (rps16, rpoC1, rpl2, rpl16, petD, petB, and atpF), and the other three genes with two introns (clpP, ycf3, rps12) (Table S2). However, the length of ten intron-containing genes were different in the two Aerides species (Table S2). Only one of the ten intron-containing genes were in the IR regions, while the other genes spread across the LSC region. In addition, rps12 was a unique trans-splicing gene in which the first exon dispersed in the LSC region, but the second and third exons were in IR regions. Seven ndh (NA (D)H dehydrogenase) genes were identified in the cp genome of A. flabellata (ndh B/C/D/E/I/J/K) and A. rosea (ndh B/C/D/G/I/J/K) (Fig. 1, Table S1).

Fig. 1
figure 1

The chloroplast genome maps of Aerides flabellata and A. rosea. Internal genes were clockwise transcribed, while external genes were counterclockwise transcribed. The inside circle bright and dark gray coloring indicated the genome guanine-cytosine (GC) content

Table 1 The general genome characteristics of the two Aerides species
Table 2 The GC content of the three positions of the two Aerides species

Repeat sequences analysis

The number of SSRs was analyzed to elucidate allied species or intra-species variations. There were 57 (Aerides flabellata) and 76 (A. rosea) SSRs detected in the two cp genomes, respectively consisting of 39 mononucleotides, seven dinucleotides, four trinucleotides, five tetranucleotides, one pentanucleotide and one hexanucleotide in A. flabellata, but of 52 mononucleotides, 12 dinucleotides, six trinucleotides, four tetranucleotides, two pentanucleotides in A. rosea (Table 3). Repeat units were composed mainly of A or T, and the mononucleotides were A/T type rather than G/C type in the two cp genomes. Furthermore, the C/G mononucleotide and AAAT/ATTT type tetranucleotide only existed in A. flabellata (Fig. S2).

Table 3 The number of SSRs types distributed in different copy regions of the two Aerides species

Four different types of long repeats were also identified based on the complete genome sequence: complement (C), forward (F), palindromic (P), and reverse (R) (Table S3). Forty-nine large repeats were detected in the two cp genomes. In A. flabellata, almost all the repeats ranged from 20 to 39 bp, with the fewest in 40 ~ 49 bp. However, the number of long repeats above 40 bp in length was similar to the repeats from 20 to 39 bp in A. rosea. No complement repeats were detected above 40 bp in length, and they were rare even in the smaller size ranges (Table S3).

Codon usage analysis

Based on coding sequences (CDS), codon usage frequency and relative synonymous codon usage (RSCU) were computed in the cp genomes of the two Aerides species and other six related species from “Vanda-Aerides alliance” (Aerides falcata, A. lawrenceae, A. odorata, Vanda coerulea Griff. ex Lindl., V. coerulescens Griff., and V. subconcolor Tang & F. T. Wang) downloaded from NCBI ( (Table S4) [21]. These CDS were composed of 48,830 to 49,803 codons, respectively, and encoded 20 amino acids in the eight cp genomes (Fig. S3, Table S4). The RSCU value of seven chloroplast genomes was similar, except A. odorata, which possessed the lower RSCU of leucine (Leu) and the higher RSCU of serine (Ser). Among them, leucine (Leu: 9.65 ~ 10.46%) was the amino acid that was utilized the most frequently, whereas tryptophan (Trp: 1.27 ~ 1.45%) was the least ubiquitous amino acid in the eight cp genomes (Table S5). According to the RSCU value, the eight cp genome could be divided into five groups: 28 codons (RSCU > 1) and 33 codons (RSCU < 1) in A. odorata; 29 codons (RSCU > 1) and 31 codons (RSCU < 1) in A. falcata; 30 codons (RSCU > 1) and 32 codons (RSCU < 1) in A. flabellata & V. coerulea; 31 codons (RSCU > 1) and 30 codons (RSCU < 1) in A. lawrenceae & V. subconcolor; 31 codons (RSCU > 1) and 31 codons (RSCU < 1) in V. coerulescens; 32 codons (RSCU > 1) and 30 codons (RSCU < 1) in A. rosea (Table S4). Almost all CDS in the eight species had the standard ATG start codon, but rpl2 started with ATA/TAT. Among three stop codons, the TAA was the most common.

IR expansion and contraction

The cp genomes of the two Aerides species were highly conserved structurally, as well as those of the six species selected from “Vanda-Aerides alliance”. There were four boundaries (LSC/IRb, IRb/SSC, SSC/IRa, IRa/LSC) with structural variations (Fig. 2). The rpl22 gene was expanded from LSC to the IRb region. The rpl32 gene was present in the SSC region in the eight species. The trnN gene was observed in the IRa and IRb region in the eight species. Notably, the ycf1 gene was expanded from SSC to the IRa region in A. flabellata and three Vanda species, while it was only located in the SSC region in the other four Aerides species. In addition, the ycf1 gene was also present in the IRb region of V. coerulea and V. coerulescens, and it expanded from IRb to the SSC region in V. subconcolor, but it is absent in A. flabellata and A. rosea.

Fig. 2
figure 2

Comparison of the boundaries of LSC, SSC and IR regions among chloroplast genomes of the two Aerides species and six species selected from “Vanda-Aerides alliance”. The arrow indicated the number of bp representing genes that were distant from a particular region of the cp genome. JLB (LSC/IRb), JSB (IRb/SSC), JSA (SSC/IRa), and JLA (IRa/LSC) denoted the junction sites between each corresponding two regions on the cp genome

Structural comparison and divergence hotspot identification analysis

Using Aerides flabellata as the reference, the cp genome sequences were compared by mVISTA (Fig. 3). The IR regions were more stable than the LSC and the SSC regions, and the rRNA genes were highly conserved. Meanwhile, the non-coding regions (CNS) were more diverse than the coding regions. The exons of ycf1 and ycf2 gene exhibited the highest polymorphism.

Fig. 3
figure 3

Sequence alignment of chloroplast genomes of the two Aerides species and six species selected from “Vanda-Aerides alliance” using mVISTA. The vertical scale indicates the percentage of identity, ranging from 50 to 100%. The horizontal axis indicated the coordinates within the cp genome. Genome regions were color coded as exon, intron, and conserved non-coding sequences (CNS) and mRNA

It was shown that the Pi value of LSC and SSC regions was greater than those of the IR regions based on the examination of CDS DNA polymorphism, demonstrating that the former were more varied than the latter. Three out of 62 CDS possessed the highest Pi values: psbT (0.01753), ycf1 (0.01970) and rps12 (0.03228) (Fig. 4A, Table S5). There were two locations with high Pi value (> 0.05) for the IGS (intergenic spacer), including psbB_psbT (0.05291) and psbE_petL (0.08433) (Fig. 4B, Table S6). The Pi value of IGS locations (0.00 ~ 0.07, average 0.01965) was greater than that of CDS (0.00 ~ 0.024, average 0.00505) (Fig. 4, Table S5, S6).

Fig. 4
figure 4

Sliding window analysis of cp genomes of two Aerides species and six species selected from “Vanda-Aerides alliance”. A Comparison of the nucleotide diversity (Pi) among CDS regions. B Comparison of the nucleotide diversity among IGS regions. X-axis: position of the midpoint of a window; Y-axis: nucleotide diversity of each window. Highest variation hotspots for eight cp genomes are annotated on the graph. The colored lines at the bottom delineate these gene locations in different regions

Positive selection analysis

The Bayes Empirical Bayes (BEB) method identified 53 genes under positive selection, with rpl22, rps4, rps8, rps14, rps16, rps18, rpl32, ycf1, and ycf2 genes having two or more significant positive selection sites. Other genes had just one substantial positive selection site aside. The number of positive selections of genes in LSC was higher than in SSC and IR regions (Table 4, Table S7).

Table 4 The positive selection analysis of two Aerides species and six species selected from “Vanda-Aerides alliance”

Phylogenetic analysis

A Maximum-likelihood (ML) phylogenetic tree was reconstructed based on 62 single-copy CDS sequences of the two Aerides species and 45 representatives from Aeridinae, with six Polystachya species as outgroups, to shed a light on the phylogeny of Aerides, as well as the position of A. flabellate (Fig. 5, Table S8). A. flabellata and six Vanda species were formed as a stable clade with strong support (UFBoot: 100%), which was sister to Holcoglossum in the “Vanda-Aerides alliance”. It was shown that A. flabellata should be placed in Vanda, which was sister to V. coerulea with strong support (UFBoot: 98%). Meanwhile, six Aerides species formed a monophyletic clade, with A. rosea as the sister taxon to the other five species. This monophyletic clade of Aerides was also found to be sister to Renanthera. All the branch nodes in the clade of Aerides were strongly supported by the ML analysis.

Fig. 5
figure 5

Phylogenetic tree reconstructed of Aeridinae using Maximum-likelihood (ML) method based on 62 single-copy CDS sequences of 47 Aeridinae species, with six Polystachya species as outgroups


In this study, the complete cp genomes of Aerides flabellata and A. rosea were sequenced and compared with those of other six related species within “Vanda-Aerides alliance” to learn more about the cp genomic information and the molecular phylogeny of Aerides.

The cp genomes of Aerides flabellata and A. rosea were highly similar. Both cp genomes showed a typical quadripartite circular structure with the LSC and SSC regions partitioned by the IR regions, which were similar to the other orchids and most of the angiosperms with no significant differences [19, 22]. Notably, the genome size differed from previous research, with 79 ~ 80 CDS were annotated in these two cp genomes, as opposed to the 74 CDS reported previously [20]. The annotation of the ndh CDS caused this difference. A. flabellata and A. rosea contained seven ndh genes with five ~ six ndh CDS. In contrast, other Aerides species lacked some ndh genes or ndh CDS [20]. Eleven ndh genes in cp genomes encode the NAD(p)H dehydrogenase [23]. Previous research delineated Apostasioideae as ndh-complete, Vanilloideae as ndh-deleted, Cypripedioideae, Orchidoideae, and Epidendroideae as both ndh-complete and ndh-deleted. These findings suggested the presence of a complete functioning set of ndh genes in the common ancestor of orchids [24]. In certain photoautotrophic plants, the NDH complex is deemed unnecessary [24, 25]. Additionally, the GC content of the IR regions was much higher than that of the LSC and SSC regions, and these characteristics were also observed in Cardamine species [26]. This phenomenon is caused by the presence of rRNA and tRNA genes in the IR regions, which is the same as in other Orchidaceae cp genomes [18, 19].

Simple sequence repeats (SSRs), also known as microsatellites, represent shorter tandem repeats consisting of 1 ~ 6 bp repeat units dispersed widely across the cp genome, and could be used for phylogenetic analysis [18, 27,28,29]. A total of 57 SSRs were identified in Aerides flabellata, while 76 were detected in A. rosea. Notably, the count of SSRs in A. flabellata diverged from recent research on Aerides, which reported a total of 71 ~ 77 SSRs [20]. Mononucleotide repeats emerged as the most prevalent SSRs within the cp genomes of both A. flabellata and A. rosea. Similar to six Polystachya species and three Bulbophyllum species, cp SSRs are predominantly comprised of short poly-A or poly-T repeats, and the mononucleotide repeats are the most commonly encountered forms [18, 30]. Repeated sequences play a pivotal role in species evolution, as well as in the inheritance and variation of genes within species [31, 32]. These repetitive sequences were widely used in the studies on genetic diversity, population structure, and the identification of closely related species [20, 33, 34]. In this study, 49 long repeats were identified from the two Aerides cp genomes, indicating that the Aerides cp genome retained abundant genetic information. The above findings can provide a data basis for further studies on population genetics.

The formation of codons is a critical process in translating genetic information from mRNA to protein [35], which is influenced by codon bias, particularly the third base usage pattern [36]. It has been empirically established that the GC composition exerts an influence on the utilization of codons and amino acids, and the GC content of the third codon base (GC3) is deemed to most closely reflect codon usage trends [37]. Regarding Aerides species, the GC content observed in this study aligns with previous research [20]. Based on the RSCU analysis, six codons encoded arginine, leucine and serine. However, only one codon encoded methionine and tryptophan, which was also reported in other orchid species [19, 38].

The IR region is the most conservative section within the cp genome. However, its boundaries have demonstrated frequent contractions and expansions, associated with the evolution of the cp genome, representing the primary driver for variations in cp genome length [39, 40]. Unlike basal angiosperms and eudicots, most monocots typically harbor trnH-rps19 clusters in each IR region [41]. In this study, the trnH-rps19 clusters were also located in each IR region, which was consistent with other five Aerides species [20], Paphiopedilum henryanum Braem [42], Phalaenopsis stobartiana Rchb.f., P. wilsonii Rolfe [19], and Platanthera ussuriensis (Regel) Maxim [17]. The presence of the trnH-rps19 gene cluster in the IR of most monocots has been suggested as evidence of a duplication event predating the divergence of monocot lineages. Contractions and expansions in the IR borders have also been proposed to implicate taxonomic relationships among angiosperms [27, 41]. Additionally, Aerides crassifolia, A. quinquevulnera, A. lawrenceae, A. odorata, and A. falcata were consistent with A. rosea [20], wherein the ycf1 gene was exclusively located in the SSC region. In contrast, the ycf1 gene spanned the SSC and IRa regions in A. flabellata, aligning with observations in Vanda subconcolor.

Divergent regions, serving as valuable sources of data for DNA barcoding and phylogenetic research, were frequently employed as molecular markers in studies focused on phylogenetic reconstruction [43]. In this study, the nucleotide sequence of non-coding regions was more varied than the coding regions, which was generally consistent with other Orchidaceae cp genomes [18, 19]. Furthermore, the analysis of coding sequence regions revealed that the genes rps12, psbT and ycf1 had significantly higher Pi values. Notably, ycf1, akin to matK, has been utilized as a DNA marker for phylogenetic studies [43]. In this research, psbB_psbT and psbE_petL also possessed the higher degree of variability. Simultaneously, sequences such as trnS_trnG, psaC_ndhE, clpP_psbB, and others exhibited the highest degree of variability in Phalaenopsis [19], while rpl32_trnL, trnE_trnT, and others showed the highest degree of variability in Cymbidium Sw. [44]. These indicated a diversity array of highly variable sequences in the Orchidaceae cp genome.

The utilization of the substitution rate ratio at synonymous and nonsynonymous sites (dN/dS, ω) has been pivotal in discerning adaptive signals among species and inferring evolutionary processes [45, 46]. Additionally, it could suggest that environmental factors impacted the evolution of cp genomes, representing a primary cause for the divergence of numerous genes within the cp genome [47]. In this study, 53 genes were significantly identified under positive selection. Among them, the atpH, petL, and rps4 genes have also been observed in other orchids [19, 48]. Furthermore, these genes could be used for orchid identification and phylogenetic research.

Aerides flabellata (synonym: Vanda flabellata) has been a focus of considerable taxonomic disagreement [6, 49]. Some taxonomists placed it within Aerides on account of features such as a long column foot and motile lip [10], while others assigned it to Vanda, emphasizing the species’ short spur and broad lip [3, 5, 8, 50]. The species Christensonia vietnamica Haager, exhibiting morphological resemblances to both Vanda and Rhynchostylis [13], has been affiliated with A. flabellata, being described as ‘almost a yellow Aerides flabellata’ [13]. Therefore, A. flabellata and C. vietnamica were placed into Vanda based on combined DNA datasets (nrITS and matK, trnL, trnL-F) [3, 6, 15, 51].

The structural features of the cp genome have been utilized in constructing the phylogeny of Orchidaceae [17,18,19], because protein-coding regions and conserved sequences were informative for taxonomy [52]. In this study, based on CDS data from complete cp genomes, it was showed that Aerides flabellata was embedded within the clade of Vanda, while other six Aerides were grouped into a stable monophyletic clade. Therefore, it was supported that A. flabellata should be moved into Vanda from Aerides based on the comparative and the phylogenetic analyses.


The complete cp genomes of Aerides flabellata and A. rosea were sequenced and analyzed to unveil their genomic intricacies. This investigation encompassed a holistic exploration of various facets, including the general genome structure, codon usage, repeat sequences, boundaries within the inverted repeats, DNA polymorphism, and phylogenetic position. These cp genomic datasets were compared with the other six related species from the “Vanda-Aerides alliance”. It was confirmed that the cp genomic features of the “Vanda-Aerides alliance” was almost congruent and highly conserved, which could be used to understand the plastome evolution and evolutionary relationships of the “Vanda-Aerides alliance”. In addition, it was supported that A. flabellata should be removed into Vanda from Aerides based on cp genomic data.

Materials and methods

Ethical statement

No specific permits were required for the collection of specimens for this study. This research was carried out in compliance with the relevant laws of China.

Plant materials and chloroplast genome sequencing

Leaf samples of Aerides flabellata and A. rosea were cultivated and obtained from the Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences, Yunnan. The specimen was deposited in the Herbarium of Southwest Forestry University (HSFU, Lilu20180015, Genomic DNA of each sample was extracted from the silica gel-dried leaf tissues using the modified CTAB method with the TiangenDNA kit (TIANGEN, China) [53]. Paired-end libraries with an average insert size of approximately 400 bp were prepared using a TruSeq DNA Sample Prep Kit (Illumina, Inc., San Diego, CA, USA) according to the manufacturer’s instructions. The libraries were sequenced on the Illumina HiSeq 2500 platform at Personalbio (two times 150 bp; Illumina, Shanghai, China). Raw data were filtered using Fastp v0.23.1 to obtain high-quality reads by the sliding window method to drop the low-quality bases of each read’s head and tail [54].

Chloroplast genome assembly and annotation

The two complete cp genomes from the clean reads were assembled by the GetOrganelle version [55] and annotated the new sequences using the Geneious Prime version 2020.0.4 [56]. The complete cp genomes sequences of Aerides flabellata and A. rosea were submitted to GenBank (Accession number: PP003956 and PP003955). The circular genome maps were drawn by the OGDRAW program ( [44].

Sequence analysis and statistics

The repetitive structures, repeat sizes, and locations of forward match (F), reverse match (R), palindromic match (P), and complementary match (C) nucleotide repeat sequences were identified by REPuter v2.74 ( [57], with maximal repeat size se to 50 bp, minimal repeat size set to 20 bp, and hamming distance set to 3 [20]. By setting the minimum number of repeats to 10, 5, 4, 3, and 3 for mononucleotide (mono-), dinucleotide (din-), trinucleotide (tri-), tetranucleotide (tetra-), pentanucleotide (penta-), and hexanucleotide (hexan-), respectively, simple sequence repeats (SSR), a tract of repetitive DNA that typically ranges in length from 1 to 6 nucleotides, were detected via MISA ( [58, 59]. Condon usage was analyzed by MEGA11 software [60], and the relative synonymous codon usage (RSCU) and amino acid frequencies were calculated with default settings [61]. Finally, the RSCU figure was drawn by PhyloSuite version 1.2.2 [62, 63]. In addition, the GC content of the three position was analyzed by CUSP on EMBOSS program ( [64].

Sequence divergence and genome comparison

The pairwise alignments and sequence divergence of Aerides flabellata and A. rosea with other six related species from “Vanda-Aerides alliance” (Table S9) were performed by the mVISTA with Shuffle-LAGAN mode ( [65]. Using an online application CPJSdraw v1.0.0 (, the contraction and extension of the IR borders between the four major areas (LSC/IRa/SSC/IRb) of the eight cp genome sequences were performed [66].

Positive selection analysis

The CDS sequences of Aerides flabellata and A. rosea with other six related species from “Vanda-Aerides alliance” (Table S9) were extracted by PhyloSuite version 1.2.2 [62, 63], and the single-copy CDS sequences were aligned by MAFFT version 7 [67]. The phylogenetic tree based on CDS was platformed by MEGA 11 with Neighbor-Joining (NJ) methods [60]. The non-synonymous (dN) and synonymous (dS) substitution rates were calculated by the CodeML algorithm implemented in EasyCodeML [68] and selected the M8 mode for selection suites to detect the protein-coding genes under selection in the two Aerides species and six related species.

Phylogenetic analysis

For phylogenetic analysis, the cp genomes of 53 species were selected (Table S9). The ingroup contains the genomes of 47 Aeridinae species, which 45 species were downloaded from the NCBI database. As Polystachyinae was sister to Aeridinae [18], six species from Polystachyinae were selected as outgroups. The single-CDS sequences (Table S8) from cp genomes were used for the phylogenetic analysis. These single-CDS sequences were extracted by PhyloSuite version 1.2.2 [62, 63], aligned by MAFFT version 7 [67], trimmed by Gblocks [69], and concatenated by plugins in PhyloSuite version 1.2.2 [62, 63]. The Maximum-Likelihood (ML) tree was performed in GTR + F + R2 mode based on CDS sequences by IQ-TREE 2 with 5000 ultrafast bootstrap (UFBoot) [70,71,72].

Availability of data and materials

The datasets generated or analyzed during the current study are available in the NCBI BioProject (PRJNA994440 and PRJNA995179, SRA: SRR25256624 and SRR25293872).


  1. Chase MW, Cameron KM, Freudenstein JV, Pridgeon AM, Salazar G, van den Berg C, et al. An updated classification of Orchidaceae. Bot J Linn Soc. 2015;177:151–74.

    Article  Google Scholar 

  2. Dressler R. Phylogeny and Classification of Orchid Family. Cambridge: Cambridge University Press; 1993.

    Google Scholar 

  3. Kocyan A, de Vogel EF, Conti E, Gravendeel B. Molecular phylogeny of Aerides (Orchidaceae) based on one nuclear and two plastid markers: A step forward in understanding the evolution of the Aeridinae. Mol Phylogenet Evol. 2008;48:422–43.

    Article  CAS  PubMed  Google Scholar 

  4. Chen XQ, Wood JJ. Aerides Lour. In: Flora of China: Orchidaceae. Vol. 25. Beijing: Science Press; 2009. p. 485–6.

  5. Christenson EA. Nomenclatural Changes in the Orchidaceae Subtribe Sarcanthinae. Selbyana. 1986;9:167–70.

    Google Scholar 

  6. Fan J, Qin H-N, Li D-Z, Jin X-H. Molecular phylogeny and biogeography of Holcoglossum (Orchidaceae: Aeridinae) based on nuclear ITS, and chloroplast trnL-F and matK. Taxon. 2009;58:849–61.

    Article  Google Scholar 

  7. Pridgeon AM, Cribb PJ, Chase MW, Rasmussen FN. Genera Orchidacearum Volume 6: Epidendroideae (Part 3). Oxford, New York: Oxford University Press; 2014. p. 133–7.

    Google Scholar 

  8. Garay LA. On the Systematics of the Monopodial Orchids I. Bot Mus Leafl Harv Univ. 1972;23:149–212.

    Google Scholar 

  9. Garay LA. On the Systematics of the Monopodial Orchids II. Bot Mus Leafl Harv Univ. 1974;23:369–75.

    Google Scholar 

  10. Seidenfaden G. Orchid Genera in Thailand XIV: Fifty-nine Vandoid Genera. Copenhagen: Council for Nordic Publications in Botany; 1988.

    Google Scholar 

  11. Senghas K. 50. Subtribus: Aeridinae (‘Sarcanthinae’). In: Die Orchideen, 3rd edition, Vol. I/B. Berlin: Blackwell; 1996. p. 1131–422.

  12. Christenson EA, Saito K, Tanaka R. In: Proceedings of the 12th World Orchid Conference 1987. In: The taxonomy of Aerides and related genera. 1st ed edition. Tokyo: 12th World Orchid Conference Organizing Committee; 1987. p. 35–40.

    Google Scholar 

  13. Christenson EA. Taxonomy of the Aeridinae with an infrageneric classification of Vanda Jones ex R. Br. In: Proceedings of the 14th World Orchid Conference. Edinburgh: HMSO Publications; 1994. p. 206–16.

    Google Scholar 

  14. Gardiner LM, Kocyan A, Motes M, Roberts DL, Emerson BC. Molecular phylogenetics of Vanda and related genera (Orchidaceae). Bot J Linn Soc. 2013;173:549–72.

    Article  Google Scholar 

  15. Topik H, Yukawa T, Ito M. Molecular phylogenetics of subtribe Aeridinae (Orchidaceae): insights from plastid matK and nuclear ribosomal ITS sequences. J Plant Res. 2005;118:271–84.

    Article  CAS  PubMed  Google Scholar 

  16. Zou L-H, Huang J-X, Zhang G-Q, Liu Z-J, Zhuang X-Y. A molecular phylogeny of Aeridinae (Orchidaceae: Epidendroideae) inferred from multiple nuclear and chloroplast regions. Mol Phylogenet Evol. 2015;85:247–54.

    Article  PubMed  Google Scholar 

  17. Han C, Ding R, Zong X, Zhang L, Chen X, Qu B. Structural characterization of Platanthera ussuriensis chloroplast genome and comparative analyses with other species of Orchidaceae. BMC Genomics. 2022;23:84.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  18. Jiang H, Tian J, Yang J, Dong X, Zhong Z, Mwachala G, et al. Comparative and phylogenetic analyses of six Kenya Polystachya (Orchidaceae) species based on the complete chloroplast genome sequences. BMC Plant Biol. 2022;22:177.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  19. Tao L, Duan H, Tao K, Luo Y, Li Q, Li L. Complete chloroplast genome structural characterization of two Phalaenopsis (Orchidaceae) species and comparative analysis with their alliance. BMC Genomics. 2023;24:359.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  20. Chen J, Wang F, Zhou C, Ahmad S, Zhou Y, Li M, et al. Comparative Phylogenetic Analysis for Aerides (Aeridinae, Orchidaceae) Based on Six Complete Plastid Genomes. Int J Mol Sci. 2023;24:12473.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  21. National Center for Biotechnology Information (NCBI)[Internet]. Bethesda (MD): National Library of Medicine (US), National Center for Biotechnology Information; [1988] – [cited 2024 Apr 09]. Available from:

  22. Biju VC, P.R. S, Vijayan S, Rajan VS, Sasi A, Janardhanan A, et al. The Complete Chloroplast Genome of Trichopus zeylanicus, And Phylogenetic Analysis with Dioscoreales. The Plant Genome. 2019;12:190032.

    Article  CAS  Google Scholar 

  23. Lin C-S, Chen JJW, Chiu C-C, Hsiao HCW, Yang C-J, Jin X-H, et al. Concomitant loss of NDH complex-related genes within chloroplast and nuclear genomes in some orchids. Plant J. 2017;90:994–1006.

    Article  CAS  PubMed  Google Scholar 

  24. Lin C-S, Chen JJW, Huang Y-T, Chan M-T, Daniell H, Chang W-J, et al. The location and translocation of ndh genes of chloroplast origin in the Orchidaceae family. Sci Rep. 2015;5:9040.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  25. Liu D-K, Tu X-D, Zhao Z, Zeng M-Y, Zhang S, Ma L, et al. Plastid phylogenomic data yield new and robust insights into the phylogeny of Cleisostoma-Gastrochilus clades (Orchidaceae, Aeridinae). Mol Phylogenet Evol. 2020;145: 106729.

    Article  PubMed  Google Scholar 

  26. Hu S, Sablok G, Wang B, Qu D, Barbaro E, Viola R, et al. Plastome organization and evolution of chloroplast genes in Cardamine species adapted to contrasting habitats. BMC Genomics. 2015;16:306.

    Article  PubMed  PubMed Central  Google Scholar 

  27. Agrama HA, Tuinstra MR. Phylogenetic diversity and relationship sorghum accessions using SSRs and RAPDs. Afr J Biotech. 2003;2:334–40.

    Article  CAS  Google Scholar 

  28. Li X, Zhao Y, Tu X, Li C, Zhu Y, Zhong H, et al. Comparative analysis of plastomes in Oxalidaceae: Phylogenetic relationships and potential molecular markers. Plant Diversity. 2021;43:281–91.

    Article  PubMed  PubMed Central  Google Scholar 

  29. Madhumati B. Potential and application of molecular markers techniques for plant genome analysis. International Journal of Pure & Applied Bioscience. 2014;2:169–88.

    Google Scholar 

  30. Yang J, Zhu Z, Fan Y, Zhu F, Chen Y, Niu Z, et al. Comparative plastomic analysis of three Bulbophyllum medicinal plants and its significance in species identification. Acta Pharmaceutica Sinica. 2020;55:2736–45.

    Google Scholar 

  31. Chen Y, Hu N, Wu H. Analyzing and Characterizing the Chloroplast Genome of Salix wilsonii. Biomed Res Int. 2019;2019:5190425.

    PubMed  PubMed Central  Google Scholar 

  32. Khan A, Asaf S, Khan AL, Al-Harrasi A, Al-Sudairy O, AbdulKareem NM, et al. First complete chloroplast genomics and comparative phylogenetic analysis of Commiphora gileadensis and C. foliacea: Myrrh producing trees. PLOS ONE. 2019;14:e0208511.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  33. Singh RB, Mahenderakar MD, Jugran AK, Singh RK, Srivastava RK. Assessing genetic diversity and population structure of sugarcane cultivars, progenitor species and genera using microsatellite (SSR) markers. Gene. 2020;753: 144800.

    Article  CAS  PubMed  Google Scholar 

  34. Yu J, Dossa K, Wang L, Zhang Y, Wei X, Liao B, et al. PMDBase: a database for studying microsatellite DNA and marker development in plants. Nucleic Acids Res. 2017;45:D1046–53.

    Article  CAS  PubMed  Google Scholar 

  35. Qiu S, Zeng K, Slotte T, Wright S, Charlesworth D. Reduced Efficacy of Natural Selection on Codon Usage Bias in Selfing Arabidopsis and Capsella Species. Genome Biol Evol. 2011;3:868–80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  36. Shang M, Liu F, Hua J, Wang K. Analysis on codon usage of chloroplast genome of Gossypium hirsutum. Scientia Agricultura Sinica. 2011;44:245–53.

    CAS  Google Scholar 

  37. Chen L, Liu T, Yang D, Nong X, Xie Y, Fu Y, et al. Analysis of codon usage patterns in Taenia pisiformis through annotated transcriptome data. Biochem Biophys Res Commun. 2013;430:1344–8.

    Article  CAS  PubMed  Google Scholar 

  38. Alzahrani DA, Yaradua SS, Albokhari EJ, Abba A. Complete chloroplast genome sequence of Barleria prionitis, comparative chloroplast genomics and phylogenetic relationships among Acanthoideae. BMC Genomics. 2020;21:393.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  39. Dugas DV, Hernandez D, Koenen EJM, Schwarz E, Straub S, Hughes CE, et al. Mimosoid legume plastome evolution: IR expansion, tandem repeat expansions and accelerated rate of evolution in clpP. Sci Rep. 2015;5:16958.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  40. Raubeson LA, Peery R, Chumley TW, Dziubek C, Fourcade HM, Boore JL, et al. Comparative chloroplast genomics: analyses including new sequences from the angiosperms Nuphar advena and Ranunculus macranthus. BMC Genomics. 2007;8:174.

    Article  PubMed  PubMed Central  Google Scholar 

  41. Wang R-J, Cheng C-L, Chang C-C, Wu C-L, Su T-M, Chaw S-M. Dynamics and evolution of the inverted repeat-large single copy junctions in the chloroplast genomes of monocots. BMC Evol Biol. 2008;8:36.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  42. Liu H, Ye H, Zhang N, Ma J, Wang J, Hu G, et al. Comparative Analyses of Chloroplast Genomes Provide Comprehensive Insights into the Adaptive Evolution of Paphiopedilum (Orchidaceae). Horticulturae. 2022;8:391.

    Article  Google Scholar 

  43. Menezes APA, Resende-Moreira LC, Buzatti RSO, Nazareno AG, Carlsen M, Lobo FP, et al. Chloroplast genomes of Byrsonima species (Malpighiaceae): comparative analysis and screening of high divergence sequences. Sci Rep. 2018;8:2210.

    Article  PubMed  PubMed Central  Google Scholar 

  44. Shaw J, Shafer HL, Leonard OR, Kovach MJ, Schorr M, Morris AB. Chloroplast DNA sequence utility for the lowest phylogenetic and phylogeographic inferences in angiosperms: The tortoise and the hare IV. Am J Bot. 2014;101:1987–2004.

    Article  PubMed  Google Scholar 

  45. Kryazhimskiy S, Plotkin JB. The population genetics of dN/dS. PLoS Genet. 2008;4: e1000304.

    Article  PubMed  PubMed Central  Google Scholar 

  46. Williams MJ, Zapata L, Werner B, Barnes CP, Sottoriva A, Graham TA. Measuring the distribution of fitness effects in somatic evolution by combining clonal dynamics with dN/dS ratios. Elife. 2020;9: e48714.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  47. Zuo L-H, Shang A-Q, Zhang S, Yu X-Y, Ren Y-C, Yang M-S, et al. The first complete chloroplast genome sequences of Ulmus species by de novo sequencing: Genome comparative and taxonomic position analysis. PLoS ONE. 2017;12: e0171264.

    Article  PubMed  PubMed Central  Google Scholar 

  48. Tang H, Tang L, Shao S, Peng Y, Li L, Luo Y. Chloroplast genomic diversity in Bulbophyllum section Macrocaulia (Orchidaceae, Epidendroideae, Malaxideae): Insights into species divergence and adaptive evolution. Plant Divers. 2021;43:350–61.

    Article  PubMed  PubMed Central  Google Scholar 

  49. Zhang G-Q, Liu K-W, Chen L-J, Xiao X-J, Zhai J-W, Li L-Q, et al. A New Molecular Phylogeny and a New Genus, Pendulorchis, of the Aerides-Vanda Alliance (Orchidaceae: Epidendroideae). PLoS ONE. 2013;8: e60097.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  50. Motes MR. Vandas: their botany, history, and culture. Portland, Or: Timber Press; 1997.

    Google Scholar 

  51. Carlsward BS, Whitten WM, Williams NH, Bytebier B. Molecular phylogenetics of Vandeae (Orchidaceae) and the evolution of leaflessness. Am J Bot. 2006;93:770–86.

    Article  CAS  PubMed  Google Scholar 

  52. Bobik K, Burch-Smith TM. Chloroplast signaling within, between and beyond cells. Front Plant Sci. 2015;6:781.

    Article  PubMed  PubMed Central  Google Scholar 

  53. Healey A, Furtado A, Cooper T, Henry RJ. Protocol: a simple method for extracting next-generation sequencing quality genomic DNA from recalcitrant plant species. Plant Methods. 2014;10:21.

    Article  PubMed  PubMed Central  Google Scholar 

  54. Chen S, Zhou Y, Chen Y, Gu J. fastp: an ultra-fast all-in-one FASTQ preprocessor. Bioinformatics. 2018;34:i884–90.

    Article  PubMed  PubMed Central  Google Scholar 

  55. Jin J-J, Yu W-B, Yang J-B, Song Y, dePamphilis CW, Yi T-S, et al. GetOrganelle: a fast and versatile toolkit for accurate de novo assembly of organelle genomes. Genome Biol. 2020;21:241.

    Article  PubMed  PubMed Central  Google Scholar 

  56. Kearse M, Moir R, Wilson A, Stones-Havas S, Cheung M, Sturrock S, et al. Geneious Basic: an integrated and extendable desktop software platform for the organization and analysis of sequence data. Bioinformatics. 2012;28:1647–9.

    Article  PubMed  PubMed Central  Google Scholar 

  57. Kurtz S, Choudhuri JV, Ohlebusch E, Schleiermacher C, Stoye J, Giegerich R. REPuter: the manifold applications of repeat analysis on a genomic scale. Nucleic Acids Res. 2001;29:4633–42.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  58. Beier S, Thiel T, Münch T, Scholz U, Mascher M. MISA-web: a web server for microsatellite prediction. Bioinformatics. 2017;33:2583–5.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  59. Thiel T, Michalek W, Varshney R, Graner A. Exploiting EST databases for the development and characterization of gene-derived SSR-markers in barley (Hordeum vulgare L.). Theor Appl Genet. 2003;106:411–22.

    Article  CAS  PubMed  Google Scholar 

  60. Kumar S, Nei M, Dudley J, Tamura K. MEGA: A biologist-centric software for evolutionary analysis of DNA and protein sequences. Brief Bioinform. 2008;9:299–306.

    Article  CAS  PubMed  Google Scholar 

  61. Bylaiah S, Shedole S, Suresh KP, Gowda L, Patil SS, Indrabalan UB. Analysis of Codon Usage Bias in Cya, Lef, and Pag Genes Exists in px01 Plasmid of Bacillus Anthracis. In: Fong S, Dey N, Joshi A, editors. ICT Analysis and Applications. Singapore: Springer Nature; 2022. p. 1–9.

    Google Scholar 

  62. Xiang C-Y, Gao F, Jakovlić I, Lei H-P, Hu Y, Zhang H, et al. Using PhyloSuite for molecular phylogeny and tree-based analyses. iMeta. 2023;2:e87.

    Article  Google Scholar 

  63. Zhang D, Gao F, Jakovlić I, Zou H, Zhang J, Li WX, et al. PhyloSuite: An integrated and scalable desktop platform for streamlined molecular sequence data management and evolutionary phylogenetics studies. Mol Ecol Resour. 2020;20:348–55.

    Article  PubMed  Google Scholar 

  64. Rice P, Longden I, Bleasby A. EMBOSS: the European Molecular Biology Open Software Suite. Trends Genet. 2000;16:276–7.

    Article  CAS  PubMed  Google Scholar 

  65. Brudno M, Malde S, Poliakov A, Do CB, Couronne O, Dubchak I, et al. Glocal alignment: finding rearrangements during alignment. Bioinformatics. 2003;19(Suppl 1):i54–62.

    Article  PubMed  Google Scholar 

  66. Li H, Guo Q, Xu L, Gao H, Liu L, Zhou X. CPJSdraw: analysis and visualization of junction sites of chloroplast genomes. PeerJ. 2023;11: e15326.

    Article  PubMed  PubMed Central  Google Scholar 

  67. Katoh K, Standley DM. MAFFT Multiple Sequence Alignment Software Version 7: Improvements in Performance and Usability. Mol Biol Evol. 2013;30:772–80.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  68. Gao F, Chen C, Arab DA, Du Z, He Y, Ho SYW. EasyCodeML: A visual tool for analysis of selection using CodeML. Ecol Evol. 2019;9:3891–8.

    Article  PubMed  PubMed Central  Google Scholar 

  69. Talavera G, Castresana J. Improvement of Phylogenies after Removing Divergent and Ambiguously Aligned Blocks from Protein Sequence Alignments. Syst Biol. 2007;56:564–77.

    Article  CAS  PubMed  Google Scholar 

  70. Hoang DT, Chernomor O, von Haeseler A, Minh BQ, Vinh LS. UFBoot2: Improving the Ultrafast Bootstrap Approximation. Mol Biol Evol. 2018;35:518–22.

    Article  CAS  PubMed  Google Scholar 

  71. Kalyaanamoorthy S, Minh BQ, Wong TKF, von Haeseler A, Jermiin LS. ModelFinder: fast model selection for accurate phylogenetic estimates. Nat Methods. 2017;14:587–9.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

  72. Minh BQ, Schmidt HA, Chernomor O, Schrempf D, Woodhams MD, von Haeseler A, et al. IQ-TREE 2: New Models and Efficient Methods for Phylogenetic Inference in the Genomic Era. Mol Biol Evol. 2020;37:1530–4.

    Article  CAS  PubMed  PubMed Central  Google Scholar 

Download references


We thank Dr. Fei Zhao for suggestions and for revising the article and Associate Professor Yuxiao Zhang for providing the computer server.


This study was supported by the National Nature Science Foundation of China (NSFC 32060049).

Author information

Authors and Affiliations



K.T. and L.T. collaborated on the analysis and writing of this manuscript. Y.L. provided the material. J.H. and H.D. collected the material. LL undertook the formal identification of the plant material. L.L. and Y.L. contributed to the design and editing of this manuscript. All authors reviewed and approved the final manuscript.

Corresponding authors

Correspondence to Yan Luo or Lu Li.

Ethics declarations

Ethics approval and consent to participate

The study was conducted the plant material that complies with relevant institutional, national, and international guidelines and legislation. Aerides flabellata and A. rosea were cultivated in Xishuangbanna Tropical Botanical Garden, Chinese Academy of Sciences.

Consent for publication

Not applicable.

Competing interests

The authors declare no competing interests.

Additional information

Publisher's Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary Information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and permissions

About this article

Check for updates. Verify currency and authenticity via CrossMark

Cite this article

Tao, K., Tao, L., Huang, J. et al. Complete chloroplast genome structural characterization of two Aerides (Orchidaceae) species with a focus on phylogenetic position of Aerides flabellata. BMC Genomics 25, 552 (2024).

Download citation

  • Received:

  • Accepted:

  • Published:

  • DOI: