Skip to main content

Comparative genomics and transcriptomics analysis reveals evolution patterns of selection in the Salix phylogeny



Willows are widely distributed in the northern hemisphere and have good adaptability to different living environment. The increasing of genome and transcriptome data provides a chance for comparative analysis to study the evolution patterns with the different origin and geographical distributions in the Salix phylogeny.


Transcript sequences of 10 Salicaceae species were downloaded from public databases. All pairwise of orthologues were identified by comparative analysis in these species, from which we constructed a phylogenetic tree and estimated the rate of diverse. Divergence times were estimated in the 10 Salicaceae using comparative transcriptomic analysis. All of the fast-evolving positive selection sequences were identified, and some cold-, drought-, light-, universal-, and heat- resistance genes were discovered.


The divergence time of subgenus Vetrix and Salix was about 17.6–16.0 Mya during the period of Middle Miocene Climate Transition (21–14 Mya). Subgenus Vetrix diverged to migratory and resident groups when the climate changed to the cool and dry trend by 14 Mya. Cold- and light- stress genes were involved in positive selection among the resident Vetrix, and which would help them to adapt the cooling stage. Universal- stress genes exhibited positive selection among the migratory group and subgenus Salix. These data are useful for comprehending the adaptive evolution and speciation in the Salix lineage.


Willows (genus Salix) are widely distributed in the northern hemisphere, ranging around the North Temperate Zone, and are the most important source of wood in forests [1,2,3]. Salix is a large and complex genus with about 450–520 species [1,2,3,4], which is under the spotlight with the genome projects’ completion of Salix purpurea [5] and Salix suchowensis [6]. Many studies have focused on this genus, particularly with regard to its phylogenetic relationships [7,8,9,10,11,12,13,14,15], the timing of diversification events [13, 15,16,17,18] and environmental stress tolerance [19,20,21,22]. Unfortunately, there is still a lot of controversy over the origin and speciation, divergence time and evolution patterns in the genus Salix.

A widely used classification system was proposed by Skvortsov, which divided the genus Salix into three subgenera Salix, Vetrix and Chamaetia [1]. The evidence of morphological taxonomy suggests that the subgenus Vetrix has passed two stages in its development [1]. When the climate became colder [23], the thermophilic groups either became extinct or moved south (Southern China and Southeast Asia), like Section Eriostachyae, Daltonianae and Denticulatae et al. Thus the hardy ones stayed and drastically expanded their ranges. At the same time, another younger and hardier formation of the subgenus Salix expanded across the northern hemisphere being represented by a number of boreal sections. However, no study explains how willows went through the long-distance migrations and how the resident and migratory groups adapted to the varied environments from high to low latitudes during the long evolutionary history.

Transcriptome sequencing technology can rapidly and economically obtain all RNA information of organisms at one time, which playing an important role in finding molecular markers and function genes for biology research [24, 25]. As more and more species had been completed transcriptome sequencing, comparative transcriptomics has received more attention from researchers [26,27,28,29,30]. Comparative transcriptomics can explain the phylogenetic relationships based on multiple species, and answer the functional differences between orthologous genes after species divergence in different living environment.

In this study, transcript sequences of 9 Salix and one Populus [31] were downloaded from public databases (Table 1). Among them, S. matsudana and S. babylonica belong to subgenus Salix, other 7 willow species belong to subgenus Vetrix. Section Psilostigmatae (Fig. 1) named by Flora of China shares some species (like S. salwinensis) with migratory section Daltonianae, so S. fargesii of section Psilostigmatae is as the possible migratory species of Vetrix. Most species of Vetrix are mainly distributed in North China or further north areas except S. fargesii (Additional file 1: Table S1). Comparative genomics and transcriptomics were subsequently analyzed in 10 Salicaceae species. A number of positive selection genes were found to be related to environmental factors in the Salix phylogeny.

Table 1 Data source of 9 Salix and one populus species. Main geographic distribution of 9 Salix is shown in Additional file 1: Table S1
Fig. 1

Geographical distribution of main species in section Psilostigmatae. The original information was from Flora of China (Additional file 1: Table S2)


Transcript sequences of 10 Salicaceae species

The average number of transcripts was about 40,649 in 10 Salicaceae species (Table 2), and S. matsudana had the largest number of unigenes (70,671), while S. babylonica had the least (3586). There are respectively 36,948, 26,599 and 37,865 annotated genes in the genomes of P. trichocarpa, S. suchowensis and S. purpurea. And these genes made up a total of 37 Mb, 34 Mb and 44 Mb cDNA sequences with a mean size of 1052 bp, 1344 bp and 1208 bp. More than 17,911 (28.2%), 8572 (32.2%) and 10,534 (27.8%) cDNAs has the length of > = 1500 bp in P. trichocarpa, S. suchowensis and S. purpurea (Additional file 2: Figure S1.). By contrast, there are 47,753, 50,429, 36,191, 51,717, 70,617, 3586 and 45,719 unigenes in the transcriptomes of S. sachalinensis, S. dasyclados, S. viminalis, S. eriocephala, S. matsudana, S. babylonica and S. fargesii, which respectively made up a total of 29.1, 30.4, 30.2, 32.6, 54.0, 2.4 and 27.2 Mb sequences with a mean size of 638, 632, 874, 660, 802, 714 and 624 bp. And more than 77, 77, 61, 75, 62, 92 and 78% unigenes had the length of < 1000 bp in the transcriptomes of S. sachalinensis, S. dasyclados, S. viminalis, S. eriocephala, S. matsudana, S. babylonica and S. fargesii.

Table 2 Transcript sequences in 10 Salicaceae species

SSRs identified in 10 Salicaceae species

A total of 3002, 2247, 1818, 1876, 1868, 2181, 3268, 234, 2139 and 2690 distinct SSRs were identified in S. purpurea, S. suchowensis S. sachalinensis, S. dasyclados, S. viminalis, S. eriocephala, S. matsudana, S. babylonica, S. fargesii and P. trichocarpa, and the incidences of different repeat types were determined (Table 3). Among the different classes of SSRs, the tri-nucleotide repeats were the most abundant (83, 81, 80, 81, 80, 80, 83, 48, 80 and 82%) and di-nucleotides were the second type (9, 11, 12, 12, 12, 13, 10, 36, 12 and 9%). Among the di-nucleotide repeats, AG/CT type showed the largest number in 10 Salicaceae species. Among the tri-nucleotide repeats, AGG/CCT type showed the largest number in S. purpurea, S. suchowensis, S. sachalinensis, S. dasyclados, S. eriocephala and S. babylonica, but AGC/CTG in S. viminalis, S. matsudana and S. fargesii.

Table 3 SSRs identified in 10 Salicaceae species

Orthologue identification and functional characterization between 10 Salicaceae species

All of the pairwise orthologues were identified by comparative analysis between the 10 Salicaceae species (Table 4). The results showed that S. purpurea had the maximum average number (6597) of orthologous genes, whereas S. babylonica had the minimum average number (707). The highest number of orthologous genes (9713) was found between S. purpurea and S. suchowensis, while the lowest number (681) was found between S. babylonica and S. fargesii. 238 single copy orthologues were found in all 10 Salicaceae species (Fig. 2). The orthologues were annotated with GO terms (Additional file 1: Table S3). Taking P. trichocarpa as an out-group species, the phylogenetic tree of Salix was constructed based on combined 238 orthologous transcripts using Maximum Likelihood (ML) method (Fig. 4).

Table 4 Number and Ks peaks of orthologous genes in 10 Salicaceae species
Fig. 2

Functional annotation and divergence between orthologues of 9 Salix and one Populus species. The heat map is based on the 238 putatively orthologous transcripts of 10 species (Named orth1-orth238). The orthologues were annotated to different function with Gene Ontology (GO) (Additional file 1: Table S3). The sequences of 238 orthologues are shown in Additional file 3

Phylogenetic analysis and divergence time

The genetic distance of species was related to synonymous mutation rate calculated by orthologous genes, so the synonymous mutation rates of all pairs of orthologues were estimated in 10 Salicaceae species (Table 4). Between different branches (Fig. 3), S. purpurea has the Ks peak (0.02) with S. suchowensis, S. sachalinensis and S. dasyclados, 0.03 Ks peak with S. viminalis, S. eriocephala and S. fargesii, 0.04 Ks peak with S. matsudana, 0.05 Ks peak with S. babylonica, and the maximum Ks peak 0.11 with out-group P. trichocarpa. Between different genera, most of Salix species has the Ks peak 0.11, whereas S. fargesii was found the minimum Ks peak 0.10 with P. trichocarpa (Table 4). It is suggested S. fargesii was a relatively ancient species compared to others.

Fig. 3

Distribution of Ks values of orthologous pairs between S. purpurea and others

Using P. trichocarpa as an out-group species, the phylogenetic tree of Salix was derived with the pairwise Ks values of the orthologous transcripts as a distance metric based on the neighbour-joining (NJ) method (Fig. 4). In the phylogenetic tree, the average Ks value is 0.11 between Genus Salix and Genus Salix (Calculated by Table 4), and which is nearly consistent with the value of 0.12 in previous studies [18]. Based on existing fossil evidence, the divergence time of genera Salix and Populus was about 48 million years old in middle Eocene sediments [16, 17]. With this time as the separation of the two lineages and K = 0.11, the rate of substitution (r) was calculated to about 1.14 * 10− 9 per site and year (T = K/2r), and which is very close to previous value of 1.28 * 10− 9 [18].

Fig. 4

Phylogenetic tree of 9 Salix and one Populus species. a NJ tree. Phylogram derived using pairwise synonymous substitution rates of orthologous transcripts as a distance metric (not from multiple sequence alignments) and the neighbour-joining method. b ML tree. Phylogram derived using 238 single copy genes and the maximum-likelihood method. Original trees were shown in Additional file 4: Figure S2

Using the fossil calibrations (48 Mya) of genera Salix and Populus [16, 17], the divergence times were estimated based on the 238 single copy genes and pairwise Ks distance metric of the orthologous transcripts (Table 4). The divergence of subgenus Vetrix and Salix occurred at about 17.6–16.0 Mya in the Salix phylogeny, and S. fargesii diverged at about 10.9–10.6 Mya with other species of subgenus Vetrix (Fig. 4). There were still some inconsistencies on the divergence time of subgenus Vetrix and Salix based on nuclear and plastome genes in previous studies [13, 15]. The time of 17.6–16.0 Mya between subgenus Vetrix and Salix supports the value of 16.9 Mya estimated by complete plastome genomes.

Evolutionary pattern of Salix spp. genes

Ka/Ks rate of orthologous genes could reflect the evolution pattern of species. Ka/Ks > 1 indicates that the gene has involved in positive selection during evolution.

In the Salix phylogeny (Table 5), stress genes producing Glutathione S-transferase protein were generally found to be involved in positive selection between S. purpurea and S. sachalinensis, S. purpurea and S. dasyclados, S. sachalinensis and S. dasyclados, S. viminalis and S. purpurea, S. viminalis and S. dasyclados, S. eriocephala and S. viminalis, S. eriocephala and S. dasyclados, S. matsudana and S. suchowensis, S. matsudana and S. fargesii. Glutathione S-transferase protein could induce multiple stresses of cold-, drought-, salt- and oxidation- [32,33,34].

Table 5 Number and function annotation of positive selection genes in Genus Salix

In subgenus Vetrix except S. fargesii (Table 5), S. dasyclados was identified 454, 306, 267 and 289 positive selection genes with the spepies of subgenus Vetrix, S. purpurea, S. suchowensis, S. sachalinensis and S. eriocephala. Between them, cold- stress genes were found to be annotated to NP_190879.1, AAM23265.1, NP_849749.1 and AAN77157.1 (Additional file 1: Table S4), which producing the proteins of P-loop NTPases [35], L-asparaginase [36], HOS10 with Myb damain [37] and thylakoid-bound ascorbate peroxidase [38]. 254 positive selection genes were identified between S. sachalinensis and S. viminalis, and one light-stress gene (NP_565524.1) was found by producing the SEP protein [39].

In subgenus Salix and S. fargesii (Table 5), 257 and 36 positive selection genes were identified between S. matsudana and S. fargesii, S. matsudana and S. babylonica. Universal-stress genes (NP_001132550.1 and NP_001132238.1) were wildly found between them by producing universal stress protein. Universal stress protein could induce by many environmental stressors such as nutrient starvation, drought, extreme temperatures, high salinity, and the presence of uncouplers, antibiotics and metals [40,41,42].


Salix phylogeny derived by comparative genomics and transcriptomics

In previous studies, Salix phylogenies were usually derived by several nuclear and plastid markers [7,8,9,10,11,12,13,14,15]. Different markers always obtained different phylogenetic trees. Comparative genomics and transcriptomics could make use of more and more nuclear sequences. Phylograms were derived using two methods in this work. 238 single copy genes were strictly selected to construct the phylogenetic tree by maximum- likelihood method, which used most of the sites. Another method is based on the neighbor-joining method using the pairwise Ks values of the orthologous transcripts as a distance metric, which used most of the orthologous. The divergence times of subgenus Vetrix and Salix estimated by two methods were consistent with the value by complete plastome genomes [15]. It is improved that enough single copy nuclear sequences should obtain similar results with enough plastid sequences in the Salix phylogeny.

Paleoclimate change in the divergence of Salix phylogeny

The divergence time of genera Salix and Populus was about 48 Mya at the period of Paleogene (66-23Mya) [43,44,45]. During the Paleogene, the global climate went against the hot and humid conditions of the late Mesozoic era and began a cooling and drying trend [23]. As the Earth cooled, tropical plants were restricted to equatorial regions and became less numerous. Deciduous plants became more common which could survive through the seasonal climates, during which Salix and Populus diverged.

Miocene (23 - 5Mya) is the main period in the divergence of Salix phylogeny (Fig. 4). During the period, there is evidence of a warm period from 21 Mya to 14 Mya named as the Middle Miocene Climate Transition (MMCT) [23], and the rare pleasant environment might cause the species diversity. The divergence time of subgenus Vetrix and Salix was about 17.6–16.0 Mya corresponding to the period of MMCT. Then global temperatures took a drop and some species were extinct by 14Mya [46,47,48], so the north subgenus Vetrix needed to migrate or adapt in order to survive. One group with S. fargesii diverged from subgenus Vetrix and migrated to south. The resident group of Vetrix had to adapt the cold and drought climate. By 8 Mya, the climate sharply cooled and formed the Quaternary Ice Age (2.6–0.1Mya) [49]. The climate change from MMCT to Quaternary Ice Age should play an important role in the divergence of Salix phylogeny.

Universal- stress genes and migration of S. fargesii

The divergence time of S. fargesii (section Psilostigmatae) was about 10.9–10.6 Mya after the MMCT (21–14 Mya), and in which period the climate changed to be cooling. Wind and animal pollination had been proved to play an important role in the spread of willows [50,51,52]. Section Psilostigmatae are mainly distributed along the Changjiang (Yangtze) [53], Laantsang and Nujiang river of China (Fig. 1). The river provided the feasibility for the migration of the willow catkins by animal or other pollination, which is consistent with the distribution of Section Psilostigmatae. In previous studies, it was shown that the evolutionary history of the salix has involved multiple reticulation events that may mainly be due to hybridization [13]. Migration of S. fargesii maybe provided the possibility for the hybridization of Salix.

Our result shows that universal- stress genes were identified to be involved in positive selection between S. fargesii and subgenus Salix. It is suggested that selective evolution of universal- stress gene should play an import role in migrating to south for S. fargesii. When S. fargesii moved to south, universal- stress gene can help to produce the abiotic resistances to adapt the complex environment of south areas.

Cold-, light- stress genes and the north resident group of sub genus Vetrix

After the MMCT (21–14 Mya), the global climate came back the cooling and drying trend [23]. In previous studies, it was shown that the evolutionary history of the Salix maybe affected by the profound climatic cooling during the Tertiary [13]. The resident group of section Vetrix had to adapt to the cooling stage especially the high-latitude. Cold-, light- stress genes were widely identified to be involved to positive evolution among S. purpurea, S. suchowensis, S. sachalinensis, S. eriocephala and S. dasyclados. It is suggested that the cool and dry climate had played an important role in the speciation of north group of Section Vetrix.


In this study, we completed the comparative analysis based on genomic and transcriptomic sequences of 9 Salix and one populus species. All pairwise of orthologues were identified in these species, from which we constructed a phylogenetic tree and estimated the rate of diverse. The divergence times were estimated by the comparative analysis, and which suggested the speciation of Salix was involved in the period from MMCT (21–14 Mya) to Quaternary Ice Age (2.6–0.1Mya). The warm climate of MMCT might cause the divergence of subgenus Vetrix and Salix. Then global temperatures came back to the cool and dry trend by 14 Mya, so willows needed to migrate or adapt in order to survive. The phylogenetic relationship and geography distribution suggest that section Psilostigmatae might migrate from north to south by the Changjiang, Laantsang and Nujiang river of China. Universal- stress genes were involved in positive evolution and could help them to adapt to the south complex environment. Cold- and light- stress genes were identified to be involved in positive evolution among the resident Vetrix. It is suggested the resident Vetrix had to adapt to the cool and dry environment in order to survive. The study shows that the paleoclimate change and selective evolution had played an important role in the divergence of Salix phylogeny.


Data sources

In order to discover the evolutionary pattern of orthologues, the cDNAs and transcripts of 9 Salix and one Populus (out-group) were downloaded from the public databases (Table 1). The cDNAs of P. trichocarpa (v3.1), Salix purpurea (v1.0) and S. suchowensis were directly derived from the JGI [5] and willow genome project of NJFU ( ). Transcriptome sequencing of S. sachalinensis, S. dasyclados, S. viminalis, S. eriocephala, S. matsudana, S. babylonica and S. fargesii were obtained from the SRA database of NCBI. Geographic distributions of section Psilostigmatae was draw by ArcGIS based on Flora of China ( ) (Additional file 1: Table S2).

Data filtering and de novo assembly

SRA datasets with FASTQ format were filtered to remove raw reads of low quality. Transcriptome assembly was achieved using the short-read assembly program Trinity [54]. The assembled sequences (> = 300 bp) were combined and clustered with CD-HIT (version 4.0) [55, 56]. Sequences with similarity > 95% were divided into one class, and the longest sequence of each class was treated as a unigene during later processing.

Identification of SSRs in 10 Salicaceae species

Putative SSRs in Unigenes and cDNAs were identified by MISA software. The options of Di- to hexa-nucleotide SSRs were set to 6 (for di-), 5 (for tri- and tetra-), and 4 (for penta- and hexa-), and all SSRs were characterized in 10 Salicaceae species.

Identification of orthologues among 10 Salicaceae species

OrthoMCL software [57] was used to cluster the transcribed sequences. Based on the proteins of Salix purpurea as reference, one-to-one sequences of each group were then filtered to use in subsequent analyses. The annotations obtained from Nr were processed through the BLAST2GO program [58] to get the relevant GO terms. Heatmap of orthologues was draw by R language.

Estimation of synonymous substitution and non-synonymous substitution rates

In order to remove the unigenes without open reading frames, pair-wise orthologues were searched against plant protein sequences of GenBank with BLASTX tool. The method has been used in previous studies [59]. Clustalw software [60] was used to align the filtered pair-wise orthologues, and the output files were formatted to NUC format for subsequent analysis. The rates of synonymous substitutions (Ka) and non-synonymous substitutions (Ks) were estimated with PAML software [61].

Phylogenetic analysis

There were still some inconsistencies on phylogenetic relationship in previous studies. Phylograms were derived using two methods in this work. Single copy genes by orthoMCL were aligned by Muscle [62] and formated by Gblock [63], maximum- likelihood method was used to build the phylogenetic tree by MEGA6 [64] (bootstrap is 1000 and Kimura 2-parameter model). Another method is based on the neighbor-joining method of MEGA6 [64] using the pairwise Ks values of the orthologous transcripts as a distance metric (Table 4). Populus trichocarpa was used as an out-group to root trees.



Clusters of orthologous groups; SSR: simple sequence repeat


Gene Ontology


Non-synonymous substitutions per non-synonymous site


Synonymous substitutions per synonymous site


Maximum- likelihood


Middle Miocene Climate Transition


Million years ago




  1. 1.

    Skvortsov AK. Willows of Russia and adjacent countries. Taxonomical and geographical revision. 1999.

  2. 2.

    Argus G. Salix. In: Flora of North America North of Mexico. 2010. doi:citeulike-article-id:13509198.

  3. 3.

    Fang Z, Zhao S, Skvortsov AK. Salicaceae. In: Wu ZY, raven PH, Hong DY, editors. Flora of China. Beijing & St. Louis: Science Press & Missouri Botanical Garden Press; 1999.

    Google Scholar 

  4. 4.

    Ding T. Origin, divergence and geographical distribution of Salicaceae. [Chinese]. Acta Bot Yunnanica. 1995;17:277–90.

    Google Scholar 

  5. 5.

    Nordberg H, Cantor M, Dusheyko S, Hua S, Poliakov A, Shabalov I, et al. The genome portal of the Department of Energy Joint Genome Institute: 2014 updates. Nucleic Acids Res. 2014;42.

  6. 6.

    Dai X, Hu Q, Cai Q, Feng K, Ye N, Tuskan GA, et al. The willow genome and divergent evolution from poplar after the common genome duplication. Cell Res. 2014;24:1274–7.

    CAS  Article  Google Scholar 

  7. 7.

    Abdollahzadeh A, Osaloo SK, Maassoumi A. Molecular phylogeny of the genus Salix (Salicaceae) with an emphasize to its species in Iran. Iran J bot. 2011;17:245–53.

    Google Scholar 

  8. 8.

    Azuma T, Kajita T, Yokoyama J, Ohashi H. Phylogenetic relationships of Salix (Salicaceae) based on rbcL sequence data. Am J Bot. 2000;87:67–75.

    CAS  Article  Google Scholar 

  9. 9.

    Hardig TM, Anttila CK, Brunsfeld SJ. A phylogenetic analysis of Salix (Salicaceae) based on matK and ribosomal DNA sequence data. J Bot. 2010;2010:1–12.

    CAS  Article  Google Scholar 

  10. 10.

    Chen JH, Sun H, Wen J, Yang YP. Molecular phylogeny of Salix L. (Salicaceae) inferred from three chloroplast datasets and its systematic implications. Taxon. 2010;59:29–37.

    Article  Google Scholar 

  11. 11.

    Lauron-Moreau A, Pitre FE, Argus GW, Labrecque M, Brouillet L. Phylogenetic relationships of American willows (Salix L., Salicaceae). PLoS One. 2015;10(4):e0121965.

    Article  Google Scholar 

  12. 12.

    Du SH, Wang ZS, Li YX, Wang DS, Zhang JG. Consistency between molecular phylogeny and morphological classification of the Salix matsudana koidz. Complex (Salicaceae). Genet Mol Res. 2015;14:8663–71.

    CAS  Article  Google Scholar 

  13. 13.

    Wu J, Nyman T, Wang DC, Argus GW, Yang YP, Chen JH. Phylogeny of Salix subgenus Salix s.L. (Salicaceae): delimitation, biogeography, and reticulate evolution neuromuscular disorders and peripheral neurology. BMC Evol Biol. 2015;15(1):31.

    Article  Google Scholar 

  14. 14.

    Huang Y, Wang J, Yang Y, Fan C, Chen J. Phylogenomic analysis and dynamic evolution of chloroplast genomes in salicaceae. Front Plant Sci. 2017;8:1050.

    Article  Google Scholar 

  15. 15.

    Zhang L, Xi Z, Wang M, Guo X, Ma T. Plastome phylogeny and lineage diversification of Salicaceae with focus on poplars and willows. Ecol Evol. 2018; 8(16): 7817–7823.

  16. 16.

    Boucher LD, Manchester SR, Judd WS. An extinct genus of Salicaceae based on twigs with attached flowers, fruits, and foliage from the Eocene Green River formation of Utah and Colorado. USA Am J Bot. 2003;90:1389–99.

    Article  Google Scholar 

  17. 17.

    Manchester SR, Judd WS, Handley B. Foliage and fruits of early poplars (Salicaceae: Populus ) from the Eocene of Utah, Colorado, and Wyoming. Int J Plant Sci. 2006;167:897–908.

    Article  Google Scholar 

  18. 18.

    Berlin S, Lagercrantz U, von Arnold S, Öst T, Rönnberg-Wästljung AC. High-density linkage mapping and evolution of paralogs and orthologs in Salix and Populus. BMC Genomics. 2010;11:129.

    Article  Google Scholar 

  19. 19.

    Brereton NJB, Gonzalez E, Marleau J, Nissim WG, Labrecque M, Joly S, et al. Comparative transcriptomic approaches exploring contamination stress tolerance in Salix sp. reveal the importance for a Metaorganismal de novo assembly approach for nonmodel plants. Plant Physiol. 2016;171:3–24.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  20. 20.

    Rao G, Sui J, Zeng Y, He C, Duan A, Zhang J. De novo transcriptome and small RNA analysis of two Chinese willow cultivars reveals stress response genes in Salix Matsudana. PLoS One. 2014;9:e109122.

    Article  Google Scholar 

  21. 21.

    Song X, Fang J, Han X, He X, Liu M, Hu J, et al. Overexpression of quinone reductase from Salix matsudana Koidz enhances salt tolerance in transgenic Arabidopsis thaliana. Gene. 2016;576:520–7.

    CAS  Article  Google Scholar 

  22. 22.

    Yanitch A, Brereton NJB, Gonzalez E, Labrecque M, Joly S, Pitre FE. Transcriptomic response of purple willow (Salix purpurea) to arsenic stress. Front Plant Sci. 2017;8.

  23. 23.

    Zachos J, Pagani H, Sloan L, Thomas E, Billups K. Trends, rhythms, and aberrations in global climate 65 Ma to present. Science. 2001;292:686–93.

    CAS  Article  Google Scholar 

  24. 24.

    Jia H, Yang H, Sun P, Li J, Zhang J, Guo Y, et al. De novo transcriptome assembly, development of EST-SSR markers and population genetic analyses for the desert biomass willow. Salix psammophila Sci Rep. 2016;6:39591.

    CAS  Article  Google Scholar 

  25. 25.

    Liu J, Yin T, Ye N, Chen Y, Yin T, Liu M, et al. Transcriptome analysis of the differentially expressed genes in the male and female shrub willows (Salix suchowensis). PLoS One. 2013;8:e60181.

    CAS  Article  Google Scholar 

  26. 26.

    Niu SH, Li ZX, Yuan HW, Chen XY, Li Y, Li W. Transcriptome characterisation of Pinus tabuliformis and evolution of genes in the Pinus phylogeny. BMC Genomics. 2013;14:263.

    CAS  Article  Google Scholar 

  27. 27.

    Koenig D, Jimenez-Gomez JM, Kimura S, Fulop D, Chitwood DH, Headland LR, et al. Comparative transcriptomics reveals patterns of selection in domesticated and wild tomato. Proc Natl Acad Sci. 2013;110:E2655–62.

    CAS  Article  PubMed  Google Scholar 

  28. 28.

    Davidson RM, Gowda M, Moghe G, Lin H, Vaillancourt B, Shiu SH, et al. Comparative transcriptomics of three Poaceae species reveals patterns of gene expression evolution. Plant J. 2012;71:492–502.

    CAS  PubMed  Google Scholar 

  29. 29.

    Liu T, Tang S, Zhu S, Tang Q, Zheng X. Transcriptome comparison reveals the patterns of selection in domesticated and wild ramie (Boehmeria nivea L. gaud). Plant Mol Biol. 2014;86:85–92.

    CAS  Article  Google Scholar 

  30. 30.

    Zhao Y, Cao Y, Wang J, Xiong Z. Transcriptome sequencing of Pinus kesiya var langbianensis and comparative analysis in the Pinus phylogeny. BMC Genomics. 2018;19:725.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  31. 31.

    Tuskan GA, DiFazio S, Jansson S, Bohlmann J, Grigoriev I, Hellsten U, et al. The genome of black cottonwood, Populus trichocarpa (Torr. & Gray). Science (80- ). 2006;313:1596–604.

    CAS  Article  Google Scholar 

  32. 32.

    Hossain MA, Mostofa MG, Fujita M. Cross Protection by Cold-shock to Salinity and Drought Stress-induced Oxidative Stress in Mustard (Brassica campestris L.) Seedlings. Mol Plant Breed. 2013.

  33. 33.

    Li Y, Yan M, Yang J, Raman I, Du Y, Min S, et al. Glutathione S-transferase mu 2-transduced mesenchymal stem cells ameliorated anti-glomerular basement membrane antibody-induced glomerulonephritis by inhibiting oxidation and inflammation. Stem Cell Res Ther. 2014;5(1):19.

    Article  Google Scholar 

  34. 34.

    Chu G, Li Y, Dong X, Liu J, Zhao Y. Role of NSD1 in H2O2-induced GSTM3 suppression. Cell Signal. 2014;26:2757–64.

    CAS  Article  Google Scholar 

  35. 35.

    Deng S, Sun J, Zhao R, Ding M, Zhang Y, Sun Y, et al. Populus euphratica APYRASE2 enhances cold tolerance by modulating vesicular trafficking and extracellular ATP in Arabidopsis plants. Plant Physiol. 2015;169:530–48.

    CAS  Article  PubMed  PubMed Central  Google Scholar 

  36. 36.

    Cho CW, Lee HJ, Chung E, Kim KM, Heo JE, Kim JI, et al. Molecular characterization of the soybean L-asparaginase gene induced by low temperature stress. Mol Cells. 2007;23:280–6

    CAS  PubMed  Google Scholar 

  37. 37.

    Meng D, Yu X, Ma L, Hu J, Liang Y, Liu X, et al. Transcriptomic response of Chinese yew (Taxus chinensis) to cold stress. Front Plant Sci. 2017;8.

  38. 38.

    van Buer J, Cvetkovic J, Baier M. Cold regulation of plastid ascorbate peroxidases serves as a priming hub controlling ROS signaling in Arabidopsis thaliana. BMC Plant Biol. 2016;16(1):163.

    Article  Google Scholar 

  39. 39.

    Heddad M, Adamska I. Light stress-regulated two-helix proteins in Arabidopsis thaliana related to the chlorophyll a/b-binding gene family. Proc Natl Acad Sci U S A. 2000;97:3741–6.

    CAS  Article  Google Scholar 

  40. 40.

    Siegele DA. Universal stress proteins in Escherichia coli. J Bacteriol. 2005;187:6253–4.

    CAS  Article  Google Scholar 

  41. 41.

    Tkaczuk KL, Shumilin IA, Chruszcz M, Evdokimova E, Savchenko A, Minor W. Structural and functional insight into the universal stress protein family. Evol Appl. 2013;6:434–49.

    CAS  Article  Google Scholar 

  42. 42.

    O’Connor A, McClean S. The role of universal stress proteins in bacterial infections. Curr Med Chem. 2017;24.

  43. 43.

    Naafs BDA, Rohrssen M, Inglis GN, Lähteenoja O, Feakins SJ, Collinson ME, et al. High temperatures in the terrestrial mid-latitudes during the early Palaeogene. Nature Geoscience. 2018;11(10):–766.

  44. 44.

    Jenkyns HC, Weedon GP. Evidence for rapid climate change in the Mesozoic-Palaeogene greenhouse world. In: Philosophical Transactions of the Royal Society A: Mathematical, Physical and Engineering Sciences. 2003. p. 1885–916.

  45. 45.

    Contreras L, Pross J, Bijl PK, O’Hara RB, Raine JI, Sluijs A, et al. Southern high-latitude terrestrial climate change during the Palaeocene-Eocene derived from a marine pollen record (ODP site 1172, East Tasman plateau). Clim Past. 2014;10:1401–20.

    Article  Google Scholar 

  46. 46.

    Böhme M. The Miocene climatic optimum: evidence from ectothermic vertebrates of Central Europe. Palaeogeogr Palaeoclimatol Palaeoecol. 2003;195:389–401.

    Article  Google Scholar 

  47. 47.

    Herold N, You Y, Müller RD, Seton M. Climate model sensitivity to changes in Miocene paleotopography. Aust J Earth Sci. 2009;56:1049–59.

    CAS  Article  Google Scholar 

  48. 48.

    Pound MJ, Haywood AM, Salzmann U, Riding JB. Global vegetation dynamics and latitudinal temperature gradients during the mid to Late Miocene (15.97-5.33Ma). Earth Sci Rev. 2012;112:1–22.

    Article  Google Scholar 

  49. 49.

    Gibbard PL, Woodcock NH. The Quaternary: History of an Ice Age. In: Geological History of Britain and Ireland. 2nd ed; 2012. p. 409–28.

    Google Scholar 

  50. 50.

    Vroege PW, Stelleman P. Insect and wind pollination in Saux Repens L. and Saux Caprea L. Isr J Bot. 1990;39:1613–9.

    Google Scholar 

  51. 51.

    Peeters L, Totland Ø. Wind to insect pollination ratios and floral traits in five alpine Salix species. Can J Bot. 1999;77:556–63.

    Article  Google Scholar 

  52. 52.

    Tamura S, Kudo G. Wind pollination and insect pollination of two temperate willow species, Salix miyabeana and Salix sachalinensis. Plant Ecol. 2000;147:185–92.

    Article  Google Scholar 

  53. 53.

    Zheng H, Clift PD, Wang P, Tada R, Jia J, He M, et al. Pre-Miocene birth of the Yangtze River. Proc Natl Acad Sci. 2013;110:7556–61.

    Article  PubMed  Google Scholar 

  54. 54.

    Grabherr MG, Haas BJ, Yassour M, Levin JZ, Thompson DA, Amit I, et al. Full-length transcriptome assembly from RNA-Seq data without a reference genome. Nat Biotechnol. 2011;29:644–52.

    CAS  Article  Google Scholar 

  55. 55.

    Huang Y, Niu B, Gao Y, Fu L, Li W. CD-HIT suite: a web server for clustering and comparing biological sequences. Bioinformatics. 2010;26:680–2.

    CAS  Article  Google Scholar 

  56. 56.

    Li W, Godzik A. Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006;22:1658–9.

    CAS  Article  Google Scholar 

  57. 57.

    Li L, Stoeckert CJ, Roos DS. OrthoMCL: identification of ortholog groups for eukaryotic genomes. Genome Res. 2003;13:2178–89.

    CAS  Article  Google Scholar 

  58. 58.

    Conesa A, Götz S, García-Gómez JM, Terol J, Talón M, Robles M. Blast2GO: a universal tool for annotation, visualization and analysis in functional genomics research. Bioinformatics. 2005;21:3674–6.

    CAS  Article  Google Scholar 

  59. 59.

    Blanc G. Widespread Paleopolyploidy in model plant species inferred from age distributions of duplicate genes. PLANT CELL ONLINE. 2004;16:1667–78.

    CAS  Article  Google Scholar 

  60. 60.

    Larkin M, Blackshields G, Brown N, Chenna R, McGettigan P, McWilliam H, et al. ClustalW and ClustalX version 2. Bioinformatics. 2007;23:2947–8.

    CAS  Article  Google Scholar 

  61. 61.

    Yang Z. PAML 4: phylogenetic analysis by maximum likelihood. Mol Biol Evol. 2007;24:1586–91.

    CAS  Article  Google Scholar 

  62. 62.

    Edgar RC. MUSCLE: multiple sequence alignment with high accuracy and high throughput. Nucleic Acids Res. 2004;32:1792–7.

    CAS  Article  Google Scholar 

  63. 63.

    Castresana J. Selection of conserved blocks from multiple alignments for their use in phylogenetic analysis. Mol Biol Evol. 2000;17:540–52.

    CAS  Article  Google Scholar 

  64. 64.

    Tamura K, Stecher G, Peterson D, Filipski A, Kumar S. MEGA6: molecular evolutionary genetics analysis version 6.0. Mol Biol Evol. 2013;30:2725–9.

    CAS  Article  Google Scholar 

Download references


This work was supported by Cloud center of Ali of Southwest Forestry University.


This work was supported by Project of National Natural Science Foundation (61702442 and 61462095).

Availability of data and materials

The raw data of transcriptomes in this study were downloaded from the NCBI Sequence Read Archive (SRA) under the accession number ERR2040399, ERR2040396, ERR1558648, ERR2040397, SRR1086819, SRR1045959 and ERR2040401. The cDNA data of genomes were directly derived the JGI ( ) and NJFU ( ).

Author information




YJZ participated in design of the study and drafted the manuscript. XYL and KRH prepared the tables and Figs. YC and RG participated in the comparative analysis and performed the statistical analysis. YC and FD conceived of the study, and helped to revise the manuscript. All authors read and approved the final manuscript.

Corresponding authors

Correspondence to Yong Cao or Fei Dai.

Ethics declarations

Ethics approval and consent to participate

Not applicable.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Additional files

Additional file 1:

Table S1. Main geographic distributions of 9 Salix used in this work. Table S2. Geographic distributions of main species in Sect. Psilostigmatae. Table S3. GO annotation of shared orthologues in 10 Salicaceae species. Table S4. Information of resistance genes involved in positive selection in Genus Salix. (XLS 60 kb)

Additional file 2:

Figure S1. Length distribution of transcripts in 10 Salicaceae species. (TIF 2038 kb)

Additional file 3:

Sequences of shared orthologues in 10 Salicaceae species. (FA 1803 kb)

Additional file 4:

Figure S2. Original tree of NJ and ML methods. (TIF 1481 kb)

Additional file 5:

Sequences of resistance genes in Genus Salix. (FA 26 kb)

Rights and permissions

Open Access This article is distributed under the terms of the Creative Commons Attribution 4.0 International License (, which permits unrestricted use, distribution, and reproduction in any medium, provided you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The Creative Commons Public Domain Dedication waiver ( applies to the data made available in this article, unless otherwise stated.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Zhao, Y., Liu, X., Guo, R. et al. Comparative genomics and transcriptomics analysis reveals evolution patterns of selection in the Salix phylogeny. BMC Genomics 20, 253 (2019).

Download citation


  • Salix phylogeny
  • Species migration
  • Comparative transcriptomics
  • Resistance gene
  • Selective evolution