Cis-regulatory signatures of orthologous stress-associated bZIP transcription factors from rice, sorghum and Arabidopsis based on phylogenetic footprints
© Xu et al.; licensee BioMed Central Ltd. 2012
Received: 2 July 2012
Accepted: 14 September 2012
Published: 20 September 2012
The potential contribution of upstream sequence variation to the unique features of orthologous genes is just beginning to be unraveled. A core subset of stress-associated bZIP transcription factors from rice (Oryza sativa) formed ten clusters of orthologous groups (COG) with genes from the monocot sorghum (Sorghum bicolor) and dicot Arabidopsis (Arabidopsis thaliana). The total cis-regulatory information content of each stress-associated COG was examined by phylogenetic footprinting to reveal ortholog-specific, lineage-specific and species-specific conservation patterns.
The most apparent pattern observed was the occurrence of spatially conserved ‘core modules’ among the COGs but not among paralogs. These core modules are comprised of various combinations of two to four putative transcription factor binding site (TFBS) classes associated with either developmental or stress-related functions. Outside the core modules are specific stress (ABA, oxidative, abiotic, biotic) or organ-associated signals, which may be functioning as ‘regulatory fine-tuners’ and further define lineage-specific and species-specific cis-regulatory signatures. Orthologous monocot and dicot promoters have distinct TFBS classes involved in disease and oxidative-regulated expression, while the orthologous rice and sorghum promoters have distinct combinations of root-specific signals, a pattern that is not particularly conserved in Arabidopsis.
Patterns of cis-regulatory conservation imply that each ortholog has distinct signatures, further suggesting that they are potentially unique in a regulatory context despite the presumed conservation of broad biological function during speciation. Based on the observed patterns of conservation, we postulate that core modules are likely primary determinants of basal developmental programming, which may be integrated with and further elaborated by additional intrinsic or extrinsic signals in conjunction with lineage-specific or species-specific regulatory fine-tuners. This synergy may be critical for finer-scale spatio-temporal regulation, hence unique expression profiles of homologous transcription factors from different species with distinct zones of ecological adaptation such as rice, sorghum and Arabidopsis. The patterns revealed from these comparisons set the stage for further empirical validation by functional genomics.
KeywordsOrthologs Cis-elements Stress bZIP transcription factor Phylogenetic footprinting
Regulatory transcription factors are among the major forces that drive the evolution of multicellular complexity. As such, they represent a group of highly conserved network hubs that directly link gene expression programs to various internal and external signals for development, growth and adaptation . Changes in the regulation of these network hubs lead to a ‘network rewiring effect’, which is manifested by dynamic changes in transcriptome and proteome signatures [2–4]. Indeed, much of the physiological and developmental variations across the evolutionary continuum of the plant kingdom are to a large extent consequences of how regulatory transcription factors have been reprogrammed over time to create diverse network configurations [5–7].
The separation of the monocot and eudicot lineages of flowering plants spans more than 140 million years of evolutionary history . Comparative analysis of reference genome sequences across a number of representatives from the monocot and eudicot lineages revealed the dynamic changes in genome size, organization and complexity during speciation [9–12]. Gene duplication is a hallmark of such dynamic changes in genome complexity . For regulatory transcription factors, the function of duplicated copies may exhibit various degrees of conservation and divergence at different taxonomic levels. The degree of functional divergence may be apparent not only in terms of structure but most importantly in terms of differences in spatio-temporal programs, some of which may have important implications to a fine-tuned physiological process integral to the adaptive properties of a given species to its ecological niche .
Recent studies comparing the evolution of genetic network complexities in flowering plants revealed important roles of regulatory transcription factor evolution to physiological variation among species [6, 15, 16]. Large gene families tend to exhibit greater functional diversity as various members are often co-opted for more specialized biological roles. The transcriptional regulators (CO and FT) of the flowering response pathway in the eudicot plant Arabidopsis thaliana have functional homologs (Hd1 and Hd3a, respectively) in the monocot plant rice (Oryza sativa). The diurnal expression patterns are remarkably conserved between the Arabidopsis and rice homologs through the specific functions of their respective phytochrome chromophore proteins (HY1 in Arabidopis and SE5 in rice). However, similarities in expression are punctuated by changes in other features that serve to fine-tune physiological functions in the divergent species, while maintaining the broad biological function of the gene. CO evolved to function as an activator that promotes the expression of FT to trigger floral development of Arabidopsis under long days, while Hd1 functions as a repressor that prevents Hd3a expression, keeping the rice plant in vegetative phase under long days [17, 18]. The co-option scenario is illustrated by numerous examples of transcription factor families (e.g., MADS, bHLH, NAC, ERF) that have undergone significant expansion during speciation leading to lineage-specific functions for both developmental and stress-related responses [1, 7, 15, 19].
The relationships among ‘equivalent’ transcription factors in divergent plant species are traditionally expressed in terms of homologous clusters whose members are likely to have evolved from a common ancestral gene. Under each cluster are the orthologous genes whose functions are conserved after speciation, and the paralogous genes whose functions have become more specialized being the outcomes of gene duplication . Conventionally, orthologs are established by one-to-one or many-to-many relationships through phylogenetic reconstructions based on coding sequences and conservation of protein domains .
Identification of regulatory transcription factors on the basis of orthologous relationship facilitated a meaningful translation of knowledge regarding biological function from model to non-model species. While such knowledge contributed immensely to the strength of the comparative functional genomics paradigm, understanding the broad biological function of transcription factors in an evolutionary context is not complete without a systematic investigation of the contribution of upstream sequence variation to the potentially unique regulatory features of each member of an orthologous group. In stress physiology, such information could potentially define another layer of complexity and might open a different view of why different species have distinct ecological niches despite their largely homologous gene sets involved in various stress responses.
The bZIP-type transcription factors belong to a relatively large family of regulators with major roles in abiotic and biotic stress response mechanisms and in seed development and maturation, primarily through the ABA signal transduction pathway [22–25]. In our previous study, we have identified a subset of bZIP transcription factors involved in the regulation of low temperature response transcriptome of temperate rice, Oryza sativa ssp. japonica cv. Nipponbare . We proposed that this group functions broadly in various stress regimes by virtue of their linkages with oxidative-mediated genetic network that plays a key role in almost every type of abiotic and biotic stresses. Furthermore, we were particularly interested in understanding the extent by which function and regulation of these general stress response transcription factors are conserved across species representing distinct stress physiological categories.
In the follow-up study described here, the aforementioned subset of stress-associated bZIP transcription factors was examined further in an evolutionary context by assessing the similarities and differences in upstream regulatory information content of the rice (Oryza sativa) genes and their orthologs in sorghum (Sorghum bicolor) and Arabidopsis (Arabidopsis thaliana) by phylogenetic footprinting [27, 28]. Bona fide upstream sequences based on alignment with full-length cDNAs are critical for establishing biologically meaningful trends from these analyses. The annotated reference sequences of rice, sorghum and Arabidopsis represent a robust comparative genomics system that is particularly suitable for such application. Furthermore, rice, sorghum and Arabidopsis represent a meaningful spectrum of diversity in the plant kingdom, encompassing the monocot and dicot divide with relevant stress physiological attributes. We discuss our interpretations of the possible implications of the patterns that we observed in these analyses within the context of the finer-scale differences that may likely confer unique regulatory attributes to each member of an orthologous group. With the window of resolution afforded by the rice-sorghum-Arabidopsis comparative genomics, we have set the stage for more strategic experimental validation of cis-regulatory signatures of orthologous stress-associated bZIP transcription factors by functional genomics.
Results and discussion
Stress-regulated bZIP transcription factors of japonica rice
Orthologous groups of stress-associated bZIP genes in rice, sorghum and Arabidopsis
Based on all available information, association with some kind of stress (abiotic and/or biotic) response mechanism was presumed to be the common function that defines the orthologous relationships among this panel of homologous bZIP transcription factors. Representative paralogs without an apparent association with stress response based on the rice expression data were also identified in all BlastN and BlastX searches and phylogenetic reconstructions set them apart from the clear orthologous gene sets (Additional file 1).
The COGs of the stress-associated bZIP transcription factors represent five functional sub-classes generally associated with ABA and stress response (Group-A), oxidative and pathogen defenses (Group-D), photomorphogenesis (Group-H), response to oxidative stress and gibberellic acid (Group-I) and stress response and sucrose signaling (Group-S) based on established functional classification schemes (Figure 2) [22, 24]. Group-S is the largest group with two complete COGs with one member from each species (S2 = Os02g03960, Sb04g002700, At1g75390; S3 = Os02g09830, Sb04g006180, At1g13600) and one partial COG without a member from Arabidopsis but duplicated copies in sorghum (S1 = Os08g38020, Sb07g028900, Sb07g028960). Group-H is the smallest group with only one complete COG with one member from each species (H1 = Os02g10860, Sb04g007060, At5g11260). Both Group-A (A1 = Os016400, Sb03g040510, At2g36270; A2 = Os02g52780, Sb04g034190, At1g45249) and Group-D (D1 = Os06g41100, Sb10g024190, At1g08320; D2 = Os05g37170, Sb09g021840, At5g06839) have two complete COGs with one member from each species. Group-I includes two complete COGs with one member from each species (I1 = Os08g43090, Sb07g025270, At4g38900; I2 = Os12g06520, Sb08g003940, At1g43700) and one rice gene (Os04g10260) with no clear orthologs in sorghum and Arabidopsis.
Total regulatory information content of stress-related bZIP COGs
The occurrence of 6 to 8 nucleotide sequence motifs representing putative transcription factor binding sites (TFBS) or cis-elements was first established in the upstream (−1,000 to +200) sequences of all orthologous and paralogous genes included in the study based on the plant-specific cis-element annotation in the Genomatix, TRANSFAC and PLACE promoter databases (Additional file 2). The TFBS classes that occurred among orthologs but not in paralogs were identified in order to establish the core subset of ortholog-specific cis-elements that could then be used to search for lineage-specific and species-specific conservation patterns. The core subset of ortholog-specific TFBS was also used to assess the differences in total cis-regulatory formation content between the different COGs and between the members of each COG. Total cis-regulatory information content represents all the motif classes in a given promoter with an e-value of 10-3 or less.
Total cis-regulatory information content of orthologous stress-associated bZIP transcription factors of rice, sorghum and Arabidopsis
TFBS in all species
TFBS in Os only
TFBS in Sb only
TFBS in At only
X2 (Os:Sb:At) = 1:1:1
X2 (Os:Sb) = 1:1
X2 (Os:At) = 1:1
X2 (Sb:At) = 1:1
General patterns of cis-regulatory conservation among COGs
To establish the common denominator among the members of the tri-species COGs with respect to cis-regulatory information content, we first searched for evidence of basal similarities among the orthologous promoters. In particular, we searched for the occurrence of TFBS spatial distribution patterns that are conserved in every member of each COG but not in paralogs . We assumed that such patterns could provide a general measure of the degrees of conservation that occurred during the divergence of monocot and dicot lineages from their last common ancestor or during speciation of rice and sorghum.
The order, relative distance from each other and strand orientation of each putative TFBS that comprised a core module were remarkably conserved within each COG. Different patterns of relatedness were observed among rice, sorghum and Arabidopsis promoters in terms of the patterns of core module location relative to the TSS. Locational similarities between rice and sorghum core modules were more evident in A1, A2, D2 and S1. The other COGs showed either similar core module locations among all three species (I2) or similar locations between either of the monocot species and Arabidopsis (H1, S2) (Figure 3). Potential functional significance of the distance of the core modules from the TSS might be interpreted in terms of the efficiency by which these core modules interact with the core promoter during the formation of initiation complex. The spatially conserved core modules are comprised of various combinations of TFBS classes that were annotated in cis-element databases with key words associated with growth and developmental (MADS box, MYBS/MYB-R1, AHBP/HD-Zip, DOFF, GTBX/MYB, MYBL), gibberellic acid (GA) response (GARP/MYB-related), abiotic stress and pathogen defense (GBOX/bZIP, MYCL/bHLH, NCS1, CAAT/NF-y) related functions. The possible biological significance of these spatially conserved core modules may be interpreted in terms of the shared properties required for basal developmental regulatory programs of each orthologous promoter that were conserved during the divergence of monocot and dicot lineages from their last common ancestor or convergent parallel evolution of regulatory modules in the monocot and dicot lineages.
The second pattern is defined by the occurrence of a given set of TFBS classes among the members of a given COG but not among paralogs (‘ortholog-specific’). The third pattern is defined by the common occurrence of a given set of TFBS classes among the orthologous monocot genes but not in the orthologous dicot gene(s) and paralogs (‘lineage-specific’). The fourth pattern is defined by certain TFBS classes with unique occurrence in either rice, sorghum or Arabidopsis orthologs, reflecting the differences that occurred during speciation (‘species-specific’). No robust trend was detected among the paralogs to indicate a clear pattern of ‘paralog-associated’ TFBS occurrence except in S2, consistent with the presumed diverse functions of paralogs.
The binary hierarchical clustering dendograms in Figure 4 also further revealed the tendency for the monocot species to be more closely related to each other than to the dicot Arabidopsis in terms of cis-regulatory information content, mirroring the exact same trend established by the coding sequence (nucleotide and amino acid levels) phylogeny. This general trend could also be seen (at least slightly) from the patterns shown in Table 1 and Figure 3. All these results are consistent with the basic assumptions of phylogenetic footprinting and imply that the patterns of upstream sequence similarities and differences established for this small group of bZIP transcription factors are robust enough to be used for meaningful functional and biological inferences [27, 40, 41].
Functional implications of ortholog-specific signatures
The TFBS classes shared in common by the stress-associated orthologs with other paralogous genes are largely associated with basal growth and developmental functions, ranging in occurrence from a simpler non-modular combinations of five TFBS classes per cluster to a more complex but non-modular combinations of more than ten TFBS classes per cluster (Figure 4, Additional file 5). More than 75% of the shared TFBS classes have been annotated in cis-element databases with keywords related to either vegetative or reproductive developmental processes. The DOFF/Zinc finger, MYBL, AHBP/HD-zip, L1BX/homeodomain, GTBX, NCS1 elements have the broadest occurrences across the clusters. Interestingly, many of these TFBS classes are also among the components of the spatially conserved core modules shown in Figure 3. However, the combinatorial (or modular) occurrence of these elements with other elements and their spatial distribution are not conserved among paralogs in the same way that they are conserved as modules in the COGs. Based on these trends, it appears that the functionality of these elements in terms of stress regulation of orthologs is defined by their modular nature and such organization are apparently lost in the paralogs [41, 42].
Ortholog-specific signatures were also evident from the results of hierarchical clustering analysis, reflecting the additional non-species-specific or non-lineage-specific regulatory information that may be required for the stress-associated function of the orthologs in conjunction with the spatially conserved core modules (Figure 4, Additional file 6). Of the total 18 TFBS classes that occurred in an ortholog-specific manner, about 67% (12) have been annotated in databases with key words related to abiotic and biotic stress responses including temperature extremes, dehydration, UV, oxidative stress, salinity, pathogens and wounding as well as responses related to ABA and ethylene signaling. The other TFBS classes (33%) that were detected only among the orthologs but not in paralogs are associated with various growth and developmental processes including the regulation of cell cycle, morphogenesis and circadian responses. These trends reiterate that the subset of ortholog-specific TFBS classes that probably work in conjunction with the spatially conserved core modules are defined primarily by putative stress-related regulatory functions.
The binary hierarchical clustering dendogram also shows that each COG is unique by virtue of their cis-regulatory information signatures (Figure 4, Additional file 6). Even the COGs that belong to the same functional class (for example, A1/A2 and D1/D2) have very distinctive patterns, distinct enough to place them in distant phylogenetic branches. The biological significance of these elements may be interpreted in terms of their possible roles as ‘regulatory fine-tuners’, perhaps by facilitating the integration of various types of intrinsic (growth) and extrinsic (stress) signals that contribute to the overall expression potential of the gene.
In general, the functional annotations of the TFBS classes that characterized a given COG reflect the functional sub-class of bZIP proteins to which the specific COG belongs. For instance, A1 and A2 belong to the sub-class involved in ABA-mediated stress response signaling (Figure 2). Both COGs contain at least one class of ABA-associated TFBS such as IBOX/Zinc finger for A1 and AGP1/GATA, OPAQ/bZIP, CE1F, and IBOX/Zinc finger for A2, with IBOX/Zinc finger as the core element shared by both A1 and A2 (Additional file 6). Hierarchical clustering dendograms also showed the expected patterns of co-occurrence between TFBS classes associated with specific signals. For instance, co-occurrence among ABA-related TFBS classes is quite apparent in most cases as well as the co-occurrence of TFBS classes involved with mechanisms associated with oxidative and pathogen defenses.
Another apparent positive association is shown by D1 and D2, which belong to the sub-class involved in pathogen and oxidative stress associated bZIP proteins (Figure 2). The cis-regulatory signatures of these COGs overlap through the EINL and LREM elements, both of which are associated with AP2/ERF-type transcription factors involved in disease response mechanisms via oxidative and ethylene signal transduction pathways (Additional file 6). However, D1 and D2 are also unique by virtue of the occurrence of other oxidative-associated elements such as AGP1/GATA in D1 and ERSE/bZIP and CAAT/NF-y in D2. In addition to the salient regulatory features of D1 and D2 that were consistent with their presumed roles in oxidative and pathogen defense mechanisms, each COG also contains distinct sets of other elements involved in ABA response mechanism, suggestive of how the regulation of the members of this COG might involve the interplay between oxidative, ethylene and ABA signal transduction pathways [29, 43]. Of all the COGs, D2 also has the most complex combination of stress-associated TFBS based on the highest density in all its members. This trend appears to be consistent with the presumed complex roles of D2 members in both abiotic and biotic (disease and wounding) stress response mechanisms [22, 24].
Similar trends were observed in the Arabidopsis-specific TFBS classes, where both development-associated (29%; AREF/ARF, SEF4/TCP) and stress-associated TFBS classes were represented (Figure 5D). As in the monocots, the stress-associated TFBS classes that are specific to Arabidopsis orthologs were also characterized by functions associated with the regulation of defenses against pathogens (43%), including those involved in oxidative, ethylene, and jasmonic acid (JA) signaling (ASRC/ABI3, EINL/AP2/ERF, GCCF, HMGF/CAMTA, RAV5/LFY) . Based on these trends, it appears that the major difference in cis-regulatory information content between the monocot and dicot lineages could be defined in terms of the types of elements with likely important roles in disease-regulated expression. This trend seems to support the possibility that distinct mechanisms are involved in the regulation of orthologous stress-related bZIP transcription factors in monocot and dicot species, conferred by the regulatory fine-tuners interacting with the spatially conserved core modules, perhaps because of the distinct sets of pathogens that co-evolved with monocot and dicot plants.
Rice and sorghum are related members of the grass (Poaceae) family representing about 70 million years of evolutionary history. We compared the various orthologous pairs in rice and sorghum for nine of the ten COGs shown in Figure 2, in order to establish patterns that could provide a glimpse of the finer-scale changes in cis-regulatory complexity as a result of speciation. Like in the other levels of comparison on ortholog vs. paralog and monocot ortholog vs. dicot ortholog, rice vs. sorghum comparison also revealed distinct cis-regulatory signatures defined by various combinations of stress-associated (occurrence of 36% in rice and 57% in sorghum) and growth and development-associated (occurrence of 55% in rice and 42% in sorghum) TFBS classes (Figure 5B and 5C). However, the most apparent commonality between rice and sorghum orthologs, which also distinguished them from Arabidopsis orthologs was the pronounced abundance of TFBS classes associated with the regulation of root development and growth, with occurrences of 36% in rice and 36% in sorghum . It is possible that the specification of root expression in rice and sorghum are facilitated by distinct combinations of root-specific regulatory signals, with the rice orthologs making use of TALE/Homeodomain, TCPF, HOCT and CDC5/MYB elements, and sorghum orthologs making use of BRRE, E2FF/E2F-DP, SBPD and ROOT elements. This trend was not particularly conserved among Arabidopsis orthologs for most of the COGs, where only one class of putative root-associated TFBS (TEFB) was detected (Figure 5D).
Root function plays an important role in stress responses, particularly in relation to dehydration and osmotic stresses. The trends established based on our current results may then be interpreted in terms of possible differences in the regulation of stress response transcriptomes in the roots of rice and sorghum. It is interesting to note that rice is a C3 plant that thrives best under semi-flooded conditions, hence very sensitive to even mild dehydration, while sorghum is a C4 plant that exhibits high level of tolerance to drought. Differences in the configurations of the root transcriptomes might be a possible implication of these patterns, which may also be relevant to the fundamental differences in root development and physiology between rice and sorghum.
Integration of developmental and stress-related responses
At the very heart of the regulatory sequence conservation among orthologous bZIP transcription factors are the spatially conserved core modules, which appear to be the primary determinants of the basal developmental and spatial regulatory programming. The function of a core module is likely to be further elaborated or specified in relation to specific growth response and stress signals by interaction with the regulatory fine-tuners and their cognate regulator proteins . Core modules tend to be highly conserved during speciation while the regulatory fine-tuners appear to have greater evolutionary flexibility leading to both lineage-specific and species-specific signatures (Figure 6B).
Overall, the resolution afforded by the analysis of tri-species COGs revealed a general trend in which orthologous stress-associated bZIP transcription factors of rice, sorghum and Arabidopsis have finer-scale differences in regulatory information content, both qualitatively and quantitatively, despite the presumed conservation of broad biological function (i.e., stress response). Certain orthologs appeared to have either lost or gained specific regulatory information, and these may have important roles in defining their unique spatio-temporal expression profiles in response to complex combinations of extrinsic and intrinsic signals. These differences can be viewed as possible indications of how natural selection may have impacted the divergence of this group of stress-related transcription factors presumably from their ancestral role of being involved primarily in regulating growth and development . Possible ways that this may have occurred are by the acquisition of additional regulatory signals or maintenance of ancestral regulatory signals that have been favored by natural selection.
Phylogenetic footprinting is based on the overall assumption that upstream sequence motifs that are highly conserved between homologous genes represent functional regulatory elements [27, 28, 39, 41]. Conventional use of this paradigm has focused primarily on the identification of highly conserved elements that define the narrow-scale expression similarities between homologous genes in response to specific signals. The inherent limitation of this relatively simplistic assumption is that it tends to neglect an important view that homologous genes may exhibit expression similarities in response to certain signals and still be unique in a regulatory context if the totality of cis-regulatory information content that defines the large-scale or finer-scale spatio-temporal programming details is taken into consideration. For instance, homologous stress-related genes from two divergent species (orthologs) may share a common cis-element that is critical to stress inducible expression. However, it is also quite conceivable that the regulatory mechanism that integrates stress-related signals with growth and development and with other intrinsic and extrinsic signals that define the gene’s overall expression potential would vary between two evolutionarily diverse species (Figure 6A).
This study is an attempt to extrapolate on the basic assumptions of phylogenetic footprinting to allow us to dissect the finer details that contribute to the distinct cis-regulatory signatures of orthologous stress-related bZIP transcription factors from rice, sorghum and Arabidopsis beyond the more obvious patterns of conservation. The patterns and relationships that we have revealed appear to be robust and quite meaningful for use as conceptual basis of further empirical validation. Furthermore, the trends that we have established have potential physiological significance given that the comparison involved three species with distinct stress physiological properties and adaptation regimes. Our comparisons provided meaningful contrasts at various levels by representing monocot vs. dicot promoter structures, syntenic (rice vs. sorghum) and non-syntenic (rice/sorghum vs. Arabidopsis) genomes, and a spectrum of natural variation for relative sensitivity to various forms of abiotic stresses. Rice is a C3 Gramenae that can withstand only periodic and very mild cold stress and thrives best in semi-flooded soil, while sorghum is a cold-sensitive but drought-tolerant C4 Saccharinae and Arabidopsis is a temperate C3 Brassicaceae that acquires freezing tolerance by cold acclimation [33, 47, 48].
Indeed, the patterns of cis-regulatory conservation revealed in this study were consistent with the established functional sub-classification of bZIP transcription factors, illustrating the strength of the paradigm. For instance, orthologous genes that belong to Group-A (A1, A2) involved in ABA-mediated stress signaling exhibited the characteristic ABA response-associated cis-elements at various combinatorial complexities. Likewise, orthologous genes that belong to Group-D (D1, D2) involved in pathogen and oxidative defenses exhibited the characteristic cis-regulatory signatures expected of such mechanisms [22, 24].
Subtle differences in regulatory information content among orthologous genes were established even at the species level. These subtleties suggest that each member of an orthologous group is potentially unique within the context of regulation. The uniqueness of each member could be a reflection of additional functional attributes that may confer regulatory specificity or precision. Potential biological implications include possible effects of changes in transcription factor regulation to network rewiring and its consequence to the evolution of transcriptional and biochemical network complexities.
Whether the relationships revealed in this relatively small window of information reflect the overall global trends remains to be seen with the analysis of a larger set of orthologous groups from diverse plant species. With the rapid progress in sequencing and annotating representative genomes from virtually any group of flowering plants, the patterns revealed in this study can be validated using a more encompassing number of reference genomes representing a more closely spaced evolutionary continuum. Moreover, the validity of our biological interpretations of the conservation patterns requires fine-scale spatio-temporal expression matrix across a battery of stress conditions throughout the plant’s entire life cycle, which is now achievable by comparative deep sequencing of transcriptomes. Finally, the robust phylogenetic footprints established based on this strategy revealed different patterns of finer-scale regulatory sequence signatures allowing a different layer of contrast between species and perhaps a new perspective that may be useful in understanding the contribution of promoter restructuring to the diversity of stress network complexities in flowering plants.
Validation of rice bZIP transcription factor expression
The composition of the core subset of stress-associated bZIP transcription factors of rice included in this study was based on the low temperature and H2O2 (4 mM) response transcriptomes described previously [23, 26]. Additional expression studies on the 11 genes included in the core subset in response to rapid dehydration and high salt concentration was performed by qRT-PCR following the established procedures [3, 26]. Briefly, three to four leaf stage (V3 to V4) seedlings of japonica rice cultivar Nipponbare were first established in standard Yoshida hydroponic medium (12 hour photoperiod, 29 °C/24 °C temperature regime) for four days prior to stress treatments. Seedlings were subjected to salinity stress by transferring from Yoshida medium to an aqueous solution of 300 mM NaCl. Rapid dehydration was imposed by complete withdrawal from the hydroponic tubs. Seedlings were maintained in a growth chamber at 28 °C constant temperature with 50% relative humidity during the entire period of rapid dehydration. Transcript abundance was expressed as normalized values against a constitutively expressed actin gene and relative to the control values. Hierarchical clustering of relative expression was performed with the TMEV analysis suite .
Phylogenetic reconstruction of orthologs and paralogs
Genomic loci corresponding to the best hits of each rice gene were identified in sorghum and Arabidopsis by BlastX and BlastN searches of the respective reference genome sequences with the rice full-length CDS as query [33, 50]. Parallel BlastX and BlastN analysis yielded identical hits with the output of BlastP searches. Results were further examined by comparing the sequence alignments of the bait rice gene with the primary hits (putative orthologs) and secondary hits (putative paralogs) from each species for each functional sub-class of bZIP transcription factors through the ClustalW version 1.83 .
In order to establish the various clusters of orthologous groups (COGs) in relation to the accepted functional classification scheme of bZIP transcription factors [22, 24], the full-length CDS of putative orthologs and paralogs identified were used for phylogenetic reconstruction by Neighbor-Joining (NJ) method, implemented in the Molecular Evolutionary Genetic Analysis (MEGA) version 3.1 . Analysis was conducted with complete deletion option for gaps and missing data and with the Poisson correction model for distance computation. Bootstrap analysis with 1,000 replicates was conducted to examine the statistical reliability of the tree topology and nodes with a bootstrap cut-off score of 50%. Orthologous and paralogous relationships established through this method were further verified for consistency with the public comparative genome annotation .
Detection and identification of TFBS motifs
Upstream sequences (−1,000 to +200 relative to transcription start site or TSS) of the orthologous genes and their representative paralogs were extracted from reference genome sequences through the Gramene comparative genomics browser . In rice and Arabidopsis, TSS was established either by ab initio prediction or through the bona fide TSS revealed by the alignment of full-length cDNAs or FL-EST contigs with the reference genomes [50, 53–55]. Delineation of the −1,000 to +200 regions in sorghum orthologs and paralogs were based on the ab initio predicted TSS . The 1,200 bp upstream sequences of orthologs and paralogs were scanned for statistically significant occurrence of sequence motifs (6 to 8 nucleotides) representing potential transcription factor binding sites (TFBS) or cis-elements, primarily using the Genomatix Suite . The output was matched with the most likely TFBS using the MatInspector program Release 184.108.40.206 comprised of a large library and matrix description of known plant-specific cis-elements and their cognate regulators. The parameters used were the standard core similarity and optimized matrix similarity. Independent validation of the TFBS classes detected by Genomatix for D1 and D2 was conducted by parallel analysis with the Dragon Motif Builder algorithm with EM2 option and 0.875 threshold and identification by matching with entries at the PLACE and TRANSFAC databases [26, 57–59]. TFBS classes established in parallel by the two methods were nearly identical. Relative TFBS enrichment was compared between species for each COG by chi-square analysis of species-to-species ratios. TFBS occurrences were expressed in binary format (1 = single copy to multi-copy occurrences, 0 = absent) and the data matrices were hierarchically clustered to search for patterns of similarities between species, patterns of co-occurrence between TFBS classes and patterns of similarities between the various COGs. Hierarchical clustering was performed using the TMEV analysis suite .
Cluster of orthologous groups
Transcription factor binding site
basic leucine zipper
Transcription start site
Full-length expressed sequence tags
quantitative real-time polymerase chain reaction.
This study was funded by the USDA-National Research Initiative-Plant Genome Research Program (2006-35604-1669) and Maine Agricultural and Forest Experiment Station (Publication 3267). MRP was supported by Korea Research Foundation Postdoctoral Fellowship (KRF-2006-352-F00002). SJY and MRP were supported by BioGreen 21 Program-RDA (20080401034024), Republic of Korea. The authors thank Dr. Mitch McGrath for his critique of the manuscript and valuable suggestions.
- Reichman JL, Heard J, Martin G, Reuber L, Jiang CZ, Keddie J, Adam L, Pineda O, Ratcliffe OJ, Samaha RR, Creelman R, Pilgrim M, Broun P, Zhang JZ, Ghandehari D, Sherman BK, Yu GL: Arabidopsis transcription factors: Genome-wide comparative analysis among eukaryotes. Science. 2000, 290: 2105-2909. 10.1126/science.290.5499.2105.View Article
- Alon U: Network motifs: Theory and experimental approaches. Nature Rev Genet. 2007, 8: 450-461. 10.1038/nrg2102.View ArticlePubMed
- Park MR, Yun KY, Herath V, Mohanty B, Xu F, Wijaya E, Bajic VB, Yun SJ, De los Reyes BG: Supra-optimal expression of the cold-regulated OsMyb4 transcription factor in transgenic rice changes the complexity of transcriptional network with major effects on stress tolerance and panicle development. Plant Cell Env. 2010, 33: 2209-2230. 10.1111/j.1365-3040.2010.02221.x.View Article
- Sun MGF, Kim PM: Evolution of biological interaction networks: from models to real data. Genome Biol. 2011, 12: e235-10.1186/gb-2011-12-12-235.View Article
- Reichmann JL, Meyerowitz EM: The AP2/EREBP family of plant transcription factors. Biol Chem. 1998, 379: 633-646.
- Nardman J, Werr W: The evolution of plant regulatory networks: What Arabidopsis cannot say for itself?. Curr Op Plant Biol. 2007, 10: 653-659. 10.1016/j.pbi.2007.07.009.View Article
- Carretero-Paulet L, Galstyan A, Roig-Villanova I, Martinez-Garcia JF, Bilbao-Castro JR, Robertson DL: Genome-wide classification and evolutionary analysis of the bHLH family of transcription factors in Arabidopsis, poplar, rice, moss and algae. Plant Physiol. 2010, 153: 1398-1412. 10.1104/pp.110.153593.PubMed CentralView ArticlePubMed
- Soltis DE, Smith SA, Cellinese N, Wurdack KJ, Tank DC, Brockington SF, Refulio-Rodriguez NF, Walker JB, Mooer MJ, Carlsward BS, Bell CD, Latvis M, Crawley S, Black C, Diouf D, Xi Z, Rushworth CA, Gitzendanner MA, Sytsma KJ, Qiu YL, Hilu KW, Davis CC, Sanderson MJ, Beaman RS, Olmstead RG, Judd WS, Donoghue MJ, Soltis PS: Angiosperm phylogeny: 17 genes, 640 taxa. AmJ Bot. 2011, 98: 704-730. 10.3732/ajb.1000404.View Article
- Schoof H, Karlowski WM: Comparison of rice and Arabidopsis annotation. Cur Op Plant Biol. 2003, 6: 106-112. 10.1016/S1369-5266(03)00003-7.View Article
- Bennetzen JL: Patterns in grass genome evolution. Cur Op Plant Biol. 2007, 10: 176-181. 10.1016/j.pbi.2007.01.010.View Article
- Ammiraju JS, Lu F, Sanyal A, Yu Y, Song X, Jiang N, Pontaroli AC, Rambo T, Currie J, Collura K, Talag J, Fan C, Goicoechea JL, Zuccolo A, Chen J, Bennetzen JL, Chen M, Jackson S, Wing R: Dynamic evolution of Oryza genomes is revealed by comparative genomic analysis of a genus-wide vertical data set. The Plant Cell. 2008, 20: 3191-3209. 10.1105/tpc.108.063727.PubMed CentralView ArticlePubMed
- Sakai H, Itoh T: Massive gene losses in Asian cultivated rice unveiled by comparative genome analysis. BMC Genomics. 2010, 11: e121-10.1186/1471-2164-11-121.View Article
- Jiang SY, Ma Z, Ramachandran S: Evolutionary history and stress regulation of the lectin superfamily in higher plants. BMC Evol Biol. 2010, 10: e79-10.1186/1471-2148-10-79.View Article
- Ogata Y, Shibata D: Practical network approaches and biologic interpretations of coexpression analyses in plants. Plant Biotech. 2009, 26: 3-7. 10.5511/plantbiotechnology.26.3.View Article
- De Bodt S, Raes J, Van de Peer Y, Thiessen G: And then there were many: MADS goes genomics. Tr Plant Sci. 2003, 8: 475-483. 10.1016/j.tplants.2003.09.006.View Article
- Xu G, Ma M, Nei M, Kong H: Evolution of F-box genes in plants: Different modes of sequence divergence and their relationships with functional diversification. Proc Natl Acad Sci USA. 2009, 106: 835-840. 10.1073/pnas.0812043106.PubMed CentralView ArticlePubMed
- Izawa T, Takahashi Y, Yano M: Comparative biology of flowering pathways in rice and Arabidopsis. Cur Op Plant Biol. 2003, 6: 113-120. 10.1016/S1369-5266(03)00014-1.View Article
- Higgins JA, Bailey PC, Laurie DA: Comparative genomics of flowering time pathways using Brachypodium distachyon as a model for the temperate grasses. PLoSOne. 2010, 5: e10065-View Article
- Nuruzzaman M, Manimekalai R, Sharoni AM, Satoh K, Kondoh H, Ooka H, Kikuchi S: Genome-wide analysis of NAC transcription factor family in rice. Gene. 2010, 465: 30-44. 10.1016/j.gene.2010.06.008.View ArticlePubMed
- Fitch WM: Homology- A personal view on some of the problems. Tr Genet. 2000, 16: 227-231. 10.1016/S0168-9525(00)02005-9.View Article
- Tatusov R, Koonin EV, Lipman DJ: A genomic perspective on protein families. Science. 1997, 278: 631-637. 10.1126/science.278.5338.631.View ArticlePubMed
- Jakoby M, Weisshaar B, Droge-Laser W, Vicente-Carbajosa J, Tiedemann J, Kroj T, Parcy F: bZIP transcription factors in Arabidopsis. Tr Plant Sci. 2002, 7: 106-111. 10.1016/S1360-1385(01)02223-3.View Article
- Cheng C, Yun KY, Ressom H, Mohanty B, Bajic VB, Jia Y, Yun SJ, De los Reyes BG: An early response regulatory cluster induced by low temperature and hydrogen peroxide in seedlings of chilling-tolerant japonica rice. BMC Genomics. 2007, 8: e175-10.1186/1471-2164-8-175.View Article
- Guedes-Correa LG, Riano-Pachon DM, Schrago CG, Mueller-Roeber B, dos Santos VR, Vincents M: The role of bZIP transcription factors in green plant evolution: Adaptive features emerging from four founder genes. PLoSOne. 2008, 8: e2994-
- Nakashima K, Ito Y, Yamaguchi-Shinozaki K: Transcriptional regulatory networks in response to abiotic stresses in Arabidopsis and grasses. Plant Physiol. 2009, 149: 88-95. 10.1104/pp.108.129791.PubMed CentralView ArticlePubMed
- Yun KY, Park MR, Mohanty B, Herath V, Xu F, Mauleon R, Wijaya E, Bajic VB, Bruckiewich R, De los Reyes BG: Transcriptional regulatory network triggered by oxidative signals configures the early response mechanisms of japonica rice to chilling stress. BMC Plant Biol. 2010, 10: e16-10.1186/1471-2229-10-16.View Article
- McCue LA, Thompson W, Carmack CS, Ryan MP, Liu JS, Derbyshire V, Lawrence CE: Phylogenetic footprinting of transcription factor binding sites in proteobacterial genomes. Nucl Acids Res. 2001, 29: 774-782. 10.1093/nar/29.3.774.PubMed CentralView ArticlePubMed
- Zhang Z, Gerstein M: Of mice and men: Phylogenetic footprinting aids the discovery of regulatory elements. J Biol. 2003, 2: e11-10.1186/1475-4924-2-11.View Article
- Desikan R, Mackerness SAH, Hancock JT, Neill SJ: Regulation of the Arabidopsis transcriptome by oxidative stress. Plant Physiol. 2001, 127: 159-172. 10.1104/pp.127.1.159.PubMed CentralView ArticlePubMed
- Buchanan CD, Lim S, Salzman RA, Kagiampakis I, Morishige DT, Weers BD, Klein RR, Pratt LH, Cordonier-Pratt MM, Klein PE, Mullet JE: Sorghum bicolor’s transcriptome response to dehydration, high salinity and ABA. Plant Molec Biol. 2005, 58: 699-720. 10.1007/s11103-005-7876-2.View Article
- Initiative TAG: Analysis of the genome sequence of the flowering plant Arabidopsis thaliana. Nature. 2000, 408: 796-815. 10.1038/35048692.View Article
- Remm M, Storm CEV, Sonnhammer ELL: Automatic clustering of orthologs and in-paralogs from pairwise species comparisons. J Molec Biol. 2001, 314: 1041-1052. 10.1006/jmbi.2000.5197.View ArticlePubMed
- Paterson AH, Bowers JE, Bruggman R, Dubchak I, Grimwood J, Gundlach H, Haberer G, Hellsten U, Mitros T, Paliakov A, Schmutz J, Spannagl M, Tang H, Wang X, Wicker T, Bhart AK, Chapman J, Feltus FA, Gowik U, Grigoriev IG, Lyons E, Maher CA, Martis M, Narechania A, Otillar RP, Penning BW, Salamov AA, Wang Y, Zhang L, Carpita NC, et al: The Sorghum bicolor genome and the diversification of grasses. Nature. 2009, 457: 551-556. 10.1038/nature07723.http://www.phytozome.net/sorghum,View ArticlePubMed
- Ouyang S, Zhu W, Hamilton J, Lin H, Campbell M, Childs K, Thibaud-Nissen F, Malek RL, Lee Y, Zheng L, Orvis J, Haas B, Wortman J, Buell CR: The TIGR Rice Genome Annotation Resource: improvements and new features. Nucl Acids Res. 2007, 35: D883-D887. 10.1093/nar/gkl976.http://rice.plantbiology.msu.edu,PubMed CentralView ArticlePubMed
- Liang C, Jaiswal P, Hebbard C, Avraham S, Buckler ES, Casstevens T, Hurwitz B, McCouch S, Ni J, Pujar A, Ravenscroft D, Ren L, Spooner W, Tecle I, Thomason J, Tung CW, Wei X, Yap I, Youens-Clark K, Ware D, Stein L: Gramene: a growing plant comparative genomics resource. Nucl Acids Res. 2008, 36: D947-D953.http://www.gramene.org,PubMed CentralView ArticlePubMed
- Schmid M, Davison TS, Henz SR, Pape UJ, Demar M, Vingron M, Scholkopf B, Weigel D, Lohmann JU: A gene expression map of Arabidopsis thaliana development. Nature Genet. 2005, 37: 501-506. 10.1038/ng1543.View ArticlePubMed
- Goda H, Sasaki E, Akiyama K, Maruyama-Nakashita A, Nakabayashi K, Li W, Ogawa M, Yamauchi Y, Preston J, Aoki K, Kiba K, Takatsuto S, Fujioka S, Asami T, Nakano T, Kato H, Mizuno T, Sakakibara H, Yamaguchi S, Nambara E, Kamiya Y, Takahashi H, Hirai MY, Sakurai T, Shinozaki K, Saito K, Yoshida S, Shimada Y: The AtGenExpress hormone and chemical treatment data set: Experimental design, data evaluation, model data analysis and data access. The Plant J. 2008, 55: 526-542. 10.1111/j.1365-313X.2008.03510.x.View ArticlePubMed
- Edgar R, Domrachev M, Lash AE: Gene Expression Omnibus: NCBI gene expression and hybridization array data repository. Nucl Acids Res. 2002, 30: 207-210. 10.1093/nar/30.1.207.http://www.ncbi.nlm.nih.gov/geo/,PubMed CentralView ArticlePubMed
- Kerhornou A, Guigo R: BioMoby web services to support clustering of co-regulated genes based on similarity of promoter configurations. Bioinformatics. 2007, 23: 1831-1833. 10.1093/bioinformatics/btm252.View ArticlePubMed
- Hong RL, Hamaguchi L, Busch MA, Weigel D: Regulatory elements of the floral homeotic gene Agamous identified by phylogenetic footprinting and shadowing. The Plant Cell. 2003, 15: 1296-1309. 10.1105/tpc.009548.PubMed CentralView ArticlePubMed
- De Bodt S, Thiessen G, Van de Peer Y: Promoter analysis of MADS-box genes in eudicots through phylogenetic footprinting. Mol Biol Evol. 2006, 23: 1293-1303. 10.1093/molbev/msk016.View ArticlePubMed
- Vandepoele K, Quimbaya M, Casneuf T, De Veylder L, de Peer YV: Unraveling transcriptional control in Arabidopsis using cis-regulatory elements and coexpression networks. Plant Physiol. 2009, 150: 535-546. 10.1104/pp.109.136028.PubMed CentralView ArticlePubMed
- Apel K, Hirt H: Reactive oxygen species: metabolism, oxidative stress and signal transduction. Ann Rev Plant Biol. 2004, 55: 373-399. 10.1146/annurev.arplant.55.031903.141701.View Article
- Koornneef A, Leon-Reyes A, Ritsema T, Verhage A, Den Otter FC, Van Loon LC, Pieterse CMJ: Kinetics of salicylate-mediated suppression of jasmonate signaling reveal a role for redox modulation. Plant Physiol. 2008, 147: 1358-1368. 10.1104/pp.108.121392.PubMed CentralView ArticlePubMed
- Won SK, Lee Y-J, Lee H-Y, Heo Y-K, Cho M, Cho H-T: Cis-element and transcriptome-based screening of root hair-specific genes and their functional characterization in Arabidopsis. Plant Physiol. 2009, 150: 1459-1473. 10.1104/pp.109.140905.PubMed CentralView ArticlePubMed
- Inada DC, Bashir A, Lee C, Thomas BC, Ko C, Goff SA, Freeling M: Conserved non-coding sequences in the grasses. Genome Res. 2003, 13: 2030-2041. 10.1101/gr.1280703.PubMed CentralView ArticlePubMed
- Thomashow MF: Plant cold acclimation: Freezing tolerance genes and regulatory mechanisms. Ann Rev Plant Physiol Plant Molec Biol. 1999, 50: 571-599. 10.1146/annurev.arplant.50.1.571.View Article
- Sasaki T, Antonio BA: Sorghum in sequence. Nature. 2009, 457: 547-556. 10.1038/457547a.View ArticlePubMed
- Saeed AI, Sharov V, White J, Li J, Liang W, Bhagabati N, Braisted J, Klapa M, Currier T, Thiagarajan M, Sturn A, Snuffin M, Rezantsev A, Popov D, Ryltsov A, Kostukovich E, Borisovsky I, Liu Z, Vinsavich A, Trush V, Quackenbush J: TM4: a free, open-source system for microarray data management and analysis. Biotechniques. 2003, 34: 374-378.PubMed
- Lamesch P, Berardini TZ, Li D, Swarbreck D, Wilks C, Sasidharan R, Muller R, Dreher K, Alexander DL, Garcia-Hernandez M, Karthikeyan AS, Lee CH, Nelson WD, Ploetz L, Singh S, Wensel A, Huala E: The Arabidopsis Information Resource Center (TAIR): improved gene annotation and new tools. Nucl Acids Res. 2012, 40: D1202-D1210. 10.1093/nar/gkr1090.http://www.arabidopsis.org,PubMed CentralView ArticlePubMed
- Chenna R, Sugawara H, Koike T, Lopez R, Gibson TJ, Higgins DG, Thompson JD: Multiple sequence alignment with the Clustal series of programs. Nucl Acids Res. 2003, 31: 3497-3500. 10.1093/nar/gkg500.PubMed CentralView ArticlePubMed
- Kumar JP, Jamal T, Doetsch A, Turner FR, Duffy JB: CREB binding protein functions during successive stages of eye development in Drosophila. Genet. 2004, 168: 877-893. 10.1534/genetics.104.029850.View Article
- Kikuchi S, Satoh K, Nagata T, Kawagashira N, Doi K, Kishimoto N, Yazaki J, Ishikawa M, Yamada H, Ooka H, Hotta I, Kojima K, Namiki T, Ohneda E, Yahagi J, Suzuki K, Li CJ, Ohtsuki K, Shishiki T, Otomo Y, Murakami K, Iida Y, Sugano S, Fujimura T, Suzuki Y, Tsunoda Y, Kurosaki T, Kodama T, Masuda H, Kobayashi M, et al: Collection, mapping, and annotation of over 28,000 cDNA clones from japonica rice. Science. 2004, 301: 376-379.View Article
- Satoh K, Doi K, Nagata T, Kishimoto N, Suzuki K, Otomo Y, Kawai J, Nakamura M, Hirozane-Kishikawa T, Kanagawa S, Arakawa T, Takahashi-Iida J, Murata M, Ninomiya N, Sasaki D, Fukuda S, Tagami M, Yamagata H, Kurita K, Kamiya K, Yamamoto M, Kikuta A, Bito T, Fujitsuka N, Ito K, Kanamori H, Choi IR, Nagamura Y, Matsumoto T, Murakami K, et al: Gene organization in rice revealed by full-length cDNA mapping and gene expression analysis through microarray. PLoSOne. 2007, 2: e1235-http://cdna01.dna.affrc.go.jp/cDNA,View Article
- Tanaka T, Antonio BA, Kikuchi S, Matsumoto T, Nagamura Y, Numa H, Sakai H, Wu J, Itoh T, Sasaki T, Aono R, Fujii Y, Habara T, Harada E, Kanno M, Kawahara Y, Kawashima H, Kubooka H, Matsuya A, Nakaoka H, Saichi N, Sanbonmatsu R, Sato Y, Shinso Y, Suzuki M, Takeda J, Tanino M, Todokoro F, Yamaguchi K, Yamamoto N, et al: The Rice Annotation Project Database (RAP-DB)- 2008 update. Nucl Acids Res. 2008, 36: D1028-D1033.http://rapdb.dna.affrc.go.jp,PubMed
- Cartharius K, Frech K, Grote K, Klocke B, Haltmeier M, Klingenhoff A, Frisch M, Bayerlein M, Werner T: MatInspector and beyond: promoter analysis based on transcription factor binding sites. Bioinformatics. 2005, 21: 2933-2942. 10.1093/bioinformatics/bti473.View ArticlePubMed
- Matys V, Fricke E, Geffers R, GoBling E, Haubrock M, Hehl R, Hornischer K, Karas D, Kel AE, Kel-Margoulis OV, Kloos DU, Land S, Lewicki-Potapov B, Michael H, Munch R, Reuter I, Rotert S, Saxel H, Scheer M, Thiele S, Wingender E: TRANSFAC: transcriptional regulation, from patterns to profiles. Nucl Acids Res. 2003, 31: 374-378. 10.1093/nar/gkg108.PubMed CentralView ArticlePubMed
- Huang E, Yang L, Chowdhary R, Kassim A, Bajic VB: An algorithm for ab initio DNA motif detection. Information Processing and Living Systems. Edited by: Bajic VB, Wee TT. 2005, London: World Scientific Publishing Ltd, 611-614.View Article
- Higo K, Ugawa Y, Iwamoto M, Korenaga T: Plant cis-acting regulatory DNA elements (PLACE) database. Nucl Acids Res. 1999, 27: 297-300. 10.1093/nar/27.1.297.PubMed CentralView ArticlePubMed
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.