Genetical Genomics Reveals Candidate Genes for Weevil Resistance and Genomic Regions with Regulatory Hotspots
In the present study we merged genetics and transcriptomics [40, 41] to better understand weevil resistance in spruce, an important phenotypic trait that defines the life history of this economically valuable perennial plant organism. Gene expression variation represents the phenotype most directly related to DNA sequence polymorphism, such that in principle each transcript has a corresponding gene with known position in the genome [42]. Genetical genomics allows assaying thousands of these gene expression traits simultaneously and thus provides data on a large and unbiased set of traits (ibidem); these 'expression phenotypes' are then accessible to standard QTL analysis. Using the gene expression traits obtained from our transcriptome study we achieved fine scale phenotyping that was merged with phenotyping of the conventional traits. This allowed us to identify positional candidate genes for any phenotypic variation by testing for co-segregation of markers with gene expression, individual resistance traits, or the composite trait represented by all sampled defense traits (see below). This way, we found certain members of complex gene families within the phenylpropanoid metabolic pathway individually associated with weevil resistance.
We used the experimental setup as presented in Figure 2. Plant material was harvested - as outlined in the Methods section - from the progeny of resistant-female-by-susceptible-male crosses, which showed wide segregation for weevil resistance and had shared parentage ([43]). Thus, cross 26 of ♀PG87*♂PG165, cross 27 of ♀PG87*♂PG117, cross 29 of ♀PG21*♂PG165 and cross 32 of ♀PG21*♂PG117 forming the partial diallel were chosen for further analysis. The plant material used for expression profiling (bark tissue from tree leaders) was harvested in a randomized fashion to minimize bias due to the sampling procedure. DNA from an expanded mapping population (417 individuals in total) was genotyped using a custom-built 384 multiplexed SNP chip. This information was used to estimate pairwise recombination rates between SNP loci and subsequently construct the framework genetic linkage map for localizing the QTLs (Additional Files 2 and 3). Phenotypic information for QTL analysis was obtained from (a) tree height, (b) weevil attack, (c) oviposition, and (d) transcriptomics data. Measures were taken for the initial tree height in 1995, and heights in years three and five as well as leader length in year five preceding the artificial augmentation of the local weevil population in October of the same year. Attack rates in 2000 and 2001 were classified as successful top kills, failure to kill the leader, and no attack. For the same years, egg counts along the leaders were summarized into five discrete classes. Comprehensive information about these measures can be found in Additional File 4. In the transcriptomics experiment a two color 21.8 k spruce EST array was employed for gene expression profiling in the partial diallel progeny. A distant pair design that maximized direct comparisons between different alleles at each locus [41] was modified for outbred individuals (Additional File 5). Signal intensities can be found under the GEO accession number GSE22116. We used the signal intensity at each EST spot for the following analyses: (a) generation of expression QTLs (eQTLs) and (b) establishing a co-expression network. In the latter the gene-gene interactions were assessed. Specifically, a Gaussian model and a shrinkage method were employed to evaluate direct gene connections (see Methods). QTLs were mapped in the diallel progeny using a likelihood function to assess the phenotype effect conditional on genotypic variation. A QTL was significant at LOD≥3.84 (Additional Files 6 and 7). A goodness-of-fit test assuming a uniform distribution was performed to test whether the observed frequencies of eQTLs along the linkage map differed significantly from the expected value. Following the rejection of this null hypothesis "eQTL hotspots" signified eQTL clusters with ≥38 eQTLs at a given locus. Positional candidate genes were identified by collocation of at least 40% of their eQTLs with phenotypic trait QTLs based on the criteria for identifying significant QTLs (10,000 randomizations, p ≤ 0.05). The positional candidate genes for the general 'resistance' trait were identified based on strong collocations of gene expression variation with all six studied resistance traits (see Material and Methods).
Expression variation from 428 gene spots generated 4,221 significant eQTLs (LOD ≥ 3.84). Figure 1A shows the phenylpropanoid pathway, gene families from the core branch as well as related families; data relevant for this study are found in Additional File 1. Phenylpropanoid pathway-linked genes that showed strong association with the general 'resistance' phenotype were: SAMS, one ADT-like gene (PicglADTL8, Figure 3), an annotated spruce CYP750 (C24, Additional File 8), and a member with similarity to LAR. An additional 34 phenylpropanoid-related candidate genes were identified based on significant co-segregation with individual weevil resistance traits such as attack rates and oviposition estimates: transcription factors (seven putative mybs and one putative NAC), putative members of the upstream shikimate pathway (DAHP and DHQD-SD), one putative PAL, two annotated CHS as well as one STS, two OMTs from group C and one from group D (OMTL), two P450s (CYP75/F3'H and CYP750, respectively), three different representatives of phenylpropanoid reductases (PCBER, PLR and IFR), one DIR (f-family), six PRXR (including the stress inducible PicabPRX2 orthologue), two putative LACs, one putative catechol-O-methyltransferase and two genes with similarity to LAR. Most associations were found for weevil susceptibility measured in the year 2000 and for both years 2000 and 2001. No dedicated monolignol biosynthetic gene co-segregated extensively with any resistance trait. However, 29 genes from our gene set were associated with individual height growth traits (predominantly tree height measures taken in 1999 preceding the artificially enforced weevil attacks), Additional File 1. As many as 11 peroxidases co-segregated with growth traits, among them several genes which are implied in lignin polymerization by radical coupling of the monolignols. Their co-expression pattern with other phenylpropanoid-related genes is shown in Figure 4.
We superimposed at a given SNP marker pQTL maps of individual phenotypic traits and counts of significant eQTLs from the studied gene set (Figure 5). We found regulatory hotspots comprising multiple pQTLs and accumulated eQTLs. Along this QTL density map, eight loci represented hubs of trans-eQTLs, which also corresponded with at least three pQTLs: four loci were associated exclusively with resistance QTLs, and each two loci with growth QTLs and QTLs from individual traits of both growth and resistance, respectively (Figure 5). The composition of those eQTL hotspots with extensive pQTL overlap is given in Additional File 9.
Since the phenylpropanoid pathway proved to be important in insect resistance, we looked at the representation of individual branches within the pathway as well as individual gene families among those regulatory hotspots that were associated with pQTLs. We have summarized the average number of detected eQTLs per gene family as well as gene families whose members have eQTLs at a minimum of two regulatory hotspots with pQTL association (Additional File 10 and Figure 6). Specifically, we compared members from transcription factors myb and NAC, the shikimate pathway, PAL, C4H and 4CL (main branch of the phenylpropanoid pathway), and gene families that are involved in the biosynthesis of various secondary compounds as well as lignin and lignans, respectively (Additional File 10). Among all pathway branches, the branch of the phenylpropanoid pathway that is directly involved in the biosynthesis of the secondary compounds generated on average the highest number of eQTLs, whereas the shikimate pathway generated the lowest number of eQTLs on average. Also, the transcription factors followed by the pathway branch that is directly related to lignin/lignan biosynthesis contributed on average the most members to resistance associated hotspots (Additional File 10). In 20 out of 29 studied gene families, individual family members contributed eQTLs to at least two different phenotype-associated eQTL hotspots. For example, members of CCoAOMT, CCR, ADT, as well as myb, PRXR, OMT, P450 families had eQTLs preferentially associated with resistance hotspots (Figure 6). Specifically, we found ADTs (and one ADT-like gene), OMTs from group C as well as both AEOMT genes, the lignin-forming genes PiglCCoAOMT3, PicabPRX1 and PicabPRX18 orthologs and the stress inducible PicabPIPRX gene, and the P450s that are involved in a wide range of derivatization reactions (CYP75/F3'H, CYP750, F3'5'H/F3H and C4H class, respectively), Additional File 9. The DIR gene family members had eQTLs predominantly associated with growth trait QTLs (Figure 6) that were mainly representatives of the f-family (Additional File 9).
Although most spruce gene markers used to build our framework linkage map lack consistent annotations (67% using TAIR7, 54% using Viridiplantae databases gave no hits), Additional File 2, for the few loci that are potential gene expression regulators located within eQTL hotspots and affect highly complex phenotypes, we further infer their putative gene function in spruce. For example, the growth trait associated locus on LG13 with accumulated eQTLs from 49 array elements (Figure 5) represents a glutamate decarboxylase (GAD) gene. GAD catalyzes gamma-aminobutyric acid (GABA) synthesis via decarboxylation from the amino acid glutamate. The regulation of GAD enzyme activity is vital for normal plant development, and is accomplished by calcium/calmodulin binding to the specific CaM domain of GAD allowing the plant to respond to various external stimuli [44]. Recently, the locally enhanced production of GABA in the plant was shown to be connected with a deterrence reaction of the host responding to herbivore attack [45]. Two loci that encode bona fide CCoAOMT genes (CCoAOMT-1, CCoAOMT-2) on LG6 represented eQTL hotspots that overlapped with QTL regions for various resistance traits (Figure 5). At the two different CCoAOMT loci a common set of pQTLs (i.e., atk_2000, egg_2000 and sum_egg) as well as eQTLs (generated by 24 members of 11 different gene families) clustered. Most prominently the following gene families were involved: PRXR (PicabPRX18, e.g.), DIR (PgeDIR13, a-family; PicsiDIR31, PicsiDIR27, f-family), OMT (AEOMTs PicglOMT-17 and PicglOMT-18; PgOMT-35, clade II), myb (TT2 orthologue, e.g.) and ADT (PicglADT3, PicglADT4), Additional File 9. Both CCoAOMT genes have also cis-eQTLs (Additional File 9) that likely represent promoter polymorphisms involved in the differential expression of the genes. Our finding of cis-eQTLs connected by extensive trans-regulatory interactions describes an example of epistasis that is commonly observed in the regulation of pathway biosynthetic genes [46].
Arogenate Dehydratase and Related Sequences
While arogenate dehydratase activity was first detected in Nicotiana sylvestris[47], gene cloning and functional characterization was only recently reported in Petunia hybrida[48]. Five putative white spruce arogenate dehydratases (ADT) were identified with similarity to characterized bacterial and fungal prephenate dehydratase (PD). They group together with A. thaliana and P. hybrida (Figure 3) implied in the catalysis of the last step of the shikimate pathway specific for the biosynthesis of phenylalanine (the entry molecule for the core phenylpropanoid pathway) (Figure 1). In addition, eight ADT-like white spruce genes fall into two more remote clusters of the small family (Figure 3).
Four ADT genes as well as six ADT-like genes were represented as elements on our microarray. The ADT-like gene PicglADTL8 was identified as positional candidate for the general resistance phenotype per se, Additional File 1. In the co-expression network three ADTs (PicglADT2 WS00810_A15, PicglADT3 WS00921_B18, and PicglADT4 WS0261_E23) as well as one ADT-like (PicglADTL7 WS00937_M15) were present ("PD", PD_4, PD_6, PD_12 and PD_9, respectively; Figure 4). Despite the central importance of ADT in generating the precursor molecule phenylalanine, ADTs were positioned at the edges of long-range subgraphs in gene-gene interactions, Figure 4 (8), (12). PicglADT3 was significantly co-expressed with a class II 4CL gene WS01010_M10 (4CL_11), see below (Figure 4 (8), supporting its function as ADT. Expression of PicglADT2 is negatively correlated with the cluster of constitutive dirigents of the f and b/d class (DIR_2, DIR_32, DIR_31, DIR_20, DIR_18), Figure 4 (12). The ADT-like PicglADTL7 was co-expressed with OPCL PicglACL10 (Figure 4, 4CL_3 WS00729_F23) suggesting both genes act in a related pathway.
4-Coumarate:CoA-Ligase (4CL) and Related Acyl CoA Ligases (ACL)
In the biosynthesis of lignin, soluble and wall bound phenolics and flavonids, 4-coumarate:CoA ligase (4CL; EC 6.2.1.12) plays a gate keeper role as the enzyme generates not only the CoA ester of coumaric acid in the last step of the core phenylpropanoid pathway (Figure 1), but the higher substituted coumarate derivatives, such as caffeic acid, ferulic acid and 5-hydroxyferulic acid, and sinapic acid in some angiosperms as well. 4CL is part of a family of adenylate-forming enzymes present in all organisms, including an Arabidopsis gene recently shown to encode a functional 3-oxo-2(2'-[Z]-pentenyl)-cyclopentane-1-octanoic acid (OPC-8) CoA ligase (OPCL) catalyzing an essential step in jasmonic acid biosynthesis [49, 50]. The 4CL gene family is moderately sized [14] and shows four bona fide 4CL genes for spruce (Figure 7). Three white spruce 4CL candidates clustered closely with Arabidopsis 4CL3 (involved in flavonoid biosynthesis) and one spruce gene was found orthologous to loblolly pine 4CL1 (for which a role in the biosynthesis of lignin in compression wood was implied) (Figure 7) [51, 52]. In the context of adenylate-forming acyl-CoA ligases, 11 white spruce candidates fell into four divergent clusters, out of which four closely grouped with the Arabidopsis OPCL, indicating a similar function (Figure 7). While this represents a large subfamily, our knowledge about the function of the members is limited.
Of 13 array elements with similarity to acyl-CoA ligase genes, three represent sequence orthologs of bona fide 4CL genes, while ten elements have higher similarity to oxo-pentenyl-cyclopentane (OPCL), Additional File 1 and Figure 7. Two spruce 4CL genes (Picgl4CL3 WS00112_J15 and Picsi4CL2 WS01010_M10) were present in our co-expression network, Figure 4. They represent spruce classII 4CL genes, which are involved in the formation of flavonoids and soluble phenolics. Picgl4CL3 (4CL_1) was directly co-expressed with two positional candidates for resistance traits (Picsi-PKS22 WS0014_M21 and LAR WS00929_C24), Figure 4 (1). Picsi4CL2 (4CL_11) is co-expressed with PicglADT3, Figure 4 (8). The absence of the classI representative Picgl4CL4 (Additional File 1, Figure 7) and other lignin forming genes (CCoAOMT, and laccases implicated in constitutive lignifications [53]) along with the presence of other gene family members such as those involved in defense mechanisms like dirigents (a-family), P450s, polyketide synthases and phenylpropanoid reductases (see also Additional File 1) suggests higher importance of the recovered gene-gene interactions for defense mechanisms than for normal plant development.
CCoAOMT Superfamily, Related to Caffeoyl-CoA O-Methyltransferase
Caffeoyl-CoA O-methyltransferase (CCoAOMT, EC 2.1.1.104) catalyses O-methylation of the hydroxyl group at the C3 position of the phenolic ring in conversion of caffeoyl-CoA to feruloyl-CoA (Figure 1) and has been characterized in many plants, including loblolly pine, where it is associated with developmental lignification [54]. Three white spruce genes are closely related with a loblolly pine CCoAOMT in the clade of bona fide CCoAOMT (Figure 8). The diversity observed in the white spruce CCoAOMT clade is the result of lineage-specific gene duplications that were retained throughout evolution.
All three genes involved in monolignol biosynthesis were absent from the co-expression network. However, PicglOMT1, the spruce specific remote OMT (CCOMTL) (Additional File 11) that is putatively involved in phenylpropanoid metabolism and a positional candidate for Ht_1997 growth trait (Additional File 1), was represented by two array spots (IS0014_O19, WS0022_P17) in the network. These two genes termed CCoAOMT_1 and CCoAOMT_2 were directly connected to transcription factors (NAC_10 WS00911_P24 and myb_5 WS00716_G01, respectively) as well as to a putative shikimate EPSPS synthase WS0085_B05, Figure 4 (1). Furthermore, two catechol-O-methyltransferase-like genes WS0107_O19 and WS01013_K15 (Additional File 1) resided at the edges of the network (CCoAOMT_9, CCoAOMT_8, Figure 4): The first clustered with various annotated genes (PicglOMT-13 WS0261_A24, OPCL WS00729_F23, ADT-like WS00937_M15, PgeDIR5 WS00924_E04, see below and Additional File 1), while the second, a positional candidate for the egg_2001 resistance trait, connected to genes with unknown functions. The phylogeny for said catechol-O-methyltransferase (-like) sequences suggests they are ubiquitous genes that lack a counterpart in modern/angiosperm plants (Additional File 12) and are part of a yet to be elucidated pathway in conifer defenses.
Polyketide Synthases Related to Chalcone and Stilbene Synthases (CHS/STS)
Stilbene synthase (STS, EC 2.3.1.95) and chalcone synthase (CHS, EC 2.3.1.74) are plant-specific polyketide synthases at the entry of stilbenoid and flavonoid biosyntheses, respectively (Figure 1). Both types of enzymes perform a sequential condensation of three acetate units to a CoA-ester to form an intermediate that is folded into the aromatic ring systems of naringenin chalcone or the stilbene backbone [55, 56]. While flavonoids, which play a vital role as UV protective pigments in plants, are ubiquitous in the land plant kingdom and were reported in the basal lineages of liverworts and mosses [57], stilbenes have so far been detected in relatively few plant families where they contribute to the resistance of woody tissues to degradation and act as phytoalexins. The main constitutive stilbene glycosides in Picea species are astringin and isorhapontin [58]. Fourteen spruce CHS-like sequences form a tight cluster with the Japanese red pine CHS, indicating several duplications of functionally related polyketide synthases (Figure 9). While the ancestor of modern polyketide synthases predates the evolution of gymnosperms and angiosperms, the close relationship of STS and CHS within the angiosperms and gymnosperms indicates that these functions evolved independently several times. In gymnosperms distinct clades of both pine and spruce orthologues are found for STS and for CHS, consistent with the presence of both functions in a common ancestor of these lineages.
Strong collocation of gene expression and trait variation identified two CHS genes Picsi-PKS21 and Picsi-PKS22 as well as the STS gene Picgl-PKS6 as potential candidates for weevil resistance traits (sum_egg, egg_2001 and egg_2001, respectively), Additional File 1. Picsi-PKS22 WS0014_M21 is present within the network (CHS_1) and co-expressed with Picgl4CL3 and a peroxidase WS0063_P06 that co-segregated with the sum_atk resistance trait (Figure 4 (1)). Picgl-PKS6 WS00925_G22, (CHS_11, Figure 4 (5)) was negatively co-expressed with peroxidase PicsiPRX2 WS0074_A10 that responded to wounding ([59]). The CHS gene WS0024_K18 (termed CHS_3) from the Picsi-PKS14 cluster is associated with a myb transcription factor (myb_32, Figure 4 (3)).
Picgl-PKS2 WS00731_E22 (see also Figure 9) was negatively co-expressed with a-family dirigent PgeDIR2 WS00911_I09 (Figure 4 (11), CHS_8 and DIR_10). The expression of PgeDIR2 was significantly triggered by insect feeding [29], see below. Interestingly, Picgl-PKS2 was co-expressed with an ANAC057 transcription factor WS00929_E05 (NAC_14, Figure 4 (11)) that accumulated with increasing growth rate (unpublished). The contrasting expression pattern found for diverse members of the polyketide synthase gene family reflects the differential gene regulation for individual members.
O-Methyltransferase Superfamily, Related to Caffeic Acid/Coniferaldehyde O-Methyltransferase (COMT)
A total 37 of OMT/COMTL candidates were identified in white spruce through mining of the Treenomix EST database (Spruce V8 database will be published elsewhere, K. Ritland, pers. com.), relative to 17 genes in Arabidopsis, eight in poplar, and six in rice (Figure 10). The founding members of the O-methyl transferase superfamily (OMT, EC 2.1.1.6) involved in the synthesis of methylated plant phenolics are related to the formation of the angiosperm monolignol precursor of S-lignin. However, the S-lignin pathway and sinapic acid-derived metabolites, such as syringyl lignins, are absent in gymnosperms (Figure 1). Even though they are broadly represented in the OMT/COMTL superfamily, evidence for dedicated bona fide COMT orthologues in white spruce is missing.
OMTs were among phenylpropanoid gene families most completely represented in the co-expression network (Figure 4). Three COMTLs (PicglOMT-11 WS00715_G04/WS0261_F17, PicglOMT-10 WS0023_M11, PicglOMT-7 WS02611_F21) similar to the stress inducible O-methyltransferase from scots pine ([17]; clade I, Figure 10 and 11) were highly co-expressed, OMT_5, OMT_21, OMT_22, OMT_1, Figure 4 (2). Their expression was negatively correlated with the expression of a sequence (WS00716_K11) that shares similarity to AtPER16/AtPER45 and is a positional candidate for the Hgt1999 trait (Additional File 1). Five spruce OMTs (WS0071_H13, WS0078_K09, WS0104_J07, WS0046_C03, and WS01013_K11) that are part of the lineage specific clade in the plant OMT/COMTL superfamily (Figure 10) cluster within the network (OMT_4, OMT_10, OMT_19, OMT_3, and OMT_18, Figure 4 (3)) and were co-expressed with a transcription factor that weakly resembles the secondary wall-associated AtMYB83 (WS0079_B12). PicglOMT-1 WS0099_A18, with strong similarity to a beta-alanine betaine synthase (clade III, Figure 10 and 11), represents a positional candidate for the egg_2000 trait and is co-expressed with PgeDIR5, a dirigent of the a-subfamily, DIR_15, Figure 4 (4). Although their classification here is based on phylogenetic analysis alone, and biochemical characterization needs to support the annotation, spruce genes of the OMT/COMTL superfamily possibly extent the repertoire of spruce defense mechanisms.
Cytochrome P450s Involved in Oxygenation of Phenylpropanoids
In addition to a major plant carbon sink - the lignin biosynthesis - cytochromes P450 contribute to the biosynthesis of many bioactive phenolic derivatives (for a review see [60]). In the families of the cinnamte hydroxylase (C4H, CYP73) and coumaroyl-shikimate 3'-hydroxylase (C3'H, CYP98) two and one white spruce candidate were identified, corresponding to single genes in Arabidopsis, while no white spruce coniferylaldehyde 5-hydroxylase (CA5H, CYP84) sequence orthologues were found. In the CYP75 family, seven diverse white spruce candidates form two distinct clusters with the Arabidopsis CYP75B1 encoding a functional F3'H [61]. Phylogenetically related, but without biochemical function assigned, 45 white spruce candidates constitute the unusual large and diverse conifer specific CYP750 family, an outgroup to angiosperm CYP84. Another group of P450-dependent monooxygenases forms the spruce-specific outgroup of nine CYP76-likes (Additional File 8).
Within CYP75/F3'H sequences, family member C59 represents a positional candidate for the sum_egg resistance trait. Another representative (C57, WS00931_D17) is present in our co-expression network (Figure 4 (5)) and is negatively correlated with NAC_1 (WS00713_M11), a NAC transcription factor that is a candidate for the atk_2001 resistance trait. The function of the diverse CYP750 family (Additional File 8) is unknown, however, the sequence relationship with CYP84 (F5H), CYP75(F3'5'H/F3'H), CYP98 (C3H), and CYP73 (C4H) which are involved in oxidation reactions of the phenylpropanoid metabolism indicates that these families and CYP750s may share a common progenitor and possibly a structurally similar substrate of the phenylpropanoid class. On our array 11 spruce specific CYP750 genes were represented (Additional File 1), one CYP750 (C24) was identified in our study as a positional candidate for the weevil resistance phenotype per se, and three CYP750s (C18, C12 and C56) are found in the co-expression network. C18 (F5H_5, WS00934_G23) was co-expressed with a candidate (LAR similarity, DFR_3) for the general resistance phenotype (Figure 4 (7)). C12 (F5H_3, WS00923_E07) was negatively correlated with myb myb_1 (WS00917_H19) that is loosely related to TT2, a putative regulator of proanthocyanidin biosynthesis [62], and in our study a positional candidate for both atk_2000 and sum_atk resistance traits (Additional File 1, Figure 4 (8)). Expression patterns for both CYP750s suggest diverged, though unknown, functions consistent with their position in different subclades (Additional File 8). The third CYP750 C56 (F3'H_8, WS00922_H05) was negatively correlated to a cluster of co-expressed peroxidase genes of which some are implied in monolignol polymerization (PicabPRX18 IS0011_F24 (PRXR_1), spruce ArathPER42 WS0031_G05 (PRXR_13), and ArathPER64 WS01016_D12 (PRXR_58) orthologues, see elsewhere) (Figure 4 (8)). Future characterization of spruce CYP750 candidates will need to uncover their enzymatic function in planta.
Phenylpropanoid Reductases
Phenylpropanoid reductases are a class of closely related enzymes that include leucoanthocyanidin and isoflavone reductases (LAR/IFR), pinoresinol and lariciresinol reductasess (PLR), phenylcoumaran benzylic ether reductase (PBR), and isoeugenol synthase (IES) that have been shown to participate in the biosynthesis of a plethora of constitutive and induced defense related phenylpropanoids and phytoalexins, such as lignans and isoflavans [63–67]. With the exception of the IFR family, for which no white spruce homologues were identified, at least two white spruce candidates were detected in the other families (Figure 12A). A total of 12 white spruce homologs cluster closely with the loblolly pine PBR1 (Figure 12B) which reduces the benzylic ether functionalities of both dehydrodiconiferyl alcohol and dihydrodehydrodiconiferyl alcohol in the formation of 8-5'-linked lignans [66]. However, the phylogenetic distance of the white spruce candidates to the functionally characterized angiosperm and gymnosperm phenylpropanoid reductases does not allow functional predictions extending beyond similarity of the mechanism of the catalyzed reaction.
Out of 21 array elements annotated as phenylpropanoid reductases 11 were present within the co-expression network ("PCBER", Figure 4). Gene expression variation of PicglPPR05 (represented by three array elements: WS00723_J06, WS0048_K01, WS00920_D17 (Figure 4 (9)): PCBER_6, PCBER_3, PCBER_16) was co-expressed with the secondary wall-associated AtMYB20 (myb_19; WS0071_H14), which was a candidate for the ldr99 growth trait. PicglPPR21 is a positional candidate for the sum_egg resistance trait; its transcript abundance was lowest in the spruce mapping family with the most vigorous growth and highest weevil attack rate (unpublished). In the network this PCBER gene (WS0086_D12; PCBER_11) was connected to PicglDIR10 (WS0261_J16; DIR_30), a member of the b/d-subfamily and the cluster of peroxidases involving monolignol forming enzymes (AtPER64, AtPER42, PicabPRX18, see above, and PopalPRX WS0099_B17 (PRXR_51)), Figure 4 (8). Transcripts of PicglPPR13 (WS01011_J14) accumulated with increasing growth rate, and PicglPPR13 (PCBER_19) was negatively co-expressed with the resistance phenotype candidate LAR WS00725_B17, DFR_3, Figure 4 (7). The latter is correlated with another (remote) LAR representative WS00926_B24 (DFR_6) that was lowest expressed in the spruce mapping family 26, the stress inducible PicabPRX2 WS00928_G19 (PRXR_45), and the cluster of dirigents involved in constitutive defenses, DIR_2, DIR_32, DIR_31, DIR_18, DIR_20, Figure 4 (12), see also below. The expression pattern for PicglPPR13 suggests that this PCBER could be involved either in susceptibility or tolerance to insect feeding. However, for other representatives (PicglPPR21) their co-expression pattern suggests involvement in constitutive defenses.
Dirigent Proteins
A recent study on Sitka spruce dirigent-like proteins (DIR) [29] reported 35 unique DIR or DIR-like genes. For our survey, we were able to use expression data from 23 elements annotated as unique DIRs and represented on the microarray covering distinct a-, b/d-, and f-subfamilies (Additional File 1). The DIR f-subfamily is spruce specific, while angiosperm members can be found for both a- and b/d- subfamilies, [29] showed that DIRs from different clades have distinct gene expression patterns with a suggested role in induced defense to weevil feeding for members of the a- subfamily. Representatives of b/d- and f-subfamilies are more likely involved in primary processes or constitutive defense (ibidem).
Within the network PicglDIR12 WS00815_A07 (b/d-subfamily, DIR_7) was co-expressed with a growth trait associated transcription factor resembling secondary wall-associated AtMYB20 (myb_19) and with a cluster of six putative laccases, Figure 4 (9). A cluster of constitutive dirigents involving b/d-subfamily members (PicsiDIR17 WS01012_J06, PicsiDIR11 WS01012_K18, PicsiDIR21 WS02610_M19, PicglDIR7 WS0262_G08 (Additional File 1)), and PicsiDIR26 WS0058_H22 from f-subfamily were co-expressed with remote LAR (DFR_3), one of the identified positional candidates for weevil resistance, Figure 4 (12).
PgeDIR13 (WS00914_H24) was highly up-regulated in bark tissue following weevil feeding and for PgeDIR2 (WS00911_I09), another a-subfamily member, the highest induction was registered, [29]. In the network, PgeDIR13 (DIR_11) was co-expressed with two other a-subfamily members (PicsiDIR19 WS01011_J07, PicsiDIR16 WS01032_M02), DIR_17, DIR_25, Figure 4 (10), and for PgeDIR2 gene expression was negatively regulated with the expression of the polyketide synthase Picgl-PKS2, CHS_8, Figure 4 (11). Thus, our experiment confirms previous observations that dirigent subfamilies in constitutive defenses are differentially regulated than dirigents in induced defenses. Similar to the cluster of spruce/conifer-specific members from the OMT superfamily (OMTL), see above, dirigents from the same subfamily are also strongly co-expressed. This provides evidence for the presence of a multitude of gene copies with the same or highly similar protein functions that act in related pathways.
Class III Peroxidases
Plant class III peroxidases represent a key multifunctional enzyme family that is involved in such diverse processes as auxin catabolism, defenses, general stress responses, and lignification [68]. Peroxidases were suggested to function in the radical reaction of monolignols to the polymer lignin [69]. Misregulation of peroxidases provided indications for altered lignin composition ([70], review of xylem class III peroxidases in lignification: [71] and [72]). Recently, two lignin-forming peroxidases have been identified and characterized from Norway spruce suspension culture showing substrate preference for H- or G-lignin precursor molecules, respectively [15]. To date, 17 peroxidase unigenes for conifer lignin polymerization have been identified and their expression abundance in different tissues and after biotic and abiotic challenges studied [73].
For eight genes (PicabPRX1, PicabPRX2, PicabPRX3, PicabPRX4, PicabPRX6/PicabPRX7 cluster, PicabPRX16/17, PicabPRX18, PicabPIPRX) the respective white spruce orthologues were represented on the microarray (Additional File 1, Additional File 13). While some peroxidases with proposed roles in lignin polymerization are related to developmental lignification (PicabPRX1, PicabPRX4, PicabPRX16/17 cluster, PicabPRX18, array elements: see Additional File 1), others were stress inducible (PicabPRX2, PicabPRX3, PicabPIPRX, elements: Additional File 1) and reflect the specific stress responses of peroxidases ([73, 15]). In addition, gene expression of two other inducible peroxidases (PicsiPRX1 and PicsiPRX2, [59]) were surveyed (Additional File 1). PicsiPRX1 (WS0012_J09, PRXR_4) was co-expressed with several other peroxidases, Figure 4 (6). Constitutive expression levels of PicsiPRX2 (WS0074_A10) increased with higher inherent growth rate of individuals (unpublished). PicsiPRX2 expression (PRXR_26) was negatively correlated with expression of stilbene synthase Picgl-PKS6 (CHS_11), a positional candidate for the egg_2001 resistance trait (Figure 4 (5)). We suggest that PicsiPRX2 and Picgl-PKS6 are involved in different defense strategies. The white spruce PicabPRX2 orthologue WS00928_G19 is significantly associated with the sum_egg resistance trait, and is also co-expressed with the remote leucoanthocyanin reductase, one positional candidate for the "resistance phenotype" per se, (PRXR_45, DFR_3, Figure 4 (7), Additional File 1).
The ability to bind lignin was demonstrated in vitro for PicabPRX18 [73]. The white spruce orthologue of PicabPRX18 is a positional candidate for Hgt1999 growth trait. PicabPRX16/17 is a positional candidate for Hgt1997 trait. A co-expression cluster involving the white spruce orthologue of PopalPRX WS0099_B17, a putative spruce homologue of the secondary wall-associated transcription factor AtMYB83 WS0051_I02 and S-adenosylmethionine synthetase (SAMS) WS0261_C18 were positional candidates for Hgt_1995 trait (PRXR_51, myb_47, SAMS_6, Figure 4 (8)), Additional File 1. Also, the PicabPRX4 orthologue WS00912_J06 (PRXR_37) was co-expressed with a cluster of peroxidases with growth associations (Figure 4 (5), Additional File 1), among them PicabPRX6/PicabPRX7 (WS00910_L21/WS01025_F04: PRXR_36, PRXR_69). Interestingly, transcript abundance for all genes in this cluster decreased with growth rate. This suggests that a number of peroxidases are important in reinforcement of anatomical structures (via lignification, e.g.). Similarly, lignin biosynthetic genes are down-regulated in fast growing individuals [38].