Fruit flavor and aroma profiles are shaped by a mixture of sugars, acids and volatile organic compounds. Individual volatile components have been demonstrated to play key roles in consumer liking, as noted in tomato , and strawberry . Several reports have detailed the importance of key compounds to aroma production and the genetic loci or genes that control them [6, 16]. One component contributing to flavor in strawberry fruits is γ-decalactone (γ-D). This compound represents the recognized aromas in peaches and apricots, and is of interest to industry as a flavoring agent. The synthesis of γ-D is not well-understood in plants. However, several microorganisms can generate γ-D from fatty-acid substrates  and are used as bioreactors to produce this flavor compound.
To identify transcripts related to γ-D production, two parents varying for the volatile were crossed. ‘Elyana’ is a large, firm berry, and bred for production in Florida . ‘Mara des Bois’ is smaller and soft, and not used in wide commercial cultivation. Fruits (red, ripe fruits of similar size, age, and environment) were assayed for the volatile and also for transcripts associated with receptacle/achene ripening. The RNAseq data from producers and non-producers were computationally bulked to identify sequences common to each pool (Additional file 1: Figure S1). Strawberry fruit transcriptomes have been examined previously [32–34], and show changes in a substantial number of transcripts throughout the ripening process. The commercial strawberry is octoploid and highly heterozygous . Crossing two plants produces a myriad of progeny phenotypes due to contributions from homoeologous genes. The segregation within subgenomes likely provides additional “noise” that allows for transcripts relevant to the trait to separate clearly from others, focusing the candidate set. Transcriptomes for each plant were considered independently, so the transcriptomes of γ-D producers and non-producers could be compared in silico.
The test began with detection of γ-D. The γ-D producers show fluctuations in volatile quantity throughout the growing season (Figure 1). The genotypes with the highest concentrations were estimated to possess between 0.028 to 0.035 mM γ-D (data not shown). Gamma decalactone accumulation was not ever detected in the ‘Mara des Bois’ parent. The range of differences observed in parental fruits and progeny presented an ideal situation to assay global gene expression coinciding with volatile production.
When progeny were sorted by presence/absence of γ-D, a small subset of transcripts was significantly different between phenotypic groups. Pairwise comparisons of RNAseq data around a fruit phenotype resolved to an especially strong single candidate, a transcript encoding an omega-6-fatty acid desaturase (FaFAD1). This gene is located near a QTL previously reported to explain about 90% of the γ-D phenotype in strawberry . Another gene recently found in peach (PpFAD_1B-6, 71% identity and 82% similar to FaFAD1 at the protein level) was correlated with γ-D in ripening fruit, and demonstrated ω-6 oleate desaturase activity when overexpressed in yeast . Fatty acid desaturases catalyze the formation of double bonds into fatty acyl chains. The report from Sanchez , and the biochemistry inferred from protein sequence, each suggest a role for this gene in lactone production, but the precise biochemical steps have not been demonstrated. Future work will examine the role of this transcript in transgenic lines.
FaFAD1 transcript abundance correlated well with the presence of γ-D. The transcript accumulated with ripening, and only in certain environmental conditions (Figure 5). The ability to produce γ-D segregated as a single dominant locus, consistent with previous reports . A PCR-based survey of materials from the population indicated that the FaFAD1 genomic sequence could not be amplified from γ-D non-producers, suggesting gene deletion or radical alteration affecting PCR amplification. Eight total combinations using ten distinctly positioned primers were used to amplify regions upstream of and within the FaFAD1 gene (Shown in Figure 7). In each case, ‘Elyana’ genomic template amplified the expected fragments, but ‘Mara des Bois’ produced no amplicons. This is consistent with the idea that a deletion is responsible for the absence of FaFAD1-related sequence in the ‘Mara des Bois’ genome, and this would explain the dominant, single gene effect for the γ-D phenotype. This finding allowed for development of a PCR-based molecular marker to identify genotypes with the potential to produce γ-D. PCR primers corresponding to the 5′ sequence of FaFAD1 were used to amplify the region shown in Figure 7 (primers A and B). The presence of the PCR product co-segregated precisely with the ability to produce the γ-D, in the parental cross, in progeny, and in backcross populations (Figure 8A).
The nature of the deletion is further exemplified using simple sequence repeat (SSR) markers, tools frequently used to fingerprint strawberry germplasm [37–39]. The availability of the diploid strawberry, F. vesca, genome  makes it possible to position the microsatellite sequences within the structural context of the strawberry genome. One SSR sequence is located 11 kb adjacent to the FaFAD1 gene, and polymorphisms would be predicted to segregate with the candidate gene. The results in Figure 8 show that the presence of allele 205 is a reliable predictor of the ability to produce γ-D, at least in the subset of the population evaluated. These data are important because they provide two independent, PCR-based tests can differentiate γ-D producers and non-producers. While not tested outside of the parental genotypes and only in several progeny, this second primer set offers another potential molecular marker that may be used to verify results from the FaFAD1 sequence. The findings also indicate that the SSR is present in the ‘Elyana’ parent and not the other, suggesting that the missing FaFAD1 gene may be part of a larger deletion.
The potential utility of this molecular marker was demonstrated when it was applied to a set of unrelated germplasm that was also varying for presence/absence of γ-D. In this case, the PCR product could only be amplified in genotypes demonstrated to produce γ-D. The evaluation was extended to wild germplasm where fruit could be obtained. Even in these distant accessions, the PCR product was only amplified in lines where γ-D was detectable. The putative marker not only works within a population, but likely will work across commercial breeding populations and wild octoploid accessions. Future studies will track the origin of the gene in diploid germplasm in an attempt to reconstruct its origins, as was done with alleles to trace linalool-producing variants of FaNES1[6, 40].
There are notable limitations to the approach used in this study. The visual correlation between the γ-D phenotype and qRT-PCR results is shown in Figures 4 and 5. This quantitative trend, however, is not consistent in the RNA-seq RPKM data (data not shown). γ-D producers, irrespective of volatile amount produced, could not be distinguished quantitatively, though producers and non-producers were still easily discernible. This suggests that while the RNA-seq approach worked well for the qualitative γ-D phenotype, uncovering quantitative γ-D effects would be very challenging. This further underscores the need to incorporate independent methods (like qRT-PCR) for examining candidate genes in the early stages of discovery.
The development of two independent molecular markers that segregate with the phenotype in a wide range of germplam will permit improved parental selection and rapid screening of progeny possessing the ability to produce γ-D. A simple and robust assay can rapidly eliminate individual plants from breeding populations, saving time, fuel, space and other resources. Most importantly, the marker allows more rapid integration of a fruity volatile into advanced selections.