Transitions from cross-fertilization to self-fertilization can have profound effects on population genetic structure and patterns of molecular evolution across the genome . Most importantly, homozygosity increases with more intense selfing, which decreases effective population size (N
) and reduces opportunities for crossing over between heterozygous sites, resulting in elevated linkage disequilibrium [2, 3]. Effective population size is also reduced around selected regions by the effects of genetic hitchhiking, including selective sweeps of beneficial mutations and background selection against deleterious mutations (reviewed in ). Linkage among weakly selected sites with opposing selective forces can also interfere with the ability of selection to act efficiently .
Estimates of N
in selfers are often lower than the expected two-fold decrease based on selfing alone. This result presumably occurs because life-history characteristics associated with selfing promote population subdivision, isolation, and frequent genetic bottlenecks [6–10]. Thus, both genetic and demographic processes in selfing populations should lead to a decrease in the efficacy of natural selection and an increase in the fixation of slightly deleterious mutations, with important consequences for genome evolution. The accumulation of deleterious mutations may also be an important factor in causing species extinction , and could explain the lack of persistence of selfing lineages over longer time scales [12–14]. However, the extent to which theoretical predictions on the reduced efficacy of selection in selfing populations occur is unclear.
The efficacy of selection depends on the product of N
and the selection coefficient (s). The reduction of N
due to selfing is expected to result in a higher rate of fixation of slightly deleterious mutations and a lower rate of fixation for advantageous mutations (reviewed in ). Diverse methods have been developed to detect the footprint of natural selection at the molecular level (reviewed in [15, 16]). One common approach is to quantify the ratio of mutations at non-synonymous sites (d
) versus synonymous or silent sites (d
S); hereafter ω. Because selection acts primarily on proteins and not DNA sequences, synonymous changes are often treated as selectively neutral (but see below) thus enabling measurement of the degree of selective constraint on amino acid sequences. Under neutrality ω is expected to equal 1, whereas values of ω less than or greater than 1 indicate purifying or positive selection, respectively. The vast majority of functional proteins that have been examined exhibit ω values much less than one indicating that most protein sequences are subject to purifying selection. However, in selfers a reduction in the efficacy of selection may result in elevation of this value as a result of the accumulation of deleterious mutations.
Although predictions concerning the effect of selfing on levels of polymorphism have been well documented [17, 18], evidence for a reduction of selection efficacy in selfing populations of diverse plants and animals is equivocal. The only study that we are aware of that has attributed an accumulation of deleterious mutations to selfing was a comparison of the selfing plant Arabidopsis thaliana with Drosophila melanogaster. Other studies focusing on closely related outcrossers and selfers [20–25] have failed to detect convincing evidence of reduced selection efficacy leading to several hypotheses to explain the apparent lack of signal in the molecular data. First, the genomic distribution of selection coefficients is poorly known [26, 27], and if there are few weakly selected mutations very little effect of the mating system is predicted . Another explanation involves the amount of time that has elapsed since the transition from outcrossing to selfing, which in some instances may be too short for substantial changes to have occurred at the genome level . Finally, no comparisons have been made using genomic data involving very large numbers of loci (e.g. thousands of genes) and it is possible that larger genomic samples may allow the detection of weak effects of selfing despite the stochasticity and slow rate of mutation accumulation.
The reduced efficacy of selection due to decreased N
e is also predicted to diminish the signal of biased codon usage. Proportional usage of synonymous codons often differs between high and low expression genes due to selection for higher translational efficiency and accuracy [29, 30]. However, the strength of selection on synonymous codon usage is expected to be weak relative to purifying selection against amino acid altering mutations. If this is true, reductions in N
, such as those associated with transitions to selfing, are predicted to reduce the efficacy of selection to the extent that these mutations become effectively neutral [31, 32]. Thus, in principle the reduced efficacy of selection can be inferred from an increase in the frequency of substitutions (or polymorphisms) for unpreferred codons, or by less differentiation of codon usage between high and low expression genes. However, because codon bias is eroded by genetic drift this process will occur rather slowly and is dependent on the rate of mutation . Therefore, to detect the reduced efficacy of selection based on codon usage it may be necessary to include either old selfing lineages, which may be difficult given the relatively recent origin of most selfers [14, 34], or to obtain data from a very large number of loci, to detect the small changes that are possible in codon usage, an approach we use here.
Empirical tests of these predictions have included selfing and outcrossing species of Brassicaceae [22, 35, 36], Caenorhabditis[23, 37] and members of the Triticeae [24, 25]. The results of these studies are mixed, with no evidence of reduced codon bias in selfing species of the Triticeae, a slight reduction in codon bias in selfing C. briggsae relative to outcrossing C. sp. 5, and a recent analysis found evidence of an effect of selfing on codon usage in Brassicaceae . Two explanations have been proposed for the lack of a strong effect of selfing on codon usage. First, as mentioned above, many selfing lineages are thought to be of recent origin (e.g.), and there simply may not have been sufficient time for enough mutations to have drifted to fixation. Second, interspecific comparisons introduce confounding effects that are not directly related to selfing because of the independent evolutionary history of the species, thus limiting conclusions about the effect of selfing on codon bias . Ideally, predictions require contrasts between conspecific selfing and outcrossing lineages with fewer confounding effects; however, this may not be feasible over long evolutionary time spans. The approach we use in this study is to contrast both inter and intraspecific selfing lineages of different ages and to use a large sample of loci in an effort to detect changes in the efficacy of selection on genomes.
Here, we investigate patterns of molecular evolution in the floral transcriptomes of three independently derived selfing lineages relative to an outcrossing genotype in Eichhornia (Pontederiaceae), a neotropical genus of aquatic plants. Our samples include three individuals of E. paniculata, an annual diploid that has been the subject of detailed studies on the ecology and genetics of mating-system variation over several decades (reviewed in ). Populations of E. paniculata are largely concentrated in N.E. Brazil, with smaller foci in Jamaica and Cuba and isolated localities in Nicaragua and Mexico. Populations in Brazil are mostly outcrossing and possess the sexual polymorphism tristyly, which promotes cross-pollination among the three floral morphs (reviewed in ). Morphological, genetic and biogeographical evidence indicates that tristyly has broken down on multiple occasions in E. paniculata resulting in independently derived selfing populations [38, 40, 41]. Populations from Jamaica are largely composed of selfing variants of the mid-styled morph (M-morph) in which stamens are elongated to a position adjacent to mid-level stigmas resulting in the autonomous self-fertilization of flowers. In contrast, plants from Mexico and Nicaragua are selfing variants of the long-styled morph (L-morph) with a different arrangement of sexual organs (see Figure two in ). Although both variants possess the selfing syndrome, comparisons of molecular variation at 10 EST-derived nuclear loci indicate a high level of genetic differentiation consistent with their separate origins from different outcrossing ancestors (see Figure three in ). Our analysis included an individual of both selfing variants and an individual of an outcrossing L-morph from N.E. Brazil, the likely centre of origin of the species. We also included a selfing individual of E. paradoxa, the sister species of E. paniculata (e.g.), to serve as a potentially more ancient selfing phenotype, and as an outgroup for inferences on the ancestral DNA sequence in our samples. Most populations of E. paradoxa are predominantly selfing, although a tristylous population is known from Brazil (see ) indicating that, in common with E. paniculata, selfing has likely arisen from the evolutionary breakdown of tristyly.
We used high-throughput DNA sequencing technology to generate a set of approximately 8000 orthologous ESTs from the floral transcriptomes of the four Eichhornia genotypes. Using this dataset we investigated the molecular evolution of protein coding genes to address the following specific questions predicted by the hypothesis of reduced selection efficacy in selfers: (1) Is there evidence for relaxation of purifying selection against non-synonymous mutations in selfing lineages? (2) Can we detect biased codon usage in our samples, and if so, does this vary among lineages based on their mating systems?