A fundamental challenge in evolutionary biology is the mechanistic integration from genomic variation to fitness of organisms in their natural environment. Linking genomes to fitness is requisite to understand adaptation, speciation, and the interactions between the two processes [1, 2]. In the past, different model systems have been used to either understand the genomic basis of phenotypes, or the fitness effects of phenotypic variation in response to different environmental conditions in nature, but in only few systems have we a thorough understanding about the ecological context in which phenotypic traits evolve and the genomic basis of respective traits. Notable exceptions include heavy metal tolerance in Arabidopsis, eco-morphological differentiation in lake trout , whitefish [5, 6], and marine snails , the reduction of armor in freshwater sticklebacks [8, 9], and changes in fur coloration in beach mice .
Elucidating the genomic basis of adaptation and speciation in systems with in- depth knowledge of ecological sources of selection and phenotypic trait variation has been hindered by the lack of genomic resources, which are often only available for model organisms. However, next generation sequencing (NGS) techniques provide a promising tool in this endeavor [11–13]. While whole genome sequencing on replicated sets of individuals is still expensive, focusing on transcribed portions of the genome (the transcriptome) has become increasingly popular. Such transcriptomic studies focus on sequencing cDNA libraries constructed from mRNA isolated from specific tissues (RNA-seq [14, 15]), and in combination with barcoding technologies [16, 17], can provide large amounts of sequence and expression level data for comparative transcriptomic studies.
We provide a first characterization of the transcriptome of the Atlantic molly, Poecilia mexicana (Poeciliidae). This livebearing fish species is widely distributed in freshwater environments along the Atlantic versant from northeastern Mexico into lower Central America [18, 19]. Poecilia mexicana has been used as a study organism in animal behavior and behavioral ecology [20–22], predator–prey interactions [23, 24], sensory ecology [25–27], and life history evolution [28, 29]. Furthermore, the species is the maternal ancestor of a unisexual hybrid species, the Amazon molly (P. formosa), and is thus frequently investigated to address questions about the evolution and maintenance of sexual reproduction [30–32].
Most importantly, P. mexicana is an emerging model system to study adaptation to extreme environments and ecological speciation, as the species has colonized both hydrogen sulfide-rich and cave habitats in southern Mexico . Hydrogen sulfide (H2S) is a potent toxicant lethal for most metazoans even in micro-molar amounts by inhibiting cellular respiration [34, 35]. The absence of light in caves inhibits the use of visual senses, and cave-dwellers must cope with perpetual darkness, especially if they evolved from diurnal surface-dwelling forms like in poeciliids [36, 37]. Extreme habitats harbor distinct ecotypes of P. mexicana that have evolved convergently across independently colonized sulfidic springs and caves [38, 39]. Compared to conspecifics in adjacent non-sulfidic surface habitats, which harbor the ancestral populations, extremophile populations diverged in eye size, body shape, and gill morphology [38, 40], physiology , life history strategies [41, 42], and behavior [43, 44]. Despite the lack of physical barriers between extreme and adjacent normal habitats, gene flow across habitat types is eminently low , and reproductive isolation is at least partially mediated by natural and sexual selection against immigrants [46–48].
Despite the well-characterized selective environments and phenotypic variation in this system, there are currently no genomic resources to start addressing questions about the genomic changes underlying trait divergence. Such resources are particularly required to test whether fixed positively selected mutations and changes in gene expression patterns show similar patterns of convergence across replicated environmental gradients, or whether unique genomic changes in each evolutionary replicate essentially precipitated similar phenotypic effects. To address such questions in the future, and to start building genomic resources for other study areas using P. mexicana (and close relatives such as the sailfin molly P. latipinna, the amazon molly P. formosa, and the guppy P. reticulata), we used RNA sequencing to obtain a first transcriptome of this species. We particularly focused on gill tissues, since many physiological processes involved in the maintenance of homeostasis take place here [49, 50], and used six individuals from ancestral, non-sulfidic populations to facilitate de novo transcriptome assembly in absence of a reference genome. Our key objectives were to (1) create a database of the gill transcriptome in P. mexicana for future comparative studies, (2) to annotate transcripts based on functional annotations in reference databases, (3) to compare the P. mexicana transcriptome to other published fish transcriptomes, including the guppy (Poecilia reticulata), medaka (Oryzias latipes), threespined stickleback (Gasterosteus aculeatus), and zebrafish (Danio rerio), and (4) to identify potential candidate genes of interest in the study of extremophile poeciliids.