De novo assembly, characterization and functional annotation of Senegalese sole (Solea senegalensis) and common sole (Solea solea) transcriptomes: integration in a database and design of a microarray
- Hicham Benzekri1, 2,
- Paula Armesto3,
- Xavier Cousin4, 5,
- Mireia Rovira6,
- Diego Crespo6,
- Manuel Alejandro Merlo7,
- David Mazurais8,
- Rocío Bautista2,
- Darío Guerrero-Fernández2,
- Noe Fernandez-Pozo1,
- Marian Ponce3,
- Carlos Infante9,
- Jose Luis Zambonino8,
- Sabine Nidelet10,
- Marta Gut11,
- Laureana Rebordinos7,
- Josep V Planas6,
- Marie-Laure Bégout4,
- M Gonzalo Claros1, 2 and
- Manuel Manchado3Email author
© Benzekri et al.; licensee BioMed Central Ltd. 2014
Received: 30 April 2014
Accepted: 15 October 2014
Published: 3 November 2014
Senegalese sole (Solea senegalensis) and common sole (S. solea) are two economically and evolutionary important flatfish species both in fisheries and aquaculture. Although some genomic resources and tools were recently described in these species, further sequencing efforts are required to establish a complete transcriptome, and to identify new molecular markers. Moreover, the comparative analysis of transcriptomes will be useful to understand flatfish evolution.
A comprehensive characterization of the transcriptome for each species was carried out using a large set of Illumina data (more than 1,800 millions reads for each sole species) and 454 reads (more than 5 millions reads only in S. senegalensis), providing coverages ranging from 1,384x to 2,543x. After a de novo assembly, 45,063 and 38,402 different transcripts were obtained, comprising 18,738 and 22,683 full-length cDNAs in S. senegalensis and S. solea, respectively. A reference transcriptome with the longest unique transcripts and putative non-redundant new transcripts was established for each species. A subset of 11,953 reference transcripts was qualified as highly reliable orthologs (>97% identity) between both species. A small subset of putative species-specific, lineage-specific and flatfish-specific transcripts were also identified. Furthermore, transcriptome data permitted the identification of single nucleotide polymorphisms and simple-sequence repeats confirmed by FISH to be used in further genetic and expression studies. Moreover, evidences on the retention of crystallins crybb1, crybb1-like and crybb3 in the two species of soles are also presented. Transcriptome information was applied to the design of a microarray tool in S. senegalensis that was successfully tested and validated by qPCR. Finally, transcriptomic data were hosted and structured at SoleaDB.
Transcriptomes and molecular markers identified in this study represent a valuable source for future genomic studies in these economically important species. Orthology analysis provided new clues regarding sole genome evolution indicating a divergent evolution of crystallins in flatfish. The design of a microarray and establishment of a reference transcriptome will be useful for large-scale gene expression studies. Moreover, the integration of transcriptomic data in the SoleaDB will facilitate the management of genomic information in these important species.
KeywordsSoles Transcriptome Microarray Orthology Molecular markers SoleaDB
The term “Soles” refers to a wide group of flatfish species belonging to the Soleidae (true soles) and Cynoglossidae (tongue soles) families. Common sole (Solea solea) and Senegalese sole (Solea senegalensis) are two economically important species highly appreciated worldwide due to the excellent quality of their flesh (low-fat, firm and white) and heavily exploited in industrial fisheries. As a result, sole overfishing has had a profound effect on some life-history traits observing a shift towards earlier sexual maturation and a decline of the spawning biomass [1, 2]. Therefore, aquaculture efforts have focused on S. senegalensis as one of the most promising species for diversification in Southern Europe due to its fast growth rates [3–5]. However, Solea aquaculture is facing several bottlenecks such as the production of high-quality larvae, the improvement and optimization of nutrition for better growth rates and the development of strategies for the control of infectious diseases. In addition, the failure to reproduce successfully soles of the F1 generation in captivity precludes the development of selection programs [4, 6]. Moreover, soles are an excellent model to study development and metamorphosis in fish. Soles undergo drastic morphological, physiological and ethological changes early during development for a short period of time (between 12–18 days after hatching in S. senegalensis). Therefore, sole species have become a suitable model to study larval ontogeny, skin pigmentation, hormonal and epigenetic mechanisms controlling development, sex differentiation, nutritional requirements, asymmetrical development as well as comparative genomics in flatfish [7–10].
In this context, development of large-scale genomic resources is a priority to facilitate the implementation of new technological approaches such as RNA-seq and genome-wide mapping studies, that can assist in the identification of signalling networks controlling metamorphosis, growth, reproduction or disease resistance to advance in the knowledge of their biology and to improve rearing techniques and selective breeding . Several studies have focused previously on the development of genomic resources in S. senegalensis and S. solea species. Molecular markers, including microsatellites (or Single Sequence Repeats; SSRs) and Single Nucleotide Polymorphisms (SNPs), and Bacterial Artificial Chromosomes (BAC) libraries have been developed [2, 7, 11–18]. Being scarce, more polymorphic markers are required for population management and breeding programs. Moreover, a limited set of expressed sequence tags (ESTs) has been described in each species that were used for the design of species-specific oligo-DNA microarrays [19, 20]. Consequently, the number of available ESTs is still far to conform a representative transcriptome as described in other teleosts and further sequencing efforts are required.
Next generation sequencing technologies (NGS) have drastically transformed the way researchers can address genomic questions on non-model species, including soles. The NGS platforms are able to generate quickly an enormous bulk of cost-effective genomic information, even for those species with limited or no previous genomic information available . Although NGS offers different applications, cDNA/RNA sequencing (RNA-seq) has become very popular for genome-wide transcriptome profiling and de novo sequencing of transcriptomes. The high-volume of transcriptomic reads generated constitute a rich and important source of potential molecular markers, including SSRs and SNPs, as well as for transcript processing, making possible the design and implementation of other techniques such as microarray hybridization or qPCR [7, 20]. In aquaculture, several studies have described the implementation of 454 pyrosequencing for de novo transcriptome sequencing of some Mediterranean species such as seabream [21, 22], common sole  or seabass . Most of these studies have focused on the characterization of transcriptomes under specific developmental stages, pathogen challenge or tissue-specific profiles. Nevertheless, the design of a dedicated species-specific database would be useful for easier management of genomic information (detailed annotations, tissue-specific and whole transcriptomes) and development of complementary techniques such as microarrays or RNA-seq.
The main aim of this work was the generation of a representative transcriptome for S. senegalensis and S. solea after processing an important large set of transcriptomic information produced by Roche/454 and Illumina paired-end NGS technologies covering a large number of developmental stages associated to physiological challenges. The main transcriptome features and characteristics were identified. Comparative analysis between soles allowed for the identification of orthologs, new genes and molecular markers. Finally, a browseable database referred to as SoleaDB was constructed and a new microarray tool was designed and validated.
Results and discussion
Pre-processing and assembly of NGS data
Pre-processing summary of raw reads
Reference to Figure1
Total Input Reads
Previous reports that focused on de novo transcriptome assembly used a variable number of input reads (from 2 up to 368 million reads [29–31]). However, a total of 20–30 million reads were considered sufficient to generate a de novo transcriptome, even though assembly is highly influenced by factors such as species or type of sample (whole-body or a specific tissue) . For example, the number of transcripts in six different marine invertebrates varied from 86,897 to 338,254 in spite of the fact that the number of input of reads was similar (56.2-80 million reads). Moreover, more transcripts were discovered using whole-body samples . In this study, a joint analysis of the complete set of samples comprising a wide range of organs and developmental stages was performed to facilitate a maximal gene representation into the final transcriptome assembly. As a whole, more than 1,800 million reads were assembled in each sole species, representing the highest number of reads reported to date for any organism.
Overview of assembled transcriptomes in S. senegalensis and S. solea
Transcripts with ortholog1
Different orthologous IDs
Different, complete ORFs
Transcripts without ortholog1
Putative new transcripts
Non-redundant new transcripts
The extremely large number of reads used to assemble the two sole transcriptomes could favour the accumulation of assembly errors [28, 31]. Evaluation of transcript accuracy was initially based on mapping useful reads of two randomly selected libraries of each sole transcriptome using Bowtie2 (not shown). Since 96.7-98.7% of the reads mapped onto assembled transcripts, the assembly errors can be considered negligible. Interestingly, the longest transcript in S. senegalensis v4 and S. solea v1 transcriptomes (Additional file 2, “Longest transcript”) is clearly not an artifact: in both species, it corresponded to a titin-like protein highly similar to a long mRNA (94,446 bp) previously assembled in tilapia (Acc No XM_005460929). Titin is a giant filamentous protein highly abundant in muscle that forms a separate myofilament system in both skeletal and cardiac muscle . The fact that this transcript is 6-fold longer in S. senegalensis v4 than in S. senegalensis v3 supports the significant contribution of Illumina short-reads to the final assembly.
SoleaDB, a database for browsing Soleatranscriptomes
Therefore, SoleaDB can be extremely useful for data comparison across experiments allowing for the identification of paralogues, alternative spliced transcripts and novel genes. It represents a new, easy-to-use, valuable tool to host NGS data and for sharing genomic information between users applying these techniques. Moreover, whole and reference transcriptomes will be a useful tool for downstream applications such as RNA-seq. Examples of usefulness of SoleaDB can be found in the next sections.
Transcripts without ortholog as a source of putative new sole transcripts
The high number of transcripts without ortholog (Table 2) deserved a deeper analysis. Based on their low testcode index, >91% of the unknown transcripts did not encode for proteins (Table 2; “Unknown”), which could explain in part the lack of orthology. To check the accuracy of these unknown transcripts, genome reads (from several shotgun genomic libraries of S. senegalensis available in our laboratory) were mapped onto the unknown transcripts of S. senegalensis v4, resulting in 462,568 (91.25%) unknown transcripts mapped (440,385 with more than 10 reads). These high mapping percentages indicate that these sequences were not assembly artifacts and that they might correspond to co-purified genomic fragments or immature transcripts.
Among the 7.30% and 8.37% of transcripts without ortholog in S. senegalensis v4 and S. solea v1, respectively, some showed a testcode index >0.94 (Table 2, “Putative new transcripts”) and, therefore, are likely coding transcripts. Those transcripts referred to as non-redundant new transcripts (14,451 and 15,503 transcripts in S. senegalensis v4 and S. solea v1, respectively; Table 2), based on the absence of orthologs in UniProtKB database, may represent “new” proteins (or fragments) in sole.
A reference transcriptome for each sole species
The high number of assembled transcripts indicated an overestimation of sole transcriptomes when compared to other teleosts [24–26]. Probably, a certain number of transcripts could actually represent alleles, paralogs, fragmented transcripts, spliced forms, immature mRNAs and even a combination of them. Therefore, identification of representative transcripts from these transcriptomes would be a useful tool to be used as a reference for future gene expression studies. For this purpose, representative transcripts were selected from (i) the longest transcripts with unique, different orthologous ID and (ii) the putative, non-redundant new transcripts (Table 2). Hence, the reference transcriptome for S. senegalensis (named S. senegalensis v4.1) consisted of 59,514 transcripts and the S. solea (named S. solea v1.1) reference transcriptome contained 54,005 transcripts (Table 2, last row). When useful reads (Additional file 1 and Table 1) were mapped onto these reference transcriptomes, 82.3-87.5% of reads were mapped onto transcripts, while 76.5-93.3% of transcripts received more than one read, suggesting that they represent adequately the transcriptome. Additional verification of the reference transcriptomes (v4.1 and v1.1) was based on an orthology analysis using zebrafish (43,132 entries available in RefSeq and 42,555 in ENSEMBL). In S. senegalensis v4.1, 39,851 reference transcripts were found to be orthologs to 21,542 RefSeq and to 20,753 ENSEMBL zebrafish entries, and in S. solea v1.1 34,949 reference transcripts were found to be orthologs to 20,594 RefSeq and to 19,632 ENSEMBL zebrafish entries (Additional file 3). These numbers suggest that a certain number of alleles, immature mRNAs and lineage-specific genes (or even some non-detected chimeric assemblies) may have been included in the reference transcriptome. Moreover, since the number of RefSeq and ENSEMBL IDs nearly corresponds to half the number of sole transcripts, it is likely that both alleles for each gene were included in the sole reference transcriptomes. This hypothesis is also supported by the fact that most of the samples analyzed corresponded to wild animals or larvae, being mostly heterozygous. It is worth noting that the number of different zebrafish IDs in RefSeq or ENSEMBL (Additional file 3) is close to the ~21,000 genes recently reported for half-smooth tongue sole  and not so different from the 26,206 genes that have been recently reported in zebrafish [24, 27]. Therefore, it can be suggested that most sole genes have been covered in the sole reference transcriptomes.
S. solea and S. senegalensisshow clear functional similarity, are highly orthologous, and contain sole- and flatfish-specific transcripts
S. solea and S. senegalensis are two closely-related species with similar morphology (differing mainly in the pigmentation pattern of pectoral fin), ecology (they usually live in sympatry in estuarine and coastal areas) and feeding habits and preys [35–37]. In silico genome comparisons previously performed among fish species to identify orthologous gene groups identified a high percentage of shared genes (90.5%) and only a small number of species-specific gene families (ranging from 271 in tetraodontiformes to 601 in zebrafish) . Therefore, a comparative analysis of sole transcriptomes could reveal new clues about their biology and evolution and also can provide supporting evidence of reference transcriptome accuracies.
The 21.68% and 22.63% of annotation success in the S. senegalensis v4 and S. solea v1 transcriptomes, respectively, is in concordance with the analysis of Full-LengtherNext (Table 2, “Transcripts with ortholog”). Comparison of GO terms between the two sole species revealed that they were similar (Additional file 4). The highest number of annotated transcripts by biological process was associated with metabolic (15.2%) and cellular (22.2%) processes. By cellular components, the most represented categories were cell (36.3%) and organelle (22.1%). By molecular function, the highest number of annotated transcripts was within the catalytic activity category (30.4%). Interestingly, the channel regulator and antioxidant activity categories were only represented in S. senegalensis. In conclusion, both sole transcriptomes appear to be very similar from a biological and functional point of view.
True orthologs between both sole species were obtained after performing a reciprocal Blast using the reference transcriptomes. In this analysis, two transcripts were considered as Blast-based, true orthologs when a highly restrictive reciprocal Blast (>97% identity) always resulted with their sequences giving the best score and E values . A total of 26,291 reference transcripts of S. senegalensis were highly homologous to reference transcripts of S. solea, and 21,238 reference transcripts of S. solea were highly homologous to reference transcripts of S. senegalensis. Of these homologous sequences, only 11,953 could be considered as true, Blast-based orthologs (data not shown). These include 210 unannotated transcripts with an average length of 900 nt in S. senegalensis and 1,199 nt in S. solea. More interestingly, 137 of them had a testcode ≥0.94, indicating that they likely code for a specialized protein (Additional file 5, “new-transcripts” tab). To confirm this hypothesis, they were compared to other fish proteins (Gadus morhua, Oryzias latipes, Oreochromis niloticus, Tetraodon nigroviridis, Gasterosteus aculeatus) and only 35 (25.5%) failed to find any orthology (Additional file 6). Moreover, 75 transcripts (54.7%) showed a clear ortholog only in the flatfish C. semilaevis, suggesting that these they might represent flatfish-specific transcripts. In conclusion, 11,953 transcripts were identified as true, Blast-based orthologs between both sole species, from which 75 are likely to be flatfish-specific transcripts and 35 are putative new sole transcripts.
Transcriptome comparison across teleosts revealed a set of lineage-specific genes
Sole transcriptomes confirmed the retention of crystallin genes
Sole transcriptomes as a source of molecular markers
SSR summary statistics for whole and reference transcriptomes
Type of SSR
Same repeat motif2
Different repeat motif
A total of 337,315 SNPs were identified in S. senegalensis and 381,404 in S. solea transcriptomes. A significant proportion of SNPs occurred in transcripts containing an ORF (109,235 [32.4%] and 115,746 [30.3%], respectively) with approximately half occurring within the ORF (53,265 and 46,599, respectively). These figures for SNP location in coding regions are similar to those found in other fish species ranging from 17.4 to 24.7% [45–47]. Although SNP prediction is only based on bioinformatic analysis and requires empirical validation to eliminate false positives and sequencing errors [45, 48], these SNPs can also be used as a source of putative molecular markers.
Design and validation of an oligonucleotide microarray for S. senegalensis
Validation of microarray data using qPCR
Angiotensin I converting enzyme 2
Heat shock protein 90-alpha
Complement component C3
Thyroid stimulating hormone, beta
De novo transcriptomes of S. solea and S. senegalensis covering their main developmental stages and organs were described based on a combined assembly approach that can be applied to other transcriptomic studies. The huge volume of reads processed in each species (>1,800 millions, the highest number of reads reported to date for any organism) produced a high number of transcripts that were mined to obtain a representative reference transcriptome for each species. The organization and deposit of all this information at SoleaDB offers the scientific community a new powerful resource for the management of genomic information in soles. Transcriptome comparisons and orthology analyses showed that both species are highly homologous and even contain transcripts with the same sequence. Moreover, comparisons across teleost transcriptomes allowed for the identification of some subsets of transcripts considered as new, species-specific and flatfish-specific transcripts. Transcriptome analysis followed by a phylogenetic analysis confirmed the retention of crystallins crybb2 and crybb3 confirming species-specific events during flatfish evolution. In conclusion, this study not only provided functional information about soles, but also provides new tools to the scientific community in the form of a database, SSR and SNP markers, and a new microarray with 43,403 features in Senegalese sole.
Biological materials and sample preparation
To cover the most important developmental stages and physiological and environmental conditions in soles, libraries were prepared using different technologies. A total of eight Roche/454 libraries were constructed mixing RNA from tissues related to the immune system (head kidney, spleen, gill, thymus and brain, obtained from 10 individuals stimulated using lipopolysaccharide, poly(I:C), peptidoglycan, zymosan A, and lipoteioic acid) , osmoregulation (gills, intestine, kidney and brain of 18 individuals challenged to three different salinities), and gonads, hypothalamus and pituitary (from 18 sole male and female wild-type and F1 broodstock; mean weight: males: 1567.3 g ± 487.7 g; females: 1891.1 g ±573.3 g) (see Additional file 1, “454” tab). In this latter case, animals were classified according to their sex and origin (F1 or wild) and RNA samples were equally pooled separately for these conditions (F1 males, F1 females, wt males and wt females). Illumina libraries (see Additional file 1, “Illumina” tab) were prepared from larvae and embryos selected at different developmental stages (early and late gastrula, early neurula and early somitogenesis in embryos and S0-S4 in metamorphosis) and treated with 4-diethylaminobenzaldehyde, all-trans RA, TTNPB, DMSO and thiourea in S. solea and S. senegalensis. Moreover, some libraries were prepared from S. senegalensis larvae cultivated under different environmental and nutritional conditions and exposed to methimazole, mifepristone and iopanoic acid (see Additional file 1, “Illumina tab”).
Samples of larvae incubated at two salinities were prepared as follows. Fertilized eggs of Senegalese sole were collected from spontaneous spawns at “El Toruño” facilities (El Puerto de Santa María, Cádiz). Water temperature in the broodstock tanks during spawning was approximately 18.5°C and salinity 34 ppt. Eggs were transferred to a 1,000 ml measuring cylinder to separate buoyant (viable) from non-buoyant (non-viable) eggs and the number in each fraction was estimated using volumetric methods (1,100 eggs ml−1). After estimating the number of fertilized eggs, embryos were incubated (at the gastrula stage) in 15 l cylinder tanks at an initial density of 2,000 embryos l−1. After seeding, two salinities (10 and 36 ppt) were established using a recirculation system that kept constant temperature (20°C ±0.5) and target salinity. Water turnover was maintained at one total renewal per hour during the experiment. Trial was done in triplicate tanks for each salinity. After hatching, larvae were sampled at day 3 using a 350 μm-mesh net. One pool of larvae were collected from each tube (~100 larvae/pool and n = 3 for each condition), washed with DEPC water, frozen directly in liquid nitrogen and stored at −80°C until analysis. The experimental procedures comply with the Guidelines of the European Union Council (86/609/EU) and IFAPA and IFREMER (17–010) rules for the use of animals in research.
RNA Isolation, library preparation and NGS analysis
Homogenization of tissues, including juvenile organs and the pools of larvae and embryos was carried out in the Fast-prep FG120 instrument (Bio101) using Lysing Matrix D (Q-Bio- Gene) for 40 s at speed setting 6. Total RNA was isolated from 50 mg of tissues or pools of embryos and larvae using the RNeasy Mini Kit (Qiagen). RNA integrity was further investigated using the Bioanalyzer 2100 (Agilent Technologies) before preparation of Roche/454 and Illumina libraries. The Roche/454 libraries were normalized, processed and sequenced by the Unitat de Genòmica (CCiT-UB, Barcelona, Spain) as described previously [22, 50]. Illumina libraries were constructed at the Centre Nacional d’Anàlisi Genòmica (Barcelona, Spain) for S. senegalensis using mRNA-Seq sample preparation kit and MGX platform (Montpellier, France) for S. solea using TruSeq RNA Sample Preparation Kit v2, in both cases according to manufacturer’s protocols. Briefly, 0.5 μg of total RNA was used for poly-A based mRNA enrichment selection using oligo-dT magnetic beads followed by fragmentation by divalent cations at elevated temperature resulting into fragments of 80–250 nt, with the major peak at 130 nt. First strand cDNA synthesis by random hexamers and reverse transcriptase was followed by the second strand cDNA synthesis performed using RNAseH and DNA Pol I. Double stranded cDNA was end repaired, 3´adenylated and the 3´-“T” nucleotide at the Illumina adaptor was used for the adaptor ligation. The ligation product was amplified with 15 cycles of PCR. Each library was sequenced using TruSeq SBS Kit v3-HS, in paired end mode, 2 × 76 bp (Solea senegalensis) and 2 × 100 bp (Solea solea), in a fraction of a lane (1/6 or 2/13 for S. senegalensis and 1/7 for S. solea) of a HiSeq2000 sequencing system (Illumina, Inc) following the manufacturer’s protocol, generating minimally 15 million paired-end reads for each sample. Images from the instrument were processed using the manufacturer’s software to generate FASTQ sequence files.
Pre-processing and assembly
The detailed strategy for transcriptome pre-processing and assembly is depicted in Figure 1. Roche/454 long-reads and Illumina short-reads were pre-processed using SeqTrimNext pipeline (http://www.scbi.uma.es/seqtrimnext) available at the Plataforma Andaluza de Bioinformática (University of Málaga, Spain) using the specific NGS technology configuration parameters. This pre-processing removes low quality, ambiguous and low complexity stretches, linkers, adaptors, vector fragments, organelle DNA, polyA/polyT tails, and contaminated sequences while keeping the longest informative part of the read. SeqTrimNext also served to discard sequences below 20 (short reads) or 40 bp (long reads).
The assembly strategy used follows the rationale that not a single assembler is satisfactory and, consequently, that two different algorithms (and/or parameter sets) should be used. Here MIRA3 (based on overlap-layout-consensus algorithm ) and Euler-SR (based on a strict de Bruijn graph analyzed by an Eulerian path ) were used. For S. senegalensis, Roche/454 long-reads were pre-assembled using MIRA3 with 454 settings. The same long reads were also assembled using Euler-SR with the default parameters and a k-mer = 29 (maximum length allowed). To remove artifactual sequences, contigs (consensus sequences) obtained using Euler-SR were mapped with the original reads using Bowtie2  allowing 2 mismatches to confirm the goodness of the final consensus. Unmapped contigs were considered a sign of misassembling and were submitted to Full-LengtherNext (see below) analysis to recover putative coding sequences. Illumina short-reads were pre-assembled using Oases (based on de Bruijn graphs ) with two k-mers: a small k-mer to recover lowly-expressed transcripts and a big k-mer for recovering highly-expressed transcripts since the use of multiple k-mers is reported to improve the quality and good performance of de novo assembling. k-mers from 19 to 69 were scanned in both species in seek of those that produce the lesser number of artifacts and the highest number of annotated transcripts (results not shown). As a result, 23 and 47 were used for S. senegalensis, and 25 and 69 for S. solea. Processed single reads and paired reads were assembled independently, providing distinct contig sets using a static coverage cutoff of 3. For computing efficiency and accuracy, the redundant and nested contigs were clustered using CD-HIT  at 99% identity, recovering only the longest contigs. After that, an in-house script was used to discard misassembled contigs based on the presence of exact, internal, direct or inverse repetitions. Pre-assemblies were finally reconciled using CAP3 with default parameters to provide the maximal set of transcripts for each transcriptome.
Transcripts were annotated using Sma3s  with default parameters and the vertebrate division of UniProtKB to provide gene description, GO terms, EC keys, KEGG maps and InterPro codes for every sequence. AutoFact  was used as a second gene description approach based on gene and EST databases. Orthology to zebrafish was determined using blastx with sole transcripts and the information available in ENSEMBL v72 and RefSeq at the time of writing (1/27/14), filtering for E <10−10 and a minimal identity of 30%.
Transcripts were also analyzed with Full-LengtherNext (http://www.scbi.uma.es/fulllengthernext) available at the Plataforma Andaluza de Bioinformática (University of Málaga, Spain) to provide a third gene description, as well as additional information about transcripts containing full-length ORFs, identification of ncRNAs, and transcripts with a putative start and stop codons and a predicted amino-acid sequence. Moreover, this software was used to remove or split chimeric transcripts providing a quick preview of the transcriptome and extracting the minimum set of transcripts that can be considered a reference transcriptome. Finally, putative SSRs were detected using Mreps (http://bioinfo.lifl.fr/mreps/) with default parameters counting repeats whose period was at least 2 and size at least 12 and a coverage of up to 1000 reads. The putative collection of SNPs was obtained mapping the original reads using Bowtie2 to the corresponding transcriptome and then analyzing the resulting SAM files with SAMtools  as described in http://samtools.sourceforge.net/mpileup.shtml.
SoleaDB was built using Ruby On Rails 2.0 (http://rubyonrails.org/) that allows the use of a model-view-controller pattern to maintain strict separation between the web interface (views) code, database tables (models), and all methods that handle interactions between views and database (controllers), as well as testing and production environments for each development phase of the database. The database tables were implemented in MySQL. Bulk imports, updates, and database managements were automated by means of Ruby scripts. An automated pipeline that combines all tools described above is executed on every SoleaDB update. SoleaDB can be browsed, retrieved and downloaded at http://www.juntadeandalucia.es/agriculturaypesca/ifapa/soleadb_ifapa/.
The microarray probes were designed following the workflow in Figure 7. The 252,416 transcripts of S. senegalensis v3 were analysed based on the Full-LengtherNext status provided for this transcriptome (available in SoleaDB). Transcripts qualified as N-terminal, internal or C-terminal were clustered by sequence to obtain the longest, non-redundant transcript that expands the maximum possible to the 3’-end that enable the design of specific probes in the fast evolving 3’-UTR region . Complete transcripts follow a similar reduction step but the criteria for representative transcript selection is based only on their length. Then, both collections were combined and clustered to provide the set of longest, non-redundant, annotated transcripts. Since the Agilent eArray (https://earray.chem.agilent.com/earray/) panel was limited to 45,220 60-mer probes and required the inclusion of 1,417 Agilent controls and 400 features corresponding to replicate probes of putative housekeeping genes (40 × 10), 13,881 additional probes were selected from the set of 21,099 “Coding” transcripts predicted with Full-lengtherNext and OrfPredictor  after removal of redundant sequences and sorting by testcode index. Since the resulting number clearly overloaded the microarray capacity (51,218 transcripts), these transcripts were sorted according to their testcode index. Then, all the selected transcripts were divided into 8 categories according to Blast2go, Sma3 and AUTOFACT annotations. Those probes that did not satisfied the quality criteria for cross-hybridization (BC3 and BC4) were then discarded and replaced by new transcripts until the total number of probes required for the design of a 4 × 44 format microarray was reached. As a result, the final microarray included a total of 45,220 probes, 43,403 specific for S. senegalensis, 1,417 Agilent controls, and 400 probes corresponding to replicates of putative housekeeping genes (40 × 10). The design of the array is stored in the NCBI Gene Expression Omnibus (GEO) database under accession GPL18543.
Microarray hybridization and qPCR validation
RNA labeling, hybridizations, scanning and data processing were performed according as previously described using the Agilent One-Color Microarray-Based Gene Expression analysis (Low Input Quick Amp Labeling kit) along with Agilent One-color RNA SpikeIn kit . Four pools of larvae incubated at 10 and 36 ppt were analyzed in two 4 × 44 chips. qPCR procedure for microarray validation was performed as previously described [10, 65–67]. Three pools of larvae incubated at 10 and 36 ppt at day 3 were analyzed. Real-time analysis was carried out on a CFX96™ Real-Time System (Bio-Rad) using Senegalese sole specific primers (Additional file 10). Real-time reactions were performed in duplicate containing cDNA generated from 10 ng of original RNA template, 300 nM each of specific forward and reverse primers, and 5 μl of iQ™ SYBR Green Supermix (Bio-Rad) in a 10 μl final volume. The amplification protocol used was as follows: initial 7 min denaturation and enzyme activation at 95°C, 40 cycles of 95°C for 15 s and 70°C for 30 s. Each PCR assay was performed in duplicate. For normalization of cDNA loading, all samples were run in parallel with the reference gene glyceraldehyde-3-phosphate dehydrogenase (gapdh2). Relative mRNA expression was determined using the 2-(∆∆Ct) method . Results were expressed as mean ± SEM. A Welch t-test was performed using GraphPad Prism v5 and significance was accepted at p <0.05.
Availability of supporting data
All 454 and Illumina data have been deposited in the Sequence Read Archive (SRA) database with bioproject numbers PRJNA255461 (http://www.ncbi.nlm.nih.gov/bioproject/255461), PRJNA241068 (http://www.ncbi.nlm.nih.gov/bioproject/241068) and PRJNA261151 (http://www.ncbi.nlm.nih.gov/bioproject/261151) for S. senegalensis and PRJNA261810 for S. solea (http://www.ncbi.nlm.nih.gov/bioproject/261810). Microarray hybridization data have been deposited in NCBI’s Gene Expression Omnibus and are accessible through GEO Series accession number GSE57173 (http://www.ncbi.nlm.nih.gov/geo/query/acc.cgi?acc=GSE57173). Additional information about the SSR codes for selected Illumina data used for transcriptome assembly in S. senegalensis is also included in the Additional file 1.
This study has been funded by project AQUAGENET (SOE2/P1/E287) program INTERREG IVB SUDOE and European Regional Development Fund (FEDER /ERDF) and P10-CVI-6075 from Junta de Andalucía. PA is supported by a PhD fellowship of IFAPA (Consejería de Agricultura y Pesca de la Junta de Andalucía) and funded by the Operational Program of European Social Fund 2007–2013 of Andalucía, within the priority axis 3 (Expand and improve investment in human capital) in 80%. MR was supported by the AQUAGENET project. The authors also thankfully acknowledge the computer resources and the technical support provided by the Plataforma Andaluza de Bioinformática of the University of Málaga.
- Mollet FM, Kraak SBM, Rijnsdorp AD: Fisheries-induced evolutionary changes in maturation reaction norms in North Sea sole Solea solea. Mar Ecol Prog Ser. 2007, 351: 189-199.View ArticleGoogle Scholar
- Cuveliers EL, Volckaert FA, Rijnsdorp AD, Larmuseau MH, Maes GE: Temporal genetic stability and high effective population size despite fisheries-induced life-history trait evolution in the North Sea sole. Mol Ecol Notes. 2011, 20 (17): 3555-3568.Google Scholar
- APROMAR: La Acuicultura Marina en España. 2012Google Scholar
- Cañavate JP: Opciones del lenguado senegalés Solea senegalensis Kaup, 1858 para diversificar la acuicultura marina. Bol Inst Esp Oceanogr. 2005, 21: 147-154.Google Scholar
- Schmidt G, Espinos F, Ruiz F, Segarra M, Mañanos E, Muñoz JL, Soler E, Chirivella J, Dove C, Barrera R, Lacomba T, Balach S, Santiago JM, Ambrosio L, López J, Ojeda J: Diversification in Aquaculture: A Tool for Sustainability. 2011, Madrid: Spanish Ministry of Environmental, Rural and Marine Affairs, Publications CentreGoogle Scholar
- Guzman JM, Ramos J, Mylonas CC, Mañanos EL: Comparative effects of human chorionic gonadotropin (hCG) and gonadotropin-releasing hormone agonist (GnRHa) treatments on the stimulation of male Senegalese sole (Solea senegalensis) reproduction. Aquaculture. 2011, 316: 121-128. 10.1016/j.aquaculture.2011.03.014.View ArticleGoogle Scholar
- Cerda J, Manchado M: Advances in genomics for flatfish aquaculture. Genes Nutr. 2013, 8 (1): 5-17. 10.1007/s12263-012-0312-8.PubMed CentralPubMedView ArticleGoogle Scholar
- Manchado M, Infante C, Asensio E, Cañavate JP, Douglas SE: Comparative sequence analysis of the complete set of 40S ribosomal proteins in the Senegalese sole (Solea senegalensis Kaup) and Atlantic halibut (Hippoglossus hippoglossus L.) (Teleostei: Pleuronectiformes): phylogeny and tissue- and development-specific expression. BMC Evol Biol. 2007, 7: 107-10.1186/1471-2148-7-107.PubMed CentralPubMedView ArticleGoogle Scholar
- Matsuoka MP, Infante C, Reith M, Cañavate JP, Douglas SE, Manchado M: Translational machinery of senegalese sole (Solea senegalensis Kaup) and Atlantic halibut (Hippoglossus hippoglossus L.): comparative sequence analysis of the complete set of 60s ribosomal proteins and their expression. Mar Biotechnol. 2008, 10 (6): 676-691. 10.1007/s10126-008-9104-y.PubMedView ArticleGoogle Scholar
- Infante C, Matsuoka MP, Asensio E, Cañavate JP, Reith M, Manchado M: Selection of housekeeping genes for gene expression studies in larvae from flatfish using real-time PCR. BMC Mol Biol. 2008, 9: 28-10.1186/1471-2199-9-28.PubMed CentralPubMedView ArticleGoogle Scholar
- Ponce M, Salas-Leiton E, Garcia-Cegarra A, Boglino A, Coste O, Infante C, Gisbert E, Rebordinos L, Manchado M: Genomic characterization, phylogeny and gene regulation of g-type lysozyme in sole (Solea senegalensis). Fish Shellfish Immunol. 2011, 31 (6): 925-937. 10.1016/j.fsi.2011.08.010.PubMedView ArticleGoogle Scholar
- Funes V, Zuasti E, Catanese G, Infante C, Manchado M: Isolation and characterization of ten microsatellite loci for Senegal sole (Solea senegalensis Kaup). Mol Ecol Notes. 2004, 4: 339-341. 10.1111/j.1471-8286.2004.00690.x.View ArticleGoogle Scholar
- Molina-Luzon MJ, Lopez JR, Navajas-Perez R, Robles F, Ruiz-Rejon C, De La Herran R: Validation and comparison of microsatellite markers derived from Senegalese sole (Solea senegalensis, Kaup) genomic and expressed sequence tags libraries. Mol Ecol Resour. 2012, 12 (5): 956-966. 10.1111/j.1755-0998.2012.03163.x.PubMedView ArticleGoogle Scholar
- Chen S-L, Shao C-W, Xu G-B, Liao X-L, Tian Y-S: Development of 15 novel dinucleotide microsatellite markers in the Senegalese sole Solea senegalensis. Fisheries Sci. 2008, 74: 1357-1359. 10.1111/j.1444-2906.2008.01668.x.View ArticleGoogle Scholar
- De La Herran R, Robles F, Navas JI, Hamman-Khalifa AM, Herrera M, Hachero I, Mora MJ, Ruiz-Rejon C, Garrido-Ramos M, Ruiz-Rejon M: A highly accurate, single PCR reaction for parentage assignment in Senegal sole based on eight informative microsatellite loci. Aquaculture Res. 2008, 39: 1169-1174. 10.1111/j.1365-2109.2008.01979.x.View ArticleGoogle Scholar
- Garoia F, Marzola S, Guarniero I, Trentini M, Tinti F: Isolation of polymorphic DNA microsatellites in the common sole Solea vulgaris. Mol Ecol Resour. 2006, 6 (1): 144-146.Google Scholar
- Diopere E, Hellemans B, Volckaert FA, Maes GE: Identification and validation of single nucleotide polymorphisms in growth- and maturation-related candidate genes in sole (Solea solea L.). Mar Genomics. 2013, 9: 33-38.PubMedView ArticleGoogle Scholar
- Garcia-Cegarra A, Merlo MA, Ponce M, Portela-Bens S, Cross I, Manchado M, Rebordinos L: A preliminary genetic map in Solea senegalensis (Pleuronectiformes, Soleidae) using BAC-FISH and next-generation sequencing. Cytogenet Genome Res. 2013, 141 (2–3): 227-240.PubMedView ArticleGoogle Scholar
- Cerda J, Mercade J, Lozano JJ, Manchado M, Tingaud-Sequeira A, Astola A, Infante C, Halm S, Vinas J, Castellana B, Asensio E, Cañavate P, Martinez-Rodriguez G, Piferrer F, Planas JV, Prat F, Yufera M, Durany O, Subirada F, Rosell E, Maes T: Genomic resources for a commercial flatfish, the Senegalese sole (Solea senegalensis): EST sequencing, oligo microarray design, and development of the Soleamold bioinformatic platform. BMC Genomics. 2008, 9: 508-10.1186/1471-2164-9-508.PubMed CentralPubMedView ArticleGoogle Scholar
- Ferraresso S, Bonaldo A, Parma L, Cinotti S, Massi P, Bargelloni L, Gatta PP: Exploring the larval transcriptome of the common sole (Solea solea L.). BMC Genomics. 2013, 14: 315-10.1186/1471-2164-14-315.PubMed CentralPubMedView ArticleGoogle Scholar
- Garcia de la Serrana D, Estevez A, Andree K, Johnston IA: Fast skeletal muscle transcriptome of the gilthead sea bream (Sparus aurata) determined by next generation sequencing. BMC Genomics. 2012, 13: 181-10.1186/1471-2164-13-181.PubMed CentralPubMedView ArticleGoogle Scholar
- Yufera M, Halm S, Beltran S, Fuste B, Planas JV, Martinez-Rodriguez G: Transcriptomic characterization of the larval stage in gilthead seabream (Sparus aurata) by 454 pyrosequencing. Mar Biotechnol. 2012, 14 (4): 423-435. 10.1007/s10126-011-9422-3.PubMedView ArticleGoogle Scholar
- Sarropoulou E, Galindo-Villegas J, Garcia-Alcazar A, Kasapidis P, Mulero V: Characterization of European sea bass transcripts by RNA SEQ after oral vaccine against V. anguillarum. Mar Biotechnol. 2012, 14 (5): 634-642. 10.1007/s10126-012-9466-z.PubMedView ArticleGoogle Scholar
- Howe K, Clark MD, Torroja CF, Torrance J, Berthelot C, Muffato M, Collins JE, Humphray S, McLaren K, Matthews L, McLaren S, Sealy I, Caccamo M, Churcher C, Scott C, Barrett JC, Koch R, Rauch GJ, White S, Chow W, Kilian B, Quintais LT, Guerra-Assuncao JA, Zhou Y, Gu Y, Yen J, Vogel JH, Eyre T, Redmond S, Banerjee R, et al: The zebrafish reference genome sequence and its relationship to the human genome. Nature. 2013, 496 (7446): 498-503. 10.1038/nature12111.PubMed CentralPubMedView ArticleGoogle Scholar
- Collins JE, White S, Searle SM, Stemple DL: Incorporating RNA-seq data into the zebrafish Ensembl genebuild. Genome Res. 2012, 22 (10): 2067-2078. 10.1101/gr.137901.112.PubMed CentralPubMedView ArticleGoogle Scholar
- Chen S, Zhang G, Shao C, Huang Q, Liu G, Zhang P, Song W, An N, Chalopin D, Volff JN, Hong Y, Li Q, Sha Z, Zhou H, Xie M, Yu Q, Liu Y, Xiang H, Wang N, Wu K, Yang C, Zhou Q, Liao X, Yang L, Hu Q, Zhang J, Meng L, Jin L, Tian Y, Lian J, et al: Whole-genome sequence of a flatfish provides insights into ZW sex chromosome evolution and adaptation to a benthic lifestyle. Nat Genet. 2014, 46: 253-260. 10.1038/ng.2890.PubMedView ArticleGoogle Scholar
- Kettleborough RN, Busch-Nentwich EM, Harvey SA, Dooley CM, de Bruijn E, van Eeden F, Sealy I, White RJ, Herd C, Nijman IJ, Fenyes F, Mehroke S, Scahill C, Gibbons R, Wali N, Carruthers S, Hall A, Yen J, Cuppen E, Stemple DL: A systematic genome-wide analysis of zebrafish protein-coding gene function. Nature. 2013, 496 (7446): 494-497. 10.1038/nature11992.PubMed CentralPubMedView ArticleGoogle Scholar
- Cahais V, Gayral P, Tsagkogeorga G, Melo-Ferreira J, Ballenghien M, Weinert L, Chiari Y, Belkhir K, Ranwez V, Galtier N: Reference-free transcriptome assembly in non-model animals from next-generation sequencing data. Mol Ecol Resour. 2012, 12 (5): 834-845. 10.1111/j.1755-0998.2012.03148.x.PubMedView ArticleGoogle Scholar
- Garg R, Patel RK, Tyagi AK, Jain M: De novo assembly of chickpea transcriptome using short reads for gene discovery and marker identification. DNA Res. 2011, 18 (1): 53-63. 10.1093/dnares/dsq028.PubMed CentralPubMedView ArticleGoogle Scholar
- Nakasugi K, Crowhurst RN, Bally J, Wood CC, Hellens RP, Waterhouse PM: De novo transcriptome sequence assembly and analysis of RNA silencing genes of Nicotiana benthamiana. PLoS One. 2013, 8 (3): e59534-10.1371/journal.pone.0059534.PubMed CentralPubMedView ArticleGoogle Scholar
- Francis WR, Christianson LM, Kiko R, Powers ML, Shaner NC, Haddock SHD: A comparison across non-model animals suggests an optimal sequencing depth for de novo transcriptome assembly. BMC Genomics. 2013, 14: 167-10.1186/1471-2164-14-167.PubMed CentralPubMedView ArticleGoogle Scholar
- Gregorio CC, Granzier H, Sorimachi H, Labeit S: Muscle assembly: a titanic achievement?. Curr Opin Cell Biol. 1999, 11 (1): 18-25. 10.1016/S0955-0674(99)80003-9.PubMedView ArticleGoogle Scholar
- Fernandez-Pozo N, Canales J, Guerrero-Fernandez D, Villalobos DP, Diaz-Moreno SM, Bautista R, Flores-Monterroso A, Guevara MA, Perdiguero P, Collada C, Cervera MT, Soto A, Ordas R, Canton FR, Avila C, Canovas FM, Claros MG: EuroPineDB: a high-coverage web database for maritime pine transcriptome. BMC Genomics. 2011, 12: 366-10.1186/1471-2164-12-366.PubMed CentralPubMedView ArticleGoogle Scholar
- Canales J, Bautista R, Label P, Gomez-Maldonado J, Lesur I, Fernandez-Pozo N, Rueda-Lopez M, Guerrero-Fernandez D, Castro-Rodriguez V, Benzekri H, Canas RA, Guevara MA, Rodrigues A, Seoane P, Teyssier C, Morel A, Ehrenmann F, Le Provost G, Lalanne C, Noirot C, Klopp C, Reymond I, Garcia-Gutierrez A, Trontin JF, Lelu-Walter MA, Miguel C, Cervera MT, Canton FR, Plomion C, Harvengt L, et al: De novo assembly of maritime pine transcriptome: implications for forest breeding and biotechnology. Plant Biotechnol J. 2014, 12 (3): 286-299. 10.1111/pbi.12136.PubMedView ArticleGoogle Scholar
- Vinagre C, Maia A, Cabral HN: Effect of temperature and salinity on the gastric evacuation of juvenile sole Solea solea and Solea senegalensis. J Appl Ichthyol. 2007, 23: 240-245. 10.1111/j.1439-0426.2007.00852.x.View ArticleGoogle Scholar
- Teixeira CM, Cabral HN: Comparative analysis of the diet, growth and reproduction of the soles, Solea solea and Solea senegalensis, occurring in sympatry along the Portuguese coast. J Mar Biol Assoc UK. 2010, 90: 995-1003. 10.1017/S0025315410000238.View ArticleGoogle Scholar
- Infante C, Catanese G, Manchado M: Phylogenetic relationships among ten sole species (Soleidae, Pleuronectiformes) from the Gulf of Cadiz (Spain) based on mitochondrial DNA sequences. Mar Biotechnol. 2004, 6 (6): 612-624. 10.1007/s10126-004-3081-6.PubMedView ArticleGoogle Scholar
- Muñoz-Merida A, Viguera E, Claros MG, Trelles O, Perez-Pulido AJ: Sma3s: a three-step modular annotator for large sequence datasets. DNA Res. 2014, 21 (4): 341-353. 10.1093/dnares/dsu001.PubMed CentralPubMedView ArticleGoogle Scholar
- Padhi A, Verghese B: Evidence for positive Darwinian selection on the hepcidin gene of Perciform and Pleuronectiform fishes. Mol Divers. 2007, 11 (3–4): 119-130.PubMedView ArticleGoogle Scholar
- Martin-Antonio B, Jimenez-Cantizano RM, Salas-Leiton E, Infante C, Manchado M: Genomic characterization and gene expression analysis of four hepcidin genes in the redbanded seabream (Pagrus auriga). Fish Shellfish Immunol. 2009, 26 (3): 483-491. 10.1016/j.fsi.2009.01.012.PubMedView ArticleGoogle Scholar
- Santos MD, Yasuike M, Kondo H, Hirono I, Aoki T: Teleostean IL11b exhibits complementing function to IL11a and expansive involvement in antibacterial and antiviral responses. Mol Immunol. 2008, 45 (12): 3494-3501. 10.1016/j.molimm.2008.02.004.PubMedView ArticleGoogle Scholar
- Toth G, Gaspari Z, Jurka J: Microsatellites in different eukaryotic genomes: survey and analysis. Genome Res. 2000, 10 (7): 967-981. 10.1101/gr.10.7.967.PubMed CentralPubMedView ArticleGoogle Scholar
- Liu SR, Li WY, Long D, Hu CG, Zhang JZ: Development and characterization of genomic and expressed SSRs in citrus by genome-wide analysis. PLoS One. 2013, 8 (10): e75149-10.1371/journal.pone.0075149.PubMed CentralPubMedView ArticleGoogle Scholar
- Brenner S, Elgar G, Sandford R, Macrae A, Venkatesh B, Aparicio S: Characterization of the pufferfish (Fugu) genome as a compact model vertebrate genome. Nature. 1993, 366 (6452): 265-268. 10.1038/366265a0.PubMedView ArticleGoogle Scholar
- Vera M, Alvarez-Dios JA, Millán A, Pardo BG, Bouza C, Hermida M, Fernández C, de la Herran R, Molina-Luzon MJ, Martínez P: Validation of single nucleotide polymorphism (SNP) markers from an immune Expressed Sequence Tag (EST) turbot, Scophthalmus maximus, database. Aquaculture. 2011, 313: 31-41. 10.1016/j.aquaculture.2011.01.038.View ArticleGoogle Scholar
- Hayes B, Laerdahl JK, Lien S, Moen T, Berg P, Hindar K, Davidson WS, Koop BF, Adzhubei A, Hoyheim B: An extensive resource of single nucleotide polymorphism markers associated with Atlantic salmon (Salmo salar) expressed. Aquaculture. 2007, 265: 82-90. 10.1016/j.aquaculture.2007.01.037.View ArticleGoogle Scholar
- Hubert S, Bussey JT, Higgins B, Curtis BA, Bowman S: Development of single nucleotide polymorphism markers for Atlantic cod (Gadus morhua) using expressed sequences. Aquaculture. 2009, 296: 7-14. 10.1016/j.aquaculture.2009.07.027.View ArticleGoogle Scholar
- Che R, Sun Y, Wang R, Xu T: Transcriptomic analysis of endangered chinese salamander: identification of immune, sex and reproduction-related genes and genetic markers. PLoS One. 2014, 9 (1): e87940-10.1371/journal.pone.0087940.PubMed CentralPubMedView ArticleGoogle Scholar
- Osuna-Jimenez I, Williams TD, Prieto-Alamo MJ, Abril N, Chipman JK, Pueyo C: Immune- and stress-related transcriptomic responses of Solea senegalensis stimulated with lipopolysaccharide and copper sulphate using heterologous cDNA microarrays. Fish Shellfish Immunol. 2009, 26 (5): 699-706. 10.1016/j.fsi.2009.02.020.PubMedView ArticleGoogle Scholar
- Pereiro P, Balseiro P, Romero A, Dios S, Forn-Cuni G, Fuste B, Planas JV, Beltran S, Novoa B, Figueras A: High-throughput sequence analysis of turbot (Scophthalmus maximus) transcriptome using 454-pyrosequencing for the discovery of antiviral immune genes. PLoS One. 2012, 7 (5): e35369-10.1371/journal.pone.0035369.PubMed CentralPubMedView ArticleGoogle Scholar
- Falgueras J, Lara AJ, Fernandez-Pozo N, Canton FR, Perez-Trabado G, Claros MG: SeqTrim: a high-throughput pipeline for pre-processing any type of sequence read. BMC Bioinformatics. 2010, 11: 38-10.1186/1471-2105-11-38.PubMed CentralPubMedView ArticleGoogle Scholar
- Chevreux B, Pfisterer T, Drescher B, Driesel AJ, Muller WE, Wetter T, Suhai S: Using the miraEST assembler for reliable and automated mRNA transcript assembly and SNP detection in sequenced ESTs. Genome Res. 2004, 14 (6): 1147-1159. 10.1101/gr.1917404.PubMed CentralPubMedView ArticleGoogle Scholar
- Pevzner PA, Tang H, Waterman MS: An Eulerian path approach to DNA fragment assembly. Proc Natl Acad Sci U S A. 2001, 98 (17): 9748-9753. 10.1073/pnas.171285098.PubMed CentralPubMedView ArticleGoogle Scholar
- Langmead B, Salzberg SL: Fast gapped-read alignment with Bowtie 2. Nat Methods. 2012, 9 (4): 357-359. 10.1038/nmeth.1923.PubMed CentralPubMedView ArticleGoogle Scholar
- Schulz MH, Zerbino DR, Vingron M, Birney E: Oases: robust de novo RNA-seq assembly across the dynamic range of expression levels. Bioinformatics. 2012, 28 (8): 1086-1092. 10.1093/bioinformatics/bts094.PubMed CentralPubMedView ArticleGoogle Scholar
- Li W, Godzik A: Cd-hit: a fast program for clustering and comparing large sets of protein or nucleotide sequences. Bioinformatics. 2006, 22 (13): 1658-1659. 10.1093/bioinformatics/btl158.PubMedView ArticleGoogle Scholar
- Koski LB, Gray MW, Lang BF, Burger G: AutoFACT: an automatic functional annotation and classification tool. BMC Bioinformatics. 2005, 6: 151-10.1186/1471-2105-6-151.PubMed CentralPubMedView ArticleGoogle Scholar
- Lara A, Pérez-Trabado G, Villalobos D, Díaz-Moreno S, Cantón F, Claros MG: A Web Tool to Discover Full-Length Sequences: Full-Lengther. Innovations in Hybrid Intelligent Systems. Edited by: Corchado E, Corchado JM, Abraham A. 2007, Berlin: Springer, 361-368.View ArticleGoogle Scholar
- Kolpakov R, Bana G, Kucherov G: mreps: efficient and flexible detection of tandem repeats in DNA. Nucleic Acids Res. 2003, 31 (13): 3672-3678. 10.1093/nar/gkg617.PubMed CentralPubMedView ArticleGoogle Scholar
- Li H, Handsaker B, Wysoker A, Fennell T, Ruan J, Homer N, Marth G, Abecasis G, Durbin R, Genome Project Data Processing S: The Sequence Alignment/Map format and SAMtools. Bioinformatics (Oxford, England). 2009, 25 (16): 2078-2079. 10.1093/bioinformatics/btp352.View ArticleGoogle Scholar
- Cross I, Merlo A, Manchado M, Infante C, Cañavate JP, Rebordinos L: Cytogenetic characterization of the sole Solea senegalensis (Teleostei: Pleuronectiformes: Soleidae): Ag-NOR, (GATA)n, (TTAGGG)n and ribosomal genes by one-color and two-color FISH. Genetica. 2006, 128 (1–3): 253-259.PubMedView ArticleGoogle Scholar
- Merlo MA, Cross I, Chairi H, Manchado M, Rebordinos L: Analysis of three multigene families as useful tools in species characterization of two closely-related species, Dicentrarchus labrax, Dicentrarchus punctatus and their hybrids. Genes Genet Syst. 2010, 85 (5): 341-349. 10.1266/ggs.85.341.PubMedView ArticleGoogle Scholar
- Min XJ, Butler G, Storms R, Tsang A: OrfPredictor: predicting protein-coding regions in EST-derived sequences. Nucleic Acids Res. 2005, 33 (Web Server issue): W677-W680.PubMed CentralPubMedView ArticleGoogle Scholar
- Calduch-Giner JA, Echasseriau Y, Crespo D, Baron D, Planas JV, Prunet P, Perez-Sanchez J: Transcriptional assessment by microarray analysis and large-scale meta-analysis of the metabolic capacity of cardiac and skeletal muscle tissues to cope with reduced nutrient availability in gilthead sea bream (Sparus aurata L.). Mar Biotechnol. 2014, in pressGoogle Scholar
- Manchado M, Infante C, Asensio E, Cañavate JP: Differential gene expression and dependence on thyroid hormones of two glyceraldehyde-3-phosphate dehydrogenases in the flatfish Senegalese sole (Solea senegalensis Kaup). Gene. 2007, 400 (1–2): 1-8.PubMedView ArticleGoogle Scholar
- Manchado M, Infante C, Asensio E, Crespo A, Zuasti E, Cañavate JP: Molecular characterization and gene expression of six trypsinogens in the flatfish Senegalese sole (Solea senegalensis Kaup) during larval development and in tissues. Comp Biochem Physiol B Biochem Mol Biol. 2008, 149 (2): 334-344. 10.1016/j.cbpb.2007.10.005.PubMedView ArticleGoogle Scholar
- Manchado M, Infante C, Asensio E, Planas JV, Cañavate JP: Thyroid hormones down-regulate thyrotropin beta subunit and thyroglobulin during metamorphosis in the flatfish Senegalese sole (Solea senegalensis Kaup). Gen Comp Endocrinol. 2008, 155 (2): 447-455. 10.1016/j.ygcen.2007.07.011.PubMedView ArticleGoogle Scholar
- Livak KJ, Schmittgen TD: Analysis of relative gene expression data using real-time quantitative PCR and the 2(−Delta Delta C(T)) Method. Methods. 2001, 25 (4): 402-408. 10.1006/meth.2001.1262.PubMedView ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly credited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.