RNA-seq data and de novo transcriptome assembly
To systematically investigate the transcriptome dynamics of the gills, liver, and kidney at different time points after copper ion treatment, we obtained 36 transcriptome samples. After removing low-quality sequences and trimming adapter sequences, 7.18–9.36 Gb of 125-bp paired-end clean reads were generated from each library. The total counts of clean reads were between 47,873,998 and 62,386,764.
Since there is no published genome of this species or a closely related species, we assembled the transcripts de novo and used this assembly as a reference. De novo assembly using all clean reads identified 704,405 transcript fragments, ranging in length from 201 to 23,517 bp, with an average length of 774 bp. The N50 (the length of the contig at 50% of the assembly arranged in descending order according to contig length) and N90 (the length of the contig at 90% of the assembly arranged in descending order according to contig length) values of the obtained transcripts were 1265 and 298 bp, respectively. In further assembly analysis, 486,221 gene fragments were identified across all transcripts, with an average length of 994 bp. The N50 and N90 values of the obtained gene fragments were 1480 and 438 bp, respectively.
The number of transcripts obtained in this work is similar to many published transcriptome assemblies [13, 17,18,19,20,21], including de novo transcriptome assemblies in several teleost species [13, 20, 21]. This indicates that our assembly quality is reliable to be used for subsequent transcriptome data analysis.
Gene functional annotation
All 486,221 gene fragments were annotated against seven public databases: NCBI non-redundant protein sequences (Nr), NCBI non-redundant nucleotide sequences (Nt), Protein family (Pfam), Clusters of Orthologous Groups of proteins/eukaryotic Ortholog Groups (KOG/COG), Swiss-Prot, Kyoto Encyclopedia of Genes and Genomes (KEGG), and Gene Ontology (GO). A total of 152,017 gene fragments were annotated against the Nr protein database (31.26%). The most common species with high sequence similarity in the Nr database was Danio rerio (64.8%), followed by Astyanax mexicanus (5.7%), Clupea harengus (2.8%), Ichthyophthirius multifiliis (2.1%), and Oncorhynchus mykiss (2.0%). Most of these species are teleost fish, suggesting the gene fragments used in this study were correctly assembled and annotated.
After GO annotation, the predicted G. eckloni genes were further cataloged in the next subdirectory under the three categories (Biology Process, Cellular Component, and Molecular Function). A total of 56 functional groups are displayed in Fig. 1.
KOG annotation assigned the 55,116 genes into 26 groups, which are shown in Fig. 2. Among these categories, the most sequences were assigned to “Signal transduction mechanisms” (10,376 genes), followed by “General function prediction only” (8966 genes), “Posttranslational modification, protein turnover, chaperones” (5602 genes), “Intracellular trafficking, secretion, and vesicular transport” (4198 genes), and “Transcription” (3644 genes).
A total of 50,669 genes were annotated using the KEGG database (Fig. 3). These genes were further divided into five groups: A, Cellular Processes (9905 genes); B, Environmental Information Processing (11,646 genes); C, Genetic Information Processing (6162 genes); D, Metabolism (11,077 genes); and E, Organismal Systems (20,366 genes).
Differentially expressed genes and functional enrichment
The responses of different tissues of the piebald naked carp to the copper ion treatment were different. In gills, after 6 h of metal ion treatment, 410 and 361 genes were respectively up- and down-regulated. After 36 h of treatment, the numbers of up and down-regulated genes were 327 and 324, respectively. The number of differentially expressed genes (DEGs) was abruptly increased after 72 h of treatment; 2238 and 609 genes were up- and down-regulated, respectively (Fig. 4a). The trend of gene expression patterns in the liver was similar to the gills. The period with the most differences in gene expression level occurred after 72 h of metal ion treatment, with 401 and 555 genes up- and down-regulated, respectively. The respective numbers were 285 and 246 after 6 h, and 325 and 553 after 36 h (Fig. 5a). The numbers of DEGs were relatively constant in different periods in the kidney. After 6 h of copper ion treatment, 422 and 326 genes were up- and down-regulated, respectively. The respective numbers were 332 and 357 after 36 h, and 295 and 255 after 72 h.
To further explore the mechanism of damage of metal ions to organs and the pathways of responses of the organs, we analyzed the functional enrichment of the DEGs. In GO functional enrichment in gills and liver, the most DEGs were enriched in “metabolic process” (Fig. 4b and 5b) under the Biological Process category. In the gills, the DEGs were enriched in the terms “organic substance metabolic process”, “primary metabolic process” under the Biological Process category; “cell” and “cell part” under the Cellular Component category; and “catalytic activity” under the Molecular Function category. In the liver, the DEGs were enriched in the terms “organonitrogen compound metabolic process”, “cellular component biogenesis” under the Biological Process category; “intracellular” and “intracellular part” under the Cellular Component category; and “heterocyclic compound binding” and “organic cyclic compound binding” under the Molecular Function category.
De novo transcriptome assembly often results in a large number of short contigs. To evaluate the potential impact of short contigs on the identification of DEGs, we plotted the length distribution of DEGs (Fig. 4c and 5c). The results indicated that the proportion of short contigs (< 500 bp) in the DEGs was minor (6.81% in gills and 6.49% in the liver).
In KEGG functional enrichment, the pattern of DEGs was also similar in the gills and liver. The most enriched pathway of DEGs was “Ribosome” in both tissues. The “Protein processing in endoplasmic reticulum”, “Oxidative phosphorylation”, “Lysosome”, and “Glycolysis/Gluconeogenesis” were also enriched in both tissues (Supplementary file 1 and 2) (Fig. 6).
Genes involved in oxidative stress and protein synthesis after copper exposure
The DEGs of both gills and liver were mostly enriched in “Ribosome” and oxidative stress response in the copper treated group as compared with control. Therefore, the expression levels of genes involved in oxidative stress response and protein synthesis were measured using qPCR. Our results showed that the expression levels of these genes displayed different patterns after copper treatment. Among the genes involved in oxidative stress, the expression of Hmox1 (Cluster-47,180.157989), Nqo1 (Cluster-47,180.117564), Gpx1 (Cluster-47,774.0), Cu/Zn-SOD (Cluster-47,180.196300), Prdx1 (Cluster-47,180.233418), Gadd45b1 (Cluster-47,180.146985), Hspa9 (Cluster-47,180.10385), Hspa14 (Cluster-47,180.185583), and Mn-SOD (Cluster-47,180.180330) was greatly up-regulated in the gills (Fig. 7), while the expression of Nqo1 (Cluster-47,180.117564), Cu/Zn-SOD (Cluster-47,180.117564), Prdx1 (Cluster-47,180.233418), Gadd45b1 (Cluster-47,180.233418), Hspa14 (Cluster-47,180.185583), and Mn-SOD (Cluster-47,180.180330) was greatly up-regulated in the liver (Fig. 8). Among the genes involved in protein synthesis, the expression levels of translation initiation factors Eif4ebp1 (Cluster-47,180.374620) and eIF2α (Cluster-47,180.297183), and elongation factors eEF2 (Cluster-47,180.277438) and Eef1b2 (Cluster-47,453.0), were up-regulated (Figs. 7 and 8). In addition, the expression of Hmox1 (Cluster-47,180.157989), Gpx1 (Cluster-47,774.0), Eef1b2 (Cluster-47,453.0), and Hspa8b (Cluster-47,180.176206) was down-regulated in the liver (Fig. 8). In summary, it can be seen that some transcriptome data showed large differences among duplicate samples, but there was general agreement between transcriptome results for whole heavy metal stress and qPCR validation tests, and transcripts basically represented true expression differences.