Skip to main content

Construction of a high-density genetic map and mapping of growth related QTLs in the grass carp (Ctenopharyngodon idellus)

Abstract

Background

Grass carp (Ctenopharyngodon idellus) are important species in Asian aquaculture. A draft genome for grass carp has already been published in 2015. However, there is still a requirement for a suitable genetic linkage map to arrange scaffolds on chromosomal frameworks. QTL analysis is a powerful tool to detect key locations for quantitative traits, especially in aquaculture. There no growth related QTLs of grass carp have been published yet. Even the growth trait is one of the focuses in grass carp culture.

Results

In this study, a pair of distantly related parent grass carps and their 100 six-month-old full-sib offspring were used to construct a high-density genetic map with 6429 single nucleotide polymorphisms (SNPs) by 2b-RAD technology. The total length of the consensus map is 5553.43 cM with the average marker interval of 1.92 cM. The map has a good collinearity with both the grass carp draft genome and the zebrafish genome, and it assembled 89.91% of the draft genome to a chromosomal level. Additionally, according to the growth-related traits of progenies, 30 quantitative trait loci (QTLs), including 7 for body weight, 9 for body length, 5 for body height and 9 for total length, were identified in 16 locations on 5 linkage groups. The phenotypic variance explained for these QTLs varies from 13.4 to 21.6%. Finally, 17 genes located in these regions were considered to be growth-related because they either had functional mutations predicted from the resequencing data of the parents.

Conclusion

A high density genetic linkage map of grass carp was built and it assembled the draft genome to a chromosomal level. Thirty growth related QTLs were detected. After the cross analysis of Parents resequencing data, 17 candidate genes were obtained for further researches.

Background

The grass carp (Ctenopharyngodon idellus) belongs to the Cyprinidae family and is the only species of the genus Ctenopharyngodon. As one of the most important freshwater-cultured fish, the global production of grass carp was approximately 5.8 million tons, accounting for 12.18% of global freshwater fish production in 2015 [1]. Currently most studies about grass carp focused on fish immunity [2,3,4], nutrition [5,6,7], and stress resistance [8]. A few growth-related studies in grass carp were focused on the impacts of the additives, dietary, or the growth hormones [9]. Therefore, the underlying genes associated with growth traits are still waiting to be revealed.

Growth traits are typical quantitative traits, which are influenced by multiple genes, and perhaps no any single gene shows significant impact on such a trait. So it is very difficult to discover these genes through reverse genetics, especially for the grass carp, a fish specie usually breed once a year. Forward genetic techniques, such as quantitative trait locus (QTL) location, are more effective in parsing genes for complex traits. As early as in the first few years of the new century, QTL research was applied to investigate the body weight trait in rainbow trout (Oncorhynchus mykiss) [10], the first economic fish with genetic linkage mapping [11]. In the same year, a QTL for body length of tilapia (Oreochromis mossambicus and Oreochromis aureus) was published [12] . Since then, deeper researches on growth-related traits also have been undertaken in other teleost fishes, such as Atlantic salmon (Salmo salar) [13]. In recent years, QTL studies on growth-related traits were reported in some of the main farmed species in China, e.g., common carp (Cyprinus carpio) [14] and bighead carp (Hypophthalmichthys nobilis) [15]. However, there is no similar research in the grass carp as yet.

As the current dominated molecular markers used in researches of grass carps [16,17,18,19,20], microsatellites (or Simple Sequence Repeats, SSRs), were not suitable for the high-throughput genotyping methods. With the technological advances, the high-throughput SNP genotyping methods, such as SNP array [21] and next-generation sequencing (NGS), have been widely applied in construction of genetic maps and location of QTLs in the teleost fishes [14, 15, 22]. The SNP calling and genotyping have become feasible in grass carp since the publication of the draft genome [23].

Earlier NGS methods for QTL analysis, including reduced representation sequencing, complexity reduction of polymorphic sequences (CRoPS), restriction site associated DNA sequencing (RAD-seq), and low coverage genotyping, have been discussed previously [24]. The prominent advantage of RAD-seq is the reduction of labor and cost, due to the pooled library. However, possible genotyping errors, caused by many factors [25], have been already revealed for RAD-seq. An improved method, 2b-RAD, avoids most of potential errors which may come out of size selection or sequencing depth. Furthermore, 2b-RAD is suitable for parallel genotyping for more samples, and can be more flexible with adjusting the marker density [26].

Comparatively, linkage maps have provided a framework for genomic and genetic studies, such as molecular marker-assisted selection (MAS) for quality [27] and quantitative traits (QTL), as well as chromosomal frameworks for genome scaffolds. The genetic maps for many aquaculture species have been published [14, 15, 28,29,30,31]. A high-density genetic linkage map can provide a more precise localization of the loci related to target traits and mount more genomic contigs. The first genetic linkage map of grass carp has a low density with 279 markers [32]. A map with a high density is urgently needed for the genome frameworks and the locations of trait-related loci and genes.

In this study, a high-density genetic linkage map was constructed as a chromosome framework for the draft genome assembling and it mounted 89.91% of the genome sequences, much higher than the mounting rate (64%) of the first genetic map. Thirty QTL loci for growth-related traits were then located on the map and a candidate gene list for subsequent growth research was obtained.

Results

SNP marker filtration

The genomic high-throughput sequencing data from 2 parents and 100 progenies was screened by SOAP2 [33] and RADtyping software [34]. As a result, 5818 codominate markers (SNP) and 3531 dominate markers (InDel), belonging to 16,359 tags were preliminarily selected. After further filtration using the software bowtie and bowtie 2, a total of 8608 truly unique tags were obtained.

After excluding makers with significant segregation distortion (χ2 test, p < 0.05, df = 2), 6658 markers on 6602 tags were obtained for constructing the genetic linkage map. The average interval between markers in the genome was 0.119 Mb. The markers were distributed over 610 supercontigs, which covered 93.47% of the grass carp draft genome. In order to construct genetic linkage maps quickly and accurately, the location markers that were identical were merged, which allowed for the absence of missing genotypes in offsprings, due to the limitations of joinmap 4.1 regarding the number of makers. Then the markers with the highest number of successful genotypes in offspring were used as representative markers for constructing the map. From all the markers, 3381 markers were divided into 767 groups, and 3099 markers were not consistent with any others. 122 markers were ambiguous because they could be divided into two or more different groups due to missing genotypes. Therefore, 3866 actual markers (a-markers) for the map were obtained after ambiguous markers were removed.

The construction of the linkage map

The 3866 a-markers and all related SNPs were divided into 24 groups, with LOD ≥ 5.0. The ML algorithm in Joinmap 4.1 and the Mergemap [35] were used to construct the linkage maps of both the parents (male or female) and consensus. The result showed that the male map were consisted of 3875 markers distributed in 1973 loci with a total length of 6301.59 cM and an average interval of 3.23 cM (Supplementary Table S1). The female map consisted of 3742 markers distributed in 1898 loci with a total length of 5680.51 cM and an average interval of 2.89 cM (Supplementary Table S1). The consensus map consisted of 6429 SNPs distributed in 3340 loci with a total length of 5553.43 cM and an average interval of 1.92 cM (Table 1). Since the ML distance was longer than the regression one, the length of linkage groups obtained were generally longer than that of other maps generated throughthe regression algorithm [14, 15, 36, 37], included the first grass carp genetic maps [32].

Table 1 Summary statistics of the sex-averaged linkage map of grass carp

From the total number of markers in maps of female, male and consensus, only 1188 markers were found as the hereozygous loci in both parents (ab × ab). These markers accounting for 18.47% of all markers. Therefore, the total number of SNPs on the concensus map far exceeded male or female maps. This finding confirmed that the parents selected in our research were indeed very different and their offspings were suitable for constructing genetic maps (pseudo-testcross).

Genome scaffold anchoring and synteny analysis of zebrafish

All 6429 markers were distributed on 605 supercontigs with a total length of 0.81 Gb, 89.91% of the grass carp draft genome. The 99 supercontigs with more than 20 SNPs, were selected for scaffold anchoring, and they also showed a good linear relationship with the linkage groups (Fig. 1a). The 99 supercontigs, covering 642 Mb (74.39%) of the total length, were longer than the 573 Mb used previously [23]. In addition, 45 of this supercontigs were reversed.

Fig. 1
figure1

The concensus genetic map and growth-related QTLs of the grass carp. The outmoset circle was the concensus genetic map. The circles inside showed the LOD score of each markers to the four growth-related traits. The order was BW,BL,TL and BH inwardly. The QTLs were marked by dark red

A similar strategy for screening unique tags has been applied to map SNP tags to the zebrafish genome. Consequently, 511 unique tags were obtained. Among these unique tags, 506 fell on zebrafish chromosomes. These SNPs showed a good macro-collinearity between grass carp and zebrafish (Fig. 1b). The factor that LG13 was syntenic to ZF10 (NC_007121.6) and ZF22 (NC_007133.6) was consistent with previous results [32].

QTL mapping of growth-related traits

Pairwise comparisons were conducted among the four growth-related traits (TL, BL, BH, BW), using Pearson’s correlation coefficient. It was revealed that all of the traits showed a high correlation (p < 2.2e-16). The correlation coefficients of BW/BL, BW/TL and BW/BH were 0.95, 0.95 and 0.93, respectively. The highest coefficient was 0.97 between BL and TL, and the lowest was 0.89 between BL and BH (Table S2). BL and TL conformed to the normal distribution (p BL = 0.175, pTL = 0.550) and the logarithm of BW and BH also conformed to the normal distribution (plog(BW) = 0.274, plog(BH) = 0.096).

Based on the above treated phenotype data, 30 growth-related QTLs were found on 10 genome regions (GRs), 16 genetic linkage regions (GLRs), or five LGs including LG2, LG10, LG14, LG16 and LG18 (Table 2, Fig. 2). These LGs and corresponding supercontigs were in the synteny (Fig. 1c).

Table 2 The statistics of QTLs
Fig. 2
figure2

The colinearity of the genetic maps with the genome of grass carp and zebrafish. a The colinearity of the LGs to the 99 supercontigs in the genome of grass carp. b. The synteny analysis result of the LGs to zebrafish genome. c. The five LGs which were located growth-related QTLs

For BW, there were 7 QTL regions, among which QTL qBW1b was with the highest LOD 5.1. QTL qBW1b was located on LG2 (at 221.359–226.697 cM), which accounted for 20.9% of the phenotypic variance (PVE). 9 QTL regions were identified for BL, and QTL qBL5b showed the highest LOD 4.64. QTL qBL5b was located on the same place as QTL qBW1b, and it accounted for 19.2% of the PVE. For TL, 9 QTL regions were identified. The correlation between TL and BL traits was the highest, the highest LOD of QTL regions of TL and BL were slightly different. The QTL with the highest LOD value was qTL14b, which was located on LG18 (at 298.478–314.903 cM), accounting for 21.6% of the PVE. For BH, five QTL regions were found. The highest LOD (4.26) was found for QTL qBH15b and qBH 17, which explained 17.8% of the PVE.

Candidate gene identification for growth-related traits

In order to detect candidate genes more accurately, the parental whole genomes were resequenced with an average depth of 30X. A total of 2,415,558 SNPs were revealed as heterozygous for at least one parent. The main genotypes were ab × aa, aa × ab and ab × ab, which accounted for 39.19, 38.99 and 21.70% of all SNPs, respectively, while the rest SNPs (0.13%) havethree or more genotypes in parents (Table S3). In addition, 1,135,559 InDels were obtained.

Since 30 QTLs located on 16 GLRs or 10 GRs (Table 2), the responsible regions were used to scan the candidate genes. The start and end sites of the GRs were makerd by the QTL adjacent SNPs. The endpoints of the eight GRs were located on the same supercontigs, such as qBW1a (Table 2). For these GRs, genes located within the interval were extracted. Whereas, for the other two GRs, the state and end sites were located on different short supercontigs, eg. qBW4a (Table 2). For these two GRs, all genes were extracted. As the result, 49 pre-candidate genes were discovered. The further filter criterion was retained the genes which had at least one functional SNPs/InDels and finally 17 candidate genes were selected (Table 3).

Table 3 The Statistic of Growth-related candidate genes with mutations

Discussion

The colinearity of the genetic maps and the draft genome

The high-density linkage maps provide chromosomal frameworks for genome assembly validation. A total of 99 supercontigs with more than 20 SNPs were anchored onto the chromosomal framework. 45 of the anchored supercontigs were reversed in direction compared to other supercontigs. Most of the of supercontigs, were linear with the linkage group, but there were a few exceptions, such as the obvious scattering between LG12 ~ CI49 and CI50. This may be caused by the inaccuracies in the original sketch sequences, which need to be further refined to obtain more precise results. The LGs and supercontigs were not perfectly collinear if all markers are taken into analysis. The reason for this is hard to distinguish. This is because of the lack of parental linkage phase information, which can only be estimated by the offspring data with the introduction of some deviations. Therefore, the genetic linkage map is suitable for scaffold assembly and partial verification, and it is not suitable for the fragmented sequence assembly.

The length of LG with different mapping algorithm and different gender

Pearson’s correlation coefficient among loci number, LG length and the average interval of three species -- grass carp, common carp and bighead carp were calculated (Table S4). The male, female and consensus map of grass carp were constructed through the ML algorithm and the regression algorithm was used in other fish. In the results, loci number and LG length had a strong correlation (≥0.92) in the ML maps, and the coefficients were generally around 0.6 in the regression maps (0.60 in common carp, 0.69 in bighead carp). This phenomenon may have been affected by the different mapping algorithms and it is indicated that in the ML maps more loci synchronized with the longer length of LGs.

Additionally, there were gender differences. In our ML maps, the loci of males were slightly less than that of females (1973 < 1989), but the total length of LGs was higher than that of females (6301.59 cM > 5680.51 cM). This was also found in the common carp with regression maps [14], which shows that this phenomenon was not caused by differences between mapping algorithms.

The odd length of LG5

Notably, the length of LG5 is almost 3 times more than other LGs (Fig. 1). The LOD of markers of LG5 were more than 11.0, thus the effect of grouping error was excluded. The missing genotype of offspring in the SNPs would bring adverse effects to the accuracy of map distances, so the means of the missing genotypes in the markers on each LGs were calculated. The number for LG5 was 1.33, which was below the overall mean of 1.56. This indicated that LG5 abnormalities were not caused by a missing genotype.

In order to find out a reasonable explanation, all markers of LG5 were evaluated based on two versions of the grass carp genome: one was published online [23] and the other one was assembled by the PacBio sequences (unpublished data).

The published genome, N90 (179,941 bp) [23] was used as the standard for division. Then the 164,368 supercontigs were divided into two groups: ‘N90 seqs’ or ‘Fragment seqs’. It is well known that it is difficult to assemble sequences (fragment seqs) generally because of the repeated sequences, DNA secondary structure and other factors. The number of fragment seqs located in each LGs of the concensus map were counted (Fig. 3).

Fig. 3
figure3

The number of abnormal tags in each of the LGs. The blue bar displayed the proportion of ‘fragment seq’ in each LGs. The red bar showed the proportion of repeat tags, which were detected through the PacBio contigs in each LGs

Moreover, 44 Gb of PacBio RSII sequences data were used for the de novo genome assembling. The sequencing sample was an adult grass carp, which was a third gynogenetic generation fish that had a nearly homozygous genome. The long sequencing reads and the homozygous genome made it easier to conquer the problem of the repeated regions than the published genome version. Then the markers on the map were mapped to the new de novo assembling contigs, the ‘new’ non-unique tags were counted (Fig. 3).

Judging from the statistical results, the percentage of SNP tags on either ‘fragments seqs’ or ‘repeated tags’ of LG5 were the highest. In summary, it was speculated that the length anomaly of LG5 was affected by more repeated tags. However, whether there were more recombination hot spots on LG5 that could not be determined.

The growth-related QTL in the grass carp

QTL mapping is one of the important applications of genetic mapping. This is a bridge towards functional genome research from structural genome research and has important application value for production practice. Growth-related traits are the most important economic traits in aquaculture animals, as multiple genes, environments and their interactions control them. The current research on growth-related traits of QTL mapping is mainly concentrated on the Atlantic salmon, rainbow trout, perch, common carp and tilapia in Asia, the growth of grass carp QTL positioning characters are rarely reported. In this study, 30 growth-related QTLs were identified by analyzing 4 growth-related phenotype data and high-density genetic linkage maps of grass carp. For these locations, we found most of their locations were overlapped on linkage groups (Table 2), For example, marker ref-27,615 was located on LG18 and corresponded to 4 QTL, including qBW4a, qBL9a, qTL14b and qBH18. The reason for this result might be the high correlation coefficients among the 4 traits in grass carp. This can be seen in Table S2, the lowest correlation coefficients were 0.89 between BL and BH and the highest correlation coefficients were 0.97 between BL and TL.

The candidate genes of the growth-related QTL

Due to differences in genomic structure, it is difficult to directly compare QTLs between species, while the comparison of homologous genes is feasible. However, we did not find any intersection between the candidate genes of grass carp and the genes of the growth-related QTLs reported in salmonid fishes [38]. A reasonable explanation is that a QTL analysis can only find a very limited set of genes and miss most growth-related genes, so it is unlikely that a gene will be repeatedly detected in different QTL studies. In addtion, we cannot rule out the possibility that different fish species (or even different parent of the same species) have their unique alleles leading to differential growth in offsprings.

The candidate gene list does not cover those well-known genes related to growth, such as growth hormone (GH) gene. This fact is likely to indicate that these essential growth-related genes are functionally conservative and structural mutations on them are rare in natural enviroment. However, some of the candidate genes in our research have been shown to be directly or indirectly related with the growth trait. For example, the gene rapunzel 4 (rpz4) on qTL14b was identified as the most significant QTL for TL in our research, its heterozygous missense mutation could result in axial skeletal overgrowth [39].

Another example came from Nitric Oxide Synthase (NOS). As a multifunctional messenger molecule, Nitric Oxide (NO) could be involved in neurogenesis, cell migration, immunity and apoptosis [40]. In zebrafish, Nitric Oxide Synthase 2 (NOS2) has two isoforms, NOS2a and NOS2b [41]. NOS2a is an innate immune factor and has been studied in mammals [42, 43] and fish [44, 45]. However, NOS2b was not localized in the immune cells during the development of embryos, and the result of whole-mount in site hybridization showed that it may play a role in neuropypophysis and thyroid primordium [41]. NOS2b protein in fish has a myristoylation consensus site at the extremity of the N-terminal, and is similar to mammal NOS3, which catalyze NO and mediates vascular endothelial growth factor (VEGF)-induced angiogenesis in coronary vessels [46]. All of these study are consistent to our results, in which NOS2b was significantly related to BW and BH.

Conclusion

A high-density genetic linkage map of grass carp was built. The map’s correctness is supported by the good collinearity with both the grass carp draft genome and the zebrafish genome, and its effectiveness is demonstrated by the mounting rate which is much higher than the first map. A total of 30 growth related QTLs were detected, and 17 candidate genes were obtained from a cross analysis of the resequencing data from parent fishes, while the genes located on the QTLs without separable or effective SNPs were excluded.

Methods

The polymorphic SSRs genotyping of all parents

In order to select a suitable mapping population, the fin samples of 89 grass carp parents were captured from wild populations in the Yangtze River, Pearl River and Xiangjiang River. The samples were collected for genomic DNA extraction with the standard phenol-chloroform protocol [47]. Eighty-nine samples were genotyped by PCR with 11 SSR markers (Table S5).

The PCR reaction for each SSR was performed in 10 μL volumes containing 1 μL (about 20 ng) of sample DNA, 5 μL 2xEs Taq masteMix (CWBIO, CHINA), 0.1 μmol forward primer and 0.1 μmol reverse primer, under the following conditions: 94 °C 3 min, 35 cycles of 30 s at 94 °C, 30 s at 53 °C, 30 s at 72 °C, and then prolonged extension for 5 min at 72 °C. The PCR products were genotyped through ABI3730 (ABI, USA) and the matrix of PCR band sizes of the 89 samples on all 11 SSRs were obtained. The observed heterozygosity (Ho), expected heterozygosity (He) and polymorphism information content (PIC) of each SSR was calculated through cervus (v3.0.7) [34]. The average PIC of these markers was 0.84 and the minimum PIC was 0.78, indicating that these markers are highly polymorphic (Table 4).

Table 4 The genetic diversity of SSR markers

Hierarchical cluster analysis of all parents

Hierarchical clustering between samples was completed using R script. The band size was treated as a factor, rather than a numerical value. Between any two samples, the amount of different bands on every SSR loci were calculated as scores, then the Euclidean distance was calculated to determine the genomic similarity and the tree map was obtained (Fig. 4). Among all samples, G1-G10 were closely related and accurately clustered into a single branch. This supported the reliability of this method. Due to these results, male M3 from the Yangtze River and female F8 from the Pearl River were selected as the parents, and then their 100 randomly extracted progenies were used to construct the CP population.

Fig. 4
figure4

The hierachical clustering tree of the 89 grass carp parents based on SSR polymorphism. The orange branches were closely related fishes, the red branch was the mother (F8) and the blue branch was the father (M3)

Mapping population and phenotypic data

The F1 progenies were bred in May 2015. 100 individuals that were 6 months old were randomly sampled. Growth-related traits, including body length (BL), total length (TL), body weight (BW) and body height (BH), were measured. The caudal fin of all samples including 89 adults and 100 progenies were preserved in 95% ethanol and the Genomic DNA was extracted following the standard phenol-chloroform protocol [47]. After sampling, all fish were released.

2b-RAD sequencing and screening of SNP tags

Libraries for 2b-RAD with BsaXI of two parents and 100 progenies were prepared [26] and then sequenced on X-10 (Illumina Inc.). Quality control was used in order to remove low-quality and non-restriction site tags (Table S6), then the parents’ data was mapped to the grass carp draft genome using SOAP2 with default values for all parameters. The tags which are uniquely mapped on the genome were filtered further by RADtyping to exclude those with a too high or too low sequencing depth [48]. After this process, the remaining unique reads from parent were used as the new reference sequences, to which all the data from progenies were mapped for genotyping. Markers were screened out as the preliminary SNP tags from the progenies’ reads by two criteria: (1) the markers were genotyped successfully in at least 80% of the progenies; (2) the markers were heterozygous in both of the parents.

In order to eliminate the errors which might be introduced into the uniqueness of markers by any single software, the preliminary SNP tags were mapped to the grass carp genome using bowtie (v1.2) [49] and bowtie 2 (v2.3.3) [50] respectively with default parameters, reads with more than 2 mismatches or Indels in any mapping were excluded. Finally, the alignment results were merged and significant segregation distortion markers were removed using χ2 test, and the final unique SNP tags were obtained. In addition, the χ2 test was done by R.

Considering that the length of linkage group 5 (LG5) is remarkably long, and in order to validate it, all tags were mapped to an upgraded grass carp genome, which was assembled with long reads (44×) generated by PacBio RSII.

Construction of genetic maps and QTL mapping

To reduce calculation time and achieve the most accurate linkage results, SNPs with the same parental genotype, such as ‘ab x aa’, were used for determining whether they were completely linked or not. An in-house python script was written to complete this process, in which markers with missing genotypes in some of the fish offsprings are allowed.

The markers were grouped with a LOD threshold of 5.0 into 24 LGs using Joinmap 4.1 [51] with default parameters. The male and female maps were also calculated by Joinmap 4.1 with Monte Carlo ML algorithm [51] and the consensus map was merged by MergeMap [52] . Pearson’s correlations among the four growth-related traits (BW, BH, BL and TL) were performed in all progenies. QTL mapping for growth traits used the Multiple QTL Mapping (MQM) method, with a LOD interval of 1 cM through MapQTL 6.0 [53]. The consensus map and QTL mapping were visualized using circos (v0.66) [54].

The synteny analysis of grass carp draft genome and zebrafish

After QTL mapping above, the correspondence of marker positions on the genetic map and draft genome were obtained. The supercontigs with more than 20 markers located on the genetic map were retained. The markers on these supercontigs were used for synteny analysis, and for visualizing the results of synteny analysis ggplot [55] was then used. For the sake of making the image aesthetically pleasing, the supercontigs were renamed. The corresponding supercontigs are listed in Table S17.

During the process of the synteny analysis of zebrafish, markers located on the grass carp genetic map were mapped to the zebrafish genome (GRCz10) using bowtie and bowtie 2 with default parameters to exclude repeated tags which can be mapped on multiple locations. Remaining unique SNP tags were used for the synteny analysis and the visualization of collinearity was done by circos (version: 0.66). The zebrafish chromosomal names were also renamed for aesthetic needs, and the corresponding list is shown in Table S27.

Obtainment and analysis of parents resequencing data

After the DNA sequencing libraries were constructed with an insert size of 300 bp and paired-end sequenced on an Illumina Xten sequencer, the data of the parents was obtained. Then the filtered reads were mapped to the grass carp draft genome with BWA (version: 0.7.12) using the default parameters. Duplicated reads were filtered with Picard (version: 2.1.1). SNP and Indel calling was performed using the Genome Analysis Toolkit (GATK, version: 3.5) with the adjustment of parameter ‘-glm’.

The homozygous SNPs and Indels in both the parents were excluded firstly, and the rest of them were annotated with the gene transfer format (GTF) file of grass carp [20] and the SnpEff software [56] by using the default parameters. As the result of annotation, every SNP was assigned a label named as ‘effect impact’, which valued in a set of four ratings (High, Moderate, Low and Modifier), and can be used for subsequent filtering process. The interval size of upstream and downstream for each gene was 5 K in our analysis. Then the total SNPs which located in the gene-related regions, included upstream, 5’UTR, exon, intron, 3’UTR and downstream, were counted for each gene. We removed the markers that may have little effect on gene function with the ‘Modifier’ tags and definited the remaining SNPs and indels as the functional mutations. The number of functional SNPs/Indels per gene were calculated finally.

Availability of data and materials

The raw sequence data reported in this paper have been deposited in the Genome Sequence Archive [57] in BIG Data Center, Beijing Institute of Genomics (BIG), Chinese Academy of Sciences, and can be publicly downloaded under accession numbers PRJCA001074 (https://bigd.big.ac.cn/bioproject/browse/PRJCA001074) and PRJCA001162 (https://bigd.big.ac.cn/bioproject/browse/PRJCA001162).

Abbreviations

SNPs:

Single nucleotide polymorphisms

QTLs:

Quantitative trait loci

NGS:

Next-generation sequencing

CroPS:

Complexity reduction of polymorphic sequences

RAD-seq:

Restriction site associated DNA sequencing

MAS:

Marker-assisted selection

Ho:

Observed heterozygosity

He:

Expected heterozygosity

PIC:

Polymorphism information content

SSRs:

Simple sequence repeats

BL:

Body length

TL:

Total length

BW:

Body weight

BH:

Body height

LG:

Linkage group

GATK:

Genome Analysis Toolkit

GTF:

Gene transfer format

a-markers:

Actual markers

GRs:

Genome regions

GLRs:

Genetic linkage regions

PVE:

Phenotypic variance

GH:

Growth hormone

NOS:

Nitric Oxide Synthase

NO:

Nitric Oxide

MQM:

Multiple QTL Mapping

References

  1. 1.

    FAO (2017). "Fishery and Aquaculture Statistics. 2015." 107p.

  2. 2.

    Rangel AAC, Rockemann DD, Hetrick FM, Samal SK. Identification of grass carp haemorrhage virus as a new genogroup of aquareovirus. J Gen Virol. 1999;80:2399–402.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  3. 3.

    Su JG, Zhang RF, Dong J, Yang CR. Evaluation of internal control genes for qRT-PCR normalization in tissues and cell culture for antiviral studies of grass carp (Ctenopharyngodon idella). Fish Shellfish Immunol. 2011;30(3):830–5.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  4. 4.

    Wang LQ, Wu Z, Huang QL, Huang KY, Qi GQ, Wu CX, Mao HL, Xu XW, Wang HZ, Hu CY. Grass carp (Ctenopharyngodon idella ) STAT3 regulates the eIF2 alpha phosphorylation through interaction with PKR. Dev Comp Immunol. 2018;78:26–34.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  5. 5.

    Du ZY, Liu YJ, Tian LX, Wang JT, Wang Y, Liang GY. Effect of dietary lipid level on growth, feed utilization and body composition by juvenile grass carp (Ctenopharyngodon idella). Aquac Nutr. 2005;11(2):139–46.

    CAS  Article  Google Scholar 

  6. 6.

    Du Z-Y, Liu Y-J, Tian L-X, He J-G, Cao J-M, Liang G-Y. The influence of feeding rate on growth, feed efficiency and body composition of juvenile grass carp (Ctenopharyngodon idella). Aquac Int. 2006;14(3):247–57.

    Article  Google Scholar 

  7. 7.

    Zhang Y, Liu WT, Li GY, Shi B, Miao YQ, Wu XH. Isolation and partial characterization of pepsin-soluble collagen from the skin of grass carp (Ctenopharyngodon idella). Food Chem. 2007;103(3):906–12.

    CAS  Article  Google Scholar 

  8. 8.

    Law SHW, Wu RSS, Ng PKS, Yu RMK, Kong RYC. Cloning and expression analysis of two distinct HIF-alpha isoforms - gcHIF-1alpha and gcHIF-4alpha - from the hypoxia-tolerant grass carp, Ctenopharyngodon idellus. BMC Mol Biol. 2006;7:13.

    Article  CAS  Google Scholar 

  9. 9.

    Wang W, Sun Y, Wang Y, Zhu Z. Expression of grass carp growth hormone in the yeast Pichia pastoris. Acta Genet Sin. 2003;30(4):301–6.

    CAS  PubMed  PubMed Central  Google Scholar 

  10. 10.

    O' Malley KG, Sakamoto T, Danzmann RG, Ferguson MM. Quantita- tive trait loci for spawning date and body weight in rainbow trout: testing for conserved effects across ancestrally duplicated chromosomes. J Hered. 2003;94(4):273–84.

    CAS  Article  Google Scholar 

  11. 11.

    Young WP, Wheeler PA, Coryell VH, Keim P, Thorgaard GH. A detailed linkage map of rainbow trout produced using doubled haploids. Genetics. 1998;148:839–50.

    CAS  PubMed  PubMed Central  Google Scholar 

  12. 12.

    Cnaani A, Hallerman M, Ron EM, Weller JI, Indelman M, Kashi Y, Gall GAE, Hulata G. Detection of a chromosomal region with two quantitative trait loci, affecting cold tolerance and fish size, in an F-2 tilapia hybrid. Aquaculture. 2003;223:117–28.

    CAS  Article  Google Scholar 

  13. 13.

    Moghadam HK, Poissant J, Fotherby H, Haidle L, Ferguson MM, Danzmann RG. Quantitative trait loci for body weight, condition factor and age at sexual maturation in Arctic charr (Salvelinus alpinus): comparative analysis with rainbow trout (Oncorhynchus mykiss) and Atlantic salmon (Salmo salar). Mol Gen Genomics. 2007;277(6):647–61.

    CAS  Article  Google Scholar 

  14. 14.

    Peng W, Xu J, Zhang Y, Feng J, Dong C, Jiang L, Feng J, Chen B, Gong Y, Chen L, Xu P. An ultra-high density linkage map and QTL mapping for sex and growth-related traits of common carp (Cyprinus carpio). Sci Rep. 2016;6:26693.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  15. 15.

    Fu, B., H. Liu, X. Yu and J. Tong (2016). "A high-density genetic map and growth related QTL mapping in bighead carp (Hypophthalmichthys nobilis)." Scientific Reports 6.

  16. 16.

    Li O, Zhao Y-y, Guo N, Lu C-y, Sun X-w. Effects of sample size and loci number on genetic diversity in wild population of grass carp revealed by SSR. Zool Res. 2009;30(2):121–30.

    CAS  Article  Google Scholar 

  17. 17.

    Bai S, Herrera-Abreu M, Rohn JL, Racine V, Tajadura V, Suryavanshi N. Identification and characterization of a set of conserved and new regulators of cytoskeletal organisation, cell morphology and migration. BMC Biol. 2011;9(1):18.

    Article  CAS  Google Scholar 

  18. 18.

    Zhang M, Nie JK, Shen YB, Xu XY, Dang YF, Wang RQ, Li JL. Isolation and characterization of 25 novel EST-SNP markers in grass carp (Ctenopharyngodon idella). Conserv Genet Resour. 2015;7(4):819–22.

    Article  Google Scholar 

  19. 19.

    Zheng G, Chen J, Jiang X, Zou S. Detection of EST-SSRs markers and genetic structure of different populations of grass carp in Yantze River system. Acta Hydrobiologica Sinica. 2015;39(5):1003–11.

    CAS  Google Scholar 

  20. 20.

    Chen, Y. X., M. J. Shi, W. T. Zhang, Y. Y. Cheng, Y. P. Wang and X. Q. Xia (2017). "The grass carp genome database (GCGD): an online platform for genome features and annotations." Database Journal of Biological Databases and Curation: 8.

  21. 21.

    Xu J, Zhao ZX, Zhang XF, Zheng XH, Li JT, Jiang YL, Kuang YY, Zhang Y, Feng JX, Li CJ, Yu JH, Li Q, Zhu YY, Liu YY, Xu P, Sun XW. Development and evaluation of the first high-throughput SNP array for common carp (Cyprinus carpio). BMC Genomics. 2014;15:10.

    Article  Google Scholar 

  22. 22.

    Zhang GQ, Zhang XH, Ye HZ, Jiang SJ, Yu H, Li J, Shi Q, Chen GH, Zhou Z, Luo J, You XX. Construction of high-density genetic linkage maps and QTL mapping in the golden pompano. Aquaculture. 2018;482:90–5.

    Article  Google Scholar 

  23. 23.

    Wang YP, Lu Y, Zhang Y, Ning ZM, Li Y, Zhao Q, Lu HY, Huang R, Xia XQ, Feng Q, Liang XF, Liu KY, Zhang L, Lu TT, Huang T, Fan DL, Weng QJ, Zhu CR, Lu YQ, Li WJ, Wen ZR, Zhou CC, Tian QL, Kang XJ, Shi MJ, Zhang WT, Jang SH, Du FK, He S, Liao LJ, Li YM, Gui B, He HH, Ning Z, Yang C, He LB, Luo LF, Yang R, Luo Q, Liu XC, Li SS, Huang W, Xiao L, Lin HR, Han B, Zhu ZY. The draft genome of the grass carp (Ctenopharyngodon idellus) provides insights into its evolution and vegetarian adaptation. Nat Genet. 2015;47(6):625–31.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  24. 24.

    Davey JW, Hohenlohe PA, Etter PD, Boone JQ, Catchen JM, Blaxter ML. Genome-wide genetic marker discovery and genotyping using next-generation sequencing. Nat Rev Genet. 2011;12(7):499.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  25. 25.

    Mastretta-Yanes A, Arrigo N, Alvarez N, Jorgensen TH, Pinero D, Emerson BC. Restriction site-associated DNA sequencing, genotyping error estimation and de novo assembly optimization for population genetic inference. Mol Ecol Resour. 2015;15(1):28–41.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  26. 26.

    Wang, S., E. Meyer, J. K. McKay and M. V. Matz (2012). "2b-RAD: a simple and flexible method for genome-wide genotyping." Nat Methods.

  27. 27.

    Fuji K, Kobayashi K, Hasegawa O, Coimbra MRM, Sakamoto T, Okamoto N. Identification of a single major genetic locus controlling the resistance to lymphocystis disease in Japanese flounder (Paralichthys olivaceus). Aquaculture. 2006;254(1):203–10.

    CAS  Article  Google Scholar 

  28. 28.

    Pang MX, Fu BD, Yu XM, Liu HY, Wang XH, Yin Z, Xie SQ, Tong JG. Quantitative trait loci mapping for feed conversion efficiency in crucian carp (Carassius auratus). Sci Rep. 2017;7:11.

    Article  Google Scholar 

  29. 29.

    Guo JQ, Li CJ, Teng T, Shen FF, Chen YN, Wang YF, Pan CL, Ling QF. Construction of the first high-density genetic linkage map of pikeperch (Sander lucioperca) using specific length amplified fragment (SLAF) sequencing and QTL analysis of growth-related traits. Aquaculture. 2018;497:299–305.

    CAS  Article  Google Scholar 

  30. 30.

    Li CY, Wang JP, Song K, Meng J, Xu F, Li L, Zhang GF. Construction of a high-density genetic map and fine QTL mapping for growth and nutritional traits of Crassostrea gigas. BMC Genomics. 2018;19:15.

    PubMed  PubMed Central  Article  CAS  Google Scholar 

  31. 31.

    Wang JL, Li Q, Zhong XX, Song JL, Kong LF, Yu H. An integrated genetic map based on EST-SNPs and QTL analysis of shell color traits in Pacific oyster Crassostrea gigas. Aquaculture. 2018;492:226–36.

    CAS  Article  Google Scholar 

  32. 32.

    Xia, J. H., F. Liu, Z. Y. Zhu, J. J. Fu, J. B. Feng, J. L. Li and G. H. Yue (2010). "A consensus linkage map of the grass carp (Ctenopharyngodon idella) based on microsatellites and SNPs." Bmc Genomics 11.

  33. 33.

    Li R, Yu C, Li Y, Lam TW, Yiu SM, Kristiansen K, Wang J. SOAP2: an improved ultrafast tool for short read alignment. Bioinformatics. 2009;25(15):1966–7.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  34. 34.

    Slate J, Marshall T, Pemberton J. A retrospective assessment of the accuracy of the paternity inference program CERVUS. Mol Ecol. 2000;9(6):801–8.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  35. 35.

    Wu Y, Close TJ, Lonardi S. "on the accurate construction of consensus genetic maps." computational systems bioinformatics. Comput Syst Bioinform Conf. 2008;7:285–96.

    Google Scholar 

  36. 36.

    Bouza C, Hermida M, Pardo BG, Fernandez C, Fortes GG, Castro J, Sanchez L, Presa P, Perez M, Sanjuan A, de Carlos A, Alvarez-Dios JA, Ezcurra S, Cal RM, Piferrer F, Martinez P. A microsatellite genetic map of the turbot (Scophthalmus maximus). Genetics. 2007;177(4):2457–67.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  37. 37.

    Wang CM, Zhu ZY, Lo LC, Feng F, Lin G, Yang WT, Li J, Yue GH. A microsatellite linkage map of barramundi, Lates calcarifer. Genetics. 2007;175(2):907–15.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  38. 38.

    Pedersen S, Liu L, Glebe B, Leadbeater S, Lien S, Boulding EG. Significant Synteny and Colocalization of ecologically relevant quantitative trait loci within and across species of salmonid fishes. Genome. 2018;61(1):33–42.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  39. 39.

    Green J, Taylor JJ, Hindes A, Johnson SL, Goldsmith MI. A gain of function mutation causing skeletal overgrowth in the rapunzel mutant. Dev Biol. 2009;334(1):224–34.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  40. 40.

    Estrada C, Murillo-Carretero M. Nitric oxide and adult neurogenesis in health and disease. Neuroscientist. 2005;11(4):294–307.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  41. 41.

    Poon KL, Richardson M, Korzh V. Expression of zebrafish nos2b surrounds oral cavity. Dev Dyn. 2008;237(6):1662–7.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  42. 42.

    Bogdan C. Nitric oxide and the immune response. Nat Immunol. 2001;2(10):907–16.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  43. 43.

    MacMicking J, Xie QW, Nathan C. Nitric oxide and macrophage function. Annu Rev Immunol. 1997;15:323–50.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  44. 44.

    Laing KJ, Hardie LJ, Aartsen W, Grabowski PS, Secombes CJ. Expression of an inducible nitric oxide synthase gene in rainbow trout Oncorhynchus mykiss. Dev Comp Immunol. 1999;23(1):71–85.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  45. 45.

    Saeij JP, Stet RJ, Groeneveld A, Verburg-van Kemenade LB, van Muiswinkel WB, Wiegertjes GF. Molecular and functional characterization of a fish inducible-type nitric oxide synthase. Immunogenetics. 2000;51(4–5):339–46.

    CAS  PubMed  Article  PubMed Central  Google Scholar 

  46. 46.

    Fukumura D, Gohongi T, Kadambi A, Izumi Y, Ang J, Yun CO, Buerk DG, Huang PL, Jain RK. Predominant role of endothelial nitric oxide synthase in vascular endothelial growth factor-induced angiogenesis and vascular permeability. Proc Natl Acad Sci U S A. 2001;98(5):2604–9.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  47. 47.

    Sambrook, J., E. F. Fritsch and T. Maniatis (1989). Molecular cloning: a laboratory manual, Cold spring harbor laboratory press.

  48. 48.

    Fu XT, Dou JZ, Mao JX, Su HL, Jiao WQ, Zhang LL, Hu XL, Huang XT, Wang S, Bao ZM. RADtyping: an integrated package for accurate De novo Codominant and dominant RAD genotyping in mapping populations. PLoS One. 2013;8(11).

  49. 49.

    Langmead B, Trapnell C, Pop M, Salzberg SL. Ultrafast and memory-efficient alignment of short DNA sequences to the human genome. Genome Biol. 2009;10(3).

  50. 50.

    Langmead B, Salzberg SL. Fast gapped-read alignment with bowtie 2. Nat Methods. 2012;9(4):357–U354.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  51. 51.

    Ooijen, J. W. v. (2006). "Joinmap4, Software for the calculation of genetic linkage maps in experimental populations.".

  52. 52.

    Wu YH, Close TJ, Lonardi S. Accurate construction of consensus genetic maps via integer linear programming. Ieee-Acm Transact Comput Biol Bioinform. 2011;8(2):381–94.

    Article  Google Scholar 

  53. 53.

    Van Oojien JW. MapQTL 6: software for the mapping of quantitative trait loci in experimental populations of diploid species. The Netherlands: Wageningen; 2011.

    Google Scholar 

  54. 54.

    Krzywinski M, Schein J, Birol I, Connors J, Gascoyne R, Horsman D, Jones SJ, Marra MA. Circos: an information aesthetic for comparative genomics. Genome Res. 2009;19(9):1639–45.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  55. 55.

    Wickham H. ggplot2: elegant graphics for data analysis. J Stat Softw. 2010;35(1):65–88.

    Google Scholar 

  56. 56.

    Cingolani P, Platts A, Wang LL, Coon M, Nguyen T, Wang L, Land SJ, Lu XY, Ruden DM. A program for annotating and predicting the effects of single nucleotide polymorphisms, SnpEff: SNPs in the genome of Drosophila melanogaster strain w (1118); iso-2; iso-3. Fly. 2012;6(2):80–92.

    CAS  PubMed  PubMed Central  Article  Google Scholar 

  57. 57.

    Wang Y, Song F, Zhu J, Zhang S, Yang Y, Chen T, Tang B, Dong L, Ding N, Zhang Q, Bai Z, Dong X, Chen H, Sun M, Zhai S, Sun Y, Yu L, Lan L, Xiao J, Fang X, Lei H, Zhang Z, Zhao W. GSA: genome sequence archive. Genomics Proteomics Bioinform. 2017;15(1):14–8.

    Article  Google Scholar 

Download references

Acknowledgements

This research was supported by the Wuhan Branch, Supercomputing Center, Chinese Academy of Sciences, China.

Funding

This research was supported by the Strategic Priority Research Program of the Chinese Academy of Sciences (Grant No. XDA08020201), the National Natural Science Foundation of China (Grand No. 31571275), the National High-Technology Research and Development Program (863 Program, Grant No. 2011AA100403), and the 100-talent Program, Chinese Academy of Sciences (Grant No. Y22302).

Author information

Affiliations

Authors

Contributions

XQX, YPW and MJS conceived the projects. YPW provided parent grass carps, XLH, YXJ, WTZ, YYC, XCM, YD, LX, YXC performed the experiments, MJS and WTZ did data analysis. MJS, XQX, YXJ and NW wrote and revised the manuscript. Manuscript was read and approved by XQX.

Corresponding authors

Correspondence to Mijuan Shi or Xiao-Qin Xia.

Ethics declarations

Ethics approval and consent to participate

All fish were cultured in the Huanggang Fisheries Research Institute in Huanggang, Hubei Province, China. The experiment was carried out in the Aquatic Bioinformatics Laboratory. The sampling and subsequent experiments were approved by the Animal Care and Use Committee of the Institute of Hydrobiology, the Chinese Academy of Sciences.

Consent for publication

Not applicable.

Competing interests

The authors declare that they have no competing interests.

Additional information

Publisher’s Note

Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.

Reprints and Permissions

About this article

Verify currency and authenticity via CrossMark

Cite this article

Huang, X., Jiang, Y., Zhang, W. et al. Construction of a high-density genetic map and mapping of growth related QTLs in the grass carp (Ctenopharyngodon idellus). BMC Genomics 21, 313 (2020). https://doi.org/10.1186/s12864-020-6730-x

Download citation

Keywords

  • Grass carp
  • Linkage map
  • Growth-related trait
  • Quantitative trait loci
  • Single nucleotide polymorphism