A high-density SNP-based genetic map in Pelteobagrus vachelli and mapping of growth, sex determination and hypoxia tolerance related loci


 Background: As is known to all, when we are doing QTL fine mapping, analyzing comparative genome, identifying candidate genes, making marker-assisted selection in aquaculture species, the high-density genetic linkage map is of great significance. Pelteobagrus vachelli is a very popular commercial species in Asia. However, some specific characters made it difficult to do traditional selective breeding based on phenotypes. For instance, potential problems include lacking in genomic resource which is in large scale and short of markers that is related tightly to growth, sex determination and hypoxia tolerance related traits.
Results: By making use of 5059 ddRAD markers in P. vachelli，a high-resolution genetic linkage map was successfully constructed. The map’ length was 4047.01 cM by using an interval of 0.11 cm, which is an average marker standard. It is reflected that bu using comparative genome mapping, a large majority (83.2%) of markers with a one-to-one correspondence were observed between P. vachelli and P. fulvidraco. Based on the genetic map, 8 significant genome-wide QTLs for 4 weight, 1 body proportion, 2 sex determination, and 1 hypoxia tolerance related traits were detected on 4 LGs. Some SNPs from these significant genome-wide QTLs were observably associated with these phenotypic traits in other individuals by Kompetitive Allele Specific PCR. In addition, two candidate gene for weight, Sipa1 and HSD11B2, were diﬀerentially expressed between fast-, medium- and slow-growing P. vachelli. Sema7a, associated with hypoxia tolerance, was induced after hypoxia exposure and reoxygenation.
Conclusions: We mapped a set of suggestive and significant QTLs as well as candidate genes for 12 growth, 1 sex determination and 1 hypoxia tolerance related traits based on a high-density genetic linkage map by making use of SNP markers for P. fulvidraco. Our results have offered a valuable method about the much more efficient production of all-male, fast growth and hypoxia tolerance P. vachelli for the aquaculture industry.


Background
It is universally acknowledged that high-resolution linkage maps have become more and more signi cant for a large number of genetic studies, such as ne-scale quantitative trait locus (QTL) mapping, comparative genomic analysis (CGA), genome scaffolding and assembly (GSA) and mapping molecular marker-assisted selection (MAS) [ 1,2]. For the construction of high-density genetic linkage maps, singlenucleotide polymorphism (SNP) markers representing the most abundant sources of variation in the genome are increasingly utilized [ 3]. Newly developed genotyping methods, such as reducedrepresentation sequencing, SNP arrays, and re-sequencing, have allowed for the discovery and simultaneous scoring of thousands of SNP markers from a single sequencing run for dozens of individuals [ 4]. These techniques have already been applied in some sh species, such as Ictalurus punctatus [ 5], Siniperca chuatsi [ 6], Scophthalmus maximus [ 7], Paralichthys olivaceus [ 8 ], and Megalobrama amblycephala [ 9]. In addition, signi cant QTLs for weight, gender and disease resistance in these sh species have been detected. MAS using QTL analysis is the most effective breeding method for many animals and plant species and it is a direct choice for genotyping the control trait loci [ 10].
In the Pelteobagrus genus, Pelteobagrus vachelli is the largest individual. In Asia, the reason why it is a very popular commercial species is that it has relatively high yield together as well as an affordable price for consumers to buy [ 11,12]. This species is also known as the male parent of the new aquatic species 'Huangyou-1 cat sh' (P. vachelli × P. fulvidraco ), which has advantages such as faster growth, a high feeding rate, and greater immunity when compared to its parent species [ 13]. As with a lot of other cultured sh species, the production of P. vachelli seedlings by catching wild parents and carrying out multiple generations of inbreeding has given rise to varying qualities and the degradation of economic characteristics, e.g., body length, weight, head length and condition factors. These problems result in farming production e ciencies and economic bene ts that are di cult to guarantee, in addition to the declining competition for these goods in sh markets, which have seriously affected the sustainable development of the P. vachelli industry. Identi cation of molecular markers from important QTLs for growth-related characters could contribute to e cient breeding varieties, similar to several other aquaculture species, including Lates calcarifer [ 14 ], Hypophthalmichthys nobilis [ 15], Oncorhynchus nerka [ 16 ], and O. tshawytscha [ 17].
Signi cant sexual dimorphism is often found in sh species. In terms of those commercial species, such as Dicentrarchus labrax [ 18 ], Hippoglossus hippoglossus [ 19 ], and Oreochromis niloticus [ 20 ], size as well as growth rates are always signi cantly various between female and male individuals, thus in uencing their commercial value to a large extent. Bagridae sh, the Pelteobagrus genus in particular, also exhibit signi cant sexual dimorphism [ 21,22]. Females tend to grow more slowly than males and lower body weights (up to 30% less) have been reported at harvest time compared to the body weights of identical cohorts of males [ 23]. Therefore, similar to the above mentioned commercial sh, screening for sex-associated markers will lead to shorter time that is asked for the development of all-male sh lines for aquaculture and will be of great help in elucidating sex determination and the mechanisms of sex chromosome differentiation.
Acute and chronic stress responses, which are important for the health of cultured organisms (such as their growth, reproduction, immunity, and other energy demanding activities), are the result of exposure to hypoxia [ 24]. A lot of environmental factors (i.e., elevated temperatures, decomposition, algal blooms, and organic matter accumulation via faeces and unconsumed food, and the high stocking density of intensive sh culture systems) will lead to rising biological oxygen demand and make the hypoxia status more serious for animals in the system [ 25]. Living in hypoxic conditions can place selection pressure on genotypes which could tolerate hypoxia at the genomic level more [ 26], thus developing lines by MAS that are more tolerant to hypoxia is an economical as well as sustainable solution to the problem about aquaculture. However, progress in improving hypoxia tolerance in sh, including Bagridae sh, has been very slow due to the shortage of genetic and physiological knowledge of hypoxia tolerance traits. With aquaculture sh species, analysing genetic of the QTLs associated with hypoxia tolerance is important for e cient MAS in sh, as has been demonstrated in I. punctatus [ 27,28] and O. niloticus [ 29].
Although considerable work has been carried out in P. vachelli to increase production, some speci c characteristics have continued to hinder the development of traditional phenotype-based selective breeding techniques, such as lacking in large-scale genomic resources as well as few markers that are tightly associated with growth-, sex determination-and hypoxia tolerance-related traits. MAS technology, which uses a series of selected markers that are connected to economical traits tightly, can be used to develop stress tolerant and fast growing lines more effectively. As a result, it is of great necessity to start a selective breeding programme for P. vachelli to promote these economical traits. Comparing to existing RAD-Seq approaches, as a reduced-representation sequencing method, double digest restriction-site associated DNA sequencing (ddRAD-seq) is a way for de novo SNP discovery and genotyping in nonmodel species that is e cient and cost-saving. This technique has already been applied in some sh, such as Oncorhynchus kisutch [ 30] and Polyprion oxygeneios [ 31]. For the speci c characteristics of P. vachelli, the ddRAD-seq technique is a very effective and appropriate method for the MAS of a selective breeding programme.
In this research, a high-resolution genetic linkage map of P. vachelli was established using the ddRAD-seq technique for comparative genome mapping with assembled genomes of I. punctatus and P. fulvidraco, as well asfor the ne mapping of economical QTL traits, including 13 growth-, 1 sex determination-, and 1 hypoxia tolerance-related trait. Kompetitive Allele Speci c PCR (KASP) was used to test tight chain SNPs for conducting MAS and the candidate genes screened across signi cant genome-wide (GW) QTL localization intervals were further analysed by qRT-PCR. Our results will facilitate the elucidation of the molecular mechanisms that determine growth kinetics, sex determination and hypoxia tolerance-related traits. In future MAS and de novo GSA in P. vachelli and in CGA in Siluriformes sh, they will also be very useful. We provide a good case study illustrating the utility of genomic data for investigating the genetic basis of important phenotypic traits using reduced-representation sequencing.

Phenotypic trait characteristics
The mapping family in this study consisted of 200 P. vachelli progeny and their phenotypic growth (e.g., W, TL, HL, BL/HL, BH, BL, BW/ED, BL/SL) and hypoxia tolerance-related traits were in concordance with the normal distribution (Kolmogorov-Smirnov test, asymptotic signi cance > 0.05; Table S1). All growthrelated trait values are shown in Table S1, as well as their phenotypic sexes and time to balance loss.
With regards to the growth-related traits, BL/HL, CF, BW/ED, BH/ED, and BL/SL were related to body type proportions and W, BL, HL, TL, BW, BH, CL, and CH were directly related to weight, with these eight traits showing a strong correlation with each other (r = 0.593-0.967, P < 0.001 for all) ( Table 1). The highest correlation value (r = 0.967) was observed between BL and TL. There was a strong correlation between W, TL and BL, with the W being strongly correlated with TL (r = 0.927) and BL (r = 0.934). HL was highly correlated with TL (r = 0.813), BL (r = 0.813) and W (r = 0.823). By phenotype sexing of the mapping family, 101 and 99 individuals were identi ed as males and females, respectively, with a sex ratio of 1.02:1.  (Table S2).

SNP discovery and genotyping
A total of 461,909 raw polymorphic markers were detected using the STACKS pipeline. After stringent selection, 5165 polymorphic markers were successfully genotyped in both parents and offspring. These SNPs were classi ed into three categories: maternal heterozygous (1881 SNPs), paternal heterozygous (2405 SNPs), and heterozygous in both (879 SNPs). All SNPs are listed in Table S3.

High-resolution genetic map construction
A high-resolution ddRAD-based linkage map of P. vachelli was constructed using a pseudo-testcross strategy (a mapping population is developed by hybridizing two unrelated, highly heterozygous parents to produce a set of F1 progeny) [ 8]. The resulting integrated map consisted of 26 linkage groups (LGs), including 5059 segregating SNPs (97.9% of all 5165 polymorphic markers). The number of LGs was congruent with the number of P. vachelli chromosomes (2n = 52) [ 32] The total map length was 4047.01 cM with an average interlocus distance of 0.11 cM. The genetic length of each LG ranged from 122.47 cM (LG1) to 189.03 cM (LG24) with an average interlocus distance of 0.03-0.83 cM ( Table 2 and Fig. 1). The locus names and SNP positions in the 26 LGs of the integrated genetic map are listed in Table S4.

Comparative genome mapping
Within the comparative genome mapping between the LGs of P. vachelli and the chromosomes of P. fulvidraco, 3879 (76.7% of 5059) markers were uniquely mapped to the assembled LGs and were used for synteny analysis ( Fig. 2A). Overall, a one-to-one correspondence was observed between the LGs of P. vachelli and the chromosomes of P. fulvidraco. Notably, for each LG of P. vachelli, approximately 83.2% of the 3879 synteny markers were located on a single chromosome of P. fulvidraco (Fig. 2B), while the remaining were dispersed into various LGs (Table S5). A total of 871 (17.2% of 5059) markers in the linkage map of P. vachelli were uniquely aligned to the chromosomes of I. punctatus (Table S5 and

Fine QTL mapping for phenotypic traits
In this study, we conducted 15 QTL location analyses, including 13 growth (5 body proportion-and 8 weight-related traits), sex determination-, and hypoxia tolerance-related traits. For body proportion-related traits, the CW and GW LOD signi cance thresholds varied from 3.1 to 3.3 and from 4.9 to 5.7, respectively, based on the permutation test. Using the interval mapping algorithm, no signi cant QTLs were associated with BW/ED. A total of 11 QTLs that associated with weight-related traits were detected in 6 LGs, including 1 signi cant GW QTL (qBL/HL5-a; Fig. 3D) and 10 signi cant CW QTLs, with LOD scores ranging from 3.1 to 5.1 (Table 3).
For weight-related traits, the CW and GW LOD signi cance thresholds varied from 3 to 3.6 and from 5 to 5.1, respectively, based on the permutation test. A total of 36 QTLs associated with weight-related traits were detected in 10 LGs using the interval mapping algorithm, including 4 signi cant GW QTLs (qW14-a, qTL3-a, qTL14-a and qHL14-a; Fig. 3A 3A). Owing to the strong correlation between W, TL and BL, QTLs for BL and TL were located within the same con dence intervals along three LGs (Table 4). Six QTLs for HL were mapped to LG14, LG19 and LG20 with PVE values ranging from 7.2 to 11.2%. Two QTLs associated with BW were located within LG2 and LG6 with PVE values of 7.4% and 7.7%, respectively. For CH, ve QTLs located within LG14 and LG22 contributed PVE values of 7.3-9%, with LOD scores of 3.3-4.09. Four QTLs for CL were mapped to LG3, LG19 and LG23 with PVE values ranging from 7.5-9.4%. Two QTLs associated with BH were located within LG15 with PVE values of 7.4% and 7.5% (Fig. 3d). Among all QTLs, a total of 18 con dence intervals within LG3 and LG14 were associated with weight-related traits and in addition, all signi cant GW QTLs were mapped to these two LGs (Table 4).
For the detection of sex determination loci, the CW and GW LOD signi cance thresholds varied from 3.3-3.6 and 80.4, respectively, based on the permutation test. A total of 6 QTLs associated with sex-related traits were detected within 3 LGs, including 2 signi cant GW QTLs (qSXE14-a/b) and 4 signi cant CW QTLs, with LOD scores ranging from 3.46 to 111.33 ( Fig. 3E and Table 5).
For hypoxia tolerance-related traits, the CW and GW LOD signi cance thresholds varied from 3.2-3.7 and 5.1, respectively, based on the permutation test. A total of 11 QTLs associated with hypoxia tolerancerelated traits were detected within 7 LGs using the interval mapping algorithm, including 1 signi cant GW QTL (qHT12-a) and 10 signi cant CW QTLs, with LOD scores ranging from 3.34 to 5.22 ( Fig. 3F and Table 5).

SNP phenotypic association
To con rm and re ne the position of these signi cant GW QTLs, the 200 remaining individuals and the two parents of the 400 offspring, as well as 48 independent wild individuals (24 :24 ) were used for SNP phenotypic association analysis by Kompetitive Allele Speci c PCR (KASP) assays (Table S6). Our results showed that growth-related traits, including body proportion (BL/HL; un_16957051) and weight (W, BL, TL, HL, and CH) were closely related to several markers (un_15322259, un_40276574, un_36488182, un_47032303, un_34909841, and un_54585281) with a K* of 3.884-9.534 and a signi cance of 0.002-0.049. The un_28380227 marker showed the highest association with the hypoxia locus from 1 signi cant GW QTL (qHT12-a) with a K* of 4.72 and a signi cance of 0.03.
All of the genotypic types (hkxhk, lmxll, and nnxnp) of the 200 remaining individuals were veri ed and in accordance with their parents. For these 12 sex-associated SNPs, the loci with hkxhk and nnxnp were found to be completely related to the gender phenotype in the 200 remaining individuals (over 97%). For example, in each case the remaining individuals were homozygous (nn) male and heterozygous (np) female when the parents were a homozygous and a heterozygous (nnxnp) for these markers. If both parents were heterozygous (hkxhk), the offspring were homozygous with one exclusively male (hh) and the other exclusively female (kk), and the heterozygous offspring (hk) were either male or female. Sexassociated SNPs could not be determined with the lmxll marker (approximately 50%-64.7%; Fig. 4 and Table S6). Most loci in the wild population showed no gender difference (approximately 50%), but there were several potentially valuable loci to consider as a marker for potential gender determination, with up to 85.2% association (un_43077153, un_13604800, and un_63006763).

Potential candidate genes for the 8 signi cant GW QTLs
Based on the non-redundant annotation information of the P. fulvidraco genome, 72 genes in total were located within the 8 signi cant GW QTLs (4 weight, 1 body proportion, 2 sex determination, and 1 hypoxia tolerance; Table S7). Among them, 2 genes from 1 body proportion-related QTL (qBL/HL5-a) are related to minus end-directed microtubule transport and sphingolipid synthesis. A total of 14 genes from 4 weightrelated QTLs (qW14-a, qTL3-a, qTL14-a, qHL14-a) play important roles in the genetic regulation of development, cell proliferation, immunity, and energy metabolism. The expression of the rapid growthrelated genes, Signal-induced proliferation-associated 1-like protein 1 (Sipa1) and Corticosteroid 11-betadehydrogenase isozyme 2 (HSD11B2), was signi cantly higher in fast-growing sh than in slow-growing sh in both the liver and the brain (P < 0.05, two-tailed t-test) and Sipa1 expression in the muscle of slowgrowing sh was signi cantly lower than in fast-and medium-growth sh (Fig. 5).
A total of 55 candidate QTL genes from 2 sex-related QTLs (qSXE14-a/b) were identi ed, however, no genes associated with sexual differentiation or determination were identi ed within this region based on the non-redundant annotation information of the P. fulvidraco genome (Table S7). The gene Semaphorin-7A (Sema7a) from 1 of the hypoxia tolerance-related traits (qHT12-a) has been reported to play an important role in hypoxia adaption (Table S7). The temporal expression of Sema-7a mRNA changed in the liver and brain after hypoxia and reoxygenation. Sema-7a mRNA expression peaked in both the liver and the brain after 1.5 h of hypoxia and remained high throughout the recovery.

Discussion
High-resolution genetic map construction using ddRAD-based SNPs To date, through using expressed sequence tags [ 33], AFLP [ 34], and SSR [ 35], a large majority of cat sh genetic maps have been constructed. However, these linkage maps in cat sh is still less dense, compared to an average density of 1.40-15.20 cM. Currently, a few attempts to construct high-density genetic maps of cat sh have been started, but thus far, only channel cat sh (I. punctatus) [ 36] and southern cat sh (Silurus meridionalis) [ 37] maps have been constructed. High-density linkage maps for Bagridae cat sh species have not been generated at all. In the present research, a high-resolution genetic linkage map of dark barbell cat sh (P. vachelli) was constructed with 5059 SNPs by applying the ddRAD-seq technique, which represents the most saturated linkage map to date. This is the very rst kind of high-density Bagridae sh genetic map that has been reported. A high-density linkage map, such as the one we have constructed, offers information on genomic structure in detail and is a usefel resource for comparative genomics and ne-scale QTL mapping in cat sh.

Comparative genome analysis
It is an e cient method to use comparative mapping with model species for evaluating the accuracy of genetic maps of non-model species. In this research, syntenic analyses were performed between our map and the reference genome of P. fulvidraco, which belongs to as same genus as P. vachelli. The exact number of LGs was congruent with the total number of P. fulvidraco chromosomes (2n = 52) [ 21]. A near 1:1 relationship was discovered between our map and the draft genome of P. fulvidraco and a high level of conserved genomic synteny (76.7%) and syntenic boxes (83.2%) were presented, which indicates a high genetic relationship between P. vachelli and P. fulvidraco. Aa for the previous linkage map of Cyprinus carpio [ 38], the similar phenomenon was detected, too. However, only 17.2% of the mapped markers were aligned speci cally to the chromosomes of I. punctatus (2n = 58), indicating the presence of extensive intra-chromosomal rearrangements between the two cat sh species either or issues with the assembly of the draft genome [ 39].

Fine scale QTL mapping and SNPs phenotypic association
It is e cient to identify genetic loci and candidate genes underlying quantitative traits of interest by using Fine scale QTL mapping method. Up to now, growth-related QTL studies have occurred in various kinds of aquatic species, but with the exception of those in I. punctatus [ 27,28], very few studies including ne scale QTL mapping have been performed in cat sh. QTL ne mapping and candidate gene identi cation were hard to achieve in P. vachelli because of the absent high-density linkage map. Within this study, our map consists of enough high con dence QTLs relevant to weight, body proportion, sex determination and hypoxia tolerance. These ndings show one promising platform for MAS and GAS in cat sh and would help to clarify the molecular mechanisms of the above phenotypic traits.
It is very important to validate the results of the QTL association analysis. The fact that the phenotypeassociated SNPs showed very strong relation with weight, body proportion (BL/HL), sex determination, and hypoxia tolerance traits when tested in the 200 remaining individuals, two parents and 48 independent wild individuals using KASP assays is promising. Interestingly, we found that several markers were associated with multiple traits (W, BL, TL, HL, CH) at the same time (Table 6), which also re ects the strong correlation between these weight-related traits. For the 12 sex-associated SNPs, several loci, including hkxhk and nnxnp, showed great relevance to the gender phenotypes (over 97% association in the remaining individuals and over 85.2% association in the wild population), but most loci showed no gender differences (approximately 50% association) in the wild population. The above results also demonstrate the limitations of using SNPs as a sex marker, because they are susceptible to the in uence of the parents within the family, which has also been observed in sex association analyses of O. niloticus L. [ 20], H. Hippoglossus [ 19], and P. oxygeneios [ 31]. Nonetheless, the use of these markers is possible to conduct genetic gender identi cation and all-male breeding based on the P. vachellifamily line selection. Using these SNPs, MAS could also be conducted, providing a valuable tool towards the much more e cient production of all-male, fast growing and hypoxia tolerant P. vachelli for the aquaculture industry.

Growth-related traits and candidate genes
It is reported in some aquaculture specie that there are strong correlations among the growth-related traits (BW, TL, BL, BH and HL) [ 38]. Similarly, these ve growth-related traits were also highly correlated with each other in this study ( Table 1). As expected, based on the relatively high correlation value (mean r = 0.943) between W, TL and BL, QTLs for W, TL and BL had the same distribution patterns within the three LGs (LG3, LG14, and LG24). Additionally, all the signi cant GW QTLs associated with weight-related traits were detected within LG3 and LG14, indicating that these two major LGs are involved in rapid growth and that the candidate genes associated with weight may also be located within these LGs. Among the signi cant GW QTLs, it is worth noting that several genes related to T cell development (thymocyte selection-associated high mobility group box protein, TOX), T cell antigen recognition (T cell receptor beta variable region), and the inducible expression of T cell cytokine genes (nuclear factor of activated T cells, cytoplasmic 3) were identi ed, indicating that strong disease resistance may contribute to individual growth [ 40,41].
In addition, two genes, Sipa1 and HSD11B2, were the most likely to be associated with growth, given their important roles in cell proliferation and development. Sipa1 stimulates the GTPase activity of the Rasrelated protein Rap-2a, which may well regulate cytoskeletal rearrangements, cell migration, cell adhesion and cell spreading [ 42,43], as well as promote the reorganization of the actin cytoskeleton and the recruitment of disks large homologue 4 to F-actin [ 44]. In mammals, glucocorticoid is very decisive in cell proliferation and differentiation, with the two glucocorticoid metabolic enzymes, HSD11B1 and HSD11B2, considered as anti-proliferation and pro-proliferation switches in cells, respectively. During embryonic development, HSD11B2 prevents cells from the growth-inhibiting and/or pro-apoptotic effects of cortisol [ 45]. In sh, HSD11B2 was considered to have an activity that is similar to that of mammalian HSD11B2 in preventing the cortisol activation of the glucocorticoid receptors, which could mediate cell development and stress responses. Additionally, HSD11B2 catalyses 11b-hydroxytestosterone converses to 11-ketotestosterone, a sh androgen. Previous studies have reported that androgen-related genes (e.g., 17α-Methyltestosterone) could improve the GH/IGF axis and growth in P. fulvidraco [ 46 ]. It was also found that HSD11B2 increases gradually during the period of individual growth in two cat sh species [ 47,48]. Interestingly, the expression of Sipa1 and HSD11B2 in both the liver and brain of the fast-growing sh was signi cantly higher than that in the medium-or slow-growth sh. This is consistent with the function of these genes in cell proliferation and development, supporting the notion that Sipa1 and HSD11B2 are the two good candidate genes for growth. However, with the current data, we cannot rule out the involvement of other genes within the signi cant QTLs for growth and further detailed studies on gene function are required to understand the underlying mechanisms of differential growth among individuals.

Sex determination-related traits and candidate genes
The male P. vachelli exhibit >30% faster growth than females and therefore, in both the P. vachelli farming and academic elds, the development of all-male P. vachelli for aquaculture has been a topic of great concern. The sex determination of P. vachelli is XX/XY because sex ratios in conventional diploid offspring approximate 1:1 [ 22,32] and gynogenetic offspring are all female (unpublished data), which is the same in P. fulvidraco. However, similar to other sh species, it is not easy to distinguish between the sex chromosomes (X and Y), as well as between the sex and autosomal chromosomes based on current cytogenetic techniques. Screening of sex-associated markers will make the time required for all-male P to develop much shorter. vachelli. In this study, two signi cant GW QTLs for sex determination in P. vachelli were identi ed on LG14 (with 92.3% of PVE). There were also some signi cant CW QTLs within LG21 and LG24, which is very similar to S. maximus. The most sex-related QTLs in S. maximus are found within LG21 (with 99.9% of PVE), with a linkage span of 70.882 cM. In LG7 and LG14, a small quantity of QTLs with high feasibility are discovered too [ 7]. In some other sh species, the similar phenomena have also been found. In H. hippoglossus, four markers which can be located in LG13 were found to be related to sex determination signi cantly, with 82% of PVE associated with sex [ 19]. Additionally, in Sparus aurata L. [ 6] and S. chuatsi [ 49], the signi cant QTLs affecting sex determination were found to be located in LG21 and LG23, respectively. It is suggested that, in some sh species, a main LG or chromosome is involved in sex determination and that in this LG, the sex determination genes may be located.
Among the signi cant GW QTLs of sex determination, not a single gene associated with sexual differentiation or determination was identi ed in this area, which may be due to the limitations of our simpli ed genome sequencing data or to the limitation that the P. fulvidraco genome came from an XX individual. Among the candidate genes, thrombospondin-3b was the most likely to be associated with gender development. Thrombospondin (TSP) proteins are a major component of cortical rods in mature oocytes in invertebrates [ 50]. A previous study has reported that two types of TSPs from O. niloticus involved in the formation of gonads and oogenesis also play an important role in the 14 day spawning cycle [ 51].

Hypoxia tolerant traits and candidate genes
The hypoxia tolerant trait was found to be a quantitative trait that is controlled by many genes in animals [ 52]. Mapping QTLs for hypoxia traits associated with tolerance is the rst step in improving the stress tolerance of breeding sh, but few QTL mapping studies on hypoxia traits in sh have been conducted. Only recently, two genome-wide association studies were carried out to identify QTLs for hypoxia tolerance using 250K SNP arrays and ddRAD-seq with channel cat sh families [ 27,28] and O. niloticus [ 29], respectively. It is worth noting that we identi ed a marker, un_28380227, that could be directly used in the MAS of tolerant lines in P. vachelli. Additionally, our ndings of the markers located on signi cant GW QTL intervals in LG12 provide a useful resource. Semaphorin-7A (Sema7a) was located in the signi cant GW hypoxia QTL. Sema7a plays an important role in integrin-mediated signalling and functions in regulating both cell migration and immune responses. Co-immunoprecipitation assays showed that hypoxia-inducible factor-1 (hif-1) had a structure that interacts with Sema7a and that overexpression of hif-1 induces Sema7a production even in normoxic conditions [ 53]. Previous reports suggested that hypoxia-induced Sema7a could promote neutrophil migration and a pro-tumourigenic mesenchymal phenotype [ 53,54], which also correlated with the severity of in ammation and osteoclastogenesis [ 55]. Furthermore, Sema7a was highly expressed in the early stages of hypoxic stimulation and remained high in various tissues throughout the recovery period. These results suggest that Sema7a plays a vital role during hypoxia and reoxygenation, indicating that Sema7a may be a good candidate gene for the hypoxic response.

Conclusion
We constructed a high-resolution genetic linkage map with a length of 4047.01 cM and an average marker interval of 0.11 cM using 5059 ddRAD markers in P. vachelli. Comparative genome mapping revealed that a high proportion (83.2%) of markers with a one-to-one correspondence were observed between P. vachelli and P. fulvidraco. Several SNPs from 8 signi cant genome-wide QTLs (4 weight, 1 body proportion, 2 sex, and 1 hypoxia) were found to be associated with these phenotypic traits in other individuals by KASP assays. In addition, two candidate genes for weight, Sipa1 and HSD11B2, were differentially expressed between fast-, medium-and slow-growing P. vachelli. Sema7a, which is associated with hypoxia tolerance, was induced after hypoxia exposure and reoxygenation. We mapped a set of suggestive and signi cant QTLs, as well as candidate genes for growth-, sex determination-and hypoxia tolerance-related trait based on the P. fulvidraco genome. Our results provide a valuable tool for more e cient production of all-male, fast growing and hypoxia tolerant P. vachelli for the aquaculture industry.

Mapping population and phenotypic data
The male parent was selected from a group of P. vachelli derived from a wild population from the Pearl River in Guangdong province and the female parent was chosen from a cultured population from the ChenQiang sh farm in Changzhou, Jiangsu Province, China. Fish breeding and cultivation were performed at the Nanjing Normal University. The juvenile sh were fed live Artemia nauplii to satiation twice daily until the mouth gape sizes matched the size of the arti cial compound feed. The feed was slightly over-supplied based on previous consumption.
Fish were acclimated at an ambient temperature of 26 ± 1 °C in aerated flow-through water with a bioltered water recirculation system for 1 week, followed by feed restriction for two days. For phenotypic data collection, a total of 400 offspring from a full-sib family at 105 days post-hatch were randomly selected and stocked in an aquarium (580 L of volume). The dissolved oxygen level was reduced gradually from 8.23 to 1.51 mg/L over 2.8 hours by bubbling pure nitrogen gas. After that, the surface of the water was enclosed by a plastic lm and sh were monitored for signs of balance loss due to hypoxic stress as the dissolved oxygen level was reduced from 1.51 to 0.32 mg/L over 2.6 hours. The time and sequence of balance loss for each sh was recorded for the analysis of hypoxia tolerance.
Fish were killed by dissection after mild anaesthetization in a eugenol bath (1:10,000). Growth related traits including weight (W), body length (BL), condition factor (CF), head length (HL), BL/HL, total length (TL), body width (BW), body height (BH), BW/ED (eye distance), BH/ED (eye diameter), BL/SL (snout length), caudal peduncle length (CL), and caudal peduncle height (CH) were measured by electronic scales or Vernier calliper for all 400 offspring, at which time the sexes were also identi ed by the stained squash technique [ 22]. After the measurements were taken, a portion of the ns and muscle tissues were frozen in liquid nitrogen and stored at -80 °C. This family exhibited a high variation in their growth-and hypoxia tolerance-related traits and therefore 200 randomly selected P. vachelli offspring (F1) from the above 400 offspring and their parents (F0) were used for the development of a genetic linkage map and QTL analysis. The 200 remaining individuals were used to test SNP markers in KASP assays.
Genomic DNA was extracted from the n tissue of each of the 200 randomly selected offspring (F1) and two parents (F0). ddRAD libraries for all F1 lines were constructed as described in a previous study [ 39].
Brie y, 500 ng of DNA template from each individual was double-digested using the restriction enzymes EcoRI and NlaIII (20 U/ reaction; New England Biolabs, Ipswich, MA, USA) in a combined reaction for 30 min at 37 °C. Subsequently, each fragmented sample was puri ed using a Qiagen MinElute Reaction Cleanup Kit (Qiagen, Valencia, CA, USA) and eluted in 20 μL of elution buffer (EB). The fragments were then ligated to P1 (includes a unique 4-to 8-bp multiplex identi er used to distinguish each individual) and P2 adapters that bound to the EcoRI and NlaIII overhangs, respectively. In each 40-μL reaction, 500 ng of DNA, 1 μL of P1 adapter (10 mM), 1 μL of P2 adapter (10 mM), 1 μL of T4 ligase (1000 U/mL), 4 μL of 10X T4 ligation buffer, and double-distilled water were mixed. The ligation was performed in a PCR machine using the following conditions: 37 °C for 30 min, 65 °C for 10 min, followed by a decrease in temperature to 20 °C at a rate of 1.3 °C/min. The samples were pooled and size-selected (400-600 bp) from an agarose gel. The DNA product was subsequently puri ed using a Qiagen MinElute Gel Puri cation Kit and eluted in 10 μL of EB. The paired-end (150 bp) sequencing of the ddRAD products from the 202 individuals was performed using the Illumina HiSeqXten sequencing platform (Illumina, Inc., San Diego, CA, USA). The sequencing data for each individual was extracted according to the speci c multiplex identi er.

SNP discovery and genotyping
We rst ltered out Illumina short reads lacking sample-speci c multiplex identi ers and the expected restriction enzyme motifs. Thereafter, the reads were ltered on the basis of their quality scores using Trimmomatic (v0.32) [ 56] in three steps: (1) removal of adapters; (2) removal of reads with bases from the start or end of a read, if below the quality threshold of 3; and (3) scanning of the reads with a 4-bp sliding window, removing reads with an average Phred quality per base below 15.
The STACKS pipeline (Version 1.32) [ 57] was used to assemble the loci, de novo, from the sequencing data for SNP calling. The USTACKS, CSTACKS, SSTACKS, and GENOTYPE programmes were then used to create libraries of the loci, i.e., one for each individual and one for all the loci shared among the individuals. The minimum depth of coverage required to create a stack was 3; the maximum distance (in nucleotides) allowed between stacks was 3; 2 mismatches were allowed between sample loci when the Only the SNPs with miss rates (number of samples with no genotype information/number of total samples) less than 10% along with biallelic SNPs were selected using custom PERL scripts (https://github.com/Niuyongchao/Fish_linkage_map) to avoid sequencing errors.

Linkage map construction
A linkage map was constructed using JoinMap 4.1 [ 58]. The linkage group assignments were made under a logarithm of odds (LOD) score limit of 5.0-7.0. The regression mapping algorithm and Kosambi's mapping function were used for map construction with the following settings: Rec = 0.4, LOD = 1.0, Jump = 5. The resulting linkage maps were drawn using a custom PERL script (https://github.com/Niuyongchao/Fish_linkage_map).

Comparative genome analysis
Using a method from a previous study [ 59], comparative mapping was performed between our genetic map and the reference genomes of three Siluriformes sh, including the channel cat sh (I. punctatus) and the yellow cat sh(P. fulvidraco). In brief, the ddRAD tag markers were mapped to the two cat sh genomes using the BLAST software suite (NCBI BLAST+ version 2.6.0) with an e-value cut-off of 10 -10 . In cases where the search of a query sequence hit two or more loci, the hit with the smallest e-value was considered signi cant [ 37]. Finally, markers with a single unique position in the three reference genomes were used to perform comparative mapping and the genomic synteny was visualized using Circos software.

QTL mapping
The QTLs were identi ed using MapQTL 6.0 with an interval mapping algorithm [ 60]. Automatic cofactor selection (backward elimination, P < 0.05) was used for the detection of signi cantly associated markers as cofactors. The LOD signi cance threshold levels were determined by a permutation test on the basis of 1000 permutations at a signi cance level of P < 0.05. The location of each QTL was determined according to its LOD peak location and surrounding region. The percentage of the phenotypic variance explained by a QTL (R 2 ) was estimated at the highest probability peak. The QTL results were drawn using a custom PERL script (Genepioneer Biotechnologies, Nanjing, China). QTLs with LOD scores exceeding the chromosome-wide (CW) LOD threshold at P < 0.05 were considered as suggestive, while those with LOD scores exceeding the genome-wide (GW) LOD threshold at P < 0.05 were considered as signi cant.

Kompetitive Allele Speci c PCR Assay
To con rm and re ne the positions of the signi cant GW QTLs detected in the F1 mapping population, the 200 remaining individuals from the 400 total offspring of a full-sib family at 105 days post-hatch were used. In addition, in order to avoid the restriction of the remaining population on the genotyping of the parental sh, the sex marker associations were tested in the two parents of the 400 offspring (1 :1 ) and in 48 independent wild individuals (24 :24 ) originating from the Yangtze River, Yellow River, Amur River and Pearl River. Marker phenotype association was tested using KASP assays based on SNPs that were commonly found in the mapping families and spanned the regions of 8 signi cant GW associations with weight (un_15322259, un_40276574, un_36488182, un_47032303, un_34909841, un_54585281), body proportion (un_16957051), sex determination (un_42441442, un_40145170, un_43077153, un_47032303, un_36488182, un_15322259, un_40276574, un_34909841, un_13604800, un_25544718, un_63006763, un_55377625), and hypoxia tolerance (un_28380227).
The KASP assay was carried out according to the manufacturer's recommendations (LGC Genomics, Beverly, MA, USA) and a previous study [ 61]. Marker sequences and KASP assay primers are listed in Table S8. Ampli cation was carried out starting with 15 min at 94 °C, followed by 10 touchdown cycles of 20 s at 94 °C and 60 s at 65-57 °C and 26-35 cycles of 94 °C for 20 s and 60 °C for 1 min. Endpoint genotyping was done using the CFX Manager 3.1 software. The speci city and sensitivity of all tested markers are listed in Table S2. Afterwards, an association analysis was performed using the Kruskal-Wallis test with SPSS 22.0. The Kruskal-Wallis test statistic (K*) and asymptotic signi cance (P < 0.05) were applied in order to test the magnitude of associations between the SNP genotypes and phenotype.

Identi cation of potential candidate genes
For those markers that were located in the con dence intervals of GW QTLs and mapped at a single position on the assembled genome of P. fulvidraco, their sequences were extended by adding 500 nucleotide sequences from each side of the genome. The extended markers were used to identify conserved regions in the genome of P. fulvidraco and the potential candidate genes were identi ed in the conserved regions based on the annotation information.
We identi ed a number of candidate genes including Semaphorin-7A (Sema7a), Signal-induced proliferation-associated 1-like protein 1 (Sipa1) and Corticosteroid 11-beta-dehydrogenase isozyme 2 (HSD11B2), which were possibly responsible for the rapid growth and hypoxia tolerance traits (see below). The relative expression levels of Sipa1 and HSD11B2 were detected between fast-, medium-and slow-growing P. vachelli (fast: 13.2±1.7 g; medium: 9.2±1.1 g; slow: 4.7±0.6 g) by qRT-PCR. Each group had nine individuals and three sh were pooled to create a sample. The materials for the hypoxia experiment were the same as our previously published articles [ 62,63]. In brief, control sh were removed from three aquaria for direct organ dissection. Afterwards, the water was deoxygenated for 25-30 min by bubbling pure nitrogen gas in order to decrease the oxygen concentration from 6.8 mg/L to 0.7 mg/L. After the sh were subjected to hypoxia for 1.5, 4 and 6.5 h, sh were quickly removed for organ dissection. After hypoxia for 6.5 h, the bubbling of nitrogen gas was replaced by bubbling with air, and the oxygen concentration was returned to a normal level (6.8 mg/L) in the three aquaria after 25-30 min.
During the reoxygenation for 1.5, 4 and 6.5 h, sh were quickly removed for organ dissection.
The harvested organs were immediately frozen in liquid nitrogen and stored at -80 °C. To evaluate the role of Sema7a in response to hypoxia and reoxygenation in P. vachelli, the sampling of liver and brain tissues from nine sh of the control (C) and challenged groups (hypoxia for 1.5, 4 and 6.5 h: H1.5, H4 and H6.5, respectively; reoxygenation for 1.5, 4 and 6.5 h: R1.5, R4 and R6.5, respectively) in the three aquaria was also conducted. Three sh were pooled to create a sample from an aquarium. qRT-PCR with β-actin as an internal control was used to explore mRNA expression levels. qRT-PCR was performed with the SYBR Green Master kit according to the manufacturer's instructions (Roche, Basel, Switzerland). The primers for qRT-PCR are listed in Table S8. The experiments were carried out in triplicate in an ABI Stepone TM Plus (Applied Biosystems, USA) with a total volume of 20 μL containing 10 μL of SYBR green mastermix, 4 μL of cDNA (500 ng), and 3 μL each of forward and reverse primers (2 μmol/L). The qRT-PCR was carried out as follows: 95 ºC for 10 min, followed by 40 cycles of 95 ºC for 15 s, and 55 ºC for 1 min. The expression levels were calculated by the 2 -△△CT method and subjected to statistical analysis. During the analysis of mRNA expression, fold change was determined by comparing to the expression of a reference gene. Table S1. Phenotypic data, including growth-related trait values, phenotypic sex, and time to balance loss, for ddRAD-seq and SNP phenotypic associations. Table S2. Data production of ddRAD sequencing for each individual.  Table S8. List of the allele speci c primers and common primers designed for the allele speci c PCR genotyping assays, as well as basic information on the 12 markers. Linkage group lengths and marker distributions of the high-resolution ddRAD-based SNP linkage map of Pelteobagrus vachelli. Within each linkage group, red, blue, and yellow lines represent paternal heterozygous SNPs, maternal heterozygous SNPs, and SNPs heterozygous in both parents, respectively. The details of the genetic map are given in Table S4.   Table S6. nnxnp: homozygous and heterozygous ; hkxhk: heterozygous and heterozygous ; lmxll: homozygous and heterozygous . Left side: The expression pattern of Signal-induced proliferation-associated 1-like protein 1 (Sipa1) and Corticosteroid 11-beta-dehydrogenase isozyme 2 (HSD11B2) in slow-medium-and fast-growing sh.

Supplementary Legends
Right side: The temporal expression pro les of Semaphorin-7A (Sema-7a) mRNA in the liver and brain after acute hypoxia and reoxygenation, as determined by qRT-PCR. Statistical analyses of differences were conducted by one-way analysis of variance (ANOVA) using SPSS 22.0 software. Error bars represent the standard deviation of three replicates. The asterisks above the bars represents statistically signi cant differences from control sh (*P < 0.05). Control, hypoxia (1.5, 4 and 6.5 h) and reoxygenation (1.5, 4 and 6.5 h) values are shown as bars C, H1.5, H4, H6.5, R1.5, R4 and R6. 5