Signatures of selection for resistance to Haemonchus contortus in sheep and goats

Background Gastrointestinal nematode infection (GNI) is the most important disease affecting the small ruminant industry in U.S. The environmental conditions in the southern United States are ideal for the survival of the most pathogenic gastrointestinal nematode, Haemonchus contortus. Host genetic variation for resistance to H. contortus allows selective breeding for increased resistance of animals. This selection process increases the prevalence of particular alleles in sheep and goats and creates unique genetic patterns in the genome of these species. The aim of this study was to identify loci with divergent allelic frequencies in a candidate gene panel of 100 genes using two different approaches (frequentist and Bayesian) to estimate Fst outliers in three different breeds of sheep and goats exposed to H. contortus. Results Our results for sheep populations showed SNPs under selection in C3AR1, CSF3, SOCS2, NOS2, STAT5B, TGFB2 and IL2RA genes using frequentist and Bayesian approaches. For goats, SNPs in CD1D, ITGA9, IL12A, IL13RA1, CD86 and TGFB2 genes were under selection. Common signatures of selection in both species were observed in NOS2, TGFB2 and TLR4 genes. Directional selection was present in all SNPs evaluated in the present study. Conclusions A total of 13 SNPs within 7 genes of our candidate gene panel related to H. contortus exposure were identified under selection in sheep populations. For goats, 11 SNPs within 7 genes were identified under selection. Results from this study support the hypothesis that resistance to H. contortus is likely to be controlled by many loci. Shared signatures of selection related to mechanisms of immune protection against H. contortus infection in sheep and goats could be useful targets in breeding programs aimed to produce resistant animals with low FEC.


Background
Small ruminant industry in the US is a growing industry due to ethnic markets and increasing demand for organically produced livestock. Gastrointestinal nematode infection (GNI) is one of the most prevalent health problems in sheep and goats and represents a major productivity threat for small ruminants [1]. High disease incidence has been observed in the Southeast US regions [2] and infection with Haemonchus contortus is common throughout the year [3,4]. This blood sucking parasite inhabits the abomasum of the host and it is responsible for weight loss, anemia and reduced performance [2].
Recent advances in genomic research have provided tools to unravel the genetics underlying phenotypic variation in complex traits [5], including resistance to GNI. Host genetic variation for GNI promises great opportunities for selective breeding of sheep and goats with increased resistance to H. contortus. Fecal egg count (FEC) is currently the method of choice to identify resistance to GNI and is the standard phenotypic measure to achieve rapid genetic progress [6]. Host resistance based on FEC is a heritable trait in both sheep and goats, with heritability estimates ranging between 0.01 to 0.65, and 0.1 to 0.33, respectively [7][8][9][10][11][12][13][14][15][16]. In accordance, breeding studies of small ruminants have revealed a FEC reduction after concurrent selective breeding of naturally resistant sheep to GIN infection [17][18][19].
Sheep and goats were the first livestock species to be domesticated by humans and were initially used mainly for meat, rather than wool or milk [20]. Natural selection and artificial selective breeding are the main driving forces shaping genetic variation across the sheep and goat genomes, and have gradually changed the phenotypes of these species. Within breeding strategies, selection increases the frequency of particular alleles at different loci in the population and creates unique genetic patterns in the DNA sequence that can be traced back and investigated for further analyses [21].
Two of the most used statistical methods for the analysis of signatures of selection are the detection of long haplotypes and the identification of differences in allele frequencies. The long haplotype detection method requires accurate allele assignment to one of the parental chromosomes (chromosome phasing) and ancestral allele identification, which sometimes can be a limitation when information about ancestors and pedigree is not available [22]. On the other hand, genetic differentiation among groups can be computed using the Fst method. This approach allows for identification of loci showing differences in allelic frequencies between two or more divergent populations, and therefore believed to be under selection. Highly genetic divergent loci between populations have more extreme Fst values (greater than 0.25) than low genetic divergent loci [23], and extreme Fst values are associated with either natural or artificial selection.
Using this approach for sheep, few loci have been identified as regions under selection for resistance or susceptibility to GNI [24], and in goats, information is even more scarce [25]. Some candidate markers within Ovar-DRA and Ovar-DRB genes have been identified as possible genetic markers associated with low H. contortus FEC in sheep and goat populations [26]. However, more knowledge is required to understand the genetic architecture underlying resistance against GNI in these species. Thus, the aim of this study was to identify immune loci (among a candidate gene panel of 100 genes) with divergent allelic frequencies in three different lines of sheep and goats, respectively, using the Fst statistic.

Results
Genotyping, quality control and population structure in sheep and goats The sheep and goat DNA samples were sequenced with a median depth of 24x across 5000 probes (average of 50 probes/gene) The initial SNP data set consisted of 5346 SNPs for both sheep and goats. Only biallelic SNPs were identified in our populations. After quality control, the final SNP data set contained 1339 SNPs for sheep and 1020 SNPs for goats.

Signatures of selection using Fst in sheep
Shared signatures of selection between both approaches were identified in C3AR1, SOCS2, NOS2, CSF3, TGFB2, IL2RA and STAT5B genes (    IL12A, TLR4, IL33, TGFB2, ITGA9, CD86, CD1D, and IL13RA1 genes (Additional file 5: Table S4, Figure 4). All the SNPs were under directional selection and located in exonic, intronic and untranslated regions ( Figure 4). Information regarding shared SNPs under selection in goats using both frequentist and Bayesian approaches is presented in Table 2 and Figure 3. For the Kiko and Spanish vs Boer analysis (Table 2), IL12A, TLR4 and ITGA9 genes had 4 SNPs under selection. The highest Fst value was observed in the CHR8:106725462 (TLR4). The CHR8:106725462 and CHR8:106725265 in exon 3 of TLR4 gene are synonymous mutations.
For the Kiko vs Boer analysis (Table 2), high genetic differentiation was observed in a SNP (CHR1:66217253) located in an intronic region of CD86 gene. For the Spanish vs Boer analysis (Table 2), CD1D, TGFB2, Finally, for the Spanish vs Kiko analysis (Table 2), 5 SNPs within TLR4, and ITGA9 genes were observed under selection. The majority of the SNPs with divergent allelic frequencies were located in TLR4 gene and within exon 3 and 4. Two (CHR8:106725462 and CHR8:106725265) out of 4 of the SNPs in TLR4 gene were also observed under selection in the Kiko and Spanish vs Boer analysis. The CHR22:11106216 was also under directional selection in the Kiko and Spanish vs Boer analysis, and in the Spanish vs Boer analysis.

Genomic regions under selection in both species
After examination of the Fst results per species, several loci in TGFB2, NOS2, and TLR4 genes were observed under selection in both species and are presented in Table 3 and Figure 3.

Discussion
Domestication, breed formation, and selective breeding leave detectable patterns within the genome of livestock species such as sheep and goats. Identification of these genomic patterns in the DNA sequence could help to identify of genes controlling resistance to H. contortus or other gastrointestinal parasites. Several studies have attempted to identify the genetic variation controlling gastrointestinal parasite resistance in sheep and goats by using SNP markers and Genome Wide Association Studies (GWAS) but few research studies has been devoted to identify signatures of selection for GNI resistance in these species [27][28][29]. Signatures of selection for resistance to GNI have not been identified in goats, and for sheep, only Perendale and Romney breeds have been evaluated [24]. In the present study, we unravel signatures of selection using a targeted sequencing approach in three different breeds of sheep and goats. The SNPs potentially under selection identified in this study spanned a myriad of candidate genes related to immune response and cellular mechanisms against H. contortus.  In our study, all the signatures of selection identified in sheep and goats were under directional selection. Directional selection is one of the primary cause of phenotypic diversification and has been used to increase the frequency of favorable additive alleles [30]. This selection process has not exhausted the genetic variation for most economically important traits in livestock [31]. Our results suggest that some of the SNPs in genes related to resistance to GNI are under directional selection. This could be possible due to selection for resistance to GNI is focused on resistant individuals, and susceptible animals are usually removed from the flock or not used in the mating process.

Signatures of selection in sheep populations
For Katahdin vs Dorper, and Katahdin and St. Croix vs Dorper analyses, loci within SOCS2, NOS2, TGFB2 and IL2RA genes were observed under selection. The SOCS2 gene has been previously associated with FEC in Dorper x Red Maasai sheep using GWAS and expression of this candidate gene was observed in abomasal tissue, mesenteric lymph nodes, and Peyer's patches from ewes and lambs [27]. Thus, it is possible that SOCS2 gene could be used as candidate gene for future studies to validate previous and current results in Dorper and Dorper × Red Maasai sheep.
The SOCS2 gene is a broad key regulator of cytokine responses, including IL2, IL3, IL4, IFN-γ, CSF, and Jak-STAT signaling pathways in bone marrow and T cells [32]. Studies on mice infected with Tripanozoma cruzi have shown that expression of SOCS2 facilitates inflammatory and immune responses to prevent myocardial dysfunction but increases parasitemia [33]. On the contrary, SOCS2 −/− mice infected with Schistosoma mansoni expressed increased Th2 response with higher IgE and eosinophil production than SOCS2 +/+ mice [34]. Also, SOCS2 −/− mice have shown increased body weight and gigantism possibly due to downregulation of growth hormone and insulin-like growth factor-I (IGF-I) signaling [35]. In Scottish Blackface sheep infected with Teladorsagia circumcincta, SOCS2 gene was found differentially expressed between resistant and control animals [36].
Nitrogen oxygen synthase 2 or NOS2 is a key molecule involved in Th1 response. It participates in the production of nitric oxide to kill invading microbes in phagocytes during classical macrophage activation by IFN-γ and TNF-α. Differential expression of this gene has been observed in the abomasum of Merino sheep during H. contortus challenge [37]. In that study, mRNA expression NOS2 gene was downregulated in susceptible individuals. While there is a proposed interplay between Th1, Th2, and Treg responses during GNI [38], susceptibility to these infections has been related to Th1 and Th17 responses, and Th2 has been associated with resistance to helminth infections in sheep [39].
TGFB2 protein has been reported as an anti-inflammatory cytokine, and was observed in high concentration in the gut mucosa of sheep after H. contortus infection [40]. In pigs, PAS1, a product of Ascaris suum, induces IL-10 and TGFB2 production in macrophages and has been related to loss of pro-inflammatory cytokine production [41]. In humans and animal models, it has been shown that inhibition of T-cell proliferation might be triggered by an increase of IL-10 and TGFB production in antigen presenting cells or T-cells as a result of down-modulatory molecules that are released by the parasites to enhance survival [42]. Thus, parasites are prone to use IL-10 and TGFB to downregulate host immune response.
IL2RA protein is mainly expressed in CD4+ Treg cells and it constitutes one of the three subunits of the IL2R. In humans, induction of Treg cells increases during natural and long term gastrointestinal nematode infections [43,44]. In sheep, expression of the mRNA of IL2RA gene in the abomasum has been related to subsequent H. contortus infections in resistant sheep while its expression in the jejunal mucosa has been linked to susceptibility of Trichostrongylus columbriformis [37]. Thus, differential expression between susceptible and resistant individuals could depend on the stage of the host immune response, the infection period, as well as the nematode species.
For St. Croix vs Dorper, and Katahdin and St. Croix vs Dorper analyses, SNPs in C3AR1, CSF3, STAT5B, TGFB2 and IL2RA genes were found to be under selection. C3AR1 protein plays an important role during innate immune responses. It is part of the complement cascade. Reduced T cell responses has been observed when host animals lack of C3AR [45]. Recent work in mice, using bone marrow transplant and RNA Seq analysis, identified that signaling by C3AR mediates macrophage recruitment after induced injury with cardiotoxin and muscle regeneration [46]. The exact role of C3AR1 has not been evaluated in sheep during gastrointestinal nematode infections, but some studies suggest that the complement activation is one of the first mechanisms of protection against helminth infections [47] and classical and alternative complement pathways can be activated in resistant sheep to H. contortus [48]. The STAT5 gene can be activated by many cytokines such as GM-CSF and thymic stromal lymphopoietin (TSLP) in the dendritic cells. Activation of STAT5 by TSLP has been shown to trigger Th2 responses at barrier surfaces [49]. Also, STAT5 signaling has been related to many biological processes, such as TCR signaling and basal proliferation of naïve CD4+ T cells [50]. Moreover, STAT5B mediates the signal transduction of IL2, IL4, CSF1, and different growth hormones. Thus, it is possible that STAT5B gene is responsible for many cellular functions during H. contortus exposure and further analysis is required to confirm our results.

Signatures of selection in goat populations
For many years, there has been a debate about the immune mechanisms controlling H. contortus infections in sheep and goats. The same helminth species can parasitize both species but previous studies suggest higher levels of infection in goats [51].
For Kiko and Spanish vs Boer, genes with genetic differentiation were IL12A, TLR4 and ITGA9. IL12 protein is a major cytokine that controls the maturation of CD4+ T cells into Th1 cells and promotes IFNG production in response to intracellular parasites. IL12 protein is composed of an alpha chain (p35 subunit) and a beta chain (p40 subunit) linked by a disulfide bond [52]. In Nelore cattle, some studies have suggested that susceptibility to gastrointestinal parasites is associated with an increase of Th1 response with high elevated worm burden and elevated IFNG and IL12 production [53].
Toll-like receptors (TLR) are vital for the detection of invading pathogens and are commonly expressed in antigen presenting cells and other immune cells [54]. In resistant sheep infected with H. contortus and T. colubriformis, upregulation of several TLR genes, including TLR4, was observed in the abomasum. In the same study, susceptible individuals presented lower expression of this gene [37]. Contrary to sheep, susceptible Angus yearlings infected with Ostertagia, Cooperia and Nematodirus spp., TLR4 showed higher expression in the mesenteric lymph nodes [55]. In goats, increased expression of TLR4 gene in blood has been observed during inclusion of Sericea lespedeza in the diet [56]. This observation could be related to biologically active tannin fractions from plants containing tannins such as S. lespedeza. Several studies have shown plant tannins are able to modulate the innate immune response and act as γ-T cell agonists [56,57].
The ITGA9 gene encodes an alpha integrin that compose the integral membrane glycoproteins that mediates cell-cell and cell matrix adhesion. In resistant Merino sheep infected with H. contortus, transcriptome analysis results revealed ITGA9 gene as part of an enriched gene set related to the extracellular matrix receptor interaction pathway [58]. The exact role of ITGA9 gene in goats is unknown, but further analysis could help to understand possible mechanisms of protection against H. contortus and other gastrointestinal parasites.
For Kiko vs Boer, a SNP in CD86 gene was observed under selection. This gene encodes a membrane bound protein in antigen presenting cells that binds CD28 and CTLA-4 proteins localized in the T cell membrane. Binding with C28 leads to T cell proliferation and cytokine production, while binding with CTLA-4 negatively regulates T cell response [59]. Thus, it is possible that CD86 controls some T cell mechanisms in goats.
For Spanish vs Boer analysis, signatures of selection were identified in CD1D, TGFB2, ITGA9 and IL13RA1 genes. CD1D is a major histocompatibility complex class I related protein that regulates presentation of glycolipids antigens to natural killer T cells [60]. In resistant cattle naturally exposed to gastrointestinal parasites, CD1D was upregulated in the mesenteric lymph nodes [55]. In goats, the role of CD1D has not been studied but it is possible that this gene could play an important role during presentation of glycoproteins from H. contortus to T cells.
As observed in sheep, TGFB2 could be used by H. contortus to promote infection. In tropical cattle, susceptibility to Theileria annulata has been associated with TGFB2 induction and increased TGF-b2 production by Theileria-infected macrophages promote invasiveness [61]. In sheep, TGFB-like molecules have been identified in larvae from H. contortus and T. circumcincta [62]. Thus, activation of TGFB and TGFB-like molecules from gastrointestinal parasites could control downregulation of the immune response. The exact role of TGFB2 in goats is unclear and more research is needed to understand its function during H. contortus infections.
IL13RA1 subunit together with IL4RA can form a functional receptor for IL13 [63]. In goats, no evidence of IL13RA1 function has been reported but in Hereford Shorthorn cattle infected with Boophilus microplus, results showed that IL13RA1 precursor was differentially expressed [64].
Finally for Spanish vs Kiko, TLR4 and ITGA9 had SNPs under selection. For this analysis, the same SNP identified under selection in ITGA9 gene in Kiko and Spanish vs Boer analysis was observed. For TLR4, 2 more SNPs in exon 4 were identified under selection.
Thus, it seems that for goats, TLR4 and ITGA9 genes could play important roles during H. contortus infection.

Common signatures of selection in sheep and goats
During the last two decades, results have shown differences in feeding behavior and gastrointestinal nematode parasitism between sheep and goats. Feeding behavior is one important aspect that differentiates sheep and goats. Sheep are typically raised in grazing systems with parasites commonly found within the pastures, and have to counteract the negative effects of GNI by developing an effective immune response. Goats are common browsers which allows them to rely less on immune response mechanisms [65].
For many years, there has been a question of the importance of immune effector molecules and the mucosal response in goats during GNI. Our findings suggest a possible interplay between Th1 and Th2 responses with conserved breed specific mechanisms. For both species, our results suggest a possible interplay between Th1 and Th2 response during GNI, as previously described by Hassan et al. [39] and Pernthaner et al. [66].
One of the most interesting findings from this study is the identification of shared immune response mechanisms between sheep and goats (Fig. 3). It is possible that some immune response mechanisms are shared between both species and induce an effective immune response against H. contortus. The NOS2, TGFB2, and TLR4 genes, observed under selection in both species, are key modulators of Th1 and Th2 responses, and active players of antigen recognition. Several studies in cashmere and dairy goats have evaluated the responsiveness of resistant animals to GNI and have found a negative correlated response between worm counts and eosinophil, mast cell, and globule leucocyte counts [67][68][69]. In sheep, similar cellular immune response has been associated to GNI [70,71]. Thus, it is possible that some mechanisms of immune protection are shared between these species but more studies are required to understand these events during H. contortus infection.
The conserved mechanisms of protective response against H. contortus are most likely to be useful targets in the development of alternative nematode control strategies in both species, as they can be widely applied in production systems. For this reason our future efforts will focus on validation of the results observed in the present study to unravel genetic mechanisms used for controlling H. contortus or other GNI between sheep and goats.

Conclusion
Results from this study support the hypothesis that resistance to gastrointestinal parasites such as H. contortus is likely to be controlled by many loci. Different immune response mechanisms between sheep and goats are used to control H. contortus but some aspects are shared in both species. Shared mechanisms of immune protection could be useful targets in breeding programs aimed to produce resistant animals and future research is necessary to validate our findings.

Animal populations
The research protocol for the present study was approved by the Langston University Animal Care and Use Committee. Sire candidates were randomly selected in the first year from five commercial farms in Arkansas (CWC Farm), Kansas (Hogan Ranch), and Missouri (CMI Dorpers, Thousand Oak Ranch and Persimmon Creek Ranch) and American Institute for Goat Research at Langston University, Oklahoma and transferred to Langston University for a central sire performance test with an artificial H. contortus infection described later. In the second and third years, young male offspring of resistant or moderate resistant breeding groups to GIN were tested with the artificial H. contortus infection. Sheep and goats were grouped per breed and species in adjacent pens with automated feeders allowing free-choice access to a 15% crude protein diet at Langston University. Overall, 145 offspring sheep from Dorper (n = 48), Katahdin (n = 57), and St. Croix (n = 40) breeds and 144 offspring goats from Boer (n = 52), Kiko (n = 44) and Spanish (n = 48) breeds were used in this study.
Deworming and H. contortus artificial infection methods are described in a previous publication [25]. Briefly, sheep and goats were treated with albendazole (Valbazen®; 10 and 20 mg per kg of body weight, respectively) and levamisole (Prohibit®; 12 and 18 mg per kg of body weight, respectively) during 2 weeks of adaptation. Then, animals were screened for FEC reduction (< 100 epg) and received an oral dose of 10,000 L 3 larvae of H. contortus. FEC was recorded at 28, 35 and 42 days post-infection. Animals returned to commercial farms at the end of the study.
Genotyping and data quality control Blood samples from sheep and goats were collected by puncture of the jugular vein using vacutainer tubes with anticoagulant EDTA. Subsequently, genomic DNA was isolated using DNeasy Blood & Tissue Kit (Qiagen, Valencia, CA) according the manufacturer's instructions and stored at − 20°C. The DNA yield was calculated from a spectrophotometric measurement at 260 nm (NanoDrop-1000, Thermo Scientific), and the purity was assessed using a ratio 260/ 280 nm.
Two hundred and fifty ng/μL of genomic DNA was genotyped using Capture-Sequencing by RAPiD Genomics (Gainesville, Florida) to target 100 genes related to the immune response against H. contortus or other GNI.
The candidate gene panel was selected for targeted sequencing based on results from previous studies in sheep [55,[72][73][74][75][76] and goats [77]. In addition, genes related to the immune response against H. contortus and other GNI were considered as candidates for targeted sequencing (Additional file 6: Table S5).
The Oar_v4.0 reference genome available at the National Center for Biotechnology Information (NCBI) genome browser was used to design biotinylated 120-mer probes that captured sequences at each target locus. For library preparation, Nextera tagmentation protocol from Illumina was used. Then, biotin-labeled probes hybridized denatured libraries and streptavidin-coated beads were used to capture the probe-library complex. Streptavidin-coated beads were magnetically pulled down and DNA fragments were eluted. Libraries were captured by complimentary surface bound oligos and library amplification was performed using bridge amplification according to Illumina's guidelines. The probe set used for sequence capture contained 5000 probes representing 100 genes. Target enriched libraries were sequenced using the Illumina HiSeq 3000 PE100 platform to generate 2 × 101 bp paired-end reads.
Data was demultiplexed using bcl2fastq conversion software from Illumina, cleaned, and trimmed. The 3′ ends were trimmed and low quality bases with < 20 Phred quality score reads were removed. Clean reads were mapped to the sheep (Oar_v4.0) and goat (ASM170441v1) reference genomes with MOSAIK software [78]. Freebayes was used for identification of SNPs and VCFtools [79] was used to generate VCF files. Samples were filtered based on maximum missing count [3], minimum number of alleles [2], mean read depth (750), call rate (< 95%) and MAF (≤ 0.01). Thus, SNPs with a call rate < 95% and MAF ≤ 0.01 were removed.
Principal component analysis plots were generated to illustrate population structure using JMP Genomics 9 software from SAS (SAS Institute Inc., Cary, NC). Individuals included in the principal component analysis and further Fst analysis were selected based on the identity by state threshold of ≤0.5. For these analyses, one hundred and twenty sheep from Dorper (n = 40), Katahdin (n = 40), and St. Croix (n = 40) breeds and 129 goats from Boer (n = 43), Kiko (n = 43) and Spanish (n = 43) breeds were used.

Identification of signatures of selection using allele frequencies
Signatures of selection were identified using Fst statistic at each SNP using frequentist and Bayesian approaches which are focused on the identification of differences in allele frequencies between subpopulations. To identify genetic divergence between subpopulations, analyses between resistant and susceptible breeds within species were carried out. To identify any signatures of selection different in the two resistant breeds, an additional analysis was performed per species by comparing the St. Croix against Katahdin (S vs K, resistant vs resistant breed) for sheep, and Spanish against Kiko (S vs K, resistant vs resistant breed) for goats.
For the frequentist Fst, calculation of average allele frequency across breeds, estimation of total variance, and deviation of each population from mean and Fst computation were performed using the R software and the R codes from Gondro et al. [22]. The formula used to estimate Fst values was the following: Fst ¼ deviation of each population from mean ð Þ 2 total variance where σ 2 subpopulation is the variance of the deviation of each population from mean and σ 2 total is the total variance. Estimates corresponding to the 1% extreme Fst values were used to define a significance threshold and identify regions under selection. Bayesian Fst was estimated using BayeScan software. In this approach, a Bayesian likelihood method implemented via reversible jump Markov Chain Monte Carlo (MCMC) was used which assumes that allele frequencies follow a Dirichlet distribution [80]. The main advantage of this approach is that Fst statistic is modelled using logistic regression methods by decomposing locus-population Fst coefficients into a population-specific component (beta), shared by all loci and a locus-specific component (alpha) shared by all the populations. Then, departure from neutrality at a given locus is assumed when the locus-specific component is necessary to explain the observed pattern of diversity (alpha significantly different from 0). Diversifying selection can be assumed if positive values of alpha are observed, whereas negative alpha values suggest balancing or purifying selection. Consequently, two alternative models are generated for each locus, including or not the alpha component to model