Insertion of an endogenous Jaagsiekte sheep retrovirus element into the BCO2 - gene abolishes its function and leads to yellow discoloration of adipose tissue in Norwegian Spælsau (Ovis aries)

Background The accumulation of carotenoids in adipose tissue leading to yellow fat is, in sheep, a heritable recessive trait that can be attributed to a nonsense mutation in the beta-carotene oxygenase 2 (BCO2) gene. However, not all sheep breeds suffering from yellow fat have this nonsense mutation, meaning that other functional mechanisms must exist. We investigated one such breed, the Norwegian spælsau. Results In spælsau we detected an aberration in BCO2 mRNA. Nanopore sequencing of genomic DNA revealed the insertion of a 7.9 kb endogenous Jaagsiekte Sheep Retrovirus (enJSRV) sequence in the first intron of the BCO2 gene. Close examination of its cDNA revealed that the BCO2 genes first exon was spliced together with enJSRV-sequence immediately downstream of a potential -AG splice acceptor site at enJSRV position 415. The hybrid protein product consists of 29 amino acids coded by the BCO2 exon 1, one amino acid coded by the junction sequence, followed by 28 amino acids arbitrary coded for by the enJSRV-sequence, before a translation stop codon is reached. Conclusions Considering that the functional BCO2 protein consists of 575 amino acids, it is unlikely that the 58 amino acid BCO2/enJSRV hybrid protein can display any enzymatic function. The existence of this novel BCO2 allele represents an alternative functional mechanism accounting for BCO2 inactivation and is a perfect example of the potential benefits for searching for structural variants using long-read sequencing data. Supplementary Information The online version contains supplementary material available at 10.1186/s12864-021-07826-5.


Background
The yellow coloration of adipose tissue in sheep is known to be a heritable trait in different sheep breeds and is caused by the accumulation of carotenoids [1][2][3][4]. In terms of consumer preferences, "yellow fat" is an undesirable meat quality that leads to loss of product value and is therefore not wanted in meat animals.
In animals, two enzymes displaying different cleavage effects on carotenoids have been identified; BCO1 (beta-carotene oxygenase 1) which cleaves β-carotene symmetrically into two retinal molecules, and BCO2 (beta-carotene oxygenase 2) which cleaves β-carotene asymmetrically into β-apo-10′-carotenal (C27) and βionone (C13) [5][6][7][8]. An important function of these two enzymes is to execute the first step in degradation of the large amounts of carotenoids absorbed from the plant rich diet of ruminants. Functional failure of these enzymes may lead to accumulation of intact carotenoids in body tissues like fat or muscle. This effect has been illustrated by mutations influencing either transcript levels or protein coding sequence for both genes in several organisms [9][10][11][12]. Mutations in BCO2 especially seem to result in accumulation of carotenoids and yellow colouration of tissues [13,14], possibly due to its broader substrate specificity [15,16].
Earlier we have reported that a mutation that introduces a stop codon in amino acid position 66 of ovine BCO2 is associated with yellow fat in Norwegian White Sheep breed [12]. However, in the native Norwegian breed Spaelsau, the yellow fat trait is found to segregate in the absence of this mutation, suggesting that alternative functional mechanism is influencing this trait. In the present study we searched for novel BCO2 mutations in the Spaelsau breed that could explain the yellow fat phenotype and included Nanopore long range sequencing to also detect structural variants that possibly could influence this trait.

Results
Our initial experiments (sample set 1) to better understand the mechanism(s) responsible for yellow fat in Norwegian Spaelsau included performing PCR on cDNA from animals presenting white or yellow fat to detect expression of BCO2. Four primer pairs targeting BCO2 produced 4 fragments of expected size in the three individuals showing white fat, while no fragments were amplified from the individual with yellow fat (Fig. 1 Six of these (see Table S1) were found in both individuals (both white and yellow fat phenotype). These were disregarded as candidate SVs explaining the yellow fat phenotype. The single remaining SV consisted of a 7939 bp insertion at reference genome position NC_040266.1: 25,022,547 which is 730 bp downstream from the end of BCO2's first exon (Fig. 2). The inserted sequence showed high similarity (99.0% identity) and covered the full length (7941 bp) of the endogenous Jaagsiekte sheep retrovirus (enJSRV; accession MF175071.1). Compared to the virus genome reference sequence (NC_001494.1), which is 7462 bp long, the endogenous version is 479 bp longer due to longer repeats at each end of the enJSRV. The virus sequence is 92.8% identical and shows 98% coverage to the inserted enJSRV sequence. The enJSRV insertion was verified with PCR using primer pairs 7554-7555 and 7556-7557 in the two sequenced animals (20025 and 70346), which gave expected products of 686 bp and 766 bp, respectively, spanning the upstream and downstream sheep BCO2 -enJSRV junctions, respectively. The primer pair spanning the insertions site 7552-7553 did not amplify any product in the yellow fat animal (70346), while it did in the heterozygous white fat animal (20225) (Fig. 2). Complete BCO2 intron 1 sequence, including the 7939 bp enJSRV insertion is available under accession LR701838.1.
To test whether exon 1 (located upstream of the insertion point) could still be expressed in a homozygous carrier we amplified a subregion of exon 1 (primers 7550-7551) from cDNA. A band of expected size (112 bp) was produced, indicating that BCO2 exon 1 was being transcribed in individual 70346.
By combining a BCO2 exon 1 forward primer (7550) with enJSRV specific reverse primer (7555) we were able to generate a 188 bp PCR product using cDNA from the yellow fat individual (70346). Sanger sequencing this fragment between the two primer binding sites revealed it to be composed of 94 bp from BCO2 exon 1, and 55 bp of enJSRV sequence (in a non-protein coding region), beginning at the base corresponding to position 415 in the enJSRV sequence (MF175071.1) (Fig. 3).
Analysis of sample set 2 (26 individuals with an expected relatively high frequency of the yellow fat allele) showed that 10 were heterozygous for the insertion, 4 were homozygous for the insertion, and the remaining 12 lambs were homozygous wild type. The abattoir's meat-grading report noted that the 4 homozygous animals displayed yellow fat, while the remaining group had white fat (Table 1).

Discussion
In this study we investigated whether changes the BCO2 gene could be responsible for yellow fat phenotype observed in Norwegian spaelsau. By performing PCR on cDNA, we were able to confirm the expression of BCO2 in 3 individuals displaying the white fat phenotype, however no amplification products were observed in the yellow fat individual. This indicated that the yellow fat phenotype was due to downregulated expression of BCO2 mRNA, but our results did not offer insight into the mechanism underlying this.
To investigate if novel genomic rearrangements (for example structural variants) in the BCO2 region could account for the lack of detectable mRNA, we sequenced DNA from both a single yellow-and white-fat individual. This revealed the presence of a complete enJSRVsequence inserted between exon 1 and exon 2 in the BCO2 gene. By amplifying a subregion of exon 1 from cDNA, we could establish that the insertion does not appear to interfere with expression of exon 1 in a homozygous carrier, leading us to hypothesize that gene regulation is unaffected by the insertion and that the enJSRV sequence within intron 1 is interrupting the normal splicing process.
By combining a forward primer located in the BCO2 exon 1 with a reverse primer located 470 bp into the enJSRV sequence, a hybrid cDNA product was amplified consisting of 94 bp from BCO2 exon 1 and 55 bp of enJSRV sequence. If translated, this hybrid mRNA sequence would give rise to 29 amino acids of the wildtype ovine BCO2 protein chimerized to 1 amino acid coded by the junction sequence followed by 28 amino acids coded by the enJSRV-sequence, before the protein is terminated by a stop codon. Knowing that the fulllength protein consists of 575 amino acids, we theorize that it is highly unlikely that these 58 peptides can perform the functions of wildtype BCO2 enzyme in animals homozygous for the enJSRV insertion.
To strengthen our finding that the enJSRV insertion is functionally impacting transcript processing of BCO2, we genotyped 26 additional individuals (sample set 2) with an expected high frequency of the yellow fat allele. Comparing our genotypes with the reported phenotypes of these 26 individuals, we observed a genotype pattern consistent with recessive inheritance of the yellow fat phenotype.
Due to the relatively low frequency of this phenotype in the population and access to individuals, the number of individuals included in the present study is low. However, based on the available evidence we conclude that the enJSRV insertion in BCO2 (intron 1) is a strong candidate for the yellow fat phenotype in spael sheep. To the best of our knowledge, this insertion also represents the first example of a structural variant (excluding copy number variation) affecting a quality trait in this production livestock species.

Conclusions
In this study we identified an insertion of an endogenous Jaagsiekte Sheep Retrovirus element in the BCO2 gene in the Norwegian spaelsau breed by using Nanopore sequencing technology. The insertion is localised in the first intron of BCO2 and interrupts the mRNA splicing process. The resulting mRNA consists of BCO2 exon 1 sequence fused with endogenous retrovirus sequence, leaving an open reading frame of totally 58 amino acids. Given that the wild type BCO2 enzyme consists of 575 amino acids, we find it highly unlikely that the hybrid protein of 58 amino acids can have any enzyme function. Our results make it possible for the sheep farming industry to screen their breeding candidates for this variant to avoid the yellow fat phenotype. We also think this paper clearly illustrate the strength of using long-read sequencing technology to search for structural variation and it also exemplifies how such variation can directly influence gene function.

Animals
In Norwegian abattoirs, fat colour in sheep carcasses is graded as "normal" Genotypes are scored as wild type (wt) or carrier of the endogenous Jaagsiekte Sheep Retrovirus (ins). Example lanes are copied from 3 different gels used for genotyping (Fig. S2b,c,d). The primer combinations 7552/7553, 7554/7555 and 7556/7557 are shown from left to right. The lanes resulting from each of the 3 primer pairs are grouped together for the genotypes wt/wt, wt/ins and ins/ins, respectively. The size marker in lane 1 is 100 bp GeneRuler®. E The corresponding phenotypes of the genotypes shown in previous section (d). Photo: Nortura SA Norway (sample set 2). Lambs from each farm were half-sibs descending from one of two rams registered with some offspring displaying the yellow fat trait. Similarly, mothers of these lambs were also known to be related to individuals presenting yellow fat, and therefore had an increased probability of carrying at least one yellow fat allele. To be specific, 5 ewes had offspring with yellow fat, 9 ewes had a parent with offspring with yellow fat and one ewe was half-sib of individual 70203 in sample set 1. The farms contributing to both sample set 1 and 2 are members of the Norwegian Sheep Recording System.

RNA extraction and cDNA synthesis
RNA was extracted using the RNeasy®Plus Universal Mini Kit from QIAGEN according to the manufacturer's instructions. The concentration and purity of the RNA was measured using a NanoDrop 8000 (Thermo Scientific), and the integrity of RNA was measured using a Bioanalyzer 2100 (Agilent). All samples had an RNA integrity value (RIN) of at least 8.1. cDNA was produced using a SuperScript™ II Reverse Transcriptase kit from Invitrogen according to manufacturer's instructions.

PCR amplification of BCO2 cDNA
To amplify the complete cDNA sequence of BCO2, four pairs of validated primers (A: 7541-7542, B: 7543-7544, C: 7545-7546, D:7547-7548) were mixed with cDNA and AmpliTaq Gold® Polymerase (Applied Biosystems) in four separate PCR reactions generating overlapping regions of the BCO2 cDNA [12]. The PCR was performed using 10 min at 95°C, and 40 cycles of 95°C for 30 s, 57°C for 30 s and 72°C for 1.5 min. Expected fragment sizes are 692 bp, 822 bp, 491 bp and 660 bp for primer pairs A, B, C and D, respectively (Fig. 1).

Nanopore sequencing
High-molecular-weight DNA from white (20025) and yellow fat (70346) individuals was extracted from liver tissue using Genomic-tips (G/100) from Qiagen, and fragments smaller than 25Kb were progressively depleted using a Short Read Eliminator kit (Circulomics; USA). Sequencing libraries were prepared using Ligation sequencing kit (LSK109, Oxford Nanopore) with native barcoding (EXP-NBD104, Oxford Nanopore). Equal masses of libraries were combined, 300 ng was loaded into a single PromethION flowcell (FLO-PRO002). After 23 h the run was stopped (producing 36 Gb raw data) and the flow-cell regenerated according to Oxford Nanopore's nuclease flush protocol. The remaining pooled library (230 ng) was added to the same flow-cell and a new sequencing run performed (22.2 h; 22 Gb raw data).

Identification of intronic insertion in BCO2
Candidate SVs within the BCO2 gene region were filtered for presence only in individual 70346 or following expected allele pattern among the two individuals (homozygous in individual 70346 and heterozygous in individual 20025. On-line blastn [23] was used to identify the inserted sequence in BCO2 intronic region (INS_ chr15_25022547). Insertion boundaries were determined by aligning contigs of sheep_70346 genome assembly against the Oar_rambouillet_v1.0 reference genome.
Detection of BCO2 exon 1 and BCO2-JSRV hybrid cDNA in the yellow fat animal A 112 bp exon 1 fragment of BCO2 from the yellow fat individual (70346) was amplified using primers 7550/ 7551 (Table 2). A hybrid fragment consisting of sheep BCO2 exon 1 sequence and JSRV sequence were amplified from the same individual using primers 7550/7555. In both cases cDNA was used as template with the following thermo cycler program: 10 min at 95°C and 40 cycles at 95°C for 30 s, 57°C for 30 s and 72°C for 30 s using AmpliTaq Gold® Polymerase (Applied Biosystems). The 7550/7555 product was DNA sequenced by Sanger technology at Eurofins Scientific, Luxembourg, using the same two primers for the sequencing reaction.
PCR to detect the presence or absence of the endogenous virus sequence in genomic DNA To detect the presence or absence of endogenous virus sequence in genomic DNA, three pairs of primers were designed ( Table 2): one pair (7552/7553) was designed to amplify across the insertion point without any insertion, one pair (7554/7555) to amplify the junction between the BCO2 intron sequence and the upstream enJSRV region and one (7556/7557) to amplify the junction between the BCO2 intron sequence and the downstream enJSRV region. Expected fragments lengths are 221 bp, 686 bp and 766 bp, respectively. DNA was denatured for 10 min at 95°C and PCR run for 40 cycles at 95°C for 30 s, 58°C for 30 s and 72°C for 1.5 min using using AmpliTaq Gold® Polymerase (Applied Biosystems).