The objective of this study was to produce a targeted gene knock-in Bos taurus bull by direct cytoplasmic microinjection of single-cell bovine embryos using a donor template containing the bovine SRY promoter and coding sequence, the gfp coding sequence with SV40 promoter utilizing the HMEJ-approach. Once a pregnancy was established, the phenotypic sex was determined by transrectal ultrasound and following birth, genotypic sex was determined, and the on-target and off-target integration of the donor template was evaluated using short and long read whole genome sequencing technology.
Ovaries were obtained from cull Bos taurus cows of unknown breed at a local processing plant and transported in warm sterile saline at temperature of 35–37 °C. Oocyte-cumulus-complexes (COCs) were aspirated from follicles using a vacuum aspiration system and cultured in groups of 50 COCs in 500 μL of BO-IVM culture media (IVF Biosciences, Falmouth, UK) for 18 h at 38.5 °C in a humidified 5% CO2 incubator. COCs were then washed and transferred in groups of 25 to 60 μL drops of SOF-IVF media  with 2 × 106 sperm per mL and covered in mineral oil. Sperm and COCs were incubated for 6 h at 38.5 °C in a humidified 5% CO2 incubator. Presumptive zygotes were then denuded by light vortex and transferred to 25 μL of BO-IVC culture media (IVF Biosciences, Falmouth, UK). Embryos were cultured for 7 days at 38.5 °C in a humidified atmosphere of 5% CO2, 5% O2, and 90% N2.
Guide-RNA and donor plasmid construction
The guide-RNA (gRNA) targeting the H11 safe harbor locus on bovine chromosome 17 was designed as previously described  (TAGCCATAAGACTACCTAT) and commercially synthesized (Synthego, Redwood City, CA, USA). The donor plasmid construct was designed as previously described , containing the endogenous bovine sex-determining region Y, (SRY) promoter and coding sequence , the green fluorescent protein (GFP) coding sequence and SV40 promoter, and 1 kb homology arms flanked on either side by the CRISPR target site (Fig. 1a). Each piece was commercially synthesized (GeneWiz, LLC, South Plainfield, NJ, USA) and inserted into a pUC19 plasmid using Gibson Assembly Master Mix (New England Biolabs, Inc., Ipswich, MA). Plasmids were clonally amplified using 5-alpha Chemically Competent E. coli (High Efficiency) (New England Biolabs, Inc., Ipswich, MA) and extracted using the EndoFree Plasmid Maxi Kit (Qiagen, Inc., Valencia, CA).
Embryo injection and evaluation
Approximately 200 in vitro fertilized bovine zygotes were injected approximately 6 h post insemination using laser assisted cytoplasmic injection  with 6 pL of solution containing 67 ng/μL of synthetic gRNA, 167 ng/μL of Cas9 protein (PNA Bio, Inc., Newbury Park, CA) and 133 ng/μL of donor plasmid. Embryos were then cultured in BO-IVC culture media (IVF Biosciences, Falmouth, UK) for 7 days at 38.5 °C in a humidified atmosphere of 5% CO2, 5% O2, and 90% N2. On day seven, embryos were scored for developmental stage reach and 22 high grade seven blastocysts were selected and analyzed using fluorescent microscopy on a Nikon Eclipse TE2000-U advanced inverted epifluorescence microscope at 20X magnification using a filter specific for eGFP fluorescence. Fluorescent images of GFP expressing blastocysts were taken using an Echo Revolve 4 upright, inverted, brightfield microscope at 10X magnification using transillumination for bright field and FITC for GFP expression.
Estrus synchronization was initiated in 15 nulliparous heifers from the Department of Animal Science, University of California, Davis commercial cow herd by inserting an intravaginal progesterone device (1.38 g; Eazi-Breed CIDR; Zoetis) and intramuscular administration of gonadorelin (100 mcg; Factrel; Zoetis) on day 0 (16 days prior to transfer). This number of recipients was chosen with the objective of obtaining at least two SRY knock-in bull calves assuming 60% response to synchronization, and an expectation that 27% of recipients receiving conventional in vitro produced (IVP) would result in a live calf. On day 7, the CIDR was removed and intramuscular prostaglandin (25 mg; Lutalyse; Zoetis) was administered. Recipients were monitored for estrus, and a second intramuscular dose of gonadorelin (100 mcg; Factrel; Zoetis) was administered on day 9. Prior to transfer on day 16, recipient response to synchronization was confirmed via detection of an appropriate corpus luteum with transrectal ultrasonography and nine recipients were deemed suitable for embryo transfer. Prior to transfer, each recipient received a caudal epidural using 100 mg 2% lidocaine (Xylocaine; Fresenius). A total of nine embryos were transferred via non-surgical, transcervical technique, with each GFP positive blastocyst being deposited into the uterine horn ipsilateral to the corpus luteum. Pregnancy was diagnosed on day 35 of embryonic development by transrectal ultrasonography (5.0 MHz linear probe; EVO Ibex, E.I. Medical Imaging), and sex was likewise determined at day 68 of development. A single knock-in bull calf was born in April, 2020, and was monitored and maintained at the University of California, Davis Beef Cattle Barn under normal husbandry conditions. The bull calf remains at this facility while he develops to sexual maturity.
DNA extraction and PCR analysis
Whole blood (5 ml) was collected in EDTA vacutainers (Becton Dickinson) by a veterinarian from the UC Davis veterinary hospital large animal clinic. DNA was extracted either from the buffy coat using the DNeasy Blood and Tissue kit (Qiagen, Inc., Valencia, CA, USA) or from whole blood using red blood cell lysis, SDS/Proteinase K cell lysis, phenol/chloroform/isoamyl alcohol clean up and ethanol precipitation. The placental cotyledon was collected and cut into small pieces. DNA was then extracted from 25 mg of placental cotyledon using the DNeasy Blood and Tissue kit. An ear punch biopsy was taken from the bull calf and used to establish a fibroblast line in culture. Cells were passaged twice in DMEM media (Thermo Fisher, Waltham, MA, USA) containing 10% fetal bovine serum (Thermo Fisher, Waltham, MA, USA), 1% Glutamax (Thermo Fisher, Waltham, MA, USA) and 1% penicillin/streptomycin (Thermo Fisher, Waltham, MA, USA). After the second passage, cells were collected and DNA was extracted using the DNeasy Blood and Tissue kit (Qiagen Inc., Valencia, CA, USA). The target regions were amplified using the polymerase chain reaction (PCR) using primers (F – CCCCAGTGTTGTGCATGTAG; R – GTGAATGCCACTGCTGTGTT) for the H11 locus  and primers (F – AGGAAGCCAGGAAAGTAA; R – CATCCACGTTCTAAGTCTC) for genotypic sexing. The knock-in PCR was performed on a SimpliAmp Thermal Cycler (Applied Biosystems, Foster City, California, USA) with 12.5 μL LongAmp Taq 2X Master Mix (New England Biolabs, Inc., Ipswich, MA), 9.5 μL of H2O, 1 μL of each primer at 10 mM and 1 μL of DNA for 5 min at 94 °C, 35 cycles of 30 s at 94 °C, 30 s at 60 °C and 4 min at 65 °C, followed by 15 min at 65 °C. The sexing PCR was performed on a SimpliAmp Thermal Cycler (Applied Biosystems, Foster City, California, USA) with 12.5 μL GoTaq Green Master Mix (Promega Biosciences, LLC, San Luis Obispo, CA, USA), 9.5 μL of H2O, 1 μL of each primer at 10 mM and 1 μL of DNA for 5 min at 94 °C, 35 cycles of 30 s at 94 °C, 30 s at 55 °C and 30s at 72 °C, followed by 5 min at 72 °C. Products were visualized on a 1% agarose gel using a ChemiDoc-ItTS2 Imager (UVP, LLC, Upland, CA), purified using the QIAquick Gel Extraction Kit (Qiagen, Inc., Valencia, CA) and Sanger sequenced (GeneWiz, South Plainfield, NJ).
Whole genome sequencing and mapping
Genomic DNA extracted from the buffy coat was submitted to Novogene for library construction and whole genome sequencing. Samples were sequenced on an Illumina NovaSeq 6000 sequencer with paired end, 150 bp reads. Raw reads were aligned to the donor plasmid backbone, as well as the predicted knock-in map using Bowtie2-default v220.127.116.11. SAM files were converted to BAM files, sorted and indexed using SAMtools v1.12.0 . Depth was called at each base along the alignment using SAMtools depth v1.12.0 .
Assessment of long reads
Genomic DNA extracted from whole blood was submitted for PacBio long read sequencing in a Sequel II SMRTcell (GeneWiz, USA). Input bam files were converted into FASTA files then assembly-stats (https://github.com/sanger-pathogens/assembly-stats.git) was used to assess the size of the reads. There were 19,709,419 reads with total sum length 292,074,095,630 bp which is ~97x coverage of the bovine genome. The average read length was 14,819.01 bp while the largest read length was 249,262 bp.
Identification of candidate long reads
To identify the reads that had any similarity to the possible inserts, cloning plasmid and/or insertion locus on chromosome 17, a bait file was generated containing:
The wild-type locus on Chr17 (1 kb before and after the break point).
The pUC19 plasmid backbone.
The originally proposed knock in sequence including the homology arms (complete template).
The 26 bp allele detected by Sanger sequencing.
Alignment of all input long reads against the bait file using BLASR (5.3.3-SL-release-8.0.0) recruited 314 reads.
Identification of possible structures of edited alleles
To find out the structure of any allele connecting the introduced sequences to the bovine chromosomes, a new reference was generated to include the bovine reference ARS-UCD1.2 , together with the pUC19 plasmid and complete template sequences. All candidate reads were aligned against the new reference. To enable better delineation of the allele structure, each read was fragmented into 1 kb subreads with 0.5 kb overlap. Read alignments were tested manually and classified into groups that support each allele structure. All suggested structures supported with at least two reads from more than one cluster were considered for further analyses.
Identification of possible junctional sequences between the allele blocks
The last 50 nucleotides of each edge from the plasmid and proposed complete template were converted into overlapping k-mers of 25 bp length. This way each 50 bp edge was transformed into 26 k-mers. Any short read containing at least one of these k-mers or their reverse complement were selected. The k-mer selected reads (660 reads covering the plasmid edges and 3246 reads covering the complete template edges) were error trimmed using Khmer 2 software package. Trimmed reads were assembled into contigs using SSAKE. Contigs were annotated by BLAST alignment against plasmid and complete template sequences and manual examination to identify the novel junction sequences.
Confirmation of the allele sequences
To further confirm the exact sequence of the three putative alleles, long reads covering each sub-structures of the proposed alleles were subjected to multiple sequence alignment using MAFFT (http://europepmc.org/article/MED/30976793). To improve the quality of alignment, most of the wild-type sequences were trimmed from the reads. The aligned sequences were used to generate a consensus sequence using the cons tool of EMBOSS package. The consensus sequence was re-aligned using BLAST to the nr database or the proposed sequences. The full sequence of the three insertion alleles and surrounding bovine genomic sequence are included in Supplementary Materials.
Fluorescence in situ hybridization
The fibroblast line derived from an ear punch biopsy taken from the bull calf was plated in a T75 flask in DMEM media (Thermo Fisher, Waltham, MA, USA) containing 10% fetal bovine serum (Thermo Fisher, Waltham, MA, USA), 1% Glutamax (Thermo Fisher, Waltham, MA, USA) and 1% penicillin/streptomycin (Thermo Fisher, Waltham, MA, USA). Once the cells reached ~ 80% confluency, 1.25% Gibco KaryoMAX Colcemid Solution (Thermo Fisher, Waltham, MA, USA) was added to the media and incubated at 37 °C and 5% CO2 for 1 h. Cell were then collected, resuspended in 10 mL of 0.56% KCl hypotonic solution and incubated at 37 °C for 10 min. Cells were then fixed in a 3:1 methanol to glacial acetic acid solution at 4 °C for 1 week. Once fixed, three to four drops of cell solution were applied to slides at a 45° angle and allowed to air dry. Slides were then hardened, i.e., aged at − 20 °C for 1 week. Slides were then used for fluorescence in situ hybridization as previously described [38, 39] using Roche DIG-Nick Translation Kit and Anti-Digoxigenin-Fluorescein, Fab fragments (Roche Applied Science, Upper Bavaria, Germany) to label the plasmid containing SRY. The BTA 17 chromosome-specific BAC (CHORI 371i17) with a centromere proximal location (15,482,193 – 15,677,551) was labelled using Red dUTP (Abbott Molecular, Des Plaines, IL) and a direct Nick Translation Kit (Abbott Molecular). Chromosomes were visualized using Vectashield mounting media with DAPI (Abcam, Cambridge, MA, USA). Mitotic metaphase chromosomes were examined, and images collected using an Olympus BX41 epifluorescence microscope equipped with an automatic filter wheel (Chroma Technology 82,000, DAPI/FITC/TRITC filter set), X-cite 120 Series metal-halide fiber optic lamp and Applied Imaging software (CytoVision version 7.4 GENUS, Leica Biosystems).