Experimental fish and sperm sample collection and preparation for DNA extraction
The fish used for this study were part of the National Program for Genetic Improvement and Selective Breeding for the Hybrid Striped Bass Industry research program at the Harry K. Dupree Stuttgart National Aquaculture Research Center (HKD-SNARC), Agricultural Research Service (ARS), US Department of Agriculture, Stuttgart, Arkansas and North Carolina State University (NCSU) Pamlico Aquaculture Field Laboratory, Aurora, North Carolina (NCSU-PAFL). Animal care and experimental protocols were approved by the HKD-SNARC Institutional Animal Care and Use Committee (IACUC) and conformed to ARS Policies and Procedures 130.4 and 635.1. Broodstock for this study consisted of domesticated four year-old female white bass (WB) and domesticated five year-old male striped bass (SB). Domesticated broodstock were obtained as fingerlings from the NCSU-PAFL and reared to maturity at the HKD-SNARC. The domesticated WB line originally established at the NCSU-PAFL by outcrossing WB from Lake Erie with fish from the Tennessee River, has been domesticated for over 8 generations. The domesticated SB line was originally established at NCSU-PAFL by outcrossing SB from six stocks (Canadian, Hudson River, Roanoke River, Chesapeake Bay, Santee-Cooper Reservoir, and Florida-Gulf of Mexico) and has been domesticated for over 6 generations.
Broodstock were conditioned in outdoor 1-acre ponds and brought into cold bank facilities one week prior to spawning. Female WB were given 75 μg GnRHa Ovaplants®(Syndel Laboratories, Cat. No. 13460), injected into the dorsal musculature. Fish to be spawned were chosen in an arbitrary manner based on those that were conditioned and available at the time of spawning. All fish were strip spawned, with eggs from each WB female separated into two labeled spawning bowls where sperm from two SB males would be used to complete fertilization to produce conspecific hybrid striped bass (one male SB for each WB bowl with eggs from the same female serving as the control for female fertility). Well-water was added to the egg/sperm mixture to activate sperm. The eggs and sperm were allowed to stand in the well water for approximately 2 min and then gently poured into McDonald hatching jars until fertile eggs began to hatch (approximately 36-40 h post-fertilization). The fertilization rate was determined by examining 200-300 eggs, approximately two hours post-fertilization, for signs of development using a dissecting light microscope (10X). Eggs that showed no development were assumed to be unfertilized or dead (arrested in embryonic development).
We selected four representative high-fertility (HF) male striped bass (animal ID: 1922, 1927, 1929 and 1930, with fertility rates of 94%, 87%, 76%, and 76% respectively) and three representative sub-fertility (SF) male striped bass (animal ID: 1916, 1917 and 1938, with fertility rates of 11%, 0%, and 0% respectively) from the population of fish spawned. The white bass egg duplication allowed us to de-select any males that were part of a given spawning where no eggs were fertilized from either WB bowl (i.e., presumption of poor egg quality or female infertility). All fish were anaesthetized to initial loss of equilibrium using 30 ppm Tricaine-S (Western Chemical, Ferndale, WA, USA). Gametes were stripped using light abdominal pressure, fish were then returned to well-oxygenated water containing a 1% NaCl solution to minimize stress in a recovery tank before being returned to their culture tanks. Aliquots of striped bass semen were collected at the time of spawning for DNA extraction and sequencing and snap frozen in liquid nitrogen. Briefly, approximately 1 mL of semen was put into a sterile conical tube and was then subsequently diluted with isosmotic (350 mOsmol/Kg) Striped Bass Extender  to provide an extended striped bass semen mixture containing approximately one billion sperm cells per 100 μL of extended semen. One mL of this mixture was pipetted into a sterile, RNAse free tube and immediately plunged into liquid nitrogen. It was kept frozen until a later time when it was thawed to extract and purify the DNA from the striped bass sperm. Care was taken to avoid any contact of water with semen prior to collection and freezing.
DNA extraction and MBD-Seq library construction
MBD-seq method was employed to identify methylated DNA regions in striped bass sperm. Briefly, we extracted the genomic DNA from sperm using the OmniPrep™ for Tissue kit (G Biosciences, Cat. NO: 786-395) and purified the DNA samples using the MinElute PCR Purification Kit (Qiagen, Cat. NO: 28006).The purified DNA concentration was measured by the Qubit dsDNA Broad-Range Assay (Invitrogen, Cat. NO: Q32850) and each sample was adjusted to 0.1 μg/μl with a final volume of 55 μl and then sheared into 300–500 bp fragments. The Methyl Cap Kit (Diagenode, Cat. NO: C02020010) was used to obtain the methylated DNA, according to the manufacture’s instructions.
The MBD-Seq library was constructed as follows. The NEBNext End Repair Module (NEB, Cat. NO: E6050S) was used for end repair of the fragmented, methylated DNA. After the 3′ poly “A” was added, a pair of Solexa adaptors (Illumina) was ligated to the repaired ends using T4 ligase (Promega, Cat. NO: M1801). The ligated products were then electrophoresed through 2% agarose gels and the fragments, ranging from 200 to 500 bp, were purified usinga Quick Gel Extraction Kit (QIAGEN, Cat. NO: 28704). We then enriched the purified DNA templates through PCR (the PCR program was as follows: 98 °C for 30s; 98 °C for 10s, followed by 60 °C for 30s and 72 °C for 10s, with 22 cycles; followed by the final 72 °C for 5 min), purified the PCR products using a MinElute PCR Purification Kit (QIAGEN, Cat. NO:28004) and then measured the concentration of the library using the Qubit Assay (Life Technology, Cat. NO: Q32850). The MBD-Seq library was sequenced on theSolexa 1G Genome Analyzer (Illumina) platform following the specifications provided by the manufacturer.
MBD-Seq data analysis
The quality of the raw short reads was evaluated employing FastQC,a web-based software that provides a thorough examination of the reads. Then, Bowtie was used to align the reads to the reference genome established and maintained by the Reading laboratory: http://appliedecology.cals.ncsu.edu/striped-bass-genome-project/data-downloads/. The draft striped bass genome sequence assembly is 585 million bases (585 Mb) and comprised of 35,010 contigs, with a GC content of 40.0%, similar to that of the congeneric white bass (39.5%) (Reading et al., unpublished data). The assembly has a CEGMA completeness score of 86.29% (partial) and 70.56% (complete), indicating that the complete genome is probably 600-700 Mb in size. The MAKER annotation pipeline was used to identify 27,485 protein coding genes.
During the quality filtering step, we trimmed the first 15 bases of each short read to maintain high sequence quality score, which resulted in 35 bp tags. For our analysis, a combination of procedures available in SAMtools and BEDtools were applied for the data filtration and format conversion.
The Model Based Analysis of ChIP-Seq (MACS) was implemented for peaks identification of each sample. This software utilized a dynamic Poisson distribution to effectively catch the local bias, improving the reliability of the prediction. After creating a contrast between conditions, the R package DiffBind runs an edgeR analysis with a false discovery rate (FDR) < 0.1 to call the differentially methylated regions (DMRs). Then, the ChIPpeakAnno package was used for the genomic annotation of the previously identified DMRs. This software provides information about the distance, relative position, and overlaps for the inquired feature.
Pyrosequencing for MBD-Seq validation
Equal amounts of DNA from the samples of each group were pooled together, serving as the template for the bisulfite conversion and the bisulfite PCR. Sodium bisulfite conversion reagents were used to treat 500 ng of each DNA pool (Methyl EdgeTM Bisulfite Conversion System, Promega). The DMRs for validation were randomly selected from the bioinformatics analysis results. The PCR primers were designed with PSQ Assay Design software (Biotage, Swedan) and shown in Additional file 1: Table S1.
Each pyrosequencing reaction contained 10-20 ng bisulfite converted DNA, and pyrosequencing methylation analysis was performed utilizing the Pyro Q-CpG system (PyroMark ID, Biotage). Briefly, the PCR products were bound to Streptavidin coated Sepharose beads (GE Healthcare Bio-sciences AB, Sweden). Then, the beads were purified in 70% ethanol for 5 s, denatured in Denature buffer (Biotage) for 5 s, and washed with washing buffer (Biotage) for 10 s in the pyrosequencing Vacuum Prep Tool (Biotage). Next, 0.5 mM sequence primer was annealed to the purified single-stranded PCR product and pyrosequencing was carried out using the Pyro Q-CpG system. The methylation level was expressed for each cytosine locus on CpG sites as the percentage of mC/(mC + C), where“mC” is methylated cytosine and “C” is unmethylated cytosine. Non-CpG cytosine residues were used as internal controls to verify bisulfite conversion. Finally, DNA methylation level was obtained.
DMR gene annotation enrichment analysis
The positional information of DMR regions was used to identify neighboring predicted genes on each of the genome assembly contigs. A list of the predicted genes was then compiled and identities of the informative loci were used to retrieve official gene symbols that were submitted to DAVID Gene Functional Classification for Gene Ontology (GO) Class enrichment analysis . Default parameters were used for DAVID with the complete gene list serving as the background. Functional characteristics of the annotated genes associated with DMRs also were evaluated, focusing mainly on GO biological process, cellular component, and molecular function that were enriched overall in the gene set.