- Open Access
Whole blood transcriptional profiling comparison between different milk yield of Chinese Holstein cows using RNA-seq data
BMC Genomics volume 17, Article number: 512 (2016)
The objective of this research was to investigate the variation of gene expression in the blood transcriptome profile of Chinese Holstein cows associated to the milk yield traits.
We used RNA-seq to generate the bovine transcriptome from the blood of 23 lactating Chinese Holstein cows with extremely high and low milk yield. A total of 100 differentially expressed genes (DEGs) (p < 0.05, FDR < 0.05) were revealed between the high and low groups. Gene ontology (GO) analysis demonstrated that the 100 DEGs were enriched in specific biological processes with regard to defense response, immune response, inflammatory response, icosanoid metabolic process, and fatty acid metabolic process (p < 0.05). The KEGG pathway analysis with 100 DEGs revealed that the most statistically-significant metabolic pathway was related with Toll-like receptor signaling pathway (p < 0.05). The expression level of four selected DEGs was analyzed by qRT-PCR, and the results indicated that the expression patterns were consistent with the deep sequencing results by RNA-Seq. Furthermore, alternative splicing analysis of 100 DEGs demonstrated that there were different splicing pattern between high and low yielders. The alternative 3’ splicing site was the major splicing pattern detected in high yielders. However, in low yielders the major type was exon skipping.
This study provides a non-invasive method to identify the DEGs in cattle blood using RNA-seq for milk yield. The revealed 100 DEGs between Holstein cows with extremely high and low milk yield, and immunological pathway are likely involved in milk yield trait. Finally, this study allowed us to explore associations between immune traits and production traits related to milk production.
Milk yield and milk composition of lactating cows are the most important economic traits in dairy cattle. In the past century, genetic selection on milk yield has improved milk production in cattle. With the development of quantitative trait loci (QTLs), genome-wide association studies (GWAS) and RNA sequencing (RNA-seq) technologies, a large number of candidate genes and SNPs associated with milk performance traits have been identified, such as DGAT1 and GHR genes [1–6]. Previous studies described the associations between DGAT1 K232A polymorphism and milk production traits [7, 8]. Blott et al. identified a significant SNP (BFGL-NGS-118998) inside the GHR gene that has an important role related to milk traits .
Mammary gland is an important organ to synthesize and secrete milk. The mammary epithelial cell has a remarkable ability to convert blood circulating nutrients into milk components . Thus, almost all the studies related to milk performance traits are based on mammary gland. For example, Cui et al. collected mammary gland samples from four lactating cows after slaughter. They used RNA-seq to generate the bovine mammary transcriptome with high and low milk fat and protein percentage . Finucane et al. compared bovine mammary expression profiles before and after parturition using microarray . However, there are still limitations in analysis of milk performance traits using bovine mammary tissue, such as sampling difficultly, high cost of sampling and tremendous damage to the lactating cows, which resulted in small sample size.
Milk is composed of fat, protein, lactose, minerals, vitamins and water, and all these nutrients derive from blood [10, 13–15]. Some data shows that synthesizing 1L milk requires 400-500L blood flow though the mammary gland. So plenty of blood is essential for synthesizing milk. Previous studies associated with milk traits were only focused on blood physiological and biochemical indexes, and there are very few studies associating gene expression with milk traits in cattle blood. A gene expression profile from blood provides new opportunities to clarify the basic mechanisms of complex traits in cattle milk. Besides, whole blood is a complex mixture of cells and can accurately reflect the physiological condition and health level of cows. Many studies have used blood to diagnose disease in dairy cattle, such as mastitis [16, 17]. Most importantly, blood is easier to sample in comparison with other tissues and involves limited handling of animals. Furthermore, the lactation process requires multiple tissues and organs to complete, such as mammary gland, liver and adipose tissue. Hence, blood has the potential ability to represent milk performance traits more directly and comprehensively. Sandri et al. (2015) analyzed the gene expression profile in the blood related to the gene merit for milk productive traits using microarray . Marcel et al. analyzed porcine peripheral blood mononuclear cells transcription profile of pigs with divergent humoral immune response and lean growth performance . To complete understand the impact of blood transcriptome profiles on milk yield, comprehensive cataloguing of gene expression change within high yielders and low yielders is required. The aim of this study is to compare gene expression profiles in bovine whole blood of high and low milk yield cows, and to investigate potential molecular biomarkers in the blood transcriptome that relate to the productive potential of lactating cows using RNA-seq techniques.
Analysis of expressed transcripts between high and low yielders
From 23 samples, we obtained total 74.6 Gb RNA-Seq data files (26,763,546 to 51,313,614 paired-end reads per sample). Nearly 68 % of the reads were mapped to the bovine genome UMD3.1.66 and approximately 62 % of the reads in every individual were uniquely mapped to the bovine genome. The alignment information for each sample is presented in Table 1. Of these, 16,314 and 16,151 expressed transcripts were revealed in high yielders and low yielders, respectively (Additional file 1: Table S1).
DEGs and splice events between high and low yielders
To provide a better understanding of the biological mechanism of milk yield, it is essential to identify the DEGs between high and low milk yield cows. Based on the Cuffdiff analysis, a total of 100 DEGs (p<0.05, FDR< 0.05) were examined between the high and low yielders (Fig. 1). All the DEGs were located in chromosomes randomly, but there were no DEGs identified in chromosome 14, 20, 27, and 28. The expression level of 100 DEGs was from 2 to 1063 FPKM in high group, and 0.4 to 1794 FPKM in low group. In addition, 43 of the 100 DEGs were highly expressed in the high yielders; whereas, the other 57 DEGs showed lower expression in low yielders. The expression level of 100 DEGs is shown in Fig. 2 and the detail information is presented in Additional file 2: Table S2.
Splice events are thought to contribute to phenotypic complexity during the mammalian evolution . In total, we obtained 44,572 and 36,467 splice events in high and low yielders, respectively, compared to the annotated bovine genome UMD3.1.66. Of these, 214 (in high yielders) and 202 (in low yielders) differentially expressed splice events were identified. Further analysis showed that the 214 and 202 splice events involved 59 DEGs and 57 DEGs, respectively. Major splicing events such as, exon skipping (ES), intron retention (IR), alternative 5’ splicing site (A5SS), alternative 3’ splicing site (A3SS) and mutually exclusive exon (MXE) were detected in our studied bovine blood transcriptomes. The A3SS was the major splicing pattern observed in high group; but in low group, the major type was ES (Fig. 3). This suggests that high milk yield cows are more inclined to take the A3SS pattern. In addition, more splice sites were found in chromosome 26 in both groups.
Functional classification of DEGs
The DAVID tool  was used to annotate the function of the 100 DEGs with the particular categories focusing on the gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway. A total of 55 clusters (p<0.05) were significantly annotated with GO terms within three major function groups: biological process (BP), cellular component (CC), and molecular function (MF). The most significant GO categories observed were defense response, immune response, inflammatory response, icosanoid metabolic process, and fatty acid metabolic process (p<0.05) (Fig. 4). Only one KEGG pathway was enriched, which was the Toll-like receptor signaling pathway. On the other hand, based on the enrichment analysis of DEGs containing alternative splice sites revealed that the PPAR signaling pathway was only detected in the high yielded group. Detailed information of the DEG functional annotations are showed in Table 2.
qRT-PCR validation of DEGs in high and low milk yielders
Furthermore, we randomly selected 4 DEGs identified from the RNA-seq data, LGALSL, IL-8, FAM213B and CCL5, to validate their expression patterns using qRT-PCR. The results from the qRT-PCR confirmed that the DEGs had the same expression pattern observed with the RNA-seq (Fig. 5). This indicates that the gene expression observed in blood transcriptome between high yielders and low yielders was highly credible.
The object of this study is to investigate potential molecular biomarkers in whole blood related to milk production traits in lactating dairy cows, with the aim of putting forward a non-invasive method that identifies the DEGs for milk performance traits. In this study, whole blood genome transcriptome profiles of high and low yield milk cows were investigated using RNA-seq technique. RNA-seq has many advantages over traditional cDNA microarray technologies and it can easily detect low-abundance genes [22, 23]. Marioni et al. demonstrated that RNA-seq and qRT-PCR have a high correlation, and that the Pearson correlation could reach 0.929 , which means RNA-Seq is accurate and reproducible.
Among these 100 DEGs, many genes were also detected by other studies associated with milk yield, such as BOLA-DQA1. BOLA-DQA1 is one of the primary histocompatibility complex (MHC) class II molecules that plays an important role in the immune system. It is thought that MHC genes indirectly affect milk production traits by increasing the disease resistance of the cows. In this present study, the expression level of BOLA-DQA1 was higher in high milk yielders than in low milk yielders. Other research has reported that the BOLA-DQA1 gene was associated with resistance to mastitis progression .
Splicing event analysis showed that DEGs in high and low milk yield cows were different. In low yielders, cows are more inclined to take the ES pattern but in high yielders A3’SS was more likely. Oxidized low-density lipoprotein receptor (OLR1), which can degrade the oxidized form of low-density lipoprotein and plays important roles in fatty acid transport, was reported as a potential gene for milk-fat percentage and milk fat yield . The splice site analysis in our study also found that AS in OLR1 was significant. There were 5 splice sites of OLR1 in high milk yield including one ES, one A5’SS and three other events. But in low yielders, there was only one splice site on OLR1, which was A5’SS. Also, the expression level of OLR1 was higher in high yielders than in low yielders. Splice events analysis of DEGs can reveal differences between high and low milk yield cows. Among those DEGs containing splice sites, gene ontology enrichment indicated that the PPAR signaling pathway was significantly different between high and low groups. Study showed that PPARA is expressed in heart, liver, adipose tissue, and muscle tissue and is involved in fatty acid catabolism ; PPARs play important roles in the regulation of metabolic and inflammatory signaling pathways [26, 27]. It is also reported that PPAR-γ (PPARG) is over-expressed in adipose tissue and macrophages and primarily regulates adipogenesis [27, 28]. PPAR-γ has been reported to significantly increase its expression during lactation in bovine mammary gland . Also, anti-inflammatory properties were observed between PPARA and PPARG . In our study, we found that this PPAR pathway is also in the blood transcriptome, and it was only detected in high yielders. Moreover, these three PPAR pathway DEGs: LPL, OLR1, and CPT1A, were highly expressed in high yielders. Also, the Toll-like receptor signaling pathway existed only in high yielders. The Toll-like receptor pathway is also involved in innate immunity . These results indicate that the regulation of metabolic and immune function is more active in high milk yield cows. The immune system plays a key role in health maintenance, pathogenesis, diseases resistance, and production performance. Further research is required to explore the relation between immune response and milk performance traits in cattle blood transcriptomes.
The present study provides a non-invasive method to identify the DEGs in cattle blood using RNA-seq for milk yield traits. The study revealed 100 DEGs between high yielders and low yielders, and discovered different alternative splicing patterns between the two groups. The enrichment analysis also revealed that specific metabolic and immunological pathways are related to cattle milk yield traits, and could be considered a signature of blood biomarkers selection in dairy cattle. These results provide the valuable resources of biological research in Chinese Holstein cows milk production, but also offer some potential guidelines to understand the relationship between milk production and the immune function.
Blood samples collection
Twenty three Chinese Holstein cows in their second or third lactation were selected based on their current lactation and previous lactation records from BingZhouHai Dairy Farm. All the cows are reared under the same standard. 13 high yielders: ~28kg/day/cow and 10 low yielders: ~18kg/day/cow were selected for this study. 5ml of whole blood was drawn from the jugular vein of each cow. The blood samples were frozen in liquid nitrogen then transferred to −80 °C for further RNA extraction. All experimental methods used in this study were approved by the Inner Mongolia Agricultural University (Hohhot, China) Institutional Animal Care and Use Committee.
RNA extraction and sequencing
We used TRIzol reagent (TaKaRa, USA) to extract total RNA from blood samples following the manufacturer’s instructions. RNA degradation and purity were monitored on 1% agarose gels. Illumina TruSeq Stranded mRNA Sample Preparation Kit (Illumina, San Diego, CA, USA) was used to generate cDNA libraries according to the manufacturer’s recommendations. A total of 1μg of high quality RNA from each individual was used to prepare the sequencing library. The poly-T oligo-attached magnetic beads selection procedure was used to obtained mRNAs from the total RNA. After fragmentation, random oligonucleotides and SuperScript II was used to synthesize first-strand cDNA. DNA polymerase I and RNase H was used to synthesis Second-strand cDNA. After adenylating of 3’ ends of DNA fragments, Hybridization was initiated by ligating Illumina PE adapter and index. cDNA fragments (200 bp) were generated by the AMPure XP system in purifying the library. DNA fragments were selectively enriched to construct the final sequencing library using Illumina PCR Primer Cocktail . An Illumina Hiseq 2000 platform sued to sequence the libraries.
Differentially expressed genes and splice events analysis between high and low yielders
Clean reads were acquired by removing low quality reads with adapter and poly-N from raw reads and were used for subsequent analysis. We downloaded the cattle genome (UMD3.1.66) and annotation files from the ensembl database. Bowtie v0.12.8 was used to build the index of cattle reference genome  and TopHat v2.1.0  was employed to align clean reads for each sample against the cattle reference genome. DEGs between the high and low yielders were detected using Cuffdiff . P value <0.05 and FDR<0.05 were as the threshold in this study to select differentially expressed genes between high and low yielders.
The Alternative Splicing Transcriptional Landscape Visualization Tool (Astalavista) web server (http://genome.crg.es/astalavista/) extracts and displays splice events from genomic annotation of exon-intron gene coordinates. Astalavista v3.0  was employed to identify the alternative splice events for all available transcripts and to study the five basic splice events of 100 DEGs for 23 samples.
Enrichment analysis of DEGs
To further investigate the function of 100 DEGs, we performed the gene ontology (GO) and Kyoto Encyclopedia of Genes and Genomes (KEGG) pathway analysis using DAVID bioinformatics resource tool . GO and KEGG terms with p<0.05 were considered significantly enriched.
Validation of DEGs by qRT-PCR
DEGs identified by the RNA-seq method were validated using qRT-PCR. GAPDH was used as the internal quality control. RT-PCR experiments were performed with 2×SYBR Green master mix technology (Takara) on the Mx3000P Real-Time PCR System (Agilent, USA). The reaction was performed using the following program: 95 °C for 10 min; 40 cycles of 95 °C for 15s and 61 °C for 60s; 95 °C for 30s, 55 °C for 30s and 95 °C for 30s. Primer sequences can be found in Additional file 3: Table S3.
A3SS, alternative 3’ splicing site; A5SS, alternative 5’ splicing site; AS, Alternative splicing; BP, biological process; CC, cellular component; DEG, differentially expressed gene; ES, exon skipping; EST, expressed sequence tag; FDR, false discovery rate corrected p values; FPKM, Fragments per kilobase of transcript per million mapped fragments; GO, Gene ontology; IR, intron retention; KEGG, Kyoto Encyclopedia of Genes and Genomes; MF, molecular function; MXE, mutually exclusive exon; RNA-Seq, RNA-Sequencing
Sun D, Jia J, Ma Y, Zhang Y, Wang Y, Yu Y, Zhang Y. Effects of DGAT1 and GHR on milk yield and milk composition in the Chinese dairy population. Anim Genet. 2009;40(6):997–1000.
Grisart B, Coppieters W, Farnir F, Karim L, Ford C, Berzi P, Cambisano N, Mni M, Reid S, Simon P. Positional candidate cloning of a QTL in dairy cattle: identification of a missense mutation in the bovine DGAT1 gene with major effect on milk yield and composition. Genome Res. 2002;12(2):222–31.
Boichard D, Grohs C, Bourgeois F, Cerqueira F, Faugeras R, Neau A, Rupp R, Amigues Y, Boscher YM, Levéziel H. Detection of genes influencing economic traits in three French dairy cattle breeds. Genet Sel Evol. 2003;35(1):77–101.
Daetwyler HD, Schenkel FS, Sargolzaei M, Robinson JAB. A genome scan to detect quantitative trait loci for economically important traits in Holstein cattle using two methods and a dense single nucleotide polymorphism map. J Chengdu Sport Univ. 2008;91(8):3225–36.
Kolbehdari D, Wang Z, Grant JR, Murdoch B, Prasad A, Xiu Z, Marques E, Stothard P, Moore SS. A whole genome scan to map QTL for milk production traits and somatic cell score in Canadian Holstein bulls. J Anim Breed Genet. 2009;126(3):216–27.
Grisart B, Farnir F, Karim L, Cambisano N, Kim J-J, Kvasz A, Mni M, Simon P, Frère J-M, Coppieters W. Genetic and functional confirmation of the causality of the DGAT1 K232A quantitative trait nucleotide in affecting milk yield and composition. Proc Natl Acad Sci U S A. 2004;101(8):2398–403.
Kaupe B, Brandt H, E-M P, Erhardt G. Joint analysis of the influence of CYP11B1 and DGAT1 genetic variation on milk production, somatic cell score, conformation, reproduction, and productive lifespan in German Holstein cattle. J Anim Sci. 2007;85(1):11–21.
Bovenhuis H, Visker M, Poulsen N, Sehested J, van Valenberg H, van Arendonk J, Larsen L, Buitenhuis A. Effects of the diacylglycerol o-acyltransferase 1 (DGAT1) K232A polymorphism on fatty acid, protein, and mineral composition of dairy cattle milk. J Dairy Sci. 2016.
Sarah B, Jong-Joo K, Sirja M, Anne SK, Anne C, Paulette B, Nadine C, Christine F, Bernard G, Dave J. Molecular dissection of a quantitative trait locus: a phenylalanine-to-tyrosine substitution in the transmembrane domain of the bovine growth hormone receptor is associated with a major effect on milk yield and composition. Genetics. 2003;163(1):253–66.
Bauman DE, Mather IH, Wall RJ, Lock AL. Major advances associated with the biosynthesis of milk. J Dairy Sci. 2006;89(4):1235–43.
Cui X, Hou Y, Yang S, Xie Y, Zhang S, Zhang Y, Zhang Q, Lu X, Liu GE, Sun D. Transcriptional profiling of mammary gland in Holstein cows with extremely different milk protein and fat percentage using RNA sequencing. BMC Genomics. 2014;15(1):1–15.
Finucane KA, Mcfadden TB, Bond JP, Kennelly JJ, Zhao FQ. Onset of lactation in the bovine mammary gland: gene expression profiling indicates a strong inhibition of gene expression in cell proliferation. Funct Integr Genomics. 2008;8(3):251–64.
Akers RM: Lactation and the mammary gland: Iowa State Press; 2002
Mcmanaman JL, Neville MC. Mammary physiology and milk secretion. Adv Drug Deliv Rev. 2003;55(5):629–41.
Baldi A, Cheli F, Pinotti L, Pecorini C. Nutrition in mammary gland health and lactation: advances over Eight Biology of Lactation in Farm Animals meetings. J Anim Sci. 2008;86(13_suppl):3–9.
Zeng R, Bequette BJ, Vinyard BT, Bannerman DD. Determination of milk and blood concentrations of lipopolysaccharide-binding protein in cows with naturally acquired subclinical and clinical mastitis. J Dairy Sci. 2009;92(3):980–9.
Jhambh R, Dimri U, Gupta VK, Rathore R. Blood antioxidant profile and lipid peroxides in dairy cows with clinical mastitis. Veterinary World. 2013;6(6):271–3.
Sandri M, Stefanon B, Loor JJ. Transcriptome profiles of whole blood in Italian Holstein and Italian Simmental lactating cows diverging for genetic merit for milk protein. J Dairy Sci. 2015;98(9):6119–27.
Marcel A, Eduard M, Siriluck P, Klaus W. PBMC transcription profiles of pigs with divergent humoral immune responses and lean growth performance. Int J Biol Sci. 2013;9(9):907–16.
Kang X, Liu G, Liu Y, Xu Q, Zhang M, Fang M. Transcriptome profile at different physiological stages reveals potential mode for curly fleece in Chinese tan sheep. Plos One. 2013;8(8):e71763.
Dennis G, Sherman BT, Hosack DA, Yang J, Wei G, Lane HC, Lempicki RA. DAVID: Database for Annotation, visualization, and Integrated Discovery. Genome Biol. 2003;2003:3–P3.
Croucher NJ, Thomson NR. Studying bacterial transcriptomes using RNA-seq. Curr Opin Microbiol. 2010;13(5):619–24.
Marioni JC, Mason CE, Mane SM, Stephens M, Gilad Y. RNA-seq: an assessment of technical reproducibility and comparison with gene expression arrays. Genome Res. 2008;18(9):1509–17.
Miyasaka T, Takeshima SN, Matsumoto Y, Kobayashi N, Matsuhashi T, Miyazaki Y, Tanabe Y, Ishibashi K, Sentsui H, Aida Y. The diversity of bovine MHC class II DRB3 and DQA1 alleles in different herds of Japanese Black and Holstein cattle in Japan. Gene. 2010;472(1–2):42–9.
Schennink A, Bovenhuis H, Léon-Kloosterziel KM, Arendonk JAMV, Visker MHPW. Effect of polymorphisms in the FASN, OLR1, PPARGC1A, PRL and STAT5A genes on bovine milk-fat composition. Anim Genet. 2009;40(40):909–16.
Chinetti G, Fruchart JC, Staels B. Peroxisome proliferator-activated receptors (PPARs): nuclear receptors at the crossroads between lipid metabolism and inflammation. Inflamm Res. 2000;49(10):497–505.
Cynthia H, Peter T. Coordination of inflammation and metabolism by PPAR and LXR nuclear receptors. Curr Opin Genet Dev. 2008;18(5):461–7.
Bionaz M, Loor JJ. Gene networks driving bovine milk fat synthesis during the lactation cycle. BMC Genomics. 2008;9(1):366.
Szanto A, Nagy L. The many faces of PPARgamma: anti-inflammatory by any means? Immunobiology. 2008;213(9–10):789–803.
Trapnell C, Williams BA, Pertea G, Mortazavi A, Kwan G, van Baren MJ, Salzberg SL, Wold BJ, Pachter L. Transcript assembly and quantification by RNA-Seq reveals unannotated transcripts and isoform switching during cell differentiation. Nat Biotechnol. 2010;28(28):511–5.
This work was supported by National Natural Science Foundation of China (No.30960246), and “National Science and Technology Support Plan” of China (No.2011BAD28B05).
The publication costs for this article were funded by the corresponding author.
This article has been published as part of BMC Genomics Volume 17 Supplement 7, 2016: Selected articles from the International Conference on Intelligent Biology and Medicine (ICIBM) 2015: genomics. The full contents of the supplement are available online at http://bmcgenomics.biomedcentral.com/articles/supplements/volume-17-supplement-7.
Availability of data and materials
The sequence data of 23 Chinese Holstein cows used in this paper can be used by public. Datasets supporting the results of this article are included in the additional files.
WZ and YB conceived and designed the experiments. XB and XJ performed the experiments. XB, ZZ and BL analyzed the data. XB, YB and WZ wrote the paper. All authors read and approved the final manuscript.
The authors declare that they have no competing interests.
Consent for publication
Ethics approval and consent to participate
All of the animals were handled in strict, accordance with good animal practices as defined by the relevant national and local animal welfare bodies. The experimental procedure was approved by the Animal Care and Use Committee of Inner Mongolia Agricultural University, China and was performed in accordance with the animal welfare and ethics guidelines.