Impact of breed and sex on porcine endocrine transcriptome: a bayesian biometrical analysis
© Pérez-Enciso et al; licensee BioMed Central Ltd. 2009
Received: 05 August 2008
Accepted: 24 February 2009
Published: 24 February 2009
Transcriptome variability is due to genetic and environmental causes, much like any other complex phenotype. Ascertaining the transcriptome differences between individuals is an important step to understand how selection and genetic drift may affect gene expression. To that end, extant divergent livestock breeds offer an ideal genetic material.
We have analyzed with microarrays five tissues from the endocrine axis (hypothalamus, adenohypophysis, thyroid gland, gonads and fat tissue) of 16 pigs from both sexes pertaining to four extreme breeds (Duroc, Large White, Iberian and a cross with SinoEuropean hybrid line). Using a Bayesian linear model approach, we observed that the largest breed variability corresponded to the male gonads, and was larger than at the remaining tissues, including ovaries. Measurement of sex hormones in peripheral blood at slaughter did not detect any breed-related differences. Not unexpectedly, the gonads were the tissue with the largest number of sex biased genes. There was a strong correlation between sex and breed bias expression, although the most breed biased genes were not the most sex biased genes. A combined analysis of connectivity and differential expression suggested three biological processes as being primarily different between breeds: spermatogenesis, muscle differentiation and several metabolic processes.
These results suggest that differences across breeds in gene expression of the male gonads are larger than in other endocrine tissues in the pig. Nevertheless, the strong presence of breed biased genes in the male gonads cannot be explained solely by changes in spermatogenesis nor by differences in the reproductive tract development.
It is well known that variability at the transcriptome is in part due to genetic causes, much like any other complex phenotype, e.g., . Thus, the large phenotypic differences that we observe between extant breeds of domestic species or between different ecotypes in wild species must be correlated also to differences at the transcriptome level. The extent of these differences and their nature is, however, not fully known. Considering, moreover, that the transcriptome programme differs widely between tissues and across development stages of the same organism while simultaneously many expression levels are highly inter correlated adds an additional layer of complexity to the problem.
Although the differences between tissues and across development stage transcriptomes is a well known fact [2–4], the variability across tissues or development stages is less studied than within a single tissue due to increased costs of longitudinal or across-tissue studies. Certainly, choosing the tissue and the timing to be studied is a critical aspect of any experimental design in transcriptomics. Yet, this choice is not necessarily obvious. An example is growth and leanness in livestock. These two traits are the most important selection objectives in the majority of breeding programmes. But they are very complex phenotypes that depend on a large number of tissues. As for growth, the most valuable tissue in economic terms is muscle. Although muscle would be a logical tissue to be studied, muscle development does depend on signals external to the muscle itself like endocrine and paracrine factors that also change along development. Similarly, differences in the amount of fat tissue are more likely to be caused by genetic signals that originate in the hypothalamus or in endocrine tissues rather than in the fat tissue itself.
In order to study the impact of breed differentiation on the pig's transcriptome, we have analyzed the breed and sex differences across different tissues. Among tissues that are of interest, those involved in the different endocrine axes stand out as a promising choice, considering their fundamental biological role and that their transcriptomes have not been widely analyzed. Here we report a detailed microarray analysis of five tissues that make up two main endocrine axes, the HPTA (hypothalamic-pituitary-gonadal) and HPT (hypothalamic-pituitary-thyroid) axes, plus fat tissue, in four highly divergent porcine breeds. Both HPTA and HPT axes are highly influential endocrine axes and, we conjectured, must be responsible for at least some of the large phenotypic differences between breeds caused by artificial selection in livestock, e.g., in fat content.
The tissues sampled were hypothalamus (HYPO), adenohypophysis (AHYP), thyroid gland (THYG), gonads (GONA) from both sexes, males (GONAM) and females (GONAF), and back fat tissue (FATB). Some of their primary endocrine roles are in Additional file 1. The four breeds were Large White (LW), Duroc (DU), Youli (YL) and Iberian (IB). The Large White is a very lean and high growth breed, it is used as sire (male) line. Although there are many different Duroc lines differing in their lean content and growth performance, the Duroc line employed here was a maternal line with good reproductive performance and high intramuscular fat content. Youli, also a maternal line, results from crossing Landrace breed with a hybrid line made up of Chinese and European breeds, and it is a highly prolific line. Finally, the Iberian breed is a traditional 'unimproved' breed of slow growth, high fat deposition and low prolificacy but high intramuscular fat content and renowned meat quality.
Means (SD) of the variance ratios' posterior distributions.
3 × 10-3 (10-4)
4 × 10-3 (10-4)
3 × 10-3 (10-4)
3 × 10-3 (10-5)
4 × 10-3 (10-4)
4 × 10-3 (10-4)
4 × 10-3 (10-4)
3 × 10-3 (10-4)
3 × 10-3 (10-5)
3 × 10-4 (10-5)
3 × 10-3 (10-5)
3 × 10-3 (10-5)
4 × 10-3 (10-5)
4 × 10-3 (10-5)
7 × 10-4 (10-4)
5 × 10-3 (10-4)
6 × 10-4 (10-5)
When microarrays were analyzed separately by sex or breed, neither the influence of tissue nor of sex varied. That is, the ratio of Probeset × Tissue variance () was constant across sexes and breeds, as was the ratio of Probeset × Sex variance () across breeds (Table 1). When each tissue was analyzed separately, however, the picture changed. First, the Probeset × Sex variance ratio was maximum for the gonad tissue: = 0.05 vs. ≤ 10-3 in the rest of tissues. As this ratio measures the relevance of sex in the probeset variability, this result is not completely unexpected, and agrees with previous evidence indicating the largest number of sex differentially expressed genes occurs in the reproductive organs . It is far more interesting, though, the observation that the Probeset × Breed variance ratio was larger in male ( = 0.01) than in female gonads ( = 0.0007). That is, the breed effect was over ten times larger in the testicle than in the ovarian transcriptome. Among the tissues studied, the largest transcriptome divergence between breeds corresponded to the male gonad and the minimum, to the female gonad.
Sex hormone levels in serum (ng/ml) relative to Large White (LW)
0.17 ± 0.01
1.20 ± 0.49
0.86 ± 0.34
0.016 ± 0.005
Duroc – LW
-0.01 ± 0.02
0.79 ± 0.70
0.15 ± 0.43
-0.003 ± 0.006
Youli – LW
0.02 ± 0.02
0.92 ± 0.85
0.28 ± 0.44
0.000 ± 0.006
Iberian – LW
0.01 ± 0.02
0.71 ± 0.70
0.20 ± 0.43
-0.015 ± 0.006*
Differential sex expression is predominant at gonad tissues
Means (SD) of sex Bayesian z-scores, absolute values, for each tissue.
Correlation of sex Bayesian z-scores between tissues
This distinct genetic programme in the gonads, one would expect, should consist of genes primarily involved in gametogenesis. To investigate this further, we selected the 100 most biased probesets in the gonads and, among those, we ranked the most biased probesets in AHYP and THYG. The complete list is in Additional File 2, but nine probesets stood out as highly sex – biased across all three tissues, whereas the 91 remaining probesets did not exhibit any pronounced sex bias expression neither in AHYP nor in THYG. The nine probesets corresponded to genes DDX3Y, EIFS23, FAM5C, EIFIAY, DENND4A, PTPRM, LPHN2, CLOCK and TMSB4X. These nine genes were among the 10 most differentially sex expressed genes that we obtained in a previous work , where 16 tissues in four animals were analyzed. Some of the genes were confirmed by quantitative real time PCR (QRT-PCR). They are also among the most sex biased porcine genes identified in independent studies . We also performed a gene ontology analysis of the remaining 91 probesets which did show, as presumed, that spermatogenesis genes were over represented (P-value = 0.03) but also were other processes: multicellular organism development (P = 0.03), hemophilic cell adhesion (P = 0.01) or synaptic transmission (P = 0.04). Thus, although gametogenesis partially explains the distinct sex-bias programmes between gonads and the rest of tissues, it does not explain the full story. There is not a simplistic explanation or a general common role for specific gonad sex – biased genes.
We also carried out a differential gene ontology (GO) study with the most sex biased probesets in the gonads, irrespective of whether they were also sex – biased in other tissues. Initially, we aimed at studying the 5154 sex biased probesets detected at FDR = 0.05 but this was not feasible computationally. Thus, we selected the top 1700 genes, the number of selected genes when all tissues are analyzed. The most significant and numerous GO classes are in Additional File 3. Overall, most GO categories were related to development, which makes sense because ovaries and testicles follow different development trajectories since early embryogenesis, from about four weeks of embryo age in pigs . As expected, there was also an overrepresentation of spermatogenesis and male gonad development, but also lactation. An interesting observation was the excess of genes related to the MAPKKK cascade (MAP3K5, MAPK1, FGFR3, RAPGEF2, NF1, AGT), one of the most important signalling pathways in the cell. All of these genes are also involved in development, in particular, AGT (angiotensinogen) plays a role in female pregnancy and in ovarian follicle development [12, 13].
The study of breed differences at the transcriptome level is fundamental to elucidate the impact of selection on gene expression, and thus on gene regulation. As mentioned, the ratio or Probeset × Breed heritability is a summary of the breed influence on the transcriptome. Although the overall breed influence was much smaller than that of tissue or the probeset itself (Table 1), it was a remarkable observation that was maximum in male gonads and minimum at the female gonads: 0.01 vs. 0.0007, i.e., at least 10 times larger in testicles than in ovaries. Note that the SD of these figures are very small (= 10-4) and thus the two estimates are clearly distinct. To gain further insight and to assess the influence of each individual probeset, we computed the standard deviation of the breed Bayesian z-scores, i.e., for probeset i SDzbreed, i = , where z ij is the breed z-score of probeset i at breed j. This is a rough measure of expression variability across breeds but allows us to carry out a ranking among probesets in terms of how much do their expression differ between breeds. Additional File 4 shows the mean SDzbreed for each tissue, the complete list of breed z-scores is in Additional File 5. The z-score breed variability was larger in the male gonads than in the rest of tissues, in agreement with the variance component analysis from Table 1. Only when all tissues or both gonads were considered jointly was variability larger than in male gonads. The relation between breed and sex bias expression is discussed in the next section.
Genes that are among the 200 most variable in at least four tissues
GO Biological process
Lysosomal trafficking regulator
Small VCP/p97-interacting protein
forkhead box F1
Organ morphogenesis, lung and gut development
Death inducer-obliterator 1
Major histocompatibility complex class B
Heterogeneous nuclear ribonucleoprotein H2
Vesicle trafficking protein homolog B
ER to Golgi transport
Microtubule-associated protein 6
Negative regulation of microtubule depolymerization
Rho-associated, coiled-coil containing kinase 1
Actin cytoskeleton organization and biogenesis
Family with sequence similarity 92, member A1
Frizzled homolog 4 (Drosophila)
Multicellular organism develop, wnt signaling pathway
Armadillo repeat containing, X-linked 1
Development, maintenance of tissue integrity
Lipopolysaccharide-induced TNF factor
Vasoactive intestinal peptide receptor 2
Cell cell signaling/G-protein coupled receptor protein signaling
Amyloid beta (A4) precursor-like protein 2
G-protein coupled receptor protein signaling
Tousled-like kinase 2
Cell cycle/chromatin assembly
Immunoglobulin heavy constant mu
The most extreme breed for each of the geneprobes is shaded in Table 5. Although the Youli synthetic breed seems 'enriched' in extreme probes, may be because of its highly heterogenous backgroung, it is of interest to study the extreme probes between Iberian (a non selected breed) vs. the rest of breeds, which have undergone a rather intense artificial selection process. We did that for all tissues jointly and for each of individual tissues except the gonads. In all cases, the gene LYST, involved in cellular defense, was the most extreme Iberian probe. We also looked for enriched ontology categories among the most extreme probes. When all tissues are examined together, the most 100 extreme genes were enriched in defense (antigen processing and response) and development (endodermal cell fate commitment, brain morphogenesis, mast cell biogenesis). As for each tissue independently, thyoroid gland was enriched in defense genes, whereas back fat or hypothalamus showed, additionally, an excess of muscle development genes.
Over represented Gene Ontology (GO) categories in modules with largest discrepancy between observed and expected number of probesets
Muscle cell differentiation
Mitotic sister chromatid cohesion
Negative regulation of transcription
Aldehide metabolic process
RNA metabolic process
Glycogen metabolic process
Striated muscle contraction
Regulation of striated m. contraction
Ventricular cardiac muscle morphogenesis
Is there a link between breed and sex biased expressions?
The 15 most variable genes among breeds in the male gonads together with their sex z-score.
GO biological process
Phosphorylase kinase, alpha 1
Muscle glycogenosis (X located)
Cysteine-rich secretory protein 1
Mitogen-activated protein kinase kinase kinase 5
MAPKKK cascade, apoptosis
Lysosomal associated protein transmembrane 4 beta
Caspase 8, apoptosis-related cysteine peptidase
Y box binding protein 2
coiled-coil domain containing 71
Protamine 1 (testis specific)
Progestin and adipoQ receptor family member VII
Aldehyde dehydrogenase 1 family, member A1
Aldehide metabolic process
Protein phosphatase 1E
Protein amino acid dephosphorylation
Sperm autoantigenic protein 17
As for the most variable genes across several tissues (Table 5), there was an excess of male biased genes, except ROCK1, all were overexpressed in males. Again, they were not the most sex biased genes. Note that none of the 15 most variable genes among breeds in the male gonads (Table 7) were also present in Table 5, that is, the most variable genes among breeds in the gonads were not the most variable across all tissues. To summarize, the most variable genes among breeds were predominantly male biased. However, the most variable genes were not the most sex biased. An important percentage of the most variable genes were involved in spermatogenesis (Table 7), suggesting that artificial selection targets this biological process, either directly or indirectly.
Discussion and Conclusion
Animal breeding has resulted in breeds that are extremely diverse for a large number of traits. These phenotypic differences are influenced by DNA variants and mediated by distinct transcriptome programmes across breeds. The dissection of the transcriptome breed differences will thus largely illuminate the physiological and genetic causes underlying artificial selection and breed differentiation. Here we reasoned that many changes observed in target selection tissues, i.e., muscle and fat, might actually be due to signals external to the tissue itself, notably through the endocrine system. In addition, and despite the relevance of the endocrine system in animals, the knowledge of its transcriptome is rather scarce.
Among the many statistical approaches employed to analyze microarrays e.g. , here we have adopted a Bayesian approach that is closely related to mixed model methods . The main advantages of these approaches are their modeling flexibility while allowing the whole dataset to be analyzed simultaneously. Both characteristics are important to contrast a number of biological hypotheses with a minimum standard error and, consequently, maximum power. In addition the Bayesian method provides an exact measure of error for variances and variance ratios, whereas convolute approximations are needed in the classical mixed model method. There are also potential hindrances in the analyses reported here. The main one is that variance homogeneity is assumed, unaccounting for differences in variability across probes other than the effects included in the model. For instance, a gene whose expression level is very low across all tissues is less variable than a gene expressed in some tissues and switched off in another tissues. There is a rich literature on heteroskedasticity, especially within the Bayesian paradigm. However, accounting for variance heterogeneity obliges to fit a distinct variance for a gene or a group of genes, which can be extremely hard to compute given the large number of genes in microarrays. An alternative is to analyze each gene separately, but this is also undesirable because many parameters are estimated and the risk of false positive increases.
Nevertheless, the models used here fitted the data quite well, as evidenced by the high heritabilities reported (Table 1), an there are several relevant conclusions that can be drawn from our study. First, we show that probeset is by far the most influential factor, accounting for at least 85% of total variability, whereas tissue explains in the order of 10%. Breed and sex contribute only marginally to total variance in the transcriptome. There are no differences across breeds nor between sexes in this respect. These results agree extremely well with a previous study from our group, although there we employed a different statistical methodology and we analyzed 16 tissues in a smaller number of animals (four) . Although sex and breed were, globally, much less relevant than the probeset effect, sex and breed do influence largely the expression of a subset of genes: those with most extreme z-scores.
Also importantly, we report a strong link between sex – bias and breed variability, which is not caused by large differences in reproductive development (Table 2). The male gonad is the tissue with largest breed heritability (, Table 1). This result is coherent with several independent observations. First, many of the genes that have been identified as undergoing selection in the human and other species are involved in spermatogenesis [17–19]. It is plausible then that artificial selection and breed divergence, which operates through the same mechanisms as evolution, affects also spermatogenesis. Second, modern breeding in livestock targets primarily the male because a sire can leave much more offspring than a dam, and thus selection intensity is usually much higher in males than in females. This would help to explain why sex biased and breed biased genes partially overlap (Table 7, Figure 4). And third, recent work in Drosophila  and references therein have confirmed that sex biased genes exhibit a faster rate of evolution than non biased genes; in addition, male biased genes show a stronger signal of adaptive selection than female biased genes. Nevertheless, although the most breed – biased genes tend to be also sex biased, the most sex biased genes are not among the most breed biased genes. Thus, these two phenomena are inextricably but only partially linked. Similarly, not all genes among those with largest breed – variability are involved in spermatogenesis (Table 7, Additional File 5). Thus, the high breed heritability in the male gonads cannot be explained solely by changes in spermatogenesis. This is certainly an area meriting further research.
A worth noting observation is an elevated number of myogenesis related processes among genes involved in breed differentiation (Table 7). The muscle – the major component of the meat – is the tissue that has been the main target of artificial selection in the pig. We have previously shown  that a number of genes involved in myogenesis were differentially expressed in both tissues. Thus, the excess of muscle development genes in fat and gonads might simply reflect a pleiotropic change caused by a primary effect in muscle. Thus, our initial hypothesis that breed transcriptome differences might affect primarily the endocrine system should be reevaluated, as is not fully supported by our experimental results. In fact, it is quite remarkable that both sex and breed differences at the hypothalamus, one of the key endocrine organs, is smaller than in the rest of tissues studied (Table 3, Additional File 4). Certainly, the endocrine system plays a fundamental role in animal's physiology and consequently in breed and sex differences, but may be transcriptome differences are more pronounced at development stages other than that studied here or affect a very small subset of genes.
As expected in the light of previous research in the pig and in other species, e.g., [3, 7, 21], we find extensive evidence of sex biased probesets. Not surprisingly, the gonads are the most sex biased tissue overall (Table 3, all data are in additional file 2). Globally, the most sex biased genes are also sex biased across a range of tissues, except in the gonads (Table 4, low triangular, and Figure 3). Most sex biased genes in Table 4 were identified previously by us and by an independent group, and some were confirmed by quantitative real time PCR [3, 10]. Several of the gonad sex biased genes identified here are known to be involved in gonadal development in mice and pigs [22, 23], like LHX9, PODL, GATA4, AMH (zsex = 6.7 in gonads, zsex ~ 0 in the rest of tissues), SOX9 (zsex = 12.0 in gonads, ~0 in the rest of tissues). In contrast, we do not find any sex bias for sex determining region (SRY, zsex = 0.08), which initiates the sex differentiation cascade, probably because its temporal expression is very narrow, 10 – 12 days post coitum in the mouse . Follistatin, a glycoprotein forming part of the inhibin-activin-follistatin axis that plays an important role in follicular development within the ovary, is highly overexpressed in ovary (zsex = -16.7) but no significant bias appears in the rest of tissues analyzed. The most female biased gene, nonetheless, is protein tyrosine phosphatase receptor type M (PTPRM), an important signaling molecule that regulates cell growth and differentiation. This gene was already identified in a previous study  as being also strongly female biased.
As genes work coordinately and thus their expression levels are correlated, considering gene modules should be a more powerful approach than analyzing each gene separately. We observe that connectivity varies across tissues (Figure 2) and that the least connected transcriptome occurs when gonads of both sexes are jointly analyzed. This is likely a result of large heterogeneity in expression patterns between ovaries and testicles even before puberty. But, for our purposes, the main use of detecting modules was combining expression bias and connectivity in order to increase power and discover more subtle signals that may not be evident when studying each gene in isolation . Several approaches can be envisaged to attain this. Here, first we identified sets of highly correlated genes (modules) within each tissue using standard techniques , followed by an assessment of whether any module was enriched in breed biased genes. Finally, we looked for over represented gene ontologies among all genes in that module. We found that different modules were enriched in specific ontologies (Table 7), reflecting the modularity of gene expression. Importantly, we identify a series of biological processes (spermatogenesis, muscle differentiation and several metabolic processes) that have been the likely target, direct or indirect, of artificial selection. The next logical step will be to verify whether genes that have been the target of selection (showing evidence, e.g., of a selective sweep) in the pig are enriched in these gene ontologies. At least in humans, there a significant excess of genes undergoing natural selection are involved in spermatogenesis .
Sixteen animals, four from each of four breeds, Large White (LW), Duroc (DU), Youli (YL) and Iberian (IB) piglets were sampled. These breeds represent a wide genetic variability in current pig breeding schemes. There were two males and two females per breed except in Youli, represented by three males and one female. Animals were bought from three breeding companies and transferred to the University experimental farms at weaning, i.e., aged one month approximately. Pigs were housed simultaneously, fed the same diets during the fattening period, that lasted two months, and were weighed at weekly intervals. At the time of slaughter, the average ages were 87, 83, 80 and 89 days for Large White, Duroc, Youli and Iberian pigs, respectively. Their mean live weights at that time were 27.2 (LW), 23.1 (DU), 18.9 (YL) and 17.4 kg (IB).
Animals were euthanized, after 24 h fasting, by an overdose of intravenous sodium thiobarbital. At necropsy, tissue samples were collected, snap frozen in liquid nitrogen and stored at -80°C. The average time gap between euthanasia and tissue collection was ~15 minutes, maximum time was 25 minutes. The tissues collected were hypothalamus (HYPO), adenohypophysis (AHYP), which was separated from the neurohypophysis, thyroid gland (THYG), gonads (GONA) from both sexes, males (GONAM) and females (GONAF), and back fat tissue (FATB). The hypothalamus included the mamillary body and grey tubercle but excluded the chiasma opticum. Throughout this work, each sample was identified by the acronym of the tissue followed by the animal id, e.g., FATB_LWF1 refers to back fat tissue from female 1 Large White. All procedures were approved by the Ethical and Animal Welfare Committee of the Universitat Autònoma de Barcelona, in accordance with the guidelines of the Good Experimental Practices.
RNA extraction and microarray hybridization
Total RNA was extracted from 100 mg tissue using the RiboPure™ kit (Ambion, Austin, USA) according to the manufacturer's protocol. RNA was quantified with the NanoDrop ND-1000 spectrophotometer (NanoDrop Technologies, Wilmington, USA) and the RNA integrity was assessed by Agilent Bioanalyser 2100 and RNA Nano 6000 Labchip kit (Agilent Technologies, Palo Alto, USA). Due to high variation in concentrations of the total RNA obtained in different tissues, all samples were concentrated and cleaned using the RNAeasy MiniElute Cleanup kit (Qiagen, Basel, Switzerland) obtaining final concentrations between 500 and 1000 ng/μl.
A total of 80 microarrays (16 animals × 5 tissues) were hybridized and scanned at the Institut de Recerca Hospital Universitari Vall d'Hebron (Barcelona, Spain). Briefly, the cDNA synthesis was undertaken with 5 μg of total RNA, labelled with biotin and hybridized to individual high-density oligonucleotide microarray chips (GeneChip® Porcine) from Affymetrix (Santa Clara CA) containing a total of 23,935 probeset sets, representing 20,201 Sus scrofa genes, 11,265 of these genes were annotated by Tsai et al. (2006). The hybridization was done according to Affymetrix standard protocols and microarray expression data were generated with GeneChip Operating Software (GCOS). As the annotation provided by the manufacturer is not too detailed, the results in this work are based in the annotation developed by . The complete data set, both GCRMA and original CEL files, are available at Gene Expression Omnibus (GEO) under accession number GSE14739. The original material, tissue conserved at -80 C, is also available on request. Contact the authors for details.
Data processing and statistical analysis
Quality control of CEL files was done with the Affy package of bioconductor : RNA degradation and the raw data distribution were ascertained. All CEL files were normalized simultaneously with the GCRMA procedure . After normalization, a Bayesian approach was adopted for statistical inference. The full model employed was:
y ijkgl = Tissue i + Breed j + Sex k + Probeset g + PT gi + PB gj + PS gk + e ijkgl ,
where PT, PB and PS stand for the probeset × tissue, probeset × breed and probeset × sex interactions, respectively. The subscript l refers to l-th individual. In the Bayesian paradigm, all effects are random. Here we used uninformative flat priors for all parameters. We defined the following heritabilities or ratios of variances: , , and , where the denominator is the total phenotypic variance, i.e., . We also analyzed data subsets, e.g., only females or males, or each breed or each tissue separately. In these instances, appropriate effects were deleted from model (1), e.g., Sex and PS are removed when data from a single sex are analyzed. Bayesian analyses were carried out using standard theory . A Gibbs sampling approach was performed with a home made program (A. Legarra, INRA, France, personal communication). Priors were flat bounded between two very large numbers (-106 – 106); given the large amount of data, priors should have a minor influence here. We employed 10,000 iterates with 2000 burning initial iterates and discarding every 50 to minimize autocorrelation. Note that the dimension of the system in (1) is huge, it used 23,935 probesets × 80 samples ~ 1.9 106 records and contained 287,229 equations. Total CPU time was in the order of three days on a Linux Itanium server. Most of time was spent reading the data and building the equations via linked lists rather than in the Gibbs sampling process itself.
Bayesian methodology offers several advantages over more traditional least square or maximum likelihood approaches and these will not be discussed here [30, 31]. It suffices to mention that the output of the Gibbs sampler are values that follow the marginal posterior distribution of each of the t parameters in the model, p(θt|y), and thus it is straighforward to compute any desired statistics from that distribution taking into account uncertainty on the rest of parameters in the model. Here we report the mean and SD of the posterior distributions of the random components ratios (h2). We also defined the Bayesian z-score that, in analogy to the usual z-score, was obtained from E(θt|y)/SD(θt|y). We computed sex and breed z-scores; the sex z-score for g-th probeset is E [(PSg1 - PSg2)|y]/SD [(PSg1 - PSg2)|y] where subscripts 1 and 2 refer to males and females, respecively. Similarly, the j-th breed z-score for g-th probeset is defined as z gj = E(PBgj|y)/SD(PBgj|y). A rough measure of how much the expression varies across breeds is computed as the standard deviation of the breed Bayesian z-scores, i.e., for probeset g SDzbreed, g = . We also computed an analogous False Discovery Rate statistics . We obtained this from the sex Bayesian z-scores; given that these z-scores follow a N(z-score,1) distribution, we can compute the probability that z-score > 0 and assimilate these to P-values. The complete list of z-scores is available at additonal files 4 and 5. Thoroughout, we employed dendrograms to visualize multivariate results using the hclust R function. The distance chosen was one minus the Pearson correlation (r) across variables, this distance is bound between 0 (r = 1) and 2 (r = -1).
As genes function in concerted action, which is best described by networks, we studied differential connectivity and checked whether most differentially expressed genes belonged to a specific group of highly co-regulated genes (a module). To do that, we characterized the number of modules using the approach in  employing distance 1-r and a cut level of 95%; the minimum number of probesets per module was set to 30. Due to computing constraints, we analyzed only the 12,000 probesets with higher SD than the median. We identified modules using all data jointly or for each tissue separately. For gonads, we did the analysis with both sexes pooled and within sex. In order to identify a potential connection between modularity and bias (differential expression), we studied whether any of the modules contained a larger number of probesets than expected. Suppose a list of nd probesets are either breed or sex biased, and, among them, a subset of nOBS probesets belong to module j. In addition, suppose that nj is the total number of probesets in the module j. The expected number of probesets in the module is simply nEXP = nd × nj/12,000 (because we have selected 12,000 probesets). This value is corrected when not all nd probesets are among the 12,000 employed for the module analysis. We computed the chi-squared statistics, (nOBS - nEXP)2/nEXP, and the associated P-values. We did this for each module and tissue separately.
Gene ontologies and GO over representation were analyzed with onto-tools .
Hormonal concentrations were measured in duplicate in plasma using commercial testosterone (EIA #402510, Neogen Corporation, Lexington, USA), estradiol (EIA #402210, Neogen Corporation) and progesterone EIA kits (EIA #402310, Neogen Corporation). All assays were conducted following the manufacturer's protocol. The assays were validated for pig plasma by demonstrating that serial dilutions of plasma were parallel to the displacement curve for the reference standards. Hormone standards spiked with pig plasma produced accurate results of hormone recovery (102.0 ± 3.6% to 106.5 ± 9.2% in 12 assays, r2 = 0.97).
The data used in this study have been deposited in GEO under accession number GSE14739. The original material (tissue frozen at -80 C) is also available on request. Contact the author for details.
Duroc pig breed
False Discovery Rate
Iberian pig breed
GC Robust Multiarray Average
Large White pig breed
Real Time Quantitative Reverse Transcription Polymerase Chain reaction
Unweighted Pair-Group Method with Arithmetic Mean
Youli pig breed.
We thank A. Legarra for his Bayesian analysis program, N. Navarro for help with R scripts, and T. Reverter for discussions. Thanks also to all people who helped in the dissection and in planning the experiment (J.M. Folch, L.T. Fernandes, I. Garcia-Ispierto, C. Andreu, L. Tussell, S. Tort, S. Morán). The microarray hibridizations were carried out at the Microarray Service, Core Facility of Institute of Research of University Hospital Vall d'Hebron (UCTS IR-HUVH, Barcelona) and subsidized by Fundación Genoma España http://www.gen-es.org. ALJF was recipient of a fellowship from National Council of Technological and Scientific Development (CNPq, Brazil) during this research, AO is funded by a PhD fellowship (Ministry of Education and Science – MEC – Spain). Work funded by MEC grant AGL2007-65563-C02-01/GAN.
- Williams RBH, Chan EKF, Cowley MJ, Little PFR: The influence of genetic variation on gene expression. Genome Res. 2007, 17 (12): 1707-1716. 10.1101/gr.6981507.View ArticlePubMedGoogle Scholar
- Zhang W, Morris QD, Chang R, Shai O, Bakowski MA, Mitsakakis N, Mohammad N, Robinson MD, Zirngibl R, Somogyi E, et al: The functional landscape of mouse gene expression. J Biol. 2004, 3 (5): 21-10.1186/jbiol16.PubMed CentralView ArticlePubMedGoogle Scholar
- Ferraz A, Ojeda A, Lopez-Bejar M, Fernandes L, Castello A, Folch J, Perez-Enciso M: Transcriptome architecture across tissues in the pig. BMC Genomics. 2008, 9 (1): 173-10.1186/1471-2164-9-173.PubMed CentralView ArticlePubMedGoogle Scholar
- Shyamsundar R, Kim YH, Higgins JP, Montgomery K, Jorden M, Sethuraman A, Rijn van de M, Botstein D, Brown PO, Pollack JR: A DNA microarray survey of gene expression in normal human tissues. Genome Biol. 2005, 6 (3): R22-10.1186/gb-2005-6-3-r22.PubMed CentralView ArticlePubMedGoogle Scholar
- Reverter A, Wang YH, Byrne KA, Tan SH, Harper GS, Lehnert SA: Joint analysis of multiple cDNA microarray studies via multivariate mixed models applied to genetic improvement of beef cattle. J Anim Sci. 2004, 82 (12): 3430-3439.PubMedGoogle Scholar
- Kerr MK, Martin M, Churchill GA: Analysis of variance for gene expression microarray data. J Comput Biol. 2000, 7 (6): 819-837. 10.1089/10665270050514954.View ArticlePubMedGoogle Scholar
- Mank J, Hultin-Rosenberg L, Webster M, Ellegren H: The unique genomic properties of sex-biased genes: Insights from avian microarray data. BMC Genomics. 2008, 9 (1): 148-10.1186/1471-2164-9-148.PubMed CentralView ArticlePubMedGoogle Scholar
- Tsuma VT, Einarsoon S, Madej A, Forsberg M, Lundeheim N: Plasma levels of progesterone and cortisol after ACTH administration in lactating primiparous sows. Acta Vet Scand. 1998, 39: 71-76.PubMedGoogle Scholar
- Haeussler S, Wagner A, Welter H, Claus R: Changes of testicular aromatase expression during fetal development in male pigs (Sus scrofa). Reproduction. 2007, 133 (1): 323-330. 10.1530/rep.1.01169.View ArticlePubMedGoogle Scholar
- Fernández AI, Fernández A, Óvilo C, Barragán C, Nieto M, Rodríguez M, Toro M, Silió L: Preferential transcript expression in porcine liver conditional on sex and feeding level. ISAFG Meeting: 2008; Edinburgh. 2008Google Scholar
- Pelliniemi LJ, Lauteala L: Development of sexual dimorphism in the embryonic gonad. Hum Genet. 1981, 58 (1): 64-67. 10.1007/BF00284151.View ArticlePubMedGoogle Scholar
- Cooper AC, Robinson G, Vinson GP, Cheung WT, Broughton Pipkin F: The localization and expression of the renin-angiotensin system in the human placenta throughout pregnancy. Placenta. 1999, 20 (5–6): 467-474. 10.1053/plac.1999.0404.View ArticlePubMedGoogle Scholar
- Yoshimura Y: The ovarian renin-angiotensin system in reproductive physiology. Front Neuroendocrinol. 1997, 18 (3): 247-291. 10.1006/frne.1997.0152.View ArticlePubMedGoogle Scholar
- Ellegren H, Parsch J: The evolution of sex-biased genes and sex-biased gene expression. Nat Rev Genet. 2007, 8 (9): 689-698. 10.1038/nrg2167.View ArticlePubMedGoogle Scholar
- Gentleman R, Carey VJ, Huber W, Irizarry RA, Dudoit S: Bioinformatics and computational biology solutions using R and bioconductor. Springer. 2005Google Scholar
- Kerr MK, Churchill GA: Statistical design and the analysis of gene expression microarray data. Genet Res. 2001, 77 (2): 123-128.PubMedGoogle Scholar
- Thornton KR, Jensen JD, Becquet C, Andolfatto P: Progress and prospects in mapping recent selection in the genome. Heredity. 2007, 98 (6): 340-348.PubMedGoogle Scholar
- Nielsen R, Bustamante C, Clark AG, Glanowski S, Sackton TB, Hubisz MJ, Fledel-Alon A, Tanenbaum DM, Civello D, White TJ, et al: A scan for positively selected genes in the genomes of humans and chimpanzees. PLoS Biol. 2005, 3 (6): e170-10.1371/journal.pbio.0030170.PubMed CentralView ArticlePubMedGoogle Scholar
- Voight BF, Kudaravalli S, Wen X, Pritchard JK: A map of recent positive selection in the human genome. PLoS Biol. 2006, 4 (3): e72-10.1371/journal.pbio.0040072.PubMed CentralView ArticlePubMedGoogle Scholar
- Baines JF, Sawyer SA, Hartl DL, Parsch J: Effects of X-Linkage and Sex-Biased Gene Expression on the Rate of Adaptive Protein Evolution in Drosophila. Mol Biol Evol. 2008, 25 (8): 1639-1650. 10.1093/molbev/msn111.PubMed CentralView ArticlePubMedGoogle Scholar
- Yang X, Schadt EE, Wang S, Wang H, Arnold AP, Ingram-Drake L, Drake TA, Lusis AJ: Tissue-specific expression and regulation of sexually dimorphic genes in mice. Genome Res. 2006, 16 (8): 995-1004. 10.1101/gr.5217506.PubMed CentralView ArticlePubMedGoogle Scholar
- Park SY, Jameson JL: Minireview: Transcriptional Regulation of Gonadal Development and Differentiation. Endocrinology. 2005, 146 (3): 1035-1042. 10.1210/en.2004-1454.View ArticlePubMedGoogle Scholar
- Parma P, Pailhoux E, Cotinot C: Reverse transcription-polymerase chain reaction analysis of genes involved in gonadal differentiation in pigs. Biol Reprod. 1999, 61 (3): 741-748. 10.1095/biolreprod61.3.741.View ArticlePubMedGoogle Scholar
- Reverter A, Ingham A, Lehnert SA, Tan SH, Wang Y, Ratnakumar A, Dalrymple BP: Simultaneous identification of differential gene expression and connectivity in inflammation, adipogenesis and cancer. Bioinformatics. 2006, 22 (19): 2396-2404. 10.1093/bioinformatics/btl392.View ArticlePubMedGoogle Scholar
- Ghazalpour A, Doss S, Zhang B, Wang S, Plaisier C, Castellanos R, Brozell A, Schadt EE, Drake TA, Lusis AJ, et al: Integrating genetic and network analysis to characterize genes related to mouse weight. PLoS Genet. 2006, 2 (8): e130-10.1371/journal.pgen.0020130.PubMed CentralView ArticlePubMedGoogle Scholar
- Nielsen R, Hellmann I, Hubisz M, Bustamante C, Clark AG: Recent and ongoing selection in the human genome. Nat Rev Genet. 2007, 8 (11): 857-868. 10.1038/nrg2187.PubMed CentralView ArticlePubMedGoogle Scholar
- Tsai S, Cassady JP, Freking BA, Nonneman DJ, Rohrer GA, Piedrahita JA: Annotation of the Affymetrix porcine genome microarray. Animal Genetics. 2006, 37 (4): 423-424. 10.1111/j.1365-2052.2006.01460.x.View ArticlePubMedGoogle Scholar
- Reimers M, Carey VJ: Bioconductor: an open source framework for bioinformatics and computational biology. Methods Enzymol. 2006, 411: 119-134. 10.1016/S0076-6879(06)11008-3.View ArticlePubMedGoogle Scholar
- Irizarry RA, Bolstad BM, Collin F, Cope LM, Hobbs B, Speed TP: Summaries of Affymetrix GeneChip probe level data. Nucleic Acids Res. 2003, 31 (4): e15-10.1093/nar/gng015.PubMed CentralView ArticlePubMedGoogle Scholar
- Sorensen D, Gianola D: Likelihood, Bayesian, and McMc Methods in Quantitative Genetics. 2002, New York: Springer VerlagView ArticleGoogle Scholar
- Robert CP: The Bayesian Choice. Springer. 2007, 2Google Scholar
- Benjamini Y, Hochberg : Controlling the false discovery rate – A practical and powerful approach to multiple testing. J Royal Stat Soc Ser B. 1995, 57: 289-300.Google Scholar
- Draghici S, Khatri P, Bhavsar P, Shah A, Krawetz SA, Tainsky MA: Onto-Tools, the toolkit of the modern biologist: Onto-Express, Onto-Compare, Onto-Design and Onto-Translate. Nucleic Acids Res. 2003, 31 (13): 3775-3781. 10.1093/nar/gkg624.PubMed CentralView ArticlePubMedGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.