### Gene expression dataset

We use the gene expression data from Hudson et al.[18] profiling the genome-wide expression in bovine LTL muscle sampled across 26 experimental conditions as follows: ten developmental time points (3 pre-natal, birth and 6 post-natal) across two beef cattle breed crosses (Piedmontese × Hereford and Wagyu × Hereford) and three time points throughout a nutritional deprivation and re-alimentation experiment comprising 3 adult time points for each of the two treatments (hence, 6 experimental conditions comprised of 3 time points × 2 nutrition treatments). The entire gene expression dataset comprising 48 microarrays has been deposited into Gene Expression Omnibus http://www.ncbi.nlm.nih.gov/geo/ and can be accessed using accession number GSE25554.

Following previously described approaches [

60], we fitted the following ANOVA mixed-effect model to normalize the gene expression data:

where Y_{
ikvmn
}represents the *n* -th background-adjusted, base-2 log-intensity from the *m* -th gene (*m* = 1, 2,..., 13094 probes), at the *v* -th experimental condition variety (*v* = 1, 2,..., 26) taken from the *i* -th array (*i* = 1, 2,..., 48 microarray chips), and *k* -th dye channel; μ is the overall mean; C represents a comparison group fixed effect defined as those intensities that originate from the same microarray slide, printing block and dye channel; G represents the random gene (or probe) effects with 13,094 levels; AG, DG, and VG are the random interaction effects of array × gene, dye × gene, and variety × gene, respectively; and e is the random error term. Using standard stochastic assumptions, the effects of G, AG, DG, VG and e were assumed to follow a normal distribution with zero mean and between-gene, between-gene within-array, between-gene within-dye, between-gene within-variety and within-gene components of variance, respectively. Restricted maximum likelihood estimates of variance components and solutions to model effects were obtained using VCE6 software ftp://ftp.tzv.fal.de/pub/vce6/. The solutions to the VG effect were used as the normalized mean expression of each gene (or probe) in each of the 26 experimental conditions under scrutiny.

### Promoter sequence analysis

The genome-wide promoter sequence data for bovine was downloaded from Genomatix database http://www.genomatix.de/ (ElDorado Btau 4, v-07-09). A total of 60,131 promoter sequences derived from 22,050 genes were downloaded. We introduced several filtering steps to ensure only high confidence promoter sequences are selected for the analysis. First, we applied the concept of orthologous promoters [61] and retained only those promoter sequences for which phylogenetically conserved promoter sequences were documented in the human and mouse genomes. Using this criterion we retained 39,696 promoter sequences distributed over 13,623 genes. In the next step, we applied a threshold of 1 (100% confidence) to core and matrix similarities [62]. These editing criteria resulted in a total of 310,316 high confidence TFBS that were used for integration with the gene expression data.

### The Promoterome Matrix

We built a 'Promoterome' matrix (P-matrix) relating predicted TGs with TFs based on TFBSs located in their promoter regions. The rows of P-matrix correspond to TGs for which expression data is available, and the columns correspond to TF genes retrieved from the Genomatix database. We identified 9,242 genes (rows of P-matrix) whose promoter sequence (or sequences) harbours at least one predicted TFBS corresponding to at least one of the 333 TFs (columns of P-matrix). The element P(*i*, *j*) of P-matrix is set to "1" if a promoter region of the *i* -th TG contains a TFBS corresponding to the *j* -th TF, otherwise is set to "0". Among the 333 TFs represented in the columns of P-matrix, there were 178 with expression in the Hudson et al. [18] dataset, including 143 with promoter region information in Genomatix (i.e. these 143 TFs were also represented among the rows of P-matrix). We used PermutMatrix [63] to visualise and analyse the resulting P-matrix (See Additional file 1). In addition, the cross-product matrix resulting from multiplying the P-matrix by its transpose, results in a square and symmetric matrix, P^{T}P, of dimension equal to 333 (i.e. the number of TFs). Diagonal values of P^{T}P matrix (P^{T}P(*j,j*)) correspond to the number of TGs for the *j* -th TF. Off-diagonal values of P^{T}P (P^{T}P(*j,j'*)) correspond to the number of promoter regions in which the TFBS for the *j* -th and the *j'* -th TF co-occur. Co-occurrence is used to build a network of TFs. The hierarchical tree shows a pattern consistent with the non-random assortment of the connectivity distribution with most TFs having few TGs and few TFs having lots of TGs and consistent with a scale-free power-law distribution [64, 65].

### Statistical significance in the co-expression to co-regulation relationship

Following Yu et al. [

21], we used the log odds ratio (LOD) to ascertain the enrichment of a particular TF - TG relationship with respect to random expectation for the occurrence of the observed co-expression and according to the following formulae:

where *P* (co-expression/regulation) is the probability of gene pairs with certain regulatory relationship (e.g. sharing the same predicted TFBS) showing a specific correlation co-expression; and *P* (co-expression) is the probability of randomly selected gene pairs having the same co-expression correlation. LOD values above zero signify observations that are more common than expected by chance, and vice versa.

### Simulation schema

In order to gain further insights into the mechanistic rules governing the observed relationship between co-expression and co-regulation, we generated a series of simulated datasets under various scenarios. Source code in MATLAB (The MathWorks, Inc.) was developed for the simulation schema. In our simulation process a given TF was allowed to act as "dual" (i.e., acting as both activator and repressor of its TGs) or "mono" (i.e., either activator or repressor of its TGs, but not both). In detail, our simulation process followed the following schema:

**Step1** : Define the input vector *D* (*k*) representing the percentage of TFs with "dual" behavior. In detail, *D* (*k*) = (*k* -1) × 10, for *k* = 1, 2,..., 11. That is, *D* was allowed to range from 0% to 100%, by 10% increment.

**Step2** : At the *r-* th iteration (for *r* = 1 to 93), select a total 100 TGs from P-matrix (i.e., 100 rows of P). Selection is at random and without replacement, ensuring with *r* = 93 that all rows of P were sampled.

**Step3** : Randomly assign the 333 TFs from the P-matrix as follows: *D* (*k*) percentage of TFs are set with dual characteristic while the rest are set with 4 "mono" behaviors in equal amounts: activator, repressor, strong activator, strong repressor.

**Step4** : For the regulation mechanism, allow 5 values of activation/repression as follows: *V* = {-6,-3,0,+3,+6}, where 0 indicates the (lack of) impact of a TF to a non-TG, ± 3 for moderate activator (+3) or moderate repressor (-3), and ± 6 for strong activator (+6) or strong repressor (-6) regulation. These values correspond to either one (± 3) or two (± 6) standard deviation units (see next).

**Step5** : Simulate the expression data at 334 time points corresponding to Time 0 and up to Time 333 for each consecutive TF. In order to approximate the distributional properties of the real data, the expression data for the 100 TGs at Time 0 was simulated at random from a normal distribution with a mean of 9 and a variance of 9 (i.e. *N* (9,9)), and truncated at 0 and 18, as lower and upper bounds, respectively.

At Time *t* (*t* = 1, 2,..., 333), the *j* -th TF (where *j* = *t*; hence as many time points as TFs) comes into action and the expression of the *i* -th TG is be modified according to the value of *V*. At Time 333, each of the 100 TGs has expression data at 334 dimensions (or time points).

**Step6** : Return to Step2 until the *k* percentage levels and all 9,242 TGs from P-matrix have been explored.

During the simulation process, we also considered the situation in which TFs came into action in groups of five (i.e. having 20 groups from the 100 TFs considered in each iteration). However, the results did not differ from those obtained using the one-at-a-time scenario described above.

### TF network

The strength of the similarity existing for a given TF pair can be inferred from the number of common TGs. We used the columns of P-matrix as the input for the PCIT algorithm [8] to generate a network of TFs. The resulting network contained 333 nodes (i.e. as many as TFs) linked by 1,395 edges. Edges in the network link TFs predicted to share a significantly large number of targets. Consistent with a non-random assortment in the connectivity distribution, the 10 most connected TFs (i.e. ~3% of the total 333 TFs), referred to as 'hub' TFs, had at least 17 connections each, and were connected to 196 TFs (i.e. ~60% of the total).

### Gene modules and significance of TF to module assortment

The functional modules that emerged in the landscape of Hudson et al. [18]were subjected to further scrutiny to generate a curated list of module genes (Additional file 3). This additional examination was based on gene proximity in the hierarchical cluster analysis according to the PermutMatrix software [63] as well as on the molecular function gene ontology term http://www.geneontology.org.

We focused our attention on six modules as follows: cell cycle, fat, immune, mitochondria, muscle/glycolysis, and the ribosome. These modules were chosen for their likely roles in the determination of muscle mass, intramuscular fat development and energetic efficiency.

The enrichment of the affinity between the TFs and the functionally coherent modules was explored by means of the hypergeometric test of significance. To this respect, for every TF - Module combination, we calculated the probability of having the observed number of module genes among its targets using the following hypergeometric equation:

where *N* is the total number of genes (9,242), *n*
_{
j
}is the total number of TGs for the *j* -th TF, *m*
_{
k
}is the number of genes in the *k* -th module and *m*
_{
j
}is the number of TG of the *j* -th TF belonging to the *k* -th module.

Those TFs with H(*j,k*) <5% and a proportion (m_{
j
}/n_{
j
})/(m_{
k
}/N) >1 are referred to as 'module-specific' TFs. The second condition represents an odds ratio and was applied to account for low hypergeometric probabilities resulting from under-enrichment (i.e. those TFs with a proportion of module genes among their targets less than the proportion of module genes among all genes). Finally, among 'module-specific' TF, we classify as 'module-specific and expressed' TF to refer to those for which expression data is available. These criteria resulted in 44 'module-specific' TFs of which 32 were 'module-specific and expressed' TFs (Table 2).

The odds ratio criteria resulted in 127 out of 333 TFs being allocated to one of the six modules (Table 3). In order to ascertain if this module assignment of TFs provided information about the topology of the network we tested if there exists an independent pairing assortment in the resulting network of 127 TFs and 306 edges using the chi-square test of independence (P-value = 2.8359E-62).