### Construction of the TF binding network

The results presented in this study are based on one choice of parameters used to derive a predictive binding transcriptional network. To validate the robustness of our general observations, we constructed several predictive transcriptional binding networks by exploring the parameter space. Specifically, we used different scanning ranges of promoter regions (1000 bp or 2000 bp upstream of the transcription start site (TSS) to 200bp, 1000bp, 2000bp or position of the second exon downstream of the TSS), and several TRANSFAC thresholds for determining a protein-DNA binding site, including the requirement that there is a sufficient match between the TF PWM in the homologous gene in mouse. We also used only the promoter regions that are conserved in both human and mouse. The results reported here are based on networks constructed using existing experimental binding data and predicted bindings that were generated by using sequences 2000 bp upstream and 2000 bp downstream of RefSeq identified transcription start sites and cut-offs intended to minimize false positives, as provided by TRANSFAC.

The human and mouse Refseq IDs were extracted from the most recently updated human and mouse genome builds (NCBI Build 36.1 and NCBI Build 36 respectively). To define direct regulation connectivity between a transcription factor (TF) and a gene, we used the default parameters (minimization of false positives) for matching the position weight matrices (PWM) stored in Transfac^{®}(version 9.4) with the promoter regions by using the Transfac^{®}MATCH algorithm [49]. TF-target gene links found in the matching step were retained only if the PWM match with the human promoter region, as well as with the promoter of the orthologous gene in the mouse genome (found in the HomoloGene database [50]), were above the default cutoff.

### Enrichment analysis

P-values representing TF-miRNA enrichment were calculated using Fisher's Exact Test. For a TF-miRNA pair such that the TF targets

*m*genes (out of a possible

*N*) and the miRNA targets

*n*genes, the p-value for TF-miRNA association is given by

where*k*is the number of genes targeted by both the TF and the miRNA.

We implemented a False Discovery Rate correction by transforming the Fisher's Exact Test p-values into q-values [31]. We used the Statistics program R and associated "qvalue" package to perform these calculations.

We propose a new test extending Fisher's Exact Test to a situation in which we wish to test for a 3-group association. Given a set of

*N*genes and three subsets of these genes of sizes

*n*,

*m*and

*o*, the

*p*-value of a joint association between all three subsets is calculated as:

where*k*is the overlap between the three groups.

### Bayesian Methodology

Our likelihood model is based upon the multinomial distribution, which is a common choice in modelling count data. For each TF-miRNA pair, suppose that we partition the set of all genes into four subsets according to whether each gene is a target of the TF or miRNA and count the number of genes in each subset:

*N*
_{11},

*N*
_{12},

*N*
_{21}and

*N*
_{22}(e.g.

*N*
_{11}is the number of genes targeted by both the TF and the miRNA and

*N*
_{12}is the number of genes targeted by the TF but not by the miRNA). We regard each model parameter

*p*
_{
ij
}as the probability that a gene contributes to the count

*N*
_{
ij
}(which we assume is constant for all genes). Under the assumption that the genes are independent with respect to whether or not they are targets of each regulator, the likelihood equation is:

for

*p*
_{
ij
}≥ 0 and ∑

_{
ij
}
*p*
_{
ij
}= 1. The Dirichlet prior distribution we are using has the form:

*f*(

*p*
_{11},

*p*
_{12},

*p*
_{21},

*p*
_{22}) ∝

*p*
_{11}
^{α- 1}
*p*
_{22}
^{α- 1}[

53] and taking the product of the likelihood and prior, we obtain :

as our posterior distribution. As an alternative to Fisher's Exact Test, we use the statistic:

where*c*is a positive threshold to be chosen from the data and*I*{*A*} represents the indicator function of a set*A*. This can be interpreted as the posterior probability that the log Odds Ratio (logOR) of association between a TF and a miRNA exceeds*c*. Interestingly, if we choose*α*= 0 within our prior distribution, the*p*-value according to Fisher's Exact Test is equal to*P*{log*OR*> 0} [54], implying perfect agreement between the two methods when*c*= 0. We decided to set*α*= 0.001, which practically gives the same results as Fisher's Exact Test (again when*c*= 0), but guarantees that our posterior distribution will be proper. The choice for*c*is important: choosing a small value of*c*would subject our analyses to sample-size effects, yet choosing a value too large can imply even reasonably large significant associations are ignored.

*Choosing a value of c:*Suppose we look at the 5% most highly ranked miRNA-TF pairs (~2000 pairs). As mentioned earlier, if we use p-values from Fisher's Exact Test to create our ranking, most of the TF-miRNA pairs in our list show a large number of targets for both the miRNA and the TF. More explicitly, if we define*a*^{(0)}= (*a*_{l,l}^{(0)},*a*_{l,g}^{(0)},*a*_{g,l}^{(0)},*a*_{g,g}^{(0)}) = (0.006, 0.016, 0.255, 0.722) where, for example,*a*_{l,g}^{(0)}is the proportion of the top 2000 pairs for which the number of TF targets is less than 376 and the number of miRNA targets is greater than 253 (note that 376 and 253 are the median number of targets for the respective sets of all TFs and all miRNAs), we see that ranking by Fisher's Exact Test can produce too many pairs where both the TF and miRNA have a large number of targets. We can define a similar vector: *a*^{(c)}= (*a*_{l,l}^{(c)},*a*_{l,g}^{(c)},*a*_{g,l}^{(c)},*a*_{g,g}^{(c)}) for other thresholds. We also define: *a*^{(s)}= (*a*_{l,l}^{(s)},*a*_{l,g}^{(s)},*a*_{g,l}^{(s)},*a*_{g,g}^{(s)}) to be the expected vector of corresponding proportions that would result from a 'ranking procedure' that is not subject to such a sample-size bias. Our choice of*c*minimizes the Euclidean Distance between*a*^{(c)}and an estimate of*a*^{(s)}. Our method for estimating*a*^{(s)}is motivated by the observation that if one uses the observed log Odds Ratio to rank TF-miRNA associations, there will be no sample-size bias if the variance of the estimated log Odds Ratio does not change over the pairs. More specifically, to estimate*a*^{(s)}, we first fit a Generalized Additive Model (GAM) [55] to create a 3D surface displaying predicted logORs as a function of the number of targets of the TF and number of targets of the miRNA. Then, for each TF-miRNA pair, we simulate a value of
by adding a residual to the GAM predicted logOR for that pair. The residual for each pair is chosen to be a random draw from the residuals of the GAM. Having simulated a logOR for each pair, we rank the pairs according to their simulated logORs, and store the resulting vector of proportions:
_{1}^{(s)}. Repeating this simulation*n*times and recording
_{
i
}^{(s)}on step*i*we use
as an estimate of*a*^{(s)}. Using n = 1000, and a coarse grid of*c*values {0, 0.1, 0.2, ..., 1}, the optimal threshold for*c*was found to be between 0.6 and 0.7. We used a value of 0.6 in our analysis.