Functional annotation of novel lineage-specific genes using co-expression and promoter analysis
© Kumar et al; licensee BioMed Central Ltd. 2010
Received: 23 November 2009
Accepted: 9 March 2010
Published: 9 March 2010
The diversity of placental architectures within and among mammalian orders is believed to be the result of adaptive evolution. Although, the genetic basis for these differences is unknown, some may arise from rapidly diverging and lineage-specific genes. Previously, we identified 91 novel lineage-specific transcripts (LSTs) from a cow term-placenta cDNA library, which are excellent candidates for adaptive placental functions acquired by the ruminant lineage. The aim of the present study was to infer functions of previously uncharacterized lineage-specific genes (LSGs) using co-expression, promoter, pathway and network analysis.
Clusters of co-expressed genes preferentially expressed in liver, placenta and thymus were found using 49 previously uncharacterized LSTs as seeds. Over-represented composite transcription factor binding sites (TFBS) in promoters of clustered LSGs and known genes were then identified computationally. Functions were inferred for nine previously uncharacterized LSGs using co-expression analysis and pathway analysis tools. Our results predict that these LSGs may function in cell signaling, glycerophospholipid/fatty acid metabolism, protein trafficking, regulatory processes in the nucleus, and processes that initiate parturition and immune system development.
The placenta is a rich source of lineage-specific genes that function in the adaptive evolution of placental architecture and functions. We have shown that co-expression, promoter, and gene network analyses are useful methods to infer functions of LSGs with heretofore unknown functions. Our results indicate that many LSGs are involved in cellular recognition and developmental processes. Furthermore, they provide guidance for experimental approaches to validate the functions of LSGs and to study their evolution.
Placentae exhibit remarkable variation in tissue structure and morphology within and between mammalian clades, and even within a single mammalian order . The diversity of placental architectures is thought to be the result of adaptive evolution arising from rapidly diverging and novel genes [2–4]. A greater understanding of the functional roles that these genes play would provide insights into the molecular basis for the unique phenotypic and metabolic adaptations among closely related mammalian species. Toward that end, we previously identified and bioinformatically characterized novel transcripts in cattle using placenta as a source tissue . These transcripts are lineage-specific (LSTs), and the genes that encode them have no detectable homology to genes outside of that lineage (LSGs). Functional elucidation of LSGs remains a daunting task and only a few have been characterized beyond their expression patterns [5–10]. A complementary approach that would direct the genetic and biochemical characterization of LSGs and their products is functional inference using co-expression  and promoter analysis .
Gene expression is regulated by a complex interaction of transcription factors (TFs) and their binding sites (TFBS) on the gene promoter. Co-expression analysis is based upon the assumption that a high degree of similarity in gene expression profiles correlates with relatedness of their functions . Genes that are highly co-expressed are often regulated by common transcription factor(s), forming sub-networks of genes with a common function . As a general rule, co-regulated genes share a specific arrangement of TFBSs on their promoters. The TFBSs are often located in a specific order relative to the transcription start site (TSS) as well as in a particular orientation with respect to the promoter . For example, Kindy et al.  showed that both strands of the c-myc gene are transcribed in an overlapping fashion and that transcription of the coding and non-coding strands is regulated independently. Yu and coauthors  showed a strong correlation between inter-TFBS distances and their orientation with respect to each other, demonstrating that a combination of TFs rather than an individual TF is the functional unit in tissue-specific gene regulation. Others have shown that the inter-TFBS distance between functionally over-represented TFBS pairs can vary significantly from 10 to 200 bp, although it may be greater than 200 bp in some cases [16–18]. These findings provide insights into factors governing the interactions between specific TFs and document TF pairs that are predicted to act synergistically in a tissue-specific manner  or at specific stages of development .
In a previous work we identified 91 cattle- and cetartiodactyl-specific novel transcripts that included coding sequences and noncoding RNAs (ncRNAs) . In the present work, we have inferred functions of a subset of these LSTs using co-expression analysis. In addition, we identified over-represented TFBSs and their composites in the promoters of co-expressed genes and searched existing databases and the literature for pathways and functions in which these TFs may play a synergistic role in a specific tissue or developmental stage. Using these functional inferences, we predicted sub-networks of genes that may be co-regulated with the LSGs. Our results predict that subsets of these LSGs function in glycerophospholipid/fatty acid metabolism and protein trafficking in liver and near-term placenta, and in processes involving the initiation of parturition and immune system development.
Identification of tissue-specific and time-series clusters
From the liver time-series dataset , 28 of the 49 LSTs that had tissue profiles and 4,711 known expressed genes were selected for clustering after data filtering (see Methods). Two large clusters were identified with average pairwise Pearson correlation r ≥ 0.75, and r ≥ 0.90 between any LST and transcripts encoded by known genes. The identity of the genes in these clusters overlapped, and instead of merging the clusters by lowering the correlation threshold, we selected the largest cluster containing four LSTs and 208 known transcripts (LIVR) for further analysis. These transcripts were co-expressed at seven time-points and two diets (Figure 2). Apart from liver, the genes in this cluster were expressed at higher levels in adrenal gland, cerebrum, and placentome (Additional file 1).
LIVR cluster and functional inference for LSGs
Summary information for nine LSTs co-expressed with known genes.
Bt, Ss, Oa
Bt, Ss, Oa, Ch, Ec,
Over-represented ordered TFBS pairs and unordered TFBS triplets in LIVR, PLAC and THYM co-expression clusters.
aTFBS singles and pairs
CRSD pathway (<10-03) and
Agrin in postsynaptic differentiation;
Agrin in postsynaptic differentiation
Wnt signaling pathway;
nicotinate and nicotinamide
metabolism; signal transduction
AP-2, ZF5, c-Ets
adipocytokine signaling pathway;
HIV-I Nef: negative effector of Fas
glycerolipid metabolism (with STAT
family); prion pathway; mod027529
EGFR-specific transcription factor
(ETF) not found in CRSD
mod003360; mod065501; mod070287
Phosphatidylinositol signaling system;
N-glycan biosynthesis; ribosome;
phospholipase C-epsilon pathway
Ingenuity Pathway Analysis of gene clusters.
Significantbfunctions (F) and canonical pathways (C)
Genes included in the function
glycerophospholipid metabolism (C)
LYPLA1, PGS1, PLCE1, PLCL2
FABP5, GLRX, GRRP1, MET, MLH1, PLCE1, PLXNB2, ASB2, KCTD11, CDK3
repair of DNA (F)
CDC5L, ERCC1, MLH1, NHEJ1, NTHL1, POLI, XRCC1
immune response of organism (F)
CD48, GATA3, MX1
development of epidermis (F)
ALDH3A2, FABP5, GJB5
Wnt/beta-catenin signaling (C)
CDH1, CSNK1G2, DKK1, TLE4
acute-phase response signaling(C)
FOS, HMOX2, PTPN11, SOD2
tissue morphology--size (F)
CDKN1C, DLX5, IGF2, STC1, PTGS2, FOS, CDH1
small molecule biochemistry- transport of amino acids and synthesis of prostaglandin (F)
SLC7A3, STX1A, COMT, PTGS2, IGF2, FOS, IGFBP7, CYP4A22, BCAT, STC1, MAN2A1, PTPN11, TFPI, SOD2
embryonic development-- proliferation and formation of embryonic tissue (F)
ESM1, MED28, PTGS2, CDH1, DKK1, FOS, HAND1
development of embryonic and trophoblast cells (F)
CDKN1C, HAND1, IGF2, PTPN11
cell cycle--entry into cell stage (F)
CDH1, CDKN1C, FOS, MAD2L1, PTPN11, SOD2
CYP4A22, IGF2, IGFBP7, PTGS2, STC1, PTPN11, COMT
cell adhesion (F)
CASK, CD151, CDH1, IGFBP7, MAD2L1, MAN2A1, PTPN11, PVRL2, TFPI
CASK, CDH1, CDKN1C, DKK1, DLX5, FADD, FOS, GATAD2A, HAND1, IGF2, MED28, MSX1, PTPN11, RP13-122B23.3, SNAPC2, SOD2, SPEN, TARBP2, THOC4, TLE4, UBTF, ZNF281
cancer--cell death of tumor cell lines(F)
CDH1, CDKN1C, DKK1, FADD, FOS, IGF2, IGFBP7, IHPK2, MAD2L1, MSX1, PTGS2, PTPN11, SOD2, UBTF
cellular growth and proliferation (F)
BTG1, CDCA7, ELF1, HMGB1, NCOR2, PCNA, PTK2, TCF12, ZFP36L2, ASXL1
cell death (F)
PCNA, TRAP1, PLA2G7, BTG1, NCOR2, HMGB1, PTK2, TCF12, ZFP36L2
gene expression--transcription and transactivation (F)
HMGB1, HMGB2, PCNA, ELF1, ASXL1, NCOR2, ZBTB7A, BTG1, PTK2, TCF12, NXF1
immune and lymphatic system development and function (F)
HMGB1, TCF12, PTK2, CDCA7, NCOR2
PLAC cluster and functional inference for LSGs
The PLAC cluster was expressed preferentially in placentome and consisted of 116 genes, including three that are LSGs, 34FL, 22JE, and 104JE (Table 1; Figure 3A). On the basis of PSI-BLAST search  and multiple sequence alignments we have annotated one of the LSTs, 34FL [GenBank: NM_001105478], as an SSLP-1 (secreted seminal vesicle protein) homolog, which belongs to a class of secreted Ly6 domain containing proteins. The predicted protein product of 34FL, like the SSLP-1 glycoprotein in mouse , has 10 cysteines and contained the conserved C-terminal CCXXXXXCN motif, indicating that it is a member of the SSLP-1 secreted Ly-6 glycoprotein subfamily (Additional file 3). In addition, 34FL was predicted by PSORTII  to contain a signal peptide, and was localized to the extracellular region providing evidence that it is a secreted protein. Furthermore, the 34FL gene was located on BTA29 in an orthologous region that is syntenic with mouse SSLP-1 on MMU9.
The PLAC cluster was not found to be enriched for any single TFBS or TFBS triplets. However, we identified four TFBS composite pairs in the cluster (Table 2). The pair, STAT*Pax-2 (signal transducer and activator of transcription; paired homeobox 2), was predicted in the LSG 34FL, PAG2 (Pregnancy associated glycoprotein), and PTGS2 (COX2, prostaglandin-endoperoxide synthase 2). The motifs predicted ab initio by ANN-Spec in the cluster had significant matches to NF-κB (nuclear factor kappa B), MAZ (Myc-associated zinc finger), and Sp1 TFBSs. All three sites were predicted in the cluster at varying frequency, although none were predicted in an LSG (Additional file 2). The cluster was found to be enriched for Wnt/β-catenin signaling and acute phase response (APR) signaling pathways. Other enriched IPA functions in the PLAC cluster were transport of amino acids and synthesis of prostaglandins, adhesion, development of trophoblast cells, and lipid metabolism.
THYM cluster and functional inference for LSGs
A thymus-specific cluster (THYM) was identified, consisting of 32 genes, including two LSGs 383NG and 21PW. Both of these are single-exon transcripts (Figure 3B) and have multiple ESTs from different libraries as evidence of transcription. 383NG is a paralog that has been duplicated in two other locations on the same chromosome . The THYM cluster was found to be enriched for v-Myb (myeloblastosis viral oncogene homolog) and KROX (also EGR, early growth response gene) TFBSs (Table 2). Three TFBS composite pairs were over-represented in the THYM cluster, of which one, Nkx2-5*CdxA, was predicted in the LSG 21PW and ASXL1 (Additional sex comb-like 1). An ab initio predicted motif matched the IRF (Interferon regulatory factor) TFBS. IRF-1 was identified in 21% of the genes in the cluster, including the LSGs 383NG and 21PW (Additional file 2). An analysis of the THYM cluster using IPA showed enrichment for genes involved in apoptosis, immune and lymphatic system development, transcription and trans-activation, and cell proliferation (Table 3).
Gene interaction network for the LIVR cluster
Functional elucidation of a novel gene is a challenging task. We have used an informatics-based strategy (Figure 1) to infer functions of a set of LSGs first found expressed in a cattle term-placenta cDNA library . This was accomplished by generating co-expression clusters (Figures 2 and 3) using LSTs as seeds to cluster other genes from two microarray datasets consisting of transcript profiles from 18 cattle-tissues and liver of animals fed two different diets at several peripartal time-points. We then identified over-represented TFBSs and their composites in the promoters of co-expressed genes, and searched existing databases and the literature, for pathways and functions in which these TFs may play a role in a specific tissue or developmental stage. Yu and co-authors found that genes targeted by the same TF tend to be co-expressed, with the degree of co-expression increasing if genes share more than one TF . This provides significant validation of our approach, and gives us confidence in the sub-networks of co-regulated genes that were identified. We present below a synthesis of our results with the aim of supporting the inferred functions of LSGs in each cluster.
Evidence supporting inferred functions for LSGs in the LIVR cluster
The LIVR cluster was found to be enriched for genes in the glycerophospholipid metabolism pathway, DNA repair, transport, cell death, organ development of epidermis, and immune response functions (Table 3). These pathways and functions are also characteristic of term placenta , which was the source tissue used to create the cDNA library from which the LSTs were identified. In support of the correlated pathways and functions of genes in liver and placenta we also found that the LIVR cluster genes are expressed in placentome (Additional file 1). Glycerophospholipid metabolism plays a significant role in the onset of labor in humans , and apoptosis and immunological processes are known to represent important cellular functions in term-placenta . The overlapping functions likely represent common subpopulations of cells in liver and placenta, such as macrophages and lymphocytes.
Genes in the LIVR cluster were enriched for p53 and Oct-1 TFBSs. p53 exerts a variety of regulatory effects following DNA damage . An Oct-1 TFBS has been predicted in the 39NG promoter along with a PPARγ site. PPARγ works in concert with Oct-1 to mediate transcriptional activation of GADD45 (growth arrest and DNA damage-inducible gene 45) . The presence of both Oct-1 and PPARγ sites on the 39NG promoter suggests a role for the encoded protein in DNA repair processes in response to DNA damage. In addition, the protein is predicted by PSORTII to be a nuclear protein, which supports such a role. A paired TFBS composite, Srebp-1*Pax-8, was significantly over-represented in the LIVR cluster, and was predicted in two LSGs, 237NG and 266NG. It was also predicted in PLCE1, NGLY1, and TRIP10, which have known roles in fatty acid (FA) metabolism, turnover of glycoproteins, and lipid binding, respectively (see Additional file 4 for protein functions). Srebp-1 is known to regulate genes involved in the biosynthesis of fatty acids, triglycerides and phospholipids in liver and adipocytes , and has been shown to play a role in glycerophospholipid metabolism  suggesting that 237NG and 266NG are also involved in these processes. Smith and coauthors reviewed evidence that show Pax-8 works together with Srebp-1 to target PPARγ (peroxisome proliferator-activated receptor gamma) in adipocytes and liver . Some of these LIVR genes were shown to form sub-networks that participate in glycerophospholipid metabolism, protein transport and signaling pathways in liver (Figure 5). The LSG, 237NG, is inferred to play a role in glycerophospholipid metabolism and cytokine signaling, and is one of the hub genes.
Inferred biological functions of LSTs.
Involved in glycerophospholipid/fatty acid metabolism, cell
signaling and protein trafficking in epithelial cells. 39NG
possibly plays a role in DNA repair processes in response to
DNA damage. Responsive to differences in pre-partum plane of
nutrition at time-points +1, +14 after onset of lactation (Figures
Preferential expression in placentome; involved in immune
response, acute phase and inflammatory processes. 34FL is a
pre-term and term placentome-specific SSLP-1 glycoprotein,
possibly involved with PAG2 and PTGS2 in the final events
before parturition at the feto-maternal interface.
Preferentially expressed in thymus and may play a role in
immune system development and cell-proliferation. 21PW may
play a role in gene activation in fetal thymus development.
Expression of the LIVR genes was found to be affected by pre-partum diet. They were down-regulated by restricted feeding at +1 and +14 days postpartum suggesting that the predicted functions (e.g., apoptosis, glycerophospholipid metabolism, DNA repair mechanisms, and cell signaling) are down-regulated during the early post-partum period when the animals are fed restricted diets that do not meet 100% of the estimated energy requirements during the non-lactating period. This management strategy is more successful in preparing the animal to the onset of parturition and lactation, and leads to lower incidence of metabolic disease . Therefore, animals on a higher plane of nutrition (i.e. consuming diets to meet or exceed energy requirements) show increased inflammatory responses, apoptosis, and DNA repair; a conclusion shared by Loor and coauthors . Above, we suggested that glycerophospholipid metabolism is a common function in liver and near-term placenta in animals approaching labor and delivery. Metabolic processes in both tissues have been shown to be affected by diet in non-ruminants. For example, in pregnant mice the FA composition in the mother's diet influences the maternal liver and fetal placenta FA composition [45, 46]. These findings suggest that the LIVR genes, many of which are involved in FA-linked functions, protein transport and cell signaling, play similar diet-responsive roles in both liver and placenta of pregnant animals (Figure 5), given that nearly all (99%) of the LIVR genes, including the LSGs, are also expressed in the placenta (Additional file 1).
Evidence supporting inferred functions of LSGs in the PLAC cluster
The PLAC cluster genes were found to be preferentially up-regulated in placentome and enriched for specific processes in the placenta; e.g. transport of amino acids and synthesis of prostaglandins, trophoblast cell adhesion, lipid metabolism, transcription, and cell proliferation (Table 3). The cluster is also enriched for acute phase response (APR) genes, which function to restore homeostasis. These APR gene products are a variety of serum proteins synthesized in increased amounts in response to trauma and infection. Given that labor and delivery result in oxidative and immunological stresses, with APR and apoptotic responses in placental tissue , APR enrichment provides a snap-shot of these processes in near-term placenta. The cluster is also enriched for Wnt/β-catenin signaling, which has been shown to play a central role in coordinating uterus-embryo interactions required for implantation in mouse .
The composite TFBS pair, STAT*Pax-2, was over-represented in three co-expressed genes; 34FL, PAG2, and PTGS2. The PWM for the predicted STAT binding site is common to a range of STAT proteins that are involved in the development and function of the immune system and play a role in maintaining immune tolerance and tumor surveillance. PTGS2 is a biosynthetic isoenzyme that was shown in pregnant cows and guinea-pigs to be involved in intrauterine prostaglandin (PG) synthesis, which is crucial for the initiation of parturition [49, 50]. PTGS2 was found to be 20-fold greater in cattle term placentomes (delivery at 260 days or later) compared with preterm placentomes (delivery between day 174 and day 260 of gestation) further supporting its role in parturition . Given that our data show that 34FL (a predicted SSLP-1 glycoprotein), PAG2, and PTGS2 are highly co-expressed and predicted to be regulated by STAT TFs, we suggest that 34FL also plays a role in pregnancy and/or parturition.
The ANN-Spec motifs predicted ab initio in the PLAC cluster have significant matches to TFBSs for NF-κB (nuclear factor kappa B), MAZ (Myc-associated zinc finger), and Sp1 (Additional file 2). NF-κB is known to initiate transcription for a variety of genes that are involved in immune response, acute phase and inflammatory processes . It has been located in human fetal membranes and decidua at term and pre-term delivery . The physiological expression of COX-2 (PTGS2) in rat trophoblast involves a sustained activation of NF-κB, and its inhibition abrogates the inducibility of PTGS2. This result functionally links NF-κB and PTGS2 with the other co-expressed genes in the PLAC cluster, suggesting a complex role for glycoproteins including 34FL in initiating and orchestrating the cell biology at the feto-maternal interface before parturition (Table 4).
Evidence supporting functional inference for LSGs in the THYM cluster
The thymus is an immune system organ that is of central importance to the maturation of T lymphocytes. Genes in the THYM cluster are enriched for the related functions immune system and lymphatic system development, cell death, and cellular growth and proliferation (Table 3). The v-Myb TFBS was over-represented in the THYM cluster and predicted in LSG 383NG. The v-Myb oncogene product causes late onset T cell lymphomas when expressed in the T cell lineage of transgenic mice , thus suggesting a role for this LSG 383NG in cell-proliferation. The TFBS composite pair Nkx2-5*CdxA was over-represented in promoters of 21PW and ASXL1 (additional sex comb-like 1). CdxA and Nkx2 have been shown to be markers for endoderm germ layer patterning during gastrulation, a process necessary for formation of the thymus . The AsxL1 gene in Drosophila is required to maintain homeotic gene activation and silencing, and its homologs have been identified in mouse and found to be expressed in thymus . The roles played by the TFs CdxA and Nkx2 in endoderm germ layer patterning, and that of ASXL1 in homeotic gene activation and silencing support a role for the LSG 21PW in thymus development. Furthermore, the IRF-1 TFBS, which regulates IL-15 gene expression and influences the development of T-cells and natural killer cells in the thymus , is predicted in LSGs 383NG and 21PW (Additional file 2). Taken together, these findings implicate 383NG and 21PW in immune system development and cell-proliferation (Table 4).
We selected the placenta as a model system to identify and functionally characterize novel LSGs because of its unique characteristics as a rapidly evolving physiological system in mammals. As we and others have shown, the placenta is a rich source of expressed LSGs and rapidly diverging genes [2, 3, 57–60]. Such genes are candidates for adaptive placental functions acquired by the ruminant lineage. We used a combination of cluster analysis, promoter analysis, WCGNA, and gene annotation to predict the functions of nine previously uncharacterized LSGs (Table 4) from a starting set of 49 (18%). The stringent analysis criteria produced unique and highly correlated gene expression clusters among 18 different tissues and across seven time-points and two diets in liver (Figures 2 and 3). The three clusters analyzed contained nine LSTs, seven of which are encoded by presumptive novel protein encoding LSGs and two are presumptive ncRNAs . Our results represent a major advance in characterizing the novel LSTs expressed in bovine placenta and have yielded predictions of functions that are consistent with their putative role in ruminant reproductive and immune physiology.
As additional animal genomes are sequenced and the numbers of novel genes with unknown functions increases, our approach establishes a valuable precedent for future studies. We show that it is possible to identify and characterize a significant fraction of lineage-specific genes bioinformatically, which may guide hypothesis-driven experiments to determine their biochemical and cellular functions. These may in turn yield new insights into the role of LSGs in speciation and adaptive evolution.
In a previous study, 91 novel transcripts were identified in a cattle placenta cDNA library. These LSTs were characterized on the basis of their genomic distribution and annotation in Btau_2.0 and expression patterns in 18 cattle tissues . For the present work, the annotation was updated to Btau_3.1 (December 2007) . Of the original 91 LSTs, 63 currently have no matches to non-Cetartiodactyl sequences in public databases . The remaining 28 transcripts were not considered in this study as they were re-annotated as representing divergent homologs.
Two cDNA microarray expression datasets profiling ~7,000 cattle genes were used. The cDNAs used for the array were selected from a near-term cattle placenta cDNA library . The first dataset (GEO GSE3029) was obtained by profiling total RNA from 18 cattle tissues . For this dataset, transcripts were included in the analysis if the intensity was above the median signal intensity of negative control spots present on the array, and in addition, the minimum intensity was 250 units in at least one sample-point. The second dataset [GEO: GSE3331] was generated by temporal gene expression profiling of liver RNA during the peripartal period in Holstein cows fed with a moderate energy ad-libitum, or restricted diet in which the animals were fed to consume ca. 80% of their calculated energy requirements from -65 days until parturition . The temporal data spanned -65 to +49 days relative to parturition for animals receiving each diet. Expression levels of the transcripts were analyzed further if the intensity was above the median signal intensity of negative control spots present on the array, the minimum raw intensity was 150 in at least one sample-point, and the relative expression compared to the control was statistically significant in at least one sample-point with a raw P-value (P) < 0.05. For both datasets, only those intensity spots that were flagged as 'present' were included in the analysis.
Tissue expression profile clustering
Among the 63 LSTs, 49 were present in at least one of the 18 tissues with a raw intensity of 250 (Additional file 5). In addition to these 49 LSTs, expression levels of 6,178 transcripts passed this filter. The LSTs were clustered using Pearson correlation (r) threshold of 0.90. A representative was selected from each cluster and the un-clustered LSTs were self-represented. Genes on the array that co-expressed with each of these LSTs at r ≥ 0.90 were grouped into clusters that included all co-expressed LSTs. The cluster was adjusted to bring the average cluster r ≥ 0.75.
Clustering of temporal liver gene expression profile
Of the 49 LSTs, 28 were present in at least one liver sample with a raw intensity of 150 and significantly expressed at P < 0.05 compared to a control mixture of tissues excluding liver. Expression of 4,711 unique genes passed these filter conditions. The temporal profiles of the LSTs were clustered hierarchically using gene-condition clustering as implemented in GeneSpring. The liver gene expression profiles of representative LSTs were used as seeds to identify co-expressed genes from the 4,711 genes on the array using Pearson correlation (r) at a threshold r ≥ 0.90. As before, the clusters were adjusted to bring the average cluster r ≥ 0.75.
A mixed effects model using the SAS procedure Proc MIXED (SAS Institute, Cary, North Carolina, USA)  was used on the 212 unique genes in the liver cluster to determine expression differences between groups of animals on two diets (moderate energy ad-libitum and restricted) at different time points (-65, -30, -14, +1, +14, +28, +49 days). The LOG2-transformed ratios were analyzed for each gene using a mixed model that included the fixed effect of diet within time point. Statistically significant P-values for the models were adjusted for multiple comparisons using the Benjamini-Hochberg false discovery rate (FDR) correction .
Functional annotation and assignment of genome coordinates of genes in clusters
Functions, gene symbols and genome coordinates were assigned to each clone accession on the array using RefSeq (Btau_3.1) and human protein annotations in UCSC genome browser tables . Manual curation of the clusters involved removing identical genes and using the UCSC browser to check if each gene was annotated with the correct gene symbol and genome coordinates. This manual inspection was crucial for ensuring the transcription start site of genes and their promoter regions.
Mammalian regulatory elements are concentrated near transcription start sites (TSS). For this reason, promoter analysis was concentrated on the proximal promoter region, -1000 to +100 bp relative to the TSS. Both unmasked and repeat-masked promoter sequences [-1000, +100] were extracted for gene clusters from Btau_3.1 using the UCSC Genome Table browser. To identify TFBSs that are over-represented in the gene clusters, we used promoters of unique cattle RefSeq genes as the background set. The coordinates for a non-redundant set of Btau_3.1 RefSeq genes were downloaded from UCSC Genome Table Browser, and the proximal promoters [-1000, +100] were extracted as described for the clusters.
Identification of transcription factor binding site (TFBS)
Vertebrate-specific TFBSs were predicted by scanning the repeat-masked promoters using the Match program  with a core similarity threshold of 0.9 and a matrix similarity threshold of 0.85. The promoters in each cluster were searched against a predefined matrix profile in the TRANSFAC Professional 11.4 database . This database contained a set of 214 high-quality, vertebrate-specific, non-redundant position weight matrices (PWMs) with minimized false positives ("vertebrate_non_redundant_minFP" with high-quality matrices selected). Only a single occurrence of a TFBS was counted in each promoter, and the predicted TFBS counts in each cluster were compared to those in the cattle RefSeq promoter set using Fisher's exact test (FET), which is based on a hypergeometric distribution. The computed P-values were adjusted for multiple comparisons using the Benjamini-Hochberg FDR correction .
Identification of over-represented co-occurring TFBS combinations
Two TFBSs were defined as co-occurring if they were distinct, non-overlapping, and their PWMs had a core similarity threshold of 0.9 and matrix similarity threshold of 0.85 in the output from the Match program. The Match output lists the PWM matches in their positional order on the promoter. Both ordered (A-B ≠ B-A relative to the TSS) and unordered (A-B = B-A) TFBS pairs and triplets were predicted separately, and the orientation of the TFBSs was ignored. To prevent double-counting, only a single occurrence of a combination was counted per gene. For the unordered combinations, non-redundancy was ensured by collapsing each identified combination in its sorted order (A-B-A or A-C-B or B-A-C = A*B*C), and then counting only a unique occurrence of the unordered combination within a gene promoter. Unordered TFBS composites were denoted with a comma separating the sites. Composite ordered TFBSs were denoted with an asterisk between the sites indicating that they were predicted to be co-occurring in that order in the promoter relative to the TSS. TFBS pairs and triplets were predicted for three different minimum threshold distances of 20 bp, 50 bp, and 100 bp between the TFBSs to identify all adjacent non-overlapping TFBS combinations . The maximum allowed inter-TFBS distance was set to 250 bp.
The counts of the ordered pair and triplet TFBSs were computed for each cluster of genes and compared to the counts of the respective pairs and triplets in the background RefSeq promoters using Fisher's exact test. A minimum cell count of five was necessary for comparisons. The computed P-values were adjusted for multiple comparisons using the FDR correction as before, and comparisons were deemed significant if the adjusted P-value was ≤ 0.1. The entire analysis was carried out for both repeat-masked and unmasked promoters and significant predictions in the repeat-masked promoters had to be predicted in the unmasked cluster promoters to be selected. This precautionary measure ensured that no predictions were within repeats. TFBS prediction results were manually checked for the presence of over-represented composites.
Ab initio motif prediction and comparison to known TFBSs
ANN-Spec was used for ab initio prediction of motifs that were common to an entire cluster. For each cluster of genes, the motif predictions were made on unmasked promoters, using the unmasked Btau_3.1 RefSeq promoters from which the cluster genes were subtracted as background. ANN-Spec was run iteratively by varying the predicted motif length from 6 to 16 bp and setting the run cycle (parameter m) to 100. The PWMs of predicted motifs were parsed from the ANN-Spec output. Tomtom was used to compare the predicted PWMs with TRANSFAC v11.4 PWMs and comparisons with P < 0.01 were deemed significant. Logos depicting the frequency of each nucleotide at each position of a motif were generated for the ANN-Spec-predicted and corresponding matching Transfac PWMs using the EnoLOGO web server .
Functional classification of clusters
Ingenuity Pathway Analysis (version 5.5)  was used to identify functional enrichment in the clusters, using the respective source gene sets (6,149 genes in tissue experiments, 4,711 in liver time-series experiments) as reference. The Ingenuity Pathway Knowledge Base (IPKB) was used as the source database for biological function and pathway assignment to genes. The significance threshold for function and pathway enrichment was P ≤ 0.05.
To identify known pathways in which the TFs were involved we queried the CRSD, which consists of miRNA, TF and gene expression regulatory signatures assigned to specific BioCarta and KEGG pathways using genome-wide enrichment analysis . A Perl script was written that accepted a TFBS composite and parsed the dataset for the co-occurrence of the TFs in the composite in a common pathway at a P < 10-03. In addition, we used the Predicted Regulatory Module (PREMOD) database  to identify any known modules within our set of TFBS composites.
Identification of genes with the highest connectivity using WGCNA
where aij is the adjacency between two genes i and j, x is the expression of a gene, and β is the power factor for a scale-free network. For the LIVR cluster, this power was 8 as determined by the scale-free network criterion provided by the authors . Default parameters were used for module generation. Gene connectivity was determined, and the top five genes with the highest connectivity (hub genes) were identified using 1.2 as a cutoff for gene significance and intramodular connectivity (K/Kmax) cutoff of 0.95.
Network inference using GeneGO
GeneGO MetaCore was used to identify known interactions in the LIVR cluster of genes, modeled on the human interaction database included in GeneGO.
List of Abbreviations used
Transcription factor binding site
position weight matrix
cluster of genes expressed in liver and showing effect of diet
cluster of genes preferentially expressed in cattle placenta
cluster of genes preferentially expressed in cattle thymus
cluster of genes preferentially expressed in cattle skin
cluster of genes preferentially expressed in cattle adrenal gland, thalamus, and cerebellum
Fisher's exact test
Benjamini-Hochberg false discovery rate.
We would like to thank Prof. Sheng Zhong, Department of Bioengineering, University of Illinois at Urbana-Champaign, for providing critical advice on gene clustering using the seeding approach and Dr. Denis Larkin for his helpful comments on the manuscript. We would also like to thank the UCSC genome bioinformatics staff for their help at various times with data download using the genome browser tables.
- Mossman HW: Vertebrate Fetal Membranes. 1987, Houndmills: MacMillan, 288-290.View Article
- Kumar CG, Larson JH, Band MR, Lewin HA: Discovery and characterization of 91 novel transcripts expressed in cattle placenta. BMC Genomics. 2007, 8: 113-128. 10.1186/1471-2164-8-113.PubMed CentralPubMedView Article
- Larson JH, Kumar CG, Everts RE, Green C, Wind Everts-van der A, Band MR, Lewin HA: Discovery of eight novel divergent homologs expressed in cattle placenta. Physiol Genomics. 2006, 25: 405-413. 10.1152/physiolgenomics.00307.2005.PubMedView Article
- Lespinet O, Wolf YI, Koonin EV, Aravind L: The role of lineage-specific gene family expansion in the evolution of eukaryotes. Genome Res. 2002, 12: 1048-59. 10.1101/gr.174302.PubMed CentralPubMedView Article
- Yoon H, He H, Nagy R, Davuluri R, Suster S, Schoenberg D, Pellegata N, Chapelle Ade L: Identification of a novel noncoding RNA gene, NAMA, that is downregulated in papillary thyroid carcinoma with BRAF mutation and associated with growth arrest. Int J Cancer. 2007, 121: 767-75. 10.1002/ijc.22701.PubMedView Article
- Martens JA, Laprade L, Winston F: Intergenic transcription is required to repress the Saccharomyces cerevisiae SER3 gene. Nature. 2004, 429: 571-4. 10.1038/nature02538.PubMedView Article
- Westerman BA, Poutsma A, Steegers EA, Oudejans CB: C2360, a nuclear protein expressed in human proliferative cytotrophoblasts, is a representative member of a novel protein family with a conserved coiled coil-helix-coiled coil-helix domain. Genomics. 2004, 83: 1094-104. 10.1016/j.ygeno.2003.12.006.PubMedView Article
- Krause A, Sillard R, Kleemeier B, Klüver E, Maronde E, Conejo-García JR, Forssmann WG, Schulz-Knappe P, Nehls MC, Wattler F, Wattler S, Adermann K: Isolation and biochemical characterization of LEAP-2, a novel blood peptide expressed in the liver. Protein Sci. 2003, 12: 143-52. 10.1110/ps.0213603.PubMed CentralPubMedView Article
- Onyango P, Lubyova B, Gardellin P, Kurzbauer R, Weith A: Molecular cloning and expression analysis of five novel genes in chromosome 1p36. Genomics. 1998, 50: 187-98. 10.1006/geno.1997.5186.PubMedView Article
- Jordan KL, Evans DL, Steelman S, Hall DJ: Isolation of two novel cDNAs whose products associate with the amino terminus of the E2F1 transcription factor. Biochemistry. 1996, 35: 12320-8. 10.1021/bi9611927.PubMedView Article
- Spellman PT, Sherlock G, Zhang MQ, Iyer VR, Anders K, Eisen MB, Brown PO, Botstein D, Futcher B: Comprehensive identification of cell cycle-regulated genes of the yeast Saccharomyces cerevisiae by microarray hybridization. Mol Biol Cell. 1998, 9: 3273-97.PubMed CentralPubMedView Article
- Tavazoie S, Hughes JD, Campbell MJ, Cho RJ, Church GM: Systematic determination of genetic network architecture. Nat Genet. 1999, 22: 281-5. 10.1038/10343.PubMedView Article
- Werner T: Proteomics and regulomics: the yin and yang of functional genomics. Mass Spectrom Rev. 2004, 23: 25-33. 10.1002/mas.10067.PubMedView Article
- Kindy MS, McCormack JE, Buckler AJ, Levine RA, Sonenshein GE: Independent regulation of transcription of the two strands of the c-myc gene. Mol Cell Biol. 1987, 7: 2857-62.PubMed CentralPubMedView Article
- Yu X, Lin J, Masuda T, Esumi N, Zack DJ, Qian J: Genome-wide prediction and characterization of interactions between transcription factors in Saccharomyces cerevisiae. Nucleic Acids Res. 2006, 34: 917-27. 10.1093/nar/gkj487.PubMed CentralPubMedView Article
- Hu Z, Hu B, Collins JF: Prediction of synergistic transcription factors by function conservation. Genome Biol. 2007, 8: R257-76. 10.1186/gb-2007-8-12-r257.PubMed CentralPubMedView Article
- Ho Sui SJ, Mortimer JR, Arenillas DJ, Brumm J, Walsh CJ, Kennedy BP, Wasserman WW: oPOSSUM: identification of over-represented transcription factor binding sites in co-expressed genes. Nucleic Acids Res. 2005, 33: 3154-64. 10.1093/nar/gki624.PubMed CentralPubMedView Article
- Hannenhalli S, Levy S: Predicting transcription factor synergism. Nucleic Acids Res. 2002, 30: 4278-84. 10.1093/nar/gkf535.PubMed CentralPubMedView Article
- Yu X, Lin J, Zack DJ, Qian J: Computational analysis of tissue-specific combinatorial gene regulation: predicting interaction between transcription factors in human tissues. Nucleic Acids Res. 2006, 34: 4925-36. 10.1093/nar/gkl595.PubMed CentralPubMedView Article
- Keller MA, Addya S, Vadigepalli R, Banini B, Delgrosso K, Huang H, Surrey S: Transcriptional regulatory network analysis of developing human erythroid progenitors reveals patterns of coregulation and potential transcriptional regulators. Physiol Genomics. 2006, 28: 114-28. 10.1152/physiolgenomics.00055.2006.PubMedView Article
- Everts RE, Band MR, Liu ZL, Kumar CG, Liu L, Loor JJ, Oliveira R, Lewin HA: A 7872 cDNA microarray and its use in bovine functional genomics. Vet Immunol Immunopathol. 2005, 105: 235-45. 10.1016/j.vetimm.2005.02.003.PubMedView Article
- Loor JJ, Dann HM, Guretzky NA, Everts RE, Oliveira R, Green CA, Litherland NB, Rodriguez-Zas SL, Lewin HA, Drackley JK: Plane of nutrition prepartum alters hepatic gene expression and function in dairy cows as assessed by longitudinal transcript and metabolic profiling. Physiol Genomics. 2006, 27: 29-41. 10.1152/physiolgenomics.00036.2006.PubMedView Article
- BCM_HGSC, Baylor College of Medicine Human Genome Sequencing Centre. [http://www.hgsc.bcm.tmc.edu/projects/bovine]
- Liu CC, Lin CC, Chen WS, Chen HY, Chang PC, Chen JJ, Yang PC: CRSD: a comprehensive web server for composite regulatory signature discovery. Nucleic Acids Res. 2006, 34: W571-7. 10.1093/nar/gkl279.PubMed CentralPubMedView Article
- CRSD. [http://cgap.nci.nih.gov/Pathways/BioCarta/h_CCR3Pathway]
- Workman CT, Stormo GD: ANN-Spec: a method for discovering transcription factor binding sites with improved specificity. Pac Symp Biocomput. 2000, 467-78.
- IPA, Ingenuity Pathway Analysis. [http://www.ingenuity.com]
- Altschul SF, Madden TL, Schäffer AA, Zhang J, Zhang Z, Miller W, Lipman DJ: Gapped BLAST and PSI-BLAST: a new generation of protein database search programs. Nucleic Acids Res. 1997, 25: 3389-402. 10.1093/nar/25.17.3389.PubMed CentralPubMedView Article
- Li SH, Lee RK, Lin MH, Hwu YM, Lu CH, Chen YJ, Chen HC, Chang WH, Chang WC: SSLP-1, a secreted Ly-6 protein purified from mouse seminal vesicle fluid. Reproduction. 2006, 132: 493-500. 10.1530/rep.1.01183.PubMedView Article
- PSORTII. [http://psort.ims.u-tokyo.ac.jp/]
- Fuller TF, Ghazalpour A, Aten JE, Drake TA, Lusis AJ, Horvath S: Weighted gene coexpression network analysis strategies applied to mouse weight. Mamm Genome. 2007, 18: 463-72. 10.1007/s00335-007-9043-3.PubMed CentralPubMedView Article
- GeneGo MetaCore. [http://www.genego.com]
- Yu H, Luscombe NM, Qian J, Gerstein M: Genomic analysis of gene expression relationships in transcriptional regulatory networks. Trends Genet. 2003, 19: 422-7. 10.1016/S0168-9525(03)00175-6.PubMedView Article
- Everts RE, Chavatte-Palmer P, Razzak A, Hue I, Green CA, Oliveira R, Vignon X, Rodriguez-Zas SL, Tian XC, Yang X, Renard JP, Lewin HA: Aberrant gene expression patterns in placentomes are associated with phenotypically normal and abnormal cattle cloned by somatic cell nuclear transfer. Physiol Genomics. 2008, 33: 65-77. 10.1152/physiolgenomics.00223.2007.PubMedView Article
- Rice GE: Glycerophospholipid metabolism and human labour. Reprod Fertil Dev. 1995, 7: 613-22. 10.1071/RD9950613.PubMedView Article
- Gillet R, Grimber G, Bennoun M, Caron de Fromentel C, Briand P, Joulin V: The consequence of p53 overexpression for liver tumor development and the response of transformed murine hepatocytes to genotoxic agents. Oncogene. 2000, 19: 3498-507. 10.1038/sj.onc.1203671.PubMedView Article
- Bruemmer D, Yin F, Liu J, Berger JP, Sakai T, Blaschke F, Fleck E, Van Herle AJ, Forman BM, Law RE: Regulation of the growth arrest and DNA damage-inducible gene 45 (GADD45) by peroxisome proliferator-activated receptor γ in vascular smooth muscle cells. Circ Res. 2003, 93: e38-47. 10.1161/01.RES.0000088344.15288.E6.PubMedView Article
- Horton JD: Sterol regulatory element-binding proteins: transcriptional activators of lipid synthesis. Biochem Soc Trans. 2002, 30: 1091-5. 10.1042/BST0301091.PubMedView Article
- Seashols SJ, del Castillo Olivares A, Gil G, Barbour SE: Regulation of group VIA phospholipase A2 expression by sterol availability. Biochim Biophys Acta. 2004, 1684: 29-37.PubMedView Article
- Smith MR, Kantoff PW: Peroxisome proliferator-activated receptor gamma (PPARγ) as a novel target for prostate cancer. Invest New Drugs. 2002, 20: 195-200. 10.1023/A:1015670126203.PubMedView Article
- Loor JJ, Dann HM, Everts RE, Oliveira R, Green CA, Guretzky NA, Rodriguez-Zas SL, Lewin HA, Drackley JK: Temporal gene expression profiling of liver from periparturient dairy cows reveals complex adaptive mechanisms in hepatic function. Physiol Genomics. 2005, 23: 217-26. 10.1152/physiolgenomics.00132.2005.PubMedView Article
- Ren Y, Liao WS: Transcription factor AP-2 functions as a repressor that contributes to the liver-specific expression of serum amyloid A1 gene. J Biol Chem. 2001, 276: 17770-8. 10.1074/jbc.M010307200.PubMedView Article
- Smith AD, Sumazin P, Das D, Zhang MQ: Mining ChIP-chip data for transcription factor and cofactor binding sites. Bioinformatics. 2005, 21 (Suppl 1): i403-12. 10.1093/bioinformatics/bti1043.PubMedView Article
- Ramamurthy L, Barbour V, Tuckfield A, Clouston DR, Topham D, Cunningham JM, Jane SM: Targeted disruption of the CP2 gene, a member of the NTF family of transcription factors. J Biol Chem. 2001, 276: 7836-42. 10.1074/jbc.M004351200.PubMedView Article
- Amusquivar E, Herrera E: Influence of changes in dietary fatty acids during pregnancy on placental and fetal fatty acid profile in the rat. Biol Neonate. 2003, 83: 136-45. 10.1159/000067963.PubMedView Article
- Herrera E, Amusquivar E, López-Soldado I, Ortega H: Maternal lipid metabolism and placental lipid transfer. Horm Res. 2006, 65 (Suppl 3): 59-64. 10.1159/000091507. Review.PubMedView Article
- Reddy A, Zhong XY, Rusterholz C, Hahn S, Holzgreve W, Redman CW, Sargent IL: The effect of labour and placental separation on the shedding of syncytiotrophoblast microparticles, cell-free DNA and mRNA in normal pregnancy and pre-eclampsia. Placenta. 2008, 29: 942-9. 10.1016/j.placenta.2008.08.018.PubMedView Article
- Mohamed OA, Jonnaert M, Labelle-Dumais C, Kuroda K, Clarke HJ, Dufort D: Uterine Wnt/β-catenin signaling is required for implantation. Proc Natl Acad Sci USA. 2005, 102: 8579-84. 10.1073/pnas.0500612102.PubMed CentralPubMedView Article
- Fuchs AR, Rust W, Fields MJ: Accumulation of cyclooxygenase-2 gene transcripts in uterine tissues of pregnant and parturient cows: stimulation by oxytocin. Biol Reprod. 1999, 60: 341-8. 10.1095/biolreprod60.2.341.PubMedView Article
- Welsh T, Mitchell CM, Walters WA, Mesiano S, Zakar T: Prostaglandin H2 synthase-1 and -2 expression in guinea pig gestational tissues during late pregnancy and parturition. J Physiol. 2005, 569: 903-12. 10.1113/jphysiol.2005.098129.PubMed CentralPubMedView Article
- Zabel U, Schreck R, Baeuerle PA: DNA binding of purified transcription factor NF-κ B. Affinity, specificity, Zn2+ dependence, and differential half-site recognition. J Biol Chem. 1991, 266: 252-60.PubMed
- Callejas NA, Casado M, Boscá L, Martín-Sanz P: Requirement of nuclear factor κ B for the constitutive expression of nitric oxide synthase-2 and cyclooxygenase-2 in rat trophoblasts. J Cell Sci. 1999, 112: 3147-55.PubMed
- Davies J, Badiani P, Weston K: Cooperation of Myb and Myc proteins in T cell lymphomagenesis. Oncogene. 1999, 18: 3643-7. 10.1038/sj.onc.1202956.PubMedView Article
- Dessimoz J, Opoka R, Kordich JJ, Grapin-Botton A, Wells JM: FGF signaling is necessary for establishing gut tube domains along the anterior-posterior axis in vivo. Mech Dev. 2006, 123: 42-55. 10.1016/j.mod.2005.10.001.PubMedView Article
- Fisher CL, Randazzo F, Humphries RK, Brock HW: Characterization of Asxl1, a murine homolog of Additional sex combs, and analysis of the Asx-like gene family. Gene. 2006, 369: 109-18. 10.1016/j.gene.2005.10.033.PubMedView Article
- Ohteki T, Yoshida H, Matsuyama T, Duncan GS, Mak TW, Ohashi PS: The transcription factor interferon regulatory factor 1 (IRF-1) is important during the maturation of natural killer 1.1+ T cell receptor-α/β+ (NK1+ T) cells, natural killer cells, and intestinal intraepithelial T cells. J Exp Med. 1998, 187: 967-72. 10.1084/jem.187.6.967.PubMed CentralPubMedView Article
- Hughes AL, Green JA, Garbayo JM, Roberts RM: Adaptive diversification within a large family of recently duplicated, placentally expressed genes. Proc Natl Acad Sci USA. 2000, 97: 3319-23. 10.1073/pnas.050002797.PubMed CentralPubMedView Article
- Xie S, Green J, Bixby JB, Szafranska B, DeMartini JC, Hecht S, Roberts RM: The diversity and evolutionary relationships of the pregnancy-associated glycoproteins, an aspartic proteinase subfamily consisting of many trophoblast-expressed genes. Proc Natl Acad Sci USA. 1997, 94: 12809-16. 10.1073/pnas.94.24.12809.PubMed CentralPubMedView Article
- Roberts RM: Interferon-tau, a Type 1 interferon involved in maternal recognition of pregnancy. Cytokine Growth Factor Rev. 2007, 18: 403-8. 10.1016/j.cytogfr.2007.06.010.PubMedView Article
- Chakrabarty A, MacLean JA, Hughes AL, Roberts RM, Green JA: Rapid evolution of the trophoblast kunitz domain proteins (TKDPs)-a multigene family in ruminant ungulates. J Mol Evol. 2006, 63: 274-82. 10.1007/s00239-005-0264-3.PubMedView Article
- Laboratory of Mammalian Genome Biology. [http://lewinlab.igb.uiuc.edu/Research/NovelGenes.html]
- GeneSpring. [http://www.chem.agilent.com/scripts/pds.asp?lpage=27881]
- SAS. [http://www.sas.com]
- Benjamini Y, Hochberg Y: On the adaptive control of the false discovery fate in multiple testing with independent statistics. J Educ Behav Stats. 2000, 25: 60-83.View Article
- UCSC Browser. [http://genome.ucsc.edu]
- Kel AE, Gössling E, Reuter I, Cheremushkin E, Kel-Margoulis OV, Wingender E: MATCH: A tool for searching transcription factor binding sites in DNA sequences. Nucleic Acids Res. 2003, 31: 3576-9. 10.1093/nar/gkg585.PubMed CentralPubMedView Article
- Matys V, Fricke E, Geffers R, Gössling E, Haubrock M, Hehl R, Hornischer K, Karas D, Kel AE, Kel-Margoulis OV, Kloos DU, Land S, Lewicki-Potapov B, Michael H, Münch R, Reuter I, Rotert S, Saxel H, Scheer M, Thiele S, Wingender E: TRANSFAC: transcriptional regulation, from patterns to profiles. Nucleic Acids Res. 2003, 31: 374-8. 10.1093/nar/gkg108.PubMed CentralPubMedView Article
- Gupta S, Stamatoyannopoulos JA, Bailey TL, Noble WS: Quantifying similarity between motifs. Genome Biol. 2007, 8: R24-32. 10.1186/gb-2007-8-2-r24.PubMed CentralPubMedView Article
- Workman CT, Yin Y, Corcoran DL, Ideker T, Stormo GD, Benos PV: enoLOGOS: a versatile web tool for energy normalized sequence logos. Nucleic Acids Res. 2005, 33: W389-92. 10.1093/nar/gki439.PubMed CentralPubMedView Article
- Blanchette M, Bataille AR, Chen X, Poitras C, Laganière J, Lefèbvre C, Deblois G, Giguère V, Ferretti V, Bergeron D, Coulombe B, Robert F: Genome-wide computational prediction of transcriptional regulatory modules reveals new insights into human gene expression. Genome Res. 2006, 16: 656-68. 10.1101/gr.4866006.PubMed CentralPubMedView Article
- Zhang B, Horvath S: A general framework for weighted gene co-expression network analysis. Stat Appl Genet Mol Biol. 2005, 4: Article17-PubMed
- Discenza MT, Dehbi M, Pelletier J: Overlapping DNA recognition motifs between Sp1 and a novel trans -acting factor within the wt1 tumour suppressor gene promoter. Nucleic Acids Res. 1997, 25: 4314-22. 10.1093/nar/25.21.4314.PubMed CentralPubMedView Article
- Numoto M, Yokoro K, Yasuda S, Yanagihara K, Niwa O: Detection of mouse skeletal muscle-specific product, which includes ZF5 zinc fingers and a VP16 acidic domain, by reverse transcriptase PCR. Biochem Biophys Res Commun. 1997, 236: 20-5. 10.1006/bbrc.1997.6769.PubMedView Article
- Lantinga-van Leeuwen IS, Leonhard WN, Dauwerse H, Baelde HJ, van Oost BA, Breuning MH, Peters DJ: Common regulatory elements in the polycystic kidney disease 1 and 2 promoter regions. Eur J Hum Genet. 2005, 13: 649-59. 10.1038/sj.ejhg.5201392.PubMedView Article
- Lalancette C, Platts AE, Lu Y, Lu S, Krawetz SA: Computational identification of transcription frameworks of early committed spermatogenic cells. Mol Genet Genomics. 2008, 280: 263-74. 10.1007/s00438-008-0361-2.PubMedView Article
- Garcia MG, Tirado-Gonzalez I, Handjiski B, Tometten M, Orsal AS, Hajos SE, Fernández N, Arck PC, Blois SM: High expression of survivin and down-regulation of Stat-3 characterize the feto-maternal interface in failing murine pregnancies during the implantation period. Placenta. 2007, 28: 650-7. 10.1016/j.placenta.2006.09.010.PubMedView Article
- Cheng CK, Yeung CM, Hoo RL, Chow BK, Leung PC: Oct-1 is involved in the transcriptional repression of the gonadotropin-releasing hormone receptor gene. Endocrinology. 2002, 143: 4693-701. 10.1210/en.2002-220576.PubMedView Article
- Ishiguro N, Matsui T, Shinagawa M: Specific expression of cellular oncogenes c-my c and c-myb in T-cell lines established from three types of bovine lymphosarcomas. Am J Vet Res. 1993, 54: 2010-4.PubMed
- Savage AK, Constantinides MG, Han J, Picard D, Martin E, Li B, Lantz O, Bendelac A: The Transcription Factor PLZF Directs the Effector Program of the NKT Cell Lineage. Immunity. 2008, 29: 1-13. 10.1016/j.immuni.2008.07.011.View Article
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/2.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited.