Reconstruction of composite regulator-target splicing networks from high-throughput transcriptome data
- Panagiotis Papasaikas1, 2, 3Email author,
- Arvind Rao†4, 5,
- Peter Huggins†4, 6,
- Juan Valcarcel2, 3, 7 and
- A Javier Lopez1, 4
© Papasaikas et al.; 2015
Published: 2 October 2015
We present a computational framework tailored for the modeling of the complex, dynamic relationships that are encountered in splicing regulation. The starting point is whole-genome transcriptomic data from high-throughput array or sequencing methods that are used to quantify gene expression and alternative splicing across multiple contexts. This information is used as input for state of the art methods for Graphical Model Selection in order to recover the structure of a composite network that simultaneously models exon co-regulation and their cognate regulators. Community structure detection and social network analysis methods are used to identify distinct modules and key actors within the network. As a proof of concept for our framework we studied the splicing regulatory network for Drosophila development using the publicly available modENCODE data. The final model offers a comprehensive view of the splicing circuitry that underlies fly development. Identified modules are associated with major developmental hallmarks including maternally loaded RNAs, onset of zygotic gene expression, transitions between life stages and sex differentiation. Within-module key actors include well-known developmental-specific splicing regulators from the literature while additional factors previously unassociated with developmental-specific splicing are also highlighted. Finally we analyze an extensive battery of Splicing Factor knock-down transcriptome data and demonstrate that our approach captures true regulatory relationships.
Alternative splicing is a widespread regulatory mechanism whose correct function is important for normal development and cellular behavior, and whose dysfunction plays an increasingly prominent role in the causation and modulation of human disease . Basic splicing processes are fairly well understood, and progress has been made in defining several of the mechanisms and factors involved in alternative splicing regulation .
A large part of our current view of alternative splicing regulation is based on studies of single splicing events. These efforts have substantially advanced our understanding of the process, but they are often limited by their narrow scope and generalizing the conclusions is not straightforward. At the other end of the spectrum, high-throughput approaches (genome-wide knock downs, CLIP-related techniques) can provide more general information on splicing regulation; however, practical considerations typically constrain their focus to small groups of factors. Furthermore, computational approaches to solve "splicing codes" (e.g [3, 4]) are mainly focused on genome-wide prediction of cis-regulatory elements. Validation of results and inferences from these approaches still requires slow and expensive low-throughput approaches.
Another limitation of current approaches is their static focus (e.g. the patterns observed in the presence or absence of a given factor in a given cell line or developmental stage). The dynamic nature of splicing regulation in different contexts remains a challenge, but is also a rich source of information for elucidating regulatory mechanisms. High-throughput analyses are now generating copious data on stage-and tissue-specific transcriptome composition but the problem remains of how to exploit and build upon such data to arrive efficiently at comprehensive regulatory models.
To address these challenges, we propose an approach based on the application of network reconstruction and analysis methods to high-throughput transcriptome data. Our approach derives a heterogeneous graphical model that simultaneously captures co-regulated groups of exons as well as their putative cognate regulators. This composite representation can be valuable for highlighting and assigning function to previously uncharacterized components of the system and for pinpointing novel regulatory relationships, while analysis of its topology can reveal organizational features such as central players that are critical for the flow of regulatory information or modules of densely interconnected units that correspond to meaningful functional groupings. As a proof of concept this strategy is applied to the Drosophila developmental data generated from the modENCODE project . Detailed analysis of module and central component characteristics in the final model and the overlay with orthogonal annotation and functional data demonstrate the merits of our approach.
The splicing machinery, mechanisms, and known regulatory factors are well conserved between Drosophila and mammals, and the frequency and complexity of alternative splicing are comparable. Thus, the strategies developed in this study and the biological insights are transferable to more complex mammalian systems.
Results and discussion
Quantification of alternative splicing during Drosophila development from modENCODE tiling-array and RNAseq data
Alternative Splicing was quantified for 28 time points during fly development using the publicly available modENCODE transcriptome data [5, 6]. The time points correspond to 12 embryonic stages taken at 2h intervals, 6 larval stages, 6 pupal stages, day 1 and 5 adult males, and day 1 and 5 adult females. First, individual exon Percent Spliced In (PSI, ) indices were calculated for all internal exons using the tiling array data for each developmental time point. Estimation is based on normalizing the average array signal over the exon, over robust estimates for total gene expression (see Methods for details). Out of the total 31,342 strictly internal exons, 2,855 exhibited appreciable PSI fluctuations during development (coefficient of variation ).
Next, we obtained independent estimates of exonic PSIs by reanalyzing the RNAseq data for the same timepoints (~3.5 billion 76nt reads, ~125 million reads per condition on average). RNAseq-based PSIs were calculated as the ratio of inclusion-supporting junction reads over the sum of inclusion and skipping junction reads (see also Methods). We constrained our analysis to exons with sufficient junction coverage in the dataset (≥10 inclusion plus skipping junction reads for at least half of the developmental timepoints), resulting in estimates for <50% of the exons quantified using the first type of analysis. Of those, 1,277 appeared variable in development using the same criterion as before.
Comparison of the PSI values from the two types of analysis shows reasonably good correlation across developmental stages suggesting our methods yield reliable estimates of exon usage (Additional File 1: Figure S1). For the purposes of this study we decided to use the more populous dataset of fluctuating exons and the corresponding PSI-values coming from the tiling array based analysis. The PSIs of all development-variable exons were then centered and scaled. The resulting values reflect changes over the basal (average) developmental inclusion at each time point. Standardization of the events also ensures that only the shape and not the magnitude of the fluctuations is important during network reconstruction. The developmental profiles of both the raw and standardized PSIs are shown in Additional File 1: Figure S2.
In a next step we analyzed 421 RNA-Binding Protein (RBP) genes implicated in RNA processing according to Flybase annotation. Among these, we identified 349 genes with variable expression levels across development (cv >0.2). The standardized expression values of these genes were used along with the standardized PSIs to compute a single sample covariance matrix. In the case of highly correlated entries coming from the same transcription unit only one of them was selected, based on the L1-norm of its correlation vector, in order to minimize redundancies in the final network (see Methods). This dataset was used as input in order to reconstruct a composite network that captures both developmental splicing profiles and potential regulatory relationships between exon targets and RBPs.
Reconstruction of a composite splicing regulatory network for fly development
The time series partial correlations of exon inclusion levels as well as RBP expression formed the basis for deriving a splicing regulatory network for Drosophila development. Our strategy for network reconstruction was based on the graphical lasso (glasso) algorithm  for sparse network estimation and is described in detail in Methods. Briefly, the composite sample covariance matrix described above was passed on as input to glasso for estimation of a sparse inverse covariance matrix, which directly corresponds to a unique undirected graph (see also Methods). The distinction between the two types of network nodes (fluctuating exons as targets and RBPs as putative regulators) was incorporated naturally as prior information in glasso by specifying a lower regularization parameter for regulator-to-exon edges and a higher one for exon-to-exon edges (see also Methods). This strategy encourages connections between potential regulators and alternative exons during network reconstruction and attempts to at least partially explain exon variability during development as a consequence of regulator variability.
Identified exon modules display distinct functional profiles
Top Enriched Go Terms(pval<1e-3)
Interacting Genes (pval<1e-3)
Top Targets (non-RBPs)
instal larval or pupal morphogenesis, metamorphosis, post-embryonic morphogenesis
DP, hp, Rbf, slpr, Myd88
Fib:2, CG6995:19, eIF-4a:6, sm:3, CG17838
CG17273:3, CSN3:4, CG9797:2, Abi:4, Akap200:5
organ development, instar larval or pupal development, post-embryonic morphogenesis
fz2, E2f, hop, hep, dco
Rbp2:3, Patr-1:3, dom:4, pum:3 CG7185:6
Trf2:4, CG3662:4, CG2943:7, Arf51F:4, CG2947:5
regulation of cellular process, organ development, generation of neurons
Su(H), Notch, mew, E2f, mys
(2)d:5 gw:52, CG30122:7 CG4119:3 Sxl:15
CG8116:2, CG42551:3, CG9132:2, Spt6:3, Gprk1:12
organ development, locomotion, neuron di_erentiation
Pc, mew, hop, ph-d, w
CG11266:4 su(w[a]):6 Dcp2:4, Zn72D:4, Psi:3
RhoGAP71E:5, CG4699:10, prod:2, Abl:6, l(1)G0232:6
organ morphogenesis, sensory organ development, epithelium development
rux, Pvf1, sli, Egfr, CycB
me31B:3, msi:6, CG11133, spn-E, xmas-2
CG12424:5, CG16791:13, tutl:6, LanB2:8, CG3907:3
regulation of primary metabolic process, regulation of biosynthetic process, regulation of gene expression
trx, w, Pc, arm, mor
CG6995:14 CG1646:10 Psi:5, brm:5, qkr58E-2:4
CG1244:10, CG42245:5, Syb:2 slmb:4, CG42614:12
imaginal disc development, wing disc morphogenesis, post-embryonic appendage morphogenesis
wupA, fz2, NiPp1, Tm2, Prm
CG34362:8, CG13124:10, shot:6, CG3209:4 CG8547:5
stv:8, CG15105:6, CG15105:9, CG34398:13, CG13188:6
Smg6:5, Chd1:2, snRNA:U5:38ABb, snRNA:U5:14B, snRNA:U5:23D
gammaCop:6, Nat1:2 Cip4:8, sec24:4, Utx:11
sarcomere, myo_bril, tissue development
Mhc, Dp, arm, l(2)gl, Rbf
Sxl:14, AGO1:6, Nop56:2, CG6209, SF2:3
CG42783:7, lqf:12, Zasp:10, CG42446:4, CG7188:2
Rho1, Pvf1, Taf1, Egfr, Ras85D
CG13124:7, CG4266:2, CG3558:4
CG15609:7, CG8677:4, CG2991:5, RhoGEF2:21, Pen:4
anatomical structure morphogenesis, system development, protein-binding, multicellular organismal development
Sb, Rho1, Pc, mew, baz
CG30122:7, (2)d:5, pUf68:5, CG4119:3, gw:52
stv:8, psq:17, CG8116:2, l(1)G0232:6, trx:17
Identification of splicing regulatory modules using community structure detection
Network modules or communities can be defined loosely as sets of nodes with a more dense connection pattern among their members than between their members and the remainder of the network [9, 10]. Module detection in real-world graphs is of considerable practical interest as they often capture meaningful functional groupings [9–11].
Here, we identify modules in the developmental splicing network using the greedy community detection algorithm  (see also Methods). The goal is to identify groups of exons that exhibit similar splicing regulatory patterns in the course of development and/or are under the control of the same regulatory circuitry. This analysis reveals a network community structure with a modularity value of 0.42, which indicates strong natural divisions in the network . Both the identified network community structure as well as a simplified network topology illustrating the relationships between the largest modules are shown in Figure 1. RBP and exon components of these modules are listed in Additional File 4.
Modules 1,2 and 3 show similar peaks during pupal stages, although modules 1 and 3 present more complex splicing dynamics. All are enriched for terms related to instar larva or pupal morphogenesis and development. In addition modules 1 and 2 are both significantly enriched for interactions with genes required for dorsal closure (hep, slpr) as well as components of the DREAM complex (E2f, Dp, Rbf) which regulates transcription of developmentally controlled targets, whereas module 3 is specifically GO-enriched for neuron generation and for interactions with components of the Notch pathway (Notch, Su-H), which regulates neurogenesis in the central nervous system and the imaginal disks. Modules 4 and 6 present inflections during late pupal and mid-to-late embryonic stages and are both enriched for interactions with polycomb group (PcG) components (Pc, ph-d). Module 4 is nevertheless distinct in that it also shows a pronounced switch during the L2 to L3 larval transition and is highly enriched for terms related to locomotion and neuron differentiation. On the other hand, module 6 has additional interactions with genes (trx, mor) of the PcG-antagonisitic trithorax group (trxG) and is almost exclusively enriched for GO-terms related to metabolic processes.
The densely connected modules 5 and 7 show profiles characteristic of maternally loaded transcripts, with larger peaks at the beginning of embryonic development and less pronounced ones in the 5-day female. However, module 5 dynamics suggest rapid degradation prior to cellularization (2-3h post-fertilization) while module 7 isoforms persist past the maternal-to-zygotic transition (completed ~3h post-fertilization). In addition, module 7 is mostly enriched for GO-terms related to imaginal, wing-disc and appendage morphogenesis and for interactions with genes involved in muscle filament generation and contraction (wupA, Tm-2, Prm).
Modules 8 and 10 are exceptional in that they show no clear GO-term enrichment despite the fact that they exhibit simple splicing dynamics. These modules mostly contain peripheral, loosely connected nodes suggesting an ancillary role in development. Finally module 9 shows a distinctive male-specific profile with a switch immediately prior to metamorphosis. It is enriched for biological functions related to anatomical structure development and for components of the sarcomere and contractile fibers suggesting a role in the sexually-dimorphic fly musculature (, see next section).
Inference of global and within-module splicing regulators using centrality measures
Centrality measures are structural attributes of network nodes that express the importance of their location within the network topography and/or how they influence network information flow . These measures are typically used in the analysis of Social Networks but they have also been applied in studies of biological networks (see e.g [11, 16–19]. Here we employed three different measures of centrality, namely Closeness , Betweenness  and PageRank  that capture different structural properties of the network nodes.
The closeness measure is based on geodesic distance between a node and all other nodes in the network. A node that is closeness-central can quickly interact with all the others, not only its neighbors [22, 15]. Betweenness measures how often a node participates in the shortest paths that connect pairs of other nodes in the network. A node with high betweenness can control the flow of information, and may also serve as a liaison between distant regions of the network [22, 15]. PageRank is a measure that quantifies the relative importance of individual nodes . Nodes with high PageRank centrality have high visibility, meaning that they are often traversed, irrespectively of which path is chosen. As a result, perturbations in PageRank-central nodes can affect and propagate to large parts of the network.
Interestingly 3 of the 6 non-RBP components of the core network are exons of the PcG and trxG chromatin remodeling complexes (trx:17, psq:17, dom:4 ) and for two of these cases (trx:17 and psq:17 ) developmental specific alternative splicing is also supported by RNAseq data. In psq:17 retention of a short intron results in introduction of 18aa overlapping the first Helix-Turn-Helix domain of the protein (Figure 4B). In trx:17 exon inclusion results in a longer 5'UTR which is associated with lower expression values (Figure 4B) suggesting less efficient transcription and/or decreased mRNA stability. These results highlight the interplay between splicing and chromatin regulation as key for the establishment of differential gene expression profiles that underly fly development.
Knock-downs of RBPs show significantly enhanced effects on their predicted targets
In this work we presented a method for the reconstruction of composite splicing regulatory networks that can recover simultaneously both modules of co-regulated exons as well as regulatory relationships between regulator (RBPs) and target (exons) components. Reconstruction is based on the estimation of a sparse covariance matrix that includes both types of components using a regularization approach that naturally incorporates putative regulatory relationships as prior information in the form of non-uniform regularization parameters. Our analysis on the characteristics of the derived exon modules suggests that exon groupings undergoing coordinated splicing regulation reflect coherent functional groupings implicated in distinct stages and processes during fly development. This observation indicates that the obtained exon modules can be used as proxies for inferring the function of non-characterized network components. In addition, experimental knock-down of splicing regulators produces larger effects on their computationally inferred targets as opposed to non-targets, indicating that this approach is also valuable as a hypothesis generator for regulatory relationships.
For this study only two possible values of the regularization parameter were used (see Methods) in order to specify two possible types of network relationships (regulator-target and everything else). However it is straightforward to modify this approach to incorporate more elaborate patterns of prior knowledge based for example on enrichment of RBP binding motifs near the exons or available binding or functional data for subsets of the regulators. In addition, the same approach can also be directly applied for deciphering and contrasting the configuration and organizational properties of the splicing regulatory circuitry involved in different organisms and biological processes including mammalian tissue-specific splicing or cell differentiation and reprogramming. Finally the computational framework described here can easily be extended to other layers of gene expression regulation (e.g alternative promoter selection of polyadenylation, mRNA export and translation) as it is only limited by the availability of suitable datasets.
Pre-processing of tiling array data and array-based PSI estimation
The publicly available modENCODE tiling array data consist of signal values for 25mer probes covering the entire fly genome. The tiling array probes have an average center base spacing of 38 bp and are designed to be identical to the plus strand of the genome sequence. Three biological replicates were hybridized for each experiment. The data from replicate arrays are quantile-normalized and all arrays are scaled to median array intensity. Prior to data analysis unreliable probes are removed based on their variance and mean values. In particular a probe p is removed if its coefficient of variation cv p = σ p /µ p >5 or its mean value µ p < −200 or µ p >20000.
Gene signals S(G) are calculated as the average of all probes that cover the gene regions that are constitutively included in the mature RNA as these are defined based on Flybase and modENCODE gene models (MB8 and MB9). Exon signals S(E i ) are calculated as the average of tiling array probes covering the genomic coordinates of exon i. Next, exon inclusion indices for each developmental time point are determined for every exon as the fraction S(E i )/S(GE i ).
Missing values are assigned to exon inclusion indices in cases where a reliable estimate cannot be acquired due to low gene expression or sparse probe coverage. Exons with >50% of missing values are discarded from subsequent analysis. Missing values for the remaining exons are filled in using 10-nearest neighbors imputation (, R function impute available as part of the bio-conductor project at http://www.bioconductor.org/). Finally both exon inclusion and gene expression time series data were scaled and centered.
Pre-processing of RNAseq data and RNAseq-based PSI estimation
FASTQ files for the RNAseq data for both developmental time-points and RNAi treatments were downloaded from the modENCODE ftp repository (ftp://data.modencode.org/). For every sample the reads were mapped using the STAR aligner v2.4.0 and a two-pass mapping procedure  with standard ENCODE mapping parameters as those are detailed in the software manual. Genome sequence files and annotation were based on the dm3 assembly freeze and the corresponding flybase annotation  available from the UCSC table browser. For PSI calculation only junction spanning reads were used and the index was calculated as described in . Missing values were assigned to Exons with less than 10 total inclusion and skipping junction reads per sample. Finally, missing value imputation and data scaling and centering was carried out as described above.
Selection of representative transcription-unit events
For this work we set k = 10.
Graphical model selection for network reconstruction using graphical lasso
where R is a p × p regularization parameter matrix and ∗ indicates component-wise multiplication. This formulation provides a natural way to incorporate in glasso prior information on the network structure . In this application we take advantage of this formulation and define a bimodal regularization matrix R that biases for edges between alternative exons and potential splicing regulators. In particular, our network is a mixture of two distinct types of variables: Variables of type A correspond to alternative exons that show differential inclusion between at least two developmental stages. Variables of type B correspond to candidate regulating elements. These can be either genes or exons of genes that are known or predicted to be implicated in RNA processing according to Flybase annotation AND show differential expression (in the case of genes) or inclusion (in the case of exons) during development. Connections between type A and type B variables are assigned a regularization parameter r1 and all other type of connections are constrained by a regularization parameter r2 > r1. Optimal choice of the regularization parameter for covariance estimation is an open theoretical problem [28–30]. In general, since the regularization parameter controls for type I errors in the network, it can be relaxed as the variance of the variables decreases . For this application we set r1 to 0.75 and r2 to 0.85. We note however, that the social network analysis results presented here (identified modules, centrality analysis) are robust to different choices of values for r1, and r2 as long as r2 > r1 (data not shown). The glasso process for network reconstruction was implemented using the glasso R package by Jerome Friedman, Trevor Hastie and Rob Tibshirani (http://statweb.stanford.edu/~tibs/glasso/).
Modularity-based community detection and measures of centrality
where m is the number of modules in p, l k is the number of connections within module k, L is the total number of network connections and d k is the sum of the degrees of the nodes in module k. Here, we identified modules of exons that exhibit similar profiles across development by maximizing the network's modularity using the greedy community detection algorithm  implemented in the fastgreedy.community function of the igraph package , http://igraph.org).
Extensive definitions and algorithmic details for the computation of Closeness and Betweeness centralities and the Pagerank index can be found in . All functions for centrality measure calculation are available through the igraph library (, http://igraph.org).
We acknowledge support of the Spanish Ministry of Economy and Competitiveness, ‘Centro de Excelencia Severo Ochoa 2013-2017’, SEV-2012-0208. Work in JV laboratory is supported by Fundación Botín, by Banco de Santander through its Santander Universities Global Division, Consolider RNAREG, Ministerio de Economía y Competitividad and AGAUR.
Publication charges for this work were jointly funded by Fundacion Botin, Banco de Santander through its Santander Universities Global Division, Consolider RNAREG, Ministerio de Economia y Competitividad and AGAUR and the corresponding author's institution.
This article has been published as part of BMC Genomics Volume 16 Supplement 10, 2015: Proceedings of the 13th Annual Research in Computational Molecular Biology (RECOMB) Satellite Workshop on Comparative Genomics: Genomics. The full contents of the supplement are available online at http://www.biomedcentral.com/bmcgenomics/supplements/16/S10.
- Cooper TA, Wan L, Dreyfuss G: Rna and disease. Cell. 2009, 136 (4): 777-793.View ArticleGoogle Scholar
- Fu XD, Ares M: Context-dependent control of alternative splicing by rna-binding proteins. Nature Reviews Genetics. 2014, 15 (10): 689-701.View ArticleGoogle Scholar
- Barash Y, Calarco JA, Gao W, Pan Q, Wang X, Shai O, Blencowe BJ, Frey BJ: Deciphering the splicing code. Nature. 2010, 465 (7294): 53-59.View ArticleGoogle Scholar
- Xiong HY, Alipanahi B, Lee LJ, Bretschneider H, Merico D, Yuen RK, Hua Y, Gueroussov S, Najafabadi HS, Hughes TR, et al: The human splicing code reveals new insights into the genetic determinants of disease. Science. 2015, 347 (6218): 1254806-View ArticleGoogle Scholar
- Celniker SE, Dillon LA, Gerstein MB, Gunsalus KC, Henikoff S, Karpen GH, Kellis M, Lai EC, Lieb JD, MacAlpine DM, et al: Unlocking the secrets of the genome. Nature. 2009, 459 (7249): 927-930.View ArticleGoogle Scholar
- Graveley BR, Brooks AN, Carlson JW, Duff MO, Landolin JM, Yang L, Artieri CG, van Baren MJ, Boley N, Booth BW, et al: The developmental transcriptome of drosophila melanogaster. Nature. 2011, 471 (7339): 473-479.View ArticleGoogle Scholar
- Katz Y, Wang ET, Airoldi EM, Burge CB: Analysis and design of rna sequencing experiments for identifying isoform regulation. Nature methods. 2010, 7 (12): 1009-1015.View ArticleGoogle Scholar
- Friedman J, Hastie T, Tibshirani R: Sparse inverse covariance estimation with the graphical lasso. Biostatistics. 2008, 9 (3): 432-441.View ArticleGoogle Scholar
- Radicchi F, Castellano C, Cecconi F, Loreto V, Parisi D: Defining and identifying communities in networks. Proceedings of the National Academy of Sciences of the United States of America. 2004, 101 (9): 2658-2663.View ArticleGoogle Scholar
- Girvan M, Newman ME: Community structure in social and biological networks. Proceedings of the national academy of sciences. 2002, 99 (12): 7821-7826. 10.1073/pnas.122653799.View ArticleGoogle Scholar
- Guimera R, Amaral LAN: Functional cartography of complex metabolic networks. Nature. 2005, 433 (7028): 895-900.View ArticleGoogle Scholar
- Clauset A, Newman ME, Moore C: Finding community structure in very large networks. Physical review E. 2004, 70 (6): 066111-View ArticleGoogle Scholar
- Newman ME, Girvan M: Finding and evaluating community structure in networks. Physical review E. 2004, 69 (2): 026113-View ArticleGoogle Scholar
- Taylor BJ: Differentiation of a male-specific muscle in drosophila melanogaster does not require the sex-determining genes doublesex or intersex. Genetics. 1992, 132 (1): 179-191.Google Scholar
- Brandes U, Pich C: Centrality estimation in large networks. International Journal of Bifurcation and Chaos. 2007, 17 (07): 2303-2318.View ArticleGoogle Scholar
- Carlson MR, Zhang B, Fang Z, Mischel PS, Horvath S, Nelson SF: Gene connectivity, function, and sequence conservation: predictions from modular yeast co-expression networks. BMC genomics. 2006, 7 (1): 40-View ArticleGoogle Scholar
- Ozgür A, Vu T, Erkan G, Radev DR: Identifying gene-disease associations using centrality on a literature mined gene-interaction network. Bioinformatics. 2008, 24 (13): 277-285.View ArticleGoogle Scholar
- Jovelin R, Phillips PC, et al: Evolutionary rates and centrality in the yeast gene regulatory network. Genome Biol. 2009, 10 (4): 35-View ArticleGoogle Scholar
- Buljan M, Chalancon G, Eustermann S, Wagner GP, Fuxreiter M, Bateman A, Babu MM: Tissue-specific splicing of disordered segments that embed binding motifs rewires protein interaction networks. Molecular cell. 2012, 46 (6): 871-883.View ArticleGoogle Scholar
- Freeman LC: Centrality in social networks conceptual clarification. Social networks. 1979, 1 (3): 215-239.View ArticleGoogle Scholar
- Brin S, Page L: Reprint of: The anatomy of a large-scale hypertextual web search engine. Computer networks. 2012, 56 (18): 3825-3833.View ArticleGoogle Scholar
- Borgatti SP: Centrality and network flow. Social networks. 2005, 27 (1): 55-71.View ArticleGoogle Scholar
- Penalva LO, Sánchez L: Rna binding protein sex-lethal (sxl) and control of drosophila sex determination and dosage compensation. Microbiology and molecular biology reviews. 2003, 67 (3): 343-359.View ArticleGoogle Scholar
- Hastie T, Tibshirani R, Sherlock G, Eisen M, Brown P, Botstein D: Imputing missing data for gene expression arrays. Stanford University Statistics Department Technical report. 1999Google Scholar
- Dobin A, Davis CA, Schlesinger F, Drenkow J, Zaleski C, Jha S, Batut P, Chaisson M, Gingeras TR: Star: ultrafast universal rna-seq aligner. Bioinformatics. 2013, 29 (1): 15-21.View ArticleGoogle Scholar
- Marygold SJ, Leyland PC, Seal RL, Goodman JL, Thurmond J, Strelets VB, Wilson RJ, Consortium F, et al: Flybase: improvements to the bibliography. Nucleic acids research. 2013, 41 (D1): 751-757.View ArticleGoogle Scholar
- Christley S, Nie Q, Xie X: Incorporating existing network information into gene network inference. PloS one. 2009, 4 (8): 6799-View ArticleGoogle Scholar
- Zhao P, Yu B: On model selection consistency of lasso. The Journal of Machine Learning Research. 2006, 7: 2541-2563.Google Scholar
- Banerjee O, El Ghaoui L, d'Aspremont A: Model selection through sparse maximum likelihood estimation for multivariate gaussian or binary data. The Journal of Machine Learning Research. 2008, 9: 485-516.Google Scholar
- Scheinberg K, Asadi NB, Rish I: Sparse mrf learning with priors on regularization parameters. Technical report, Technical Report RC24812. 2009, IBM TJ Watson Research CenterGoogle Scholar
- Csardi G, Nepusz T: The igraph software package for complex network research. InterJournal, Complex Systems. 2006, 1695 (5): 1-9.Google Scholar
- Aggarwal C: Social Network Data Analytics. 2011, Springer, New York, 1View ArticleGoogle Scholar
- Lyne R, Smith R, Rutherford K, Wakeling M, Varley A, Guillier F, Janssens H, Ji W, Mclaren P, North P, et al: Flymine: an integrated database for drosophila and anopheles genomics. Genome biology. 2007, 8 (7): 129-View ArticleGoogle Scholar
This article is published under license to BioMed Central Ltd. This is an Open Access article distributed under the terms of the Creative Commons Attribution License (http://creativecommons.org/licenses/by/4.0), which permits unrestricted use, distribution, and reproduction in any medium, provided the original work is properly cited. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated.