From: Determining the optimal number of independent components for reproducible transcriptomic data analysis

Analysis of reproducibility of previously identified metagenes in independent components of the METABRIC dataset. a,e,f,g,h show correlations (see Methods section) with the cell cycle, inflammation, myofibroblast, interferon signaling, immune-related metagenes from [3] as a function of the chosen data dimension M, and the stability of the best matched component. c shows the ratio between the correlation value of the proliferation metagene with the best matched component and the second best correlation (gap). d shows an intersection (Jaccard index) of the Freeman’s cell cycle signature [19] and the set of top-contributing genes (projection > 5.0) from the proliferation-associated independent component. b,i correlation of the cell cycle and immune-related metagene with the best matched component in the M = 100 ICA decomposition as a function of the stability-based component rank. In all plots, the vertical dashed line shows the MSTD value for the METABRIC dataset

