Skip to main content
Figure 1 | BMC Genomics

Figure 1

From: Simultaneous analysis of distinct Omics data sets with integration of biological knowledge: Multiple Factor Analysis approach

Figure 1

Schematic of our MFA based approach to combine 'Omics' data and to integrate biological Knowledge. (I) The heart of MFA is a PCA in which weights are assigned to the variables: (i) When several sets of variables describe a same set of individuals (tumors), it is possible to consider the merged data set: K = [K1, K2,..., K J ], where each K j corresponds to an 'Omics' data table. (ii) Separate analysis are performed by principal components analysis (PCA) on each group j of variables. Each variable belonging to a group j is weighted by 1/ λ 1 i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeq4UdW2aa0baaSqaaiabigdaXaqaaiabdMgaPbaaaaa@3001@ , where λ 1 i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaeq4UdW2aa0baaSqaaiabigdaXaqaaiabdMgaPbaaaaa@3001@ denotes the first eigenvalue of the matrix of variance-covariance associated with each data table K j . (iii) A global analysis is performed. The corresponding graphical displays (Individual Factor Map and Variables Representation) are read as for PCA. (II) MFA allows to look for common factors by providing a representation of each matrix of variables (Groups Representation). It provides the visualization of specific and common structure emerging from the K j . MFA allows to compare the main factors of variability by linking both groups and variables representations. As the coordinates of set j upon axis of rank s is equal to L g (z s , K j ): set coordinates are always comprised between 0 and 1; and a small distance between two set along axis s means that they include the structure expressed by factor s each one with the same intensity. (III) The asset of MFA to add supplementary groups of variables is used to integrate biological knowledge. The BP modules are formalized as K B P i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4saS0aaSbaaSqaaiabdkeacjabdcfaqnaaBaaameaacqWGPbqAaeqaaaWcbeaaaaa@30E9@ matrices containing the restriction of the whole data set to the genes associated with the ithBP. The projection of the K B P i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4saS0aaSbaaSqaaiabdkeacjabdcfaqnaaBaaameaacqWGPbqAaeqaaaWcbeaaaaa@30E9@ is made by means of its scalar product matrix between individuals. This matrix denoted W i is a a (I × I) matrix (W i = K B P i K B P i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4saS0aaSbaaSqaaiabdkeacjabdcfaqnaaBaaameaacqWGPbqAaeqaaaWcbeaakiqbdUealzaafaWaaSbaaSqaaiabdkeacjabdcfaqnaaBaaameaacqWGPbqAaeqaaaWcbeaaaaa@3613@ ) and can be considered as an element of the space I 2 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWefv3ySLgznfgDOjdaryqr1ngBPrginfgDObcv39gaiqaacqWFDeIudaahaaWcbeqaaiabdMeajnaaCaaameqabaGaeGOmaidaaaaaaaa@38F4@ . This element is thus projected on the dimensions of I 2 MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaWefv3ySLgznfgDOjdaryqr1ngBPrginfgDObcv39gaiqaacqWFDeIudaahaaWcbeqaaiabdMeajnaaCaaameqabaGaeGOmaidaaaaaaaa@38F4@ issued from MFA. This representation of the groups is made available by means of a graphical display of the K B P i MathType@MTEF@5@5@+=feaagaart1ev2aaatCvAUfKttLearuWrP9MDH5MBPbIqV92AaeXatLxBI9gBaebbnrfifHhDYfgasaacPC6xNi=xH8viVGI8Gi=hEeeu0xXdbba9frFj0xb9qqpG0dXdb9aspeI8k8fiI+fsY=rqGqVepae9pg0db9vqaiVgFr0xfr=xfr=xc9adbaqaaeGaciGaaiaabeqaaeqabiWaaaGcbaGaem4saS0aaSbaaSqaaiabdkeacjabdcfaqnaaBaaameaacqWGPbqAaeqaaaWcbeaaaaa@30E9@ as points in a scatter plot. It has to be read as follow: the coordinate of a given group is all the more close to 1 than the variables of this group are highly correlated with the dimension issued from the MFA (either positively or negatively). Hence, two groups are all the more close than the structures they induce on the observations are close.

Back to article page