### Datasets

#### Human miRNA-disease associations

We collect the known human miRNA-disease associations from HMDD V2.0 database (June, 2014) [11], and obtain 3693 associations among 368 miRNAs and 383 diseases.

#### MiRNA functional similarity and sequence similarity

The functional similarities among miRNAs can be calculated by the method proposed in [22], and we download the similarity data from http://www.cuilab.cn. Since miRNA’s function is closely relevant to the miRNA sequence [23], we also obtain the miRNA sequence similarity from http://www.mirbase.org/ftp.shtml. The integrated similarities among miRNAs are defined as the average of the functional similarity and the sequence similarity, and the integrated similarity matrix for miRNAs is denoted as *S*_{m}.

#### Two disease semantic similarities

To calculate disease semantic similarities, Wang [22] and Xuan [24] propose two methods based on the Medical Subject Headings (MeSH) descriptors which could be downloaded from the National Library of Medicine (http://www.nlm.nih.gov/).

Wang’s method [22] first calculates the semantic value and contribution value of a disease, and then uses these two values to compute the semantic similarity between two diseases. Unlike Wang’s method, Xuan et al. [25] improves the calculation method of semantic value. It also uses semantic value and contribution value to calculate the semantic similarity. We use the integrated similarity in our work by averaging the two types of semantic similarities, and denote the integrated similarity matrix for diseases as *S*_{d}.

### Our proposed method via matrix tri-factorization

#### Problem statement and notations

We are now given the integrated similarity matrix \(\phantom {\dot {i}\!}S_{m}\in R^{n_{m}\times n_{m}}\) among *n*_{m} miRNAs \(\phantom {\dot {i}\!}\left \{m_{1},\cdots,m_{n_{m}}\right \}\), and the integrated similarities \(\phantom {\dot {i}\!}S_{d}\in R^{n_{d}\times n_{d}}\) among *n*_{d} diseases \(\left \{d_{1},\cdots,d_{n_{d}}\right \}\). We are also given the miRNA-disease association (MDA) indicator matrix \(\phantom {\dot {i}\!}A\in R^{n_{m}\times n_{d}}\) defined as follows

$$\begin{aligned} A(i,j)= \left\{\begin{array}{ll} 1,& {i}-\text{th miRNA } m_{i} \text{ is associated with } {j}-\text{th disease } d_{j}, \\ 0,& \text{association between } {i}-\text{th miRNA } m_{i} \text{ and } {j}-\text{th disease } d_{j} \text{ is unknown.} \end{array} \right. \end{aligned} $$

We denote *Ω*={(*i*,*j*)|*A*_{ij}=1} to be the indices for the miRNA-disease pairs which are known to be associated, and *Ω*^{c}={(*i*,*j*)|*A*_{ij}=0} to be all the pairs whose associations are unknown. For any matrix *M*, we denote \(\mathcal {R}_{\Omega }(M)\) by only keeping its *Ω* part and forcing its *Ω*^{c} part to be zeros, that is,

$$\mathcal{R}_{\Omega}(M)_{{ij}}= \left\{\begin{array}{lc} M_{{ij}},~~~if (i,j)\in\Omega\\ 0,~~~~~~if (i,j)\in\Omega^{c}. \end{array}\right. $$

Our aim in this work is to complete the *Ω*^{c} part in matrix *A*, and recover the complete matrix \(\tilde {A}\).

#### MTFMDA model

We propose our MTFMDA method by considering the following three aspects. First, the unknown complete miRNA-disease association (MDA) matrix \(\tilde {A}\) can be factorized into three matrices, a feature matrix for miRNAs \(\phantom {\dot {i}\!}P\in R^{n_{m}\times r_{m}}\), a feature matrix for diseases \(\phantom {\dot {i}\!}Q\in R^{n_{d}\times r_{d}}\), and the feature relationship matrix \(\phantom {\dot {i}\!}D\in R^{r_{m}\times r_{d}}\). The factorization \(\tilde {A} = PDQ^{T}\) implies that the column vectors in \(\tilde {A}\) lie in the subspace spanned by the column vectors in *P*, and the row vectors in \(\tilde {A}\) lie in the subspace spanned by the column vectors in *Q*. *D* is generally required to be low rank, and *P* and *Q* are orthonormal matrices satisfying *P*^{T}*P*=*I* and *Q*^{T}*Q*=*I*. Second, the complete \(\tilde {A}\) should recover the known associations between miRNAs and diseases, i.e, the *Ω* part of the difference matrix \(\left (A-\tilde {A}\right)\) should be zero or as small as possible. Third, the feature vectors in *P* and *Q* should preserve the similarity information hidden in the *S*_{m} and *S*_{d}, respectively, and thus two Laplacian regularizers should be used for preserving the geometric structure. By considering the above three aspects, we propose the following MTFMDA model

$$ \begin{aligned} &\underset{P,D,Q}{\min} \left\|\mathcal{R}_{\Omega}\left(A-P D Q^{T}\right)\right\|^{2}_{F} +\lambda_{1} tr\left(P^{T} L_{m} P\right) +\lambda_{2} tr\left(Q^{T} L_{d} Q\right) +\lambda_{3}\|D\|_{*}\\ &s.t.\quad P^{T} P=I, Q^{T} Q=I, \end{aligned} $$

(1)

where *λ*_{1},*λ*_{2} and *λ*_{3} are the regularization parameters to control the trade-offs. The first term is to recover the known MDAs in *A*. In the second term, *L*_{m}=*D*_{m}−*S*_{m} is the Laplacian matrix for the miRNAs, where *D*_{m} is a diagonal matrix with the *i*-th diagonal element being the sum of *i*-th row in *S*_{m}. In the third term, *L*_{d} is the Laplacian matrix for diseases, defined in the same way as *L*_{m}. Once the optimal *P*,*D* and *Q* are solved in the optimization problem, the completed MDA matrix \(\tilde {A}\) can be obtained by \(\tilde {A}=PDQ^{T}\). The flowchart of our method is shown in Fig.1.

#### Optimization algorithm

In order to solve optimization problem above, we develop an alternate iteration algorithm to update *P*, *D* and *Q* alternately.

**Step 1**: Fix *P* and *Q*, solve *D*By fixing *P* and *Q* in the optimization problem (1), the sub-problem to solve *D* can be obtained as follows:

$$ \begin{aligned} \underset{D}{\min} \left\|\mathcal{R}_{\Omega}\left(A-P D Q^{T}\right)\right\|^{2}_{F}+\lambda_{3}\|D\|_{*}.\\ \end{aligned} $$

(2)

The sub-problem can be solved by an accelerated gradient descent algorithm [26] with the following iterations,

$$ \begin{aligned} &Y_{k}=D_{k}+\gamma_{k}\left(\gamma_{k-1}^{-1}-1\right)\left(D_{k}-D_{k-1}\right),\\ &D_{k+1}=\arg\min\nolimits_{D} \lambda_{3}\left|\left|D\left|\left|{~}_{*}+\frac{s}{2}\right|\right|D-\left(Y_{k}-\frac{1}{s}\bigtriangledown f\left(Y_{k}\right)\right)\right|\right|^{2}_{F},\\ &\gamma_{k+1}=\left(\sqrt{\gamma^{4}_{k}+4\gamma^{2}_{k}}-\gamma^{2}_{k}\right)/2, \end{aligned} $$

(3)

where *s* is a proximal parameter for estimating the second-order gradient of *f*(*Y*). The second equation in (3) can be solved by using the linearized Bregman iteration as a special form of Uzawa’s algorithm proposed in Cai et al. [27].

**Step 2**: Fix *D* and *Q*, solve *P*.

By fixing *D* and *Q* in optimization problem (1), we obtain the sub-problem of *P* as follows:

$$ \begin{aligned} \underset{P}{\min} \left\|\mathcal{R}_{\Omega}\left(A-P D Q^{T}\right)\right\|^{2}_{F}+\lambda_{1} tr\left(P^{T} L_{m} P\right).\\ \end{aligned} $$

(4)

Similarly to solving *D* in step 1, we could also use the accelerated gradient descent (APG) model to update *P* as follows:

$$ \begin{aligned} &\hat{Y}_{k}=P_{k}+\gamma_{k}\left(\gamma_{k-1}^{-1}-1\right)\left(P_{k}-P_{k-1}\right),\\ &P_{k+1}=\arg\min\nolimits_{P} \lambda_{1} tr\left(P^{T} L_{m} P\right)+\frac{s}{2}\left|\left|P-\left(\hat{Y}_{k}-\frac{1}{s}\bigtriangledown f\left(\hat{Y}_{k}\right)\right)\right|\right|^{2}_{F},\\ &\gamma_{k+1}=\left(\sqrt{\gamma^{4}_{k}+4\gamma^{2}_{k}}-\gamma^{2}_{k}\right)/2. \end{aligned} $$

(5)

Wen’s algorithm proposed in [28] is used to solve the second Eq. in (5).

**Step 3**: Fix *D* and *P*, solve *Q*.

By fixing *D* and *P* in optimization problem (1), we obtain the sub-problem of *Q* as follows:

$$ \begin{aligned} \underset{Q}{\min} \left\|\mathcal{R}_{\Omega}\left(A-P D Q^{T}\right)\right\|^{2}_{F}+\lambda_{2} tr\left(Q^{T} L_{d} Q\right).\\ \end{aligned} $$

(6)

It can be seen that the sub-problem (6) to solve for *Q* is the same with the sub-problem (4) to solve for *P*. Thus we skip the details.

Overall, the framework of our algorithm is shown as follows: