 Research
 Open access
 Published:
Inferring circRNAdrug sensitivity associations via dual hierarchical attention networks and multiple kernel fusion
BMC Genomics volumeÂ 24, ArticleÂ number:Â 796 (2023)
Abstract
Increasing evidence has shown that the expression of circular RNAs (circRNAs) can affect the drug sensitivity of cells and significantly influence drug efficacy. Therefore, research into the relationships between circRNAs and drugs can be of great significance in increasing the comprehension of circRNAs function, as well as contributing to the discovery of new drugs and the repurposing of existing drugs. However, it is timeconsuming and costly to validate the function of circRNA with traditional medical research methods. Therefore, the development of efficient and accurate computational models that can assist in discovering the potential interactions between circRNAs and drugs is urgently needed. In this study, a novel method is proposed, called DHANMKF , that aims to predict potential circRNAdrug sensitivity interactions for further biomedical screening and validation. Firstly, multimodal networks were constructed by DHANMKF using multiple sources of information on circRNAs and drugs. Secondly, comprehensive intratype and intertype node representations were learned using bityped multirelational heterogeneous graphs, which are attentionbased encoders utilizing a hierarchical process. Thirdly, the multikernel fusion method was used to fuse intratype embedding and intertype embedding. Finally, the Dual Laplacian Regularized Least Squares method (DLapRLS) was used to predict the potential circRNAdrug sensitivity associations using the combined kernel in circRNA and drug spaces. Compared with the other methods, DHANMKF obtained the highest AUC value on two datasets. Code is available at https://github.com/cuntjx/DHANMKF.
Introduction
Circular RNA (circRNA) is a unique type of RNA that differs from other RNAs in that it forms a covalently closed loop and is typically considered noncoding. With the advancement of highthroughput genomics technology, circRNA has become a hot topic in RNA biology research [1]. Since the discovery of the first circRNA in RNA viruses in the 1970s [2], the advancement of biomedical technology has resulted in the discovery of an increasing amount of circRNAs. However, research into circRNA function has progressed very slowly over several decades, until 2013, when Memczak et al. and Hansen et al. proved that the circular RNA of human cerebellar degenerationrelated protein has an important function in neural development [3, 4]. This discovery led to a great increase in the study of circRNA function. The most notable function of circRNAs is that they act as miRNA sponges, which regulates target gene expression by inhibiting miRNA activity. One circRNA can regulate one or multiple miRNAs through multiple miRNA binding sites in a circular sequence [5]. Previous studies have found that circRNA can regulate alternative splicing or transcription [6, 7], as well as parental gene expression [8, 9]. The results of these studies have also indicated that circRNA plays an important role in physiological and pathological processes, and that the dysregulation of circRNA is closely related to many human diseases [10]. over the past two decades, several verified biological function experiments have shown that circRNA has potential as a new clinical diagnostic marker.
Over the years, an increasing number of studies have demonstrated that circRNA can significantly affect the drug sensitivity of cells. For example, Gao et al. [11] screened 18 circRNAs from 3093 circRNAs and then verified them in realtime by quantitative reverse transcription PCR. Finally, hsa_circ_0006528 was found to play an important role in chemotherapy resistance in breast cancer patients. Peng et al. [12] first used nextgeneration sequencing (NGS) technology to identify the comprehensive circRNA expression profile of multidrugresistant osteosarcoma(OS) cell lines and found that hsa_circ_ 0004674 was significantly elevated in OSresistant cells and tissues, and was associated with poor prognosis. This was then verified by quantitative realtime PCR (qRTPCR). A study by Wu et al. [13] found that hsa_circ_0001546 is decreased in gastric cancer, which is associated with poor prognosis and also inhibits drug resistance via the ATM/Chk2/p53dependent pathway. Ruan et al. [14] used four identification algorithms to describe the expression profile of circRNA in approximately 1000 human cancer cell lines and observed a strong correlation between circRNA expression and drug response. That study systematically demonstrated the effect of circRNAs on drug sensitivity. However, research into the relationship between circRNA and drug sensitivity is a newly emerging field that has developed rapidly over the past decade, so our understanding of this relationship is still in its early stages.
The process of validating the relationships between circRNA and drug sensitivity using traditional biomedical methods is timeconsuming and costly. Therefore, some researchers have developed computational models that can help to reveal the potential relationships between circRNA and drugs. For example, Deng et al.[15] proposed a computational model called GATECDA for predicting the association between circRNA and drug sensitivity. GATECDA is based on the Graph Attention Autoencoder(GATE) [16]. First, sequence information data for circRNAs, structural data for drugs, and circRNAdrug sensitivity association data were collected. Then the similarity between circRNAs and drugs were each calculated and these data as well as circRNAdrug sensitivity association data were input into the GATE, in order to generate lowdimensional vector representations of circRNA and drug nodes. Finally, the lowdimensional vector representations generated by the GATE were input into a fully connected neural network for circRNAdrug sensitivity association prediction. Later, Yang et al. [17] proposed a model called MNGACDA. The model constructs a multimodal network based on multiple information sources on circRNAs and drugs. Then, a nodelevel attention Graph AutoEncoder was used to obtain lowdimensional embeddings of circRNAs and drugs from the multimodal network. Finally, the lowdimensional embeddings of circRNAs and drugs were input into an inner product decoder to score the association between circRNAs and drug sensitivity. To our knowledge, these were the first models to apply computational methods to predict the potential association between circRNAs and diseases. Thus far, no other new models have been applied in this field, and considerable advancement in the creation of new and improved models for this field of research is much needed.
Since Multiple Kernel Learning (MKL) [18] was proposed, it has been widely applied to bipartite biological networks for the improvement of model performance. Specifically, the information contained in the samples were used by MKL to compute the multiple kernel matrix, and then the optimal kernel matrix was obtained by fusing multiple kernel matrices. For example, MKGCN, which is based on MKL and GCN [19], was proposed by Yang et al. to infer novel microbedrug associations. Yan et al. [20] proposed a computational methods called MKLCBiRW, based on MKL and Birandom walk algorithm, to predict potential drugtarget interactions by integrating diverse drugrelated and targetrelated heterogeneous information.
In this study, we propose a novel method, called DHANMKF, that aims to predict potential circRNAdrug sensitivity associations for further biomedical screening and validation. Firstly, multimodal networks were constructed by DHANMKF using multiple sources of information on circRNAs and drugs. Secondly, comprehensive intratype and intertype node representations were learned using multirelational heterogeneous graphs, which are attentionbased encoders under a hierarchical process. Thirdly, a multikernel fusion method was used to fuse intratype embedding and intertype embedding. Fourthly, the Dual Laplacian Regularized Least Squares (DLapRLS) method was used to predict the potential circRNAdrug sensitivity associations by the combined kernel in circRNA and drug spaces. In order to evaluate the effectiveness of DHANMKF, it was compared with six stateoftheart methods on a benchmark data set under 5fold crossvalidations (5CV). Compared with the other methods, DHANMKF obtained the highest AUC. Furthermore, an ablation study was performed to compare the experimental results from different perspectives. Finally, case studies were conducted to demonstrate that the DHANMKF model can be a useful tool for helping with the study of circRNAdrug sensitivity associations in real situations. To the best of our knowledge, DHANMKF is the first algorithm to use dual hierarchical attention networks for the prediction of circRNAdrug sensitivity associations. Our main contributions, differing from previous approaches, are summarized as follows: (1) We classify nodes into two types, i.e., head nodes and tail nodes, based on the degree of the nodes, and then define the types of edges based on the associations between different kinds of nodes. (2) Based on the differences in types of edges, we use dual hierarchical attention networks to extract the information on circRNAs and drugs, use the multikernel fusion method to fuse this information, and then use the dual graph regularized least squares method to predict potential circRNAdrug associations. (3) We tested DHANMKF on two datasets, and the results show that multirelational dual hierarchical attention networks perform better than the other methods in predicting potential circRNAdrug associations. These results can provide new insights for further research on circRNAdrug associations.
Materials and methods
Datasets
Two datasets, data271 and data251, were used in this study. Data271 is from Deng et al. [15] and data251 is from Deng et al. [15] and Peng et al. [21]. CircRNAdrug sensitivity associations were collected from the circRic database [14] by Deng et al. [15], where drug sensitivity data were obtained from the GDSC database [22]. After Wilcoxon tests with a false discovery rate \(<0.05\), these significant circRNAdrug sensitivity associations were extracted as the data271 dataset, which contains \({N}_{c}=271\) circRNAs, \({N}_{d}=218\) drugs and 4134 circRNAdrug sensitivity associations. Integrating with the dataset of Peng et al. [21], we removed circRNAs with hostgene interaction scores \(\le 0.5\) and nodes with a degree of 0. This resulted in the data251 dataset, containing \({N}_{c}=251\) circRNAs, \({N}_{d}=217\) drugs, and 3635 circRNAdrug sensitivity associations. Additional information on these two datasets can be found in the Supplementary file. In our experiment, circRNAs and drugs were represented as two different types of nodes in the network. The node set of \(\textbf{N}_{c}\) circRNAs was defined as \(\textbf{C}=\{{c}_1, \ldots , {c}_{N_c}\}\). Similarly, the node set of \(\textbf{N}_{d}\) drugs was described as \(\textbf{D}= \{{d}_1, \ldots , {d}_{N_d}\}\). An adjacency matrix \(\textbf{Y} \in R^{{N}_{c} \times {N}_{d}}\) was created for the storage of circRNAdrug associations. In this matrix, \({N}_{c}\) rows represent the number of circRNAs and \({N}_{d}\) columns represent the number of drugs. If circRNA \({c}_{i}\)(\({1 \le i \le {N}_{c}}\)) is associated with drug \({d}_{j}\)(\({1 \le j \le {N}_{d}}\)), \(\textbf{Y}_{ij}=1\), otherwise \(\textbf{Y}_{ij}=0\). During the training phase, all the \(\textbf{Y}_{ij}=1\) are treated as positive samples and the others are treated as negative samples. We randomly masked some positive samples from \(\textbf{Y}\) to get \(\textbf{Y}_{\textbf{train}}\). In order to calculate the similarity of circRNAs and drugs, the host gene sequences of circRNAs were downloaded from the National Center for Biotechnology Information (NCBI) gene database [23] and the drug structure data were downloaded from NCBIâ€™s PubChem database [24].
Sequence similarity of host genes of circRNAs
Applying methods similar to those of Deng et al. [15] and Yang et al. [17], we treated the sequence similarity between host genes of circRNAs as the similarity between circRNAs. In this way, the similarity calculation between circRNAs became the sequence similarity calculation between host genes of circRNAs. The sequence similarity between host genes of circRNAs was calculated based on the sequence Levenshtein distance, which was obtained using the ratio function of Pythonâ€™s Levenshtein package. A similarity matrix \(\textbf{CSS} \in R^{{N}_{c} \times {N}_{c}}\) was created for storing the circRNA sequence similarity.
Structural similarity of drugs
The structure of drugs has a great impact on their function. Therefore, it has become a common practice to measure the similarity of drugs based on their structure. As in previous studies [25, 26], RDKit [27] toolkit and the Tanimoto method were used to calculate the structural similarities between drugs. The specific process was as follows: first, the structural data on several drugs were obtained from the PubChem database. Then, RDKit was used to calculate the topological fingerprint of each drug. After that, the structural similarity between drugs was calculated using the Tanimoto method. Finally, the drugs structural similarity matrix \(\textbf{DSS} \in R^{{N}_{d} \times {N}_{d}}\) was derived.
Gaussian interaction profile kernel similarity for circRNAs and drugs
The Gaussian Interaction Profile (GIP) kernel similarity [28] algorithm is a collaborative filtering algorithm that has been widely used in previous studies for similarity calculation [29, 30], and it helps to obtain topological information on circRNAs and drugs in relational graphs. Therefore, we calculated the GIP kernel similarity for circRNAs and drugs using the circRNAdrug association network. Firstly, based on the assumption that similar circRNAs are more likely to be associated with similar drugs, we utilized a binary vector \(\textbf{BI}(c_{i})\), which is the ith row of the \(\textbf{Y}_{\textbf{train}}\) matrix, representing the associations between circRNAs \(c_{i}\) and all drugs in the training matrix of \(\textbf{Y}\). Then, the GIP kernel similarity for circRNAs \(\textbf{CGS}(c_{i}, c_{j})\) between circRNA \(c_{i}\) and \(c_{j}\) was calculated as below:
Here, \(\alpha _{c}\) has been set to 1 referring to [28]â€™s studies. And similarly, we calculated the GIP of drug \(\textbf{DGS}(d_{i}, d_{j})\) between drugs \(d_{i}\) and \(d_{j}\) as follows:
Here, the binary vector \(\textbf{BI}(d_{i})\) is the ith column of the \(\textbf{Y}_{\textbf{train}}\) matrix, representing the associations between drugs \(d_{i}\) and all circRNAs in the training matrix of \(\textbf{Y}\). \(\alpha _{d}\) has been set to 1 referring to [28] studies.
Integrated similarity for circRNAs and drugs
Inspired by the study of Wang et al. [31], we used a nonlinear fusion method to integrate the circRNA similarity and the drug similarity. With circRNA similarity, for example, we first normalized the sequence similarity of host genes of circRNAs using the following formula:
Then, the K Nearest Neighbors (KNN) algorithm was used to measure \(\textbf{CSS}\)â€™s local affinity as follows:
\(N_{i}^{'}\) in Eq. (6) is the set of KNN of \(c_{i}\), including \(c_{i}\) in \(\textbf{CSS}\). This operation is based on the assumption that the higher the local similarity, the more reliable it is. Therefore, the nearend similarity is high while the farend similarity is set to 0. Similarly, we repeated the process for \(\textbf{CGS}\) and then we obtained \(\textbf{CGS}^{\mathbf {'}}\) and \(\textbf{CKNN2}\). After that, we updated the similarity matrix for each kind of data as follows:
After each iteration, \(\textbf{CSS}^{\mathbf {'(t+1)}}\) is normalized by formula Eq. (5). Similarly, \(\textbf{CGS}^{\mathbf {'(t+1)}}\) performs the same normalization. The iteration does not stop until the convergence condition is met, and the convergence condition is met when the relative change in \(\frac{\Vert \textbf{CSS}^{\mathbf {'(t+1)}}  \textbf{CSS}^{\mathbf {'(t)}}\Vert }{\Vert \textbf{CSS}^{\mathbf {'(t)}} \Vert }\) and \(\frac{\Vert \textbf{CGS}^{\mathbf {'(t+1)}}  \textbf{CGS}^{\mathbf {'(t)}}\Vert }{\Vert \textbf{CGS}^{\mathbf {'(t)}} \Vert }\) is less than \(10^{6}\). Assuming that the process involves t iterations, the overall comprehensive similarity matrix of circRNA can be obtained by Eq. (9) when the iteration ends.
Based on these rules, the similarity matrix \(\textbf{S}_{\textbf{c}}\) is an asymmetry matrix. Therefore, we calculated the \(\textbf{S}_{\textbf{c}} = \frac{\textbf{S}_{\textbf{c}} + \textbf{S}_{\textbf{c}}^{\textbf{T}}}{2}\) as the circRNA comprehensive similarity matrix. For drugs, we applied the same rules to \(\textbf{DSS}\) and \(\textbf{DGS}\), then we obtained the comprehensive drug similarity matrix \(\textbf{S}_{\textbf{d}}\).
DHANMKF
Dual Hierarchical Attention Networks (DHAN) were proposed by Zhao et al. [32] in 2022. Comprehensive node representations are learned with intratype and intertype attentionbased encoders using a hierarchical process based on the bityped multirelational heterogeneous graphs in DHAN. Specifically, DHAN uses two encoders, one to aggregate information on nodes of the same type and the other to aggregate node representations of different type neighbors. Then, the complex structure of the bityped multirelational heterogeneous graph is captured by the model by a hierarchical process and duallevel attention operation. It is worth noting that the association matrix Y of circRNAdrug is a bityped singlerelation heterogeneous graph. Therefore, in order to fully utilize the extraction ability of DHAN for node embedding, it is necessary to classify the relationships between nodes.
It is wellknown that the adjacency matrix describing different objects in the biomedical field is sparse. This means that there are many nodes with small degrees. Histograms of the degree distributions of circRNAs and drugs can be found in the Supplementary file. It can be seen that most of the nodes have small degrees regardless of whether they are circRNA nodes or drug nodes. Intuitively, the biomedical significance of a drug being associated with only a few circRNAs or a drug being associated with many circRNAs are different. Inspired by Liu et al. [33], we categorized the nodes into head nodes and tail nodes according to the value of their degrees. That is, for every node \(v \in V\), where V is the set of nodes in a graph. \(N_{v}\) is denoted by the set of neighboring nodes of v, and the number of elements in set \(N_{v}\) is defined as the degree of v. Here we let \(V_h\) and \(V_t\) denote the set of head and tail nodes, respectively. For some threshold K, we define tail nodes as nodes with a degree not exceeding K, i.e., \(V_t = \{ v :N_v \le K \}\), whereas head nodes are the complement of tail nodes, that is, \(V_h = \{ v :N_v > K \}\). K is treated as a hyperparameter in our study. In this way, the association of circRNAs with drugs in dataset data271 changes from being one type to being the following four types.

1.
Association between the head node of circRNA and the head node of the drug.

2.
Association between the head node of circRNA and the tail node of the drug.

3.
Association between the tail node of circRNA and the head node of the drug.

4.
Association between the tail node of circRNA and the tail node of the drug.
Because the node types in the circRNA similarity network and the drug similarity network are the same, that is, they are either all circRNA or all drugs, the following three types of associations will be in these two similarity networks.

1.
Associations between the head nodes.

2.
Associations between head node and tail node.

3.
Associations between tail nodes.
In summary, we represent intratype relationships and intertype relationships as \(R_{intra}=\{1,2,3\}\) and \(R_{inter}=\{1,2,3,4\}\), respectively. Whereas in the data251 dataset, circRNAs were split into two types depending on whether the host gene of the circRNA was associated with a disease or not, which is analogous to splitting circRNAs into head nodes and tail nodes. Thus the same number of edge types can also be obtained, and the definition is more biologically meaningful in this way.
Intratype attentionbased encoder
After the computational process above, the associated network of circRNA and drug becomes a bitype multirelationship heterogeneous network, given a node pair \((n_i,n_j) \in C\) that are connected via node intratype relationship \(\Phi _k \in R_{intra}^{(c)}=\{1,2,3\}\). Firstly, we initialized the representation matrix of circRNA to \(\textbf{H}_0^{c}=\textbf{S}_{c}\textbf{Y}_{train}\textbf{W}_{intra}^c=[\textbf{h}_{1}^T, \dots , \textbf{h}_{N_c}^T]^T\). Where \(\textbf{W}_{intra}^c \in R^{N_d \times d^{'}}\) is a learnable parameter \(\textbf{H}_0^{c} \in R^{N_c \times d^{'}}\), and \(\textbf{h}_{i} \in R^{d^{'}}\)(\({1 \le i \le {N}_{c}}\)) is the feature vector of the node \(n_i\). Secondly, selfattention was performed on the circRNA nodes to formulate the importance \(e_{ij}^{\Phi _{k}}\) of a specificrelation based node pair \((n_i, n_j )\) as follows:
Where  denotes the concatenate operation, and \(\textbf{a}_{\Phi _{k}^T} \in R^{2d^{'} \times 1}\) denotes the shared nodelevel attention weight vector under relation \(\Phi _{k}\). LeakyRelu is the nonlinearity activation function, which is widely used in attentionbased neural networks. In the third step, \(e_{ij}^{\Phi _{k}}\) is standardized using the Eq. (11) to facilitate comparison of importance between different nodes.
Where \(N_{intra}^{\Phi _k}(n_i)\) denotes specific relationbased neighbors of \(n_i\), the embedding \(\textbf{h}_{1}^{\Phi _k}\) of node \(n_i\) under given relation \(\Phi _k\) is obtained as follows:
Where \(Norm_{\Phi _{k}}\) denotes the relationspecific layer normalization operation; \(\textbf{h}_{i}^{\Phi _{k}}\) is semanticspecific. Therefore, by using Eq. (12) to fuse the aggregated information of nodes with different specific relations, more comprehensive node embeddings can be obtained as follows:
Where \(\textbf{q} \in R^{2d^{'} \times 1}\) is a trainable parameter. Similar to Eq. (11), we standardize \(g_{i}^{\Phi _{k}}\) by using the softmax function as follows:
Here \(\beta _{ij}^{\Phi _{k}}\) is used to measure the local importance of intrarelation \(\Phi _{k}\). Finally, the intratype attentionbased representation of circRNA node \(n_i\) can be obtained as follows:
Where \(\textbf{z}_{i}^{c} \in R^{d^{'} \times 1}\), \(\beta _{G}^{\phi _l}\) denotes how important intratype \(\Phi _l\) is for all circRNA nodes and can be regarded as a global importance parameter. The global and local importance of the intratype relationship \({\Phi _l}\) is smoothed by the parameter t. Both \(\beta _{G}^{\phi _l}\) and t can be learned from training. The aggregated information for node \(n_i\) under intratype relation \(\Phi _l\) is represented by \(\textbf{h}_{i}^{\Phi _l}\). Initialize the representation matrix of drug to \(\textbf{H}_0^{d}=\textbf{S}_{d}\textbf{Y}_{train}^T\textbf{W}_{intra}^d\), where \(\textbf{W}_{intra}^d \in R^{N_c \times d^{'}}\) is a learnable parameter and \(\textbf{H}_0^{d} \in R^{N_d \times d^{'}}\). Using the same process above, we can get the intratype attentionbased representation of drug node \(n_i\), which can be represented as \(\textbf{z}_{i}^{d} \in R^{d^{'}} \times 1\). Let \(\textbf{Z}_1^{c}=[(\textbf{z}_{1}^{c})^T,\dots ,(\textbf{z}_{N_c}^{c})^T]^T\) and \(\textbf{Z}_1^{d}=[(\textbf{z}_{1}^{d})^T,\dots ,(\textbf{z}_{N_d}^{d})^T]^T\) respectively represent the first layer output of the intratype attentionbased encoder, that is, the node embedding matrix of circRNAs and drugs. Assuming that the intratype attentionbased encoder has t layers, the output of the previous layer is taken as the input of the next layer. Repeating this process can obtain t node embedding matrices about circRNA and drugs as follows: \(\textbf{Z}_1^{c},\dots ,\textbf{Z}_{t}^{c}\), \(\textbf{Z}_1^{d},\dots ,\textbf{Z}_{t}^{d}\).
Intertype attentionbased encoder
The purpose of the intratype attentionbased encoder is to learn node embeddings by aggregating the node information of the same type neighbors, while the purpose of the intertype attentionbased encoder is to handle interactions between different types of nodes. Let \(n_i \in C\) and \(n_j \in D\), respectively. \(\textbf{z}_{i}^{c}\) and \(\textbf{z}_{j}^{d}\) are the learned representations of the circRNA node \(n_i\) and drug node \(n_j\) by intratype attention networks, respectively. The nodelevel importance \(c_{ij}^{\Phi _m}\) can be calculated by Eq. (16) and normalized by Eq. (17) as follows:
\(N_{inter}^{\Phi _m}(n_i)\) denotes the neighbors of node \(n_i\) under specific interrelation \(\Phi _m\). \(\textbf{W}_{inter}^{c}\) and \(\textbf{W}_{inter}^{d} \in R^{d^{'} \times d^{'}}\) are two typespecific matrices that map their features \(\textbf{z}_{i}^{c}\) and \(\textbf{z}_{i}^{d}\) into a common space. \(\textbf{a}_{\Phi _m} \in R^{2d^{'}}\) is a learnable weight vector. The relationship embedding of circRNA node \(n_i\) can be aggregated from the embeddings of its neighbors of different types(that is, the nodes of a drug), with corresponding coefficients as follows:
\(Norm_{\Phi _{m}}\) denotes the layer normalization operation related to the intertype relation. Then, the importance of relation embedding \(\textbf{z}_{i}^{\Phi _{m}}\) related to node \(n_i\) are obtained by fusing all relational representations by Eq. (19), and it is normalized by Eq. (20) for making relation importance comparable within intertype relations.
Finally, the representation \(\textbf{u}_i\) of circRNA node \(n_i\) is obtained by fusing these relationspecific representations as follows:
Similarly, we can get the intertype attentionbased representation of drug node \(n_j\), which can be represented as \(\textbf{u}_{j}^{d}\). Let \(\textbf{U}_1^{c}=[(\textbf{u}_{1}^{c})^T,\dots ,(\textbf{u}_{N_c}^{c})^T]^T\) and \(\textbf{U}_1^{d}=[(\textbf{u}_{1}^{d})^T,\dots ,(\textbf{u}_{N_d}^{d})^T]^T\) respectively represent the first layer output of the intertype attentionbased encoder, that is, the node embedding matrix of circRNAs and drugs. Assuming the intertype attentionbased encoder has M layers, the output of the previous layer is taken as the input of the next layer. Repeating this process can obtain M node embedding matrices about circRNA and drugs as follows: \(\textbf{U}_1^{c},\dots ,\textbf{U}_{M}^{c}\), \(\textbf{U}_1^{d},\dots ,\textbf{U}_{M}^{d}\).
Multikernel fusion
We can extract multiple embeddings from the intratype attentionbased encoder and the intertype attentionbased encoder that represent the information on circRNA nodes and drug nodes of different types and different relationships. For all the embeddings of circRNA and drug, we used the GIP kernel similarity function to calculate the circRNA and drug kernel matrices in each layer as follows:
Where \(\textbf{IC}_{l}\in R^{N_{c} \times N_{c}}\), \(\textbf{ID}_{l}\in R^{N_{d} \times N_{d}}\), \(\textbf{H}_{c}^{(l)}(i)\) and \(\textbf{H}_{d}^{(l)}(i)\) represent the ith row of matrix \(\textbf{H}_{c}^{(l)}\) and \(\textbf{H}_{d}^{(l)}\) respectively; \(\textbf{H}_{c}^{(l)}\) and \(\textbf{H}_{d}^{(l)}\) are the lth element of the circRNA embedding matrix set \(\textbf{H}_c = \{ \textbf{H}_{0}^{c}, \textbf{Z}_1^{c},\dots ,\textbf{Z}_{t}^{c}, \textbf{U}_1^{c},\dots ,\textbf{U}_{M}^{c} \}\) and the drug embedding matrix set \(\textbf{H}_d = \{ \textbf{H}_{0}^{d}, \textbf{Z}_1^{d},\dots ,\textbf{Z}_{t}^{d}, \textbf{U}_1^{d},\dots ,\textbf{U}_{M}^{d} \}\), respectively. \(\gamma _{l}\) denotes the corresponding bandwidth, we set \(\gamma _{l}=\gamma ,l=1,\cdots ,K+1\), and \(\gamma\) is a hyperparameter, and \(K+1\) is the number of elements in the circRNA embedding matrix set \(\textbf{H}_c\) and the drug embedding matrix set \(\textbf{H}_d\).
We integrated all the kernels above with multiple kernel fusion in order to fully utilize the information and improve the performance of predicting circRNAdrug associations, then the final kernel matrices of circRNA and drug were obtained as follows:
Where \(\textbf{IC}\in R^{N_{c} \times N_{c}}\), \(\textbf{ID}\in R^{N_{d} \times N_{d}}\), \(\omega _{i}^{m}=\frac{1}{K+1}\), and \(\omega _{i}^{d}=\frac{1}{K+1}\) are the corresponding weight of circRNA kernels and drug kernels, respectively.
Dual Laplacian regularized least squares model
Inspired by previous studies [34] and [35], the Dual Laplacian Regularized Least Squares (DLapRLS) method was adopted by us to predict circRNAdrug associations. Overfitting was avoided by adding graph regularization with DLapRLS. Thus, the loss function can be defined as follows:
Where \(\Vert \cdot \Vert _{F}\) is the Frobenius norm, \(\varvec{\alpha }_c\) and \(\varvec{\alpha }_{d}^{T}\in R^{N_{c}\times N_{d}}\) are learnable matrices, \(\phi _{c}\) and \(\phi _{d}\) are regularization parameters.; \(\textbf{L}_{c} \in R^{N_{c}\times N_{c}}\) and \(\textbf{L}_{d} \in R^{N_{d} \times N_{d}}\) are the normalized Laplacian matrices, as follows:
Where \(\textbf{V}_{c}=\sum _{i=1}^{N_{c}}\textbf{IC}\) and \(\textbf{V}_{d}=\sum _{i=1}^{N_{d}}\textbf{ID}\) are diagonal degree matrix. Finally, the prediction \(\hat{\textbf{F}}\) for circRNAdrug associations from \(\textbf{IC}\) and \(\textbf{ID}\) is obtained as follows:
Training
Except for parameters \(\varvec{\alpha }_{c}\) and \(\varvec{\alpha }_{d}\), the parameters of our model are updated by Adam [36]. The parameters of \(\varvec{\alpha }_{c}\) and \(\varvec{\alpha }_{d}\) are updated by calculating the partial derivatives for the parameters of DLapRLS. The specific calculation process is as follows: we first assume that \(\varvec{\alpha }_{d}\) is a constant matrix when \(\varvec{\alpha }_{d}\) is optimized. Thus, the partial derivative of the loss function Eq. (26) with respect to \(\varvec{\alpha }_{c}\) can be calculated as follows:
Let \(\frac{\partial J}{\partial \varvec{\alpha }_{c}} = 0\), then \(\varvec{\alpha }_{c}\) can be obtained as follows:
Similarly, the partial derivative of the loss function Eq. (26) with respect to \(\varvec{\alpha }_{d}\) can be calculated as follows:
Same as above, we let \(\frac{\partial J}{\partial \varvec{\alpha }_{d}} = 0\), and then \(\varvec{\alpha }_{d}\) can be obtain as follows:
\(\varvec{\alpha }_{c}\) and \(\varvec{\alpha }_{d}\) were randomly initialized at the beginning of our model training, and then they were calculated by Eqs. (31) and (33) directly in each iteration, while other parameters were optimized by Adam. The flowchart of our model is shown in Fig. 1. All experimentally verified circRNAdrug associations were treated as positive samples, and the unknown circRNAdrug associations were treated as negative samples, similar to the work of Deng et al. [15] and Yang et al. [17]. Then, the same number of negative samples were randomly selected from all the the unknown circRNAdrug associations. Finally, the same number of positive and negative samples were selected for training.
Results
Implementation details and performance evaluation
The model used in this study was implemented based on PyTorch and PyG, and we evaluated the predictive performance of our model using 5fold crossvalidation (5CV). The training epochs were set to 40, the learning rate to 0.05 and the weight decay to 0.01. The number of layers for both the intratype attentionbased encoder and the intertype attentionbased encoder were set to 1 and the output dimensions were both set to 16. The thresholds for distinguishing the head and tail nodes of circRNAs and drugs were set at 27 and 39, respectively. Multiheaded attention was set to 5, and the remaining hyperparameters were set as follows: \(\phi _{c}=\phi _{d}=\frac{1}{120}\), \(\gamma _c=\gamma _d=\gamma =\frac{1}{75}\).
During evaluation, we randomly divided all the samples into 5 folds. Four of these folds were used as a training set while the remaining fold was treated as a test set. Seven metrics are used to compare model performance: AUC, AUPR, Accuracy, Precision, Recall, F1Score, and Specificity. It is wellestablished that improved model performance is reflected by higher AUC and AUPR values. F1Score is the average of accuracy and recall, while specificity measures the ability of the classifier to correctly identify negative cases.
Performance comparison with other methods under 5CV
The current computational methods for predicting circRNAdrug sensitivity associations are restricted. We found that GATECDA [15] and MNGACDA [17] are specifically designed for predicting circRNAdrug sensitivity associations. Thus, like Ref. [15] and Ref. [17], we compared our model with seven stateoftheart models from different domains, namely MNGACDA [17], GATECDA [15], MINIMDA [37], LAGCN [38], MMGCN [39], and GANLDA [40] . Brief descriptions of these models are provided below:

MNGACDA [17] : a computational framework for predicting circRNAdrug sensitivity associations. This model uses multimodal networks to learn the embedded representations of circRNAs and drugs, then captures the internal information between nodes in the networks with nodelevel attention Graph AutoEncoder.

GATECDA [15] : a computational model based on Graph Attention Autoencoder (GATE) for predicting circRNAdrug sensitivity associations.

MKGCN [35] : a computational model based on GCN and MKL for predicting microbedrug associations.

MINIMDA [37] : a method of predicting miRNAdisease associations by constructing integrated similarity networks and using multimodal networks to obtain embedding representations of miRNAs and diseases. These representations are then fed into a multilayer perceptron for prediction.

LAGCN [38] : LAGCN integrates various associations into a heterogeneous network, learns embeddings of drugs and diseases by Graph Convolution operations, and then combines multiple layers by using an attention function.

MMGCN [39] : MMGCN differs from simple multisource integration in that it uses a GCN encoder to obtain miRNA and disease features in different similarity views and enhances the learned representations for association prediction by using multichannel attention that adaptively learns the importance of different features.

GANLDA [40] : this method combines heterogeneous data of lncRNA and disease as original features and reduces noise by using Principal Component Analysis (PCA). Then the Graph Attention Network is used to extract information from the features. Finally, a multilayer perceptron is used to predict lncRNAdisease associations.
The prediction performance of each method was evaluated by a 5CV experiment using the same settings and optimal parameters recommended in their respective studies. From Table 1, it can be seen that DHANMKF achieved the highest AUC and AUPR values. This indicates that DHANMKF performed better overall compared to the other models.
Evaluation of parameters
The prediction performance of DHANMKF is affected by various parameter values. The parameters of DHANMKF can be divided into four parts: the parameters in the intertype attentionbased encoder and the intratype attentionbased encoder, bandwidth parameter \(\gamma\) in MKF, regularization parameters (\(\phi _c\) and \(\phi _d\)) in DLapRL, and degree threshold parameters (\(K_c\) and \(K_d\)) for distinguishing circRNA and drug nodes as head and tail nodes. Here, the process of parameter evaluation is demonstrated using data271 as the baseline dataset.The parameter settings of DHANMKF on the data251 dataset have been put into the Supplementary file.
Optimizable parameters in the intratype attentionbased encoder and the intertype attentionbased encoder

Learning rate and its weight decay. Learning rate and its weight decay are the same in the intratype attentionbased encoder and the intertype attentionbased encoder. Based on the research conducted by Zhao et al. [32], we set them as 0.05 and 0.01 respectively.

Dropout and number of model training epochs. We selected the dropout to be \(\{0.02,0.021,\dots ,0.03\)}. When the value of the dropout is 0.026, the model performance reaches its optimum, and with an increase in the value of the dropout, the model performance gradually declines. The loss of DHANMKF started converging at 40 training epochs, so the number of epochs for our model was set to 40.

The number of attention heads. To have a more powerful representation learning capacity, the multihead attention mechanism was incorporated into the model. This parameters was tuned using 5CV. As shown in Fig. 2A, when the number of attention heads is equal to 5, the model performance reaches its optimum.

The output dimensions. We analyzed the output dimensions of the intratype attentionbased encoder and intertype attentionbased encoder, as shown in Fig. 2B. When the output dimension was 16, the AUC performance was best.

The number of layers of the intratype attentionbased encoder and the intertype attentionbased encoder. As shown in Fig. 3A, when the number of layers of the intratype attentionbased encoder and the intertype attentionbased encoder are both 1, the AUC of DHANMKF reaches its optimal value.
Optimizable parameters in MKF and DLapRL

The bandwidth parameter \(\gamma\) in MKF is actually the \(\frac{1}{2\sigma ^2}\) of the Gaussian kernel function, that is, \(\gamma =\frac{1}{2\sigma ^2}\). Parameter \(\sigma\) determines the smoothness of the Gaussian filter. The larger \(\sigma\) is, the smoother it is. Therefore, by adjusting \(\gamma\), a compromise can be reached between oversmoothing and undersmoothing. As shown in Fig. 3B, when \(\gamma = \frac{1}{75}\), the AUC of DHANMKF reaches its optimal value.

The parameters \(\phi _c\) and \(\phi _d\) play a regulating role in DLapRL, and they can be adjusted to balance underfitting and overfitting. From Fig. 4A, we can see that the AUC of the model reaches its maximum when \(\phi _c\) and \(\phi _d\) are both \(\frac{1}{120}\).
Optimization of head and tail node thresholds
The threshold of the head and tail nodes can adjust the number of head and tail nodes and the number of associated types, thus affecting the embedding of corresponding nodes. As shown in Fig. 4B, the maximum AUC value of the model is achieved when the thresholds of the circRNA node and drug node are 27 and 39, respectively.
Ablation tests
Ablation experiments were conducted from two perspectives: 1. Analyzing the importance of the intratype attentionbased encoder and the intertype attentionbased encoder; 2. Analyzing the effects of multiple relationships. Therefore, we constructed three ablation experiments. The first one is called DHANMKFintra, which means that DHANMKF removes the embedding produced by the intratype attentionbased encoder when doing MultiKernel Fusion. The second one is called DHANMKFinter, which means that DHANMKF removes the embedding produced by the intertype attentionbased encoder when doing multicore fusion. The third one is called DHANMKFmulti, which means that the model no longer divides the relationships between nodes into multiple categories.
Table 2 shows the comparison results of the 5CV. From Table 2 we can see that DHANMKF performs better than all other models. This shows that: (1) Compared with DHANMKFintra and DHANMKFinter, DHANMKF performs better, which means that the embeddings produced by the intratype attentionbased encoder and the intertype attentionbased encoder improve the performance of the model. (2) Compared with DHANMKFmulti, DHANMKF can generate node embeddings corresponding to different relationships between nodes. In summary, there are two main reasons why DHANMKF can outperform other models. The first reason is that our model can fully capture the complex structures of the bityped multirelational heterogeneous graphs. The second reason is that biological networks in reality are sparse, so it is reasonable to divide the nodes in biological networks into head nodes and tail nodes for analysis.
Case studies
To further evaluate the predictive performance of our model, we selected two drugs, PAC1 and Vorinostat, for case studies. Similar to Deng et al. [15] and Yang et al. [17], we used the circRNAdrug associations in the GDSC database as the training set and those in the CTRP database as the testing set. For each drug, we chose the top 20 circRNAs with the highest predicted scores from our modelâ€™s circRNAdrug association prediction outputs for validation.
PAC1 is the first known small molecule drug that directly activates procaspase3 to caspase3 [41]. It not only enhances procaspase3 activity but also induces cancer cell apoptosis. Vitro experiments have shown that PAC1 exhibits cytotoxicity against lymphoma, multiple myeloma, and many other cancer cells [42]. Currently, PAC1 has been used in clinical trials for the treatment of various tumors, including but not limited to lymphoma, melanoma, solid tumors, breast cancer, and lung cancer [43]. As shown in Table 3, among the top 20 circRNAs predicted by our method to be associated with PAC1, 16 have been identified in CTRP.
Belinostat is a smallmolecule hydroxamatetype inhibitor that can inhibit the activity of class I, II and IV histone deacetylase enzymes. It has been used to treat relapsed or refractory peripheral Tcell lymphoma [44]. Table 4 shows that 17 of the top 20 circRNAs predicted by our method have been confirmed in circRic.
In order to demonstrate the performance of DHANMKF in predicting the potential association between new drugs and circRNA, we chose two drugs for ab initio testing, both of which had only one known circRNAdrug association. During the training phase, we removed the unique association between these two drugs and circRNA. At this point, these two drugs were not associated with any circRNAs and were treated as new drugs during training. These two drugs were Bortezomib and MS275 (Entinostat). Bortezomib is a novel proteasome inhibitor with potent chemo/radiosensitizing effects that can overcome the traditional resistance of tumors when used in combination with chemotherapy [45]. In addition, existing clinical applications have shown that Bortezomib can improve clinical outcomes in the treatment of hematologic malignancies [46].
MS275, also known as Entinostat, is effective in human leukemia cells and lymphoma cells. It can reduce the level of BclXL in cells, induce p21 protein expression, cause cell cycle arrest (G1 phase), and induce cell apoptosis [47]. In addition, when used in combination with other drugs, entinostat can enhance the activity of some anticancer drugs, including Rituximab, Gemcitabine, Doxorubicin, Sorafenib and Bortezomib. Currently, Entinostat is undergoing phase III clinical trials and its clinical data shows that it has great potential for treating breast cancer [48].
As shown in Table 5, 6 of the top 10 predicted circRNAs associated with Bortezomib have been confirmed in circRic, and 7 of the top 10 circRNAs related to MS275 have been confirmed in circRic.
Conclusions
Recent research over the past twenty years has shown that circRNA plays an important role in drug sensitivity. Therefore, predicting the potential association between circRNA and drug sensitivity can be helpful in drug development and utilization, thus benefiting patients. In this study, we proposed a method, based on intratype attention and intertype attention called DHANMKF, for discovering potential circRNAdrug sensitivity associations. To verify the effectiveness of the model, DHANMKF was compared with six stateoftheart methods based on 5CV on benchmark datasets. The results showed that DHANMKF achieved the best performance. In addition, to further evaluate the ability of the model to discover new drugs, a case study was conducted and the modelâ€™s prediction results were validated using an independent database. The validation results clearly demonstrate that DHANMKF is an effective tool for predicting new circRNAdrug sensitivity associations.
The results show that our model outperforms the baseline models. We believe the main reasons are the following: (1) We classify the nodes into head and tail nodes, which in turn defines the types of edges connecting these two types of nodes. This allows our model to extract node embeddings from the circRNAdrug heterogeneous graph based on different types of edges. (2) The intratype attentionbased encoder can efficiently aggregate information from nodes of the same type. (3) The intertype attentionbased encoder adequately extracts node representations from different types of nodes . (4) The MKL method fuses the multirelational heterogeneous graph information captured by the two encoders in order to improve the overall performance of the model. In future studies, we plan to integrate more biomedical data, in order to generate more comprehensive circRNA and drug kernels and further improve model performance. Currently, there are few studies that use computational methods to predict potential associations between circRNA and drug sensitivity, so further investigation in this field is merited.
Availability of data and materials
The data sets and source codes used in this study are freely available at https://github.com/cuntjx/DHANMKF.
References
Nisar S, Bhat AA, Singh M, Karedath T, Rizwan A, Hashem S, et al. Insights into the role of CircRNAs: biogenesis, characterization, functional, and clinical impact in human malignancies. Frontiers Cell Dev Biol. 2021;9:617281.
Sanger HL, Klotz G, Riesner D, Gross HJ, Kleinschmidt AK. Viroids are singlestranded covalently closed circular RNA molecules existing as highly basepaired rodlike structures. Proc Natl Acad Sci. 1976;73(11):3852â€“6.
Memczak S, Jens M, Elefsinioti A, Torti F, Krueger J, Rybak A, et al. Circular RNAs are a large class of animal RNAs with regulatory potency. Nature. 2013;495(7441):333â€“8.
Hansen TB, Jensen TI, Clausen BH, Bramsen JB, Finsen B, Damgaard CK, et al. Natural RNA circles function as efficient microRNA sponges. Nature. 2013;495(7441):384â€“8.
Yu CY, Kuo HC. The emerging roles and functions of circular RNAs and their generation. J Biomed Sci. 2019;26(1):1â€“12.
AshwalFluss R, Meyer M, Pamudurti NR, Ivanov A, Bartok O, Hanan M, et al. circRNA biogenesis competes with premRNA splicing. Mol Cell. 2014;56(1):55â€“66.
Jeck WR, Sharpless NE. Detecting and characterizing circular RNAs. Nat Biotechnol. 2014;32(5):453â€“61.
Zhang Y, Yang L, Chen LL. Life without A tail: new formats of long noncoding RNAs. Int J Biochem Cell Biol. 2014;54:338â€“49.
Li F, Zhang L, Li W, Deng J, Zheng J, An M, et al. Circular RNA ITCH has inhibitory effect on ESCC by suppressing the Wnt/\(\beta\)catenin pathway. Oncotarget. 2015;6(8):6001.
Fan C, Lei X, Tie J, Zhang Y, Wu FX, Pan Y. CircR2Disease v2. 0: an updated web server for experimentally validated circRNAâ€“disease associations and its application. Genomics Proteomics Bioinforma. 2022;20(3):435â€“45.
Gao D, Zhang X, Liu B, Meng D, Fang K, Guo Z, et al. Screening circular RNA related to chemotherapeutic resistance in breast cancer. Epigenomics. 2017;9(9):1175â€“88.
KunPeng Z, XiaoLong M, Lei Z, ChunLin Z, JianPing H, TaiCheng Z. Screening circular RNA related to chemotherapeutic resistance in osteosarcoma by RNA sequencing. Epigenomics. 2018;10(10):1327â€“46.
Wu Q, Wang H, Liu L, Zhu K, Yu W, Guo J. Hsa_circ_0001546 acts as a miRNA421 sponge to inhibit the chemoresistance of gastric cancer cells via ATM/Chk2/p53dependent pathway. Biochem Biophys Res Commun. 2020;521(2):303â€“9.
Ruan H, Xiang Y, Ko J, Li S, Jing Y, Zhu X, et al. Comprehensive characterization of circular RNAs in\(^{\sim }\) 1000 human cancer cell lines. Genome Med. 2019;11:1â€“14.
Deng L, Liu Z, Qian Y, Zhang J. Predicting circRNAdrug sensitivity associations via graph attention autoencoder. BMC Bioinformatics. 2022;23(1):1â€“15.
Salehi A, Davulcu H.Â Graph attention autoencoders. arXivÂ preprintÂ arXiv:1905.10715. 2019.
Yang B, Chen H. Predicting circRNAdrug sensitivity associations by learning multimodal networks using graph autoencoders and attention mechanism. Brief Bioinforma. 2023;24(1):bbac596.
GÃ¶nen M, AlpaydÄ±n E. Multiple kernel learning algorithms. J Mach Learn Res. 2011;12:2211â€“68.
Kipf TN, Welling M.Â Semisupervised classification with graph convolutional networks.Â arXiv preprint arXiv:1609.02907.Â 2016.
Yan XY, Zhang SW, He CR. Prediction of drugtarget interaction by integrating diverse heterogeneous information source with multiple kernel learning and clustering methods. Comput Biol Chem. 2019;78:460â€“7.
Peng W, Wu R, Dai W, Ning Y, Fu X, Liu L, et al. MiRNAâ€“gene network embedding for predicting cancer driver genes. Brief Funct Genomics. 2023;22(4):341â€“50. https://doi.org/10.1093/bfgp/elac059.
Yang W, Soares J, Greninger P, Edelman EJ, Lightfoot H, Forbes S, et al. Genomics of Drug Sensitivity in Cancer (GDSC): a resource for therapeutic biomarker discovery in cancer cells. Nucleic Acids Res. 2012;41(D1):D955â€“61.
Rangwala SH, Kuznetsov A, Ananiev V, Asztalos A, Borodin E, Evgeniev V, et al. Accessing NCBI data using the NCBI sequence viewer and genome data viewer (GDV). Genome Res. 2021;31(1):159â€“69.
Wang Y, Bryant SH, Cheng T, Wang J, Gindulyte A, Shoemaker BA, et al. Pubchem bioassay: 2017 update. Nucleic Acids Res. 2017;45(D1):D955â€“63.
Shen L, Liu F, Huang L, Liu G, Zhou L, Peng L. VDARWLRLS: An antiSARSCoV2 drug prioritizing framework combining an unbalanced birandom walk and Laplacian regularized least squares. Comput Biol Med. 2022;140:105119.
Tian X, Shen L, Wang Z, Zhou L, Peng L. A novel lncRNAprotein interaction prediction method based on deep forest with cascade forest structure. Sci Rep. 2021;11(1):18881.
Landrum G, et al. RDKit: A software suite for cheminformatics, computational chemistry, and predictive modeling. Greg Landrum. 2013;8:1â€“31.
Chen X, Yan CC, Zhang X, You ZH, Deng L, Liu Y, et al. WBSMDA: within and between score for MiRNAdisease association prediction. Sci Rep. 2016;6(1):1â€“9.
Van Laarhoven T, Nabuurs SB, Marchiori E. Gaussian interaction profile kernels for predicting drugtarget interaction. Bioinformatics. 2011;27(21):3036â€“43.
Zou S, Zhang J, Zhang Z. A novel approach for predicting microbedisease associations by birandom walk on the heterogeneous network. PLoS ONE. 2017;12(9):e0184394.
Wang B, Mezlini AM, Demir F, Fiume M, Tu Z, Brudno M, et al. Similarity network fusion for aggregating data types on a genomic scale. Nat Methods. 2014;11(3):333â€“7.
Zhao Y, Wei S, Du H, Chen X, Li Q, Zhuang F, et al. Learning Bityped Multirelational Heterogeneous Graph via Dual Hierarchical Attention Networks. IEEE Trans Knowl Data Eng. 2022;35(9):9054â€“66.
Liu Z, Nguyen TK, Fang Y. Tailgnn: Tailnode graph neural networks. In: Proceedings of the 27th ACM SIGKDD Conference on Knowledge Discovery & Data Mining. New York, United States: Association for Computing Machinery; 2021. p. 1109â€“19.
Ding Y, Tang J, Guo F. Identification of drugtarget interactions via dual laplacian regularized least squares with multiple kernel fusion. KnowlBased Syst. 2020;204:106254.
Yang H, Ding Y, Tang J, Guo F. Inferring human microbedrug associations via multiple kernel fusion on graph neural network. KnowlBased Syst. 2022;238:107888.
Kingma, Diederik P., and Jimmy Ba. "Adam: A method for stochastic optimization."Â arXiv preprint arXiv:1412.6980. 2014.
Lou Z, Cheng Z, Li H, Teng Z, Liu Y, Tian Z. Predicting miRNAâ€“disease associations via learning multimodal networks and fusing mixed neighborhood information. Brief Bioinforma. 2022;23(5).
Yu Z, Huang F, Zhao X, Xiao W, Zhang W. Predicting drugâ€“disease associations through layer attention graph convolutional network. Brief Bioinforma. 2021;22(4):1â€“11.
Tang X, Luo J, Shen C, Lai Z. Multiview multichannel attention graph convolutional network for miRNAâ€“disease association prediction. Brief Bioinforma. 2021;22(6):bbab174.
Lan W, Wu X, Chen Q, Peng W, Wang J, Chen YP. GANLDA: graph attention network for lncRNAdisease associations prediction. Neurocomputing. 2022;469:384â€“93.
Peterson QP, Goode DR, West DC, Ramsey KN, Lee JJ, Hergenrother PJ. PAC1 activates procaspase3 in vitro through relief of zincmediated inhibition. J Mol Biol. 2009;388(1):144â€“58.
SÂ Roth H, JÂ Hergenrother P. Derivatives of procaspaseactivating compound 1 (PAC1) and their anticancer activities. Curr Med Chem. 2016;23(3):201â€“41.
Assempour N, Iynkkaran I, Liu Y, Maciejewski A, Gale N, Wilson A, etÂ al. 5.0: a major update to the DrugBank database for 2018. Nucleic Acids Res. 2018;46:D1074â€“82.
Poole RM. Belinostat: first global approval. Drugs. 2014;74:1543â€“54.
Adams J, Kauffman M. Development of the proteasome inhibitor Velcadeâ„¢(Bortezomib). Cancer Investig. 2004;22(2):304â€“11.
Mujtaba T, Dou QP. Advances in the understanding of mechanisms and therapeutic use of bortezomib. Discov Med. 2011;12(67):471.
Knipstein J, Gore L. Entinostat for treatment of solid tumors and hematologic malignancies. Expert Opin Investig Drugs. 2011;20(10):1455â€“67.
Connolly RM, Rudek MA, Piekarz R. Entinostat: a promising treatment option for patients with advanced breast cancer. Future Oncol. 2017;13(13):1137â€“48.
Acknowledgements
Not applicable.
Funding
This work was supported in part by the major key project of PCL(PCL2021A12), the Macau Science and Technology Development Funds Grands No. 0056/2020/AFJ from the Macau Special Administrative Region of the Peopleâ€™s Republic of China, the Key Project for University of Educational Commission of Guangdong Province of China Funds (Natural, Grant No. 2019GZDXM005), Young and Middle aged Teachers Research Basic Ability Improvement Project of Guangxi Universities,China (No. 2022KY0608).
Author information
Authors and Affiliations
Contributions
SHL designed the framework, conducted the experiments, and wrote the manuscript. YL, LL, SLL, YFZ, CJY, and DOY modifed the manuscript. All author(s) revised and approved the manuscript.
Corresponding author
Ethics declarations
Ethics approval and consent to participate
Not applicable.
Consent for publication
Not applicable.
Competing interests
The authors declare no competing interests.
Additional information
Publisherâ€™s Note
Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.
Supplementary Information
Rights and permissions
Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons licence, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons licence, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons licence and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this licence, visit http://creativecommons.org/licenses/by/4.0/. The Creative Commons Public Domain Dedication waiver (http://creativecommons.org/publicdomain/zero/1.0/) applies to the data made available in this article, unless otherwise stated in a credit line to the data.
About this article
Cite this article
Lu, S., Liang, Y., Li, L. et al. Inferring circRNAdrug sensitivity associations via dual hierarchical attention networks and multiple kernel fusion. BMC Genomics 24, 796 (2023). https://doi.org/10.1186/s1286402309899w
Received:
Accepted:
Published:
DOI: https://doi.org/10.1186/s1286402309899w