MDAPlatform: a Component-based Platform for Constructing and Assessing miRNA-disease association Prediction Methods

2021 ◽  
Vol 16 ◽  
Author(s):  
Yayan Zhang ◽  
Guihua Duan ◽  
Cheng Yan ◽  
Haolun Yi ◽  
Fang-Xiang Wu ◽  
...  

Background: Increasing evidence has indicated that miRNA-disease association prediction plays a critical role in the study of clinical drugs. Researchers have proposed many computational models for miRNA-disease prediction. However, there is no unified platform to compare and analyze the pros and cons or share the code and data of these models. Objective: In this study, we develop an easy-to-use platform (MDAPlatform) to construct and assess miRNA-disease association prediction method. Methods: MDAPlatform integrates the relevant data of miRNA, disease and miRNA-disease associations that are used in previous miRNA-disease association prediction studies. Based on the componentized model, it develops differet components of previous computational methods. Results: Users can conduct cross validation experiments and compare their methods with other methods, and the visualized comparison results are also provided. Conclusion: Based on the componentized model, MDAPlatform provides easy-to-operate interfaces to construct the miRNA-disease association method, which is beneficial to develop new miRNA-disease association prediction methods in the future.

2021 ◽  
Vol 21 (S1) ◽  
Author(s):  
Yu-Tian Wang ◽  
Qing-Wen Wu ◽  
Zhen Gao ◽  
Jian-Cheng Ni ◽  
Chun-Hou Zheng

Abstract Background MicroRNAs (miRNAs) have been confirmed to have close relationship with various human complex diseases. The identification of disease-related miRNAs provides great insights into the underlying pathogenesis of diseases. However, it is still a big challenge to identify which miRNAs are related to diseases. As experimental methods are in general expensive and time‐consuming, it is important to develop efficient computational models to discover potential miRNA-disease associations. Methods This study presents a novel prediction method called HFHLMDA, which is based on high-dimensionality features and hypergraph learning, to reveal the association between diseases and miRNAs. Firstly, the miRNA functional similarity and the disease semantic similarity are integrated to form an informative high-dimensionality feature vector. Then, a hypergraph is constructed by the K-Nearest-Neighbor (KNN) method, in which each miRNA-disease pair and its k most relevant neighbors are linked as one hyperedge to represent the complex relationships among miRNA-disease pairs. Finally, the hypergraph learning model is designed to learn the projection matrix which is used to calculate uncertain miRNA-disease association score. Result Compared with four state-of-the-art computational models, HFHLMDA achieved best results of 92.09% and 91.87% in leave-one-out cross validation and fivefold cross validation, respectively. Moreover, in case studies on Esophageal neoplasms, Hepatocellular Carcinoma, Breast Neoplasms, 90%, 98%, and 96% of the top 50 predictions have been manually confirmed by previous experimental studies. Conclusion MiRNAs have complex connections with many human diseases. In this study, we proposed a novel computational model to predict the underlying miRNA-disease associations. All results show that the proposed method is effective for miRNA–disease association predication.


Author(s):  
Xing Chen ◽  
Lian-Gang Sun ◽  
Yan Zhao

Abstract Emerging evidence shows that microRNAs (miRNAs) play a critical role in diverse fundamental and important biological processes associated with human diseases. Inferring potential disease related miRNAs and employing them as the biomarkers or drug targets could contribute to the prevention, diagnosis and treatment of complex human diseases. In view of that traditional biological experiments cost much time and resources, computational models would serve as complementary means to uncover potential miRNA–disease associations. In this study, we proposed a new computational model named Neighborhood Constraint Matrix Completion for MiRNA–Disease Association prediction (NCMCMDA) to predict potential miRNA–disease associations. The main task of NCMCMDA was to recover the missing miRNA–disease associations based on the known miRNA–disease associations and integrated disease (miRNA) similarity. In this model, we innovatively integrated neighborhood constraint with matrix completion, which provided a novel idea of utilizing similarity information to assist the prediction. After the recovery task was transformed into an optimization problem, we solved it with a fast iterative shrinkage-thresholding algorithm. As a result, the AUCs of NCMCMDA in global and local leave-one-out cross validation were 0.9086 and 0.8453, respectively. In 5-fold cross validation, NCMCMDA achieved an average AUC of 0.8942 and standard deviation of 0.0015, which demonstrated NCMCMDA’s superior performance than many previous computational methods. Furthermore, NCMCMDA was applied to three different types of case studies to further evaluate its prediction reliability and accuracy. As a result, 84% (colon neoplasms), 98% (esophageal neoplasms) and 98% (breast neoplasms) of the top 50 predicted miRNAs were verified by recent literature.


2020 ◽  
Vol 21 (1) ◽  
Author(s):  
Yubin Xiao ◽  
Zheng Xiao ◽  
Xiang Feng ◽  
Zhiping Chen ◽  
Linai Kuang ◽  
...  

Abstract Background Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well. Results In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (fivefold CV), 10-Fold Cross Validation (tenfold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in fivefold CV, tenfold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA. Conclusion The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.


2020 ◽  
Author(s):  
Yubin Xiao ◽  
Zheng Xiao ◽  
Xiang Feng ◽  
Zhiping Chen ◽  
Linai Kuang ◽  
...  

Abstract Background: Accumulating evidence has demonstrated that long non-coding RNAs (lncRNAs) are closely associated with human diseases, and it is useful for the diagnosis and treatment of diseases to get the relationships between lncRNAs and diseases. Due to the high costs and time complexity of traditional bio-experiments, in recent years, more and more computational methods have been proposed by researchers to infer potential lncRNA-disease associations. However, there exist all kinds of limitations in these state-of-the-art prediction methods as well.Results: In this manuscript, a novel computational model named FVTLDA is proposed to infer potential lncRNA-disease associations. In FVTLDA, its major novelty lies in the integration of direct and indirect features related to lncRNA-disease associations such as the feature vectors of lncRNA-disease pairs and their corresponding association probability fractions, which guarantees that FVTLDA can be utilized to predict diseases without known related-lncRNAs and lncRNAs without known related-diseases. Moreover, FVTLDA neither relies solely on known lncRNA-disease nor requires any negative samples, which guarantee that it can infer potential lncRNA-disease associations more equitably and effectively than traditional state-of-the-art prediction methods. Additionally, to avoid the limitations of single model prediction techniques, we combine FVTLDA with the Multiple Linear Regression (MLR) and the Artificial Neural Network (ANN) for data analysis respectively. Simulation experiment results show that FVTLDA with MLR can achieve reliable AUCs of 0.8909, 0.8936 and 0.8970 in 5-Fold Cross Validation (5-fold CV), 10-Fold Cross Validation (10-fold CV) and Leave-One-Out Cross Validation (LOOCV), separately, while FVTLDA with ANN can achieve reliable AUCs of 0.8766, 0.8830 and 0.8807 in 5-fold CV, 10-fold CV, and LOOCV respectively. Furthermore, in case studies of gastric cancer, leukemia and lung cancer, experiment results show that there are 8, 8 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with MLR, and 8, 7 and 8 out of top 10 candidate lncRNAs predicted by FVTLDA with ANN, having been verified by recent literature. Comparing with the representative prediction model of KATZLDA, comparison results illustrate that FVTLDA with MLR and FVTLDA with ANN can achieve the average case study contrast scores of 0.8429 and 0.8515 respectively, which are both notably higher than the average case study contrast score of 0.6375 achieved by KATZLDA.Conclusion: The simulation results show that FVTLDA has good prediction performance, which is a good supplement to future bioinformatics research.


2019 ◽  
Vol 20 (7) ◽  
pp. 1549 ◽  
Author(s):  
Yang Liu ◽  
Xiang Feng ◽  
Haochen Zhao ◽  
Zhanwei Xuan ◽  
Lei Wang

Accumulating studies have shown that long non-coding RNAs (lncRNAs) are involved in many biological processes and play important roles in a variety of complex human diseases. Developing effective computational models to identify potential relationships between lncRNAs and diseases can not only help us understand disease mechanisms at the lncRNA molecular level, but also promote the diagnosis, treatment, prognosis, and prevention of human diseases. For this paper, a network-based model called NBLDA was proposed to discover potential lncRNA–disease associations, in which two novel lncRNA–disease weighted networks were constructed. They were first based on known lncRNA–disease associations and topological similarity of the lncRNA–disease association network, and then an lncRNA–lncRNA weighted matrix and a disease–disease weighted matrix were obtained based on a resource allocation strategy of unequal allocation and unbiased consistence. Finally, a label propagation algorithm was applied to predict associated lncRNAs for the investigated diseases. Moreover, in order to estimate the prediction performance of NBLDA, the framework of leave-one-out cross validation (LOOCV) was implemented on NBLDA, and simulation results showed that NBLDA can achieve reliable areas under the ROC curve (AUCs) of 0.8846, 0.8273, and 0.8075 in three known lncRNA–disease association datasets downloaded from the lncRNADisease database, respectively. Furthermore, in case studies of lung cancer, leukemia, and colorectal cancer, simulation results demonstrated that NBLDA can be a powerful tool for identifying potential lncRNA–disease associations as well.


Complexity ◽  
2020 ◽  
Vol 2020 ◽  
pp. 1-16
Author(s):  
Yang Yujun ◽  
Yang Yimei ◽  
Xiao Jianhua

The stock market is a chaotic, complex, and dynamic financial market. The prediction of future stock prices is a concern and controversial research issue for researchers. More and more analysis and prediction methods are proposed by researchers. We proposed a hybrid method for the prediction of future stock prices using LSTM and ensemble EMD in this paper. We use comprehensive EMD to decompose the complex original stock price time series into several subsequences which are smoother, more regular and stable than the original time series. Then, we use the LSTM method to train and predict each subsequence. Finally, we obtained the prediction values of the original stock price time series by fused the prediction values of several subsequences. In the experiment, we selected five data to fully test the performance of the method. The comparison results with the other four prediction methods show that the predicted values show higher accuracy. The hybrid prediction method we proposed is effective and accurate in future stock price prediction. Hence, the hybrid prediction method has practical application and reference value.


2021 ◽  
Vol 18 (6) ◽  
pp. 7419-7439
Author(s):  
Huiqing Wang ◽  
◽  
Sen Zhao ◽  
Jing Zhao ◽  
Zhipeng Feng

<abstract> <p>The development of new drugs is a time-consuming and labor-intensive process. Therefore, researchers use computational methods to explore other therapeutic effects of existing drugs, and drug-disease association prediction is an important branch of it. The existing drug-disease association prediction method ignored the prior knowledge contained in the drug-disease association data, which provided a strong basis for the research. Moreover, the previous methods only paid attention to the high-level features in the network when extracting features, and directly fused or connected them in series, resulting in the loss of information. Therefore, we propose a novel deep learning model for drug-disease association prediction, called DCNN. The model introduces the Gaussian interaction profile kernel similarity for drugs and diseases, and combines them with the structural similarity of drugs and the semantic similarity of diseases to construct the feature space jointly. Then dense convolutional neural network (DenseCNN) is used to capture the feature information of drugs and diseases, and introduces a convolutional block attention module (CBAM) to weight features from the channel and space levels to achieve adaptive optimization of features. The ten-fold cross-validation results of the model DCNN and the experimental results of the case study show that it is superior to the existing drug-disease association predictors and effectively predicts the drug-disease associations.</p> </abstract>


2021 ◽  
Vol 11 (1) ◽  
Author(s):  
Asieh Amousoltani Arani ◽  
Mohammadreza Sehhati ◽  
Mohammad Amin Tabatabaiefar

AbstractAmong an assortment of genetic variations, Missense are major ones which a small subset of them may led to the upset of the protein function and ultimately end in human diseases. Various machine learning methods were declared to differentiate deleterious and benign missense variants by means of a large number of features, including structure, sequence, interaction networks, gene disease associations as well as phenotypes. However, development of a reliable and accurate algorithm for merging heterogeneous information is highly needed as it could be captured all information of complex interactions on network that genes participate in. In this study we proposed a new method based on the non-negative matrix tri-factorization clustering method. We outlined two versions of the proposed method: two-source and three-source algorithms. Two-source algorithm aggregates individual deleteriousness prediction methods and PPI network, and three-source algorithm incorporates gene disease associations into the other sources already mentioned. Four benchmark datasets were employed for internally and externally validation of both algorithms of our predictor. The results at all datasets confirmed that, our method outperforms most state of the art variant prediction tools. Two key features of our variant effect prediction method are worth mentioning. Firstly, despite the fact that the incorporation of gene disease information at three-source algorithm can improve prediction performance by comparison with two-source algorithm, our method did not hinder by type 2 circularity error unlike some recent ensemble-based prediction methods. Type 2 circularity error occurs when the predictor annotates variants on the basis of the genes located on. Secondly, the performance of our predictor is superior over other ensemble-based methods for variants positioned on genes in which we do not have enough information about their pathogenicity.


2020 ◽  
Vol 20 (6) ◽  
pp. 452-460
Author(s):  
Lin Tang ◽  
Yu Liang ◽  
Xin Jin ◽  
Lin Liu ◽  
Wei Zhou

Background: Accumulating experimental studies demonstrated that long non-coding RNAs (LncRNAs) play crucial roles in the occurrence and development progress of various complex human diseases. Nonetheless, only a small portion of LncRNA–disease associations have been experimentally verified at present. Automatically predicting LncRNA–disease associations based on computational models can save the huge cost of wet-lab experiments. Methods and Result: To develop effective computational models to integrate various heterogeneous biological data for the identification of potential disease-LncRNA, we propose a hierarchical extension based on the Boolean matrix for LncRNA-disease association prediction model (HEBLDA). HEBLDA discovers the intrinsic hierarchical correlation based on the property of the Boolean matrix from various relational sources. Then, HEBLDA integrates these hierarchical associated matrices by fusion weights. Finally, HEBLDA uses the hierarchical associated matrix to reconstruct the LncRNA– disease association matrix by hierarchical extending. HEBLDA is able to work for potential diseases or LncRNA without known association data. In 5-fold cross-validation experiments, HEBLDA obtained an area under the receiver operating characteristic curve (AUC) of 0.8913, improving previous classical methods. Besides, case studies show that HEBLDA can accurately predict candidate disease for several LncRNAs. Conclusion: Based on its ability to discover the more-richer correlated structure of various data sources, we can anticipate that HEBLDA is a potential method that can obtain more comprehensive association prediction in a broad field.


2019 ◽  
Vol 20 (1) ◽  
Author(s):  
Zhou Huang ◽  
Leibo Liu ◽  
Yuanxu Gao ◽  
Jiangcheng Shi ◽  
Qinghua Cui ◽  
...  

Abstract Background A series of miRNA-disease association prediction methods have been proposed to prioritize potential disease-associated miRNAs. Independent benchmarking of these methods is warranted to assess their effectiveness and robustness. Results Based on more than 8000 novel miRNA-disease associations from the latest HMDD v3.1 database, we perform systematic comparison among 36 readily available prediction methods. Their overall performances are evaluated with rigorous precision-recall curve analysis, where 13 methods show acceptable accuracy (AUPRC > 0.200) while the top two methods achieve a promising AUPRC over 0.300, and most of these methods are also highly ranked when considering only the causal miRNA-disease associations as the positive samples. The potential of performance improvement is demonstrated by combining different predictors or adopting a more updated miRNA similarity matrix, which would result in up to 16% and 46% of AUPRC augmentations compared to the best single predictor and the predictors using the previous similarity matrix, respectively. Our analysis suggests a common issue of the available methods, which is that the prediction results are severely biased toward well-annotated diseases with many associated miRNAs known and cannot further stratify the positive samples by discriminating the causal miRNA-disease associations from the general miRNA-disease associations. Conclusion Our benchmarking results not only provide a reference for biomedical researchers to choose appropriate miRNA-disease association predictors for their purpose, but also suggest the future directions for the development of more robust miRNA-disease association predictors.


Sign in / Sign up

Export Citation Format

Share Document