scholarly journals UMPred-FRL: A New Approach for Accurate Prediction of Umami Peptides Using Feature Representation Learning

2021 ◽  
Vol 22 (23) ◽  
pp. 13124
Author(s):  
Phasit Charoenkwan ◽  
Chanin Nantasenamat ◽  
Md Mehedi Hasan ◽  
Mohammad Ali Moni ◽  
Balachandran Manavalan ◽  
...  

Umami ingredients have been identified as important factors in food seasoning and production. Traditional experimental methods for characterizing peptides exhibiting umami sensory properties (umami peptides) are time-consuming, laborious, and costly. As a result, it is preferable to develop computational tools for the large-scale identification of available sequences in order to identify novel peptides with umami sensory properties. Although a computational tool has been developed for this purpose, its predictive performance is still insufficient. In this study, we use a feature representation learning approach to create a novel machine-learning meta-predictor called UMPred-FRL for improved umami peptide identification. We combined six well-known machine learning algorithms (extremely randomized trees, k-nearest neighbor, logistic regression, partial least squares, random forest, and support vector machine) with seven different feature encodings (amino acid composition, amphiphilic pseudo-amino acid composition, dipeptide composition, composition-transition-distribution, and pseudo-amino acid composition) to develop the final meta-predictor. Extensive experimental results demonstrated that UMPred-FRL was effective and achieved more accurate performance on the benchmark dataset compared to its baseline models, and consistently outperformed the existing method on the independent test dataset. Finally, to aid in the high-throughput identification of umami peptides, the UMPred-FRL web server was established and made freely available online. It is expected that UMPred-FRL will be a powerful tool for the cost-effective large-scale screening of candidate peptides with potential umami sensory properties.

2016 ◽  
Vol 2016 ◽  
pp. 1-5 ◽  
Author(s):  
Yun Wu ◽  
Yufei Zheng ◽  
Hua Tang

Conotoxins are a kind of neurotoxin which can specifically interact with potassium, sodium type, and calcium channels. They have become potential drug candidates to treat diseases such as chronic pain, epilepsy, and cardiovascular diseases. Thus, correctly identifying the types of ion channel-targeted conotoxins will provide important clue to understand their function and find potential drugs. Based on this consideration, we developed a new computational method to rapidly and accurately predict the types of ion-targeted conotoxins. Three kinds of new properties of residues were proposed to use in pseudo amino acid composition to formulate conotoxins samples. The support vector machine was utilized as classifier. A feature selection technique based onF-score was used to optimize features. Jackknife cross-validated results showed that the overall accuracy of 94.6% was achieved, which is higher than other published results, demonstrating that the proposed method is superior to published methods. Hence the current method may play a complementary role to other existing methods for recognizing the types of ion-target conotoxins.


2014 ◽  
Author(s):  
Sukanta Mondal ◽  
Priyadarshini P. Pai

Antifreeze proteins (AFP) in living organisms play a key role in their tolerance to extremely cold temperatures and have wide range of biotechnological applications. But on account of diversity, their identification has been challenging to biologists. Earlier work explored in this area did not cover introduction of sequence order information, known to represent important properties of various proteins and protein systems for prediction of their attributes. In this study, the effect of Chou's pseudo amino acid composition that presents sequence order of proteins was systematically explored using support vector machines for AFP prediction. Our findings suggest that introduction of sequence order information helps identify AFPs with an accuracy of 84.75% on independent test dataset, outperforming approaches such as AFP-Pred and iAFP. The relative performance calculated using Youden’s Index (Sensitivity + Specificity -1) was found to be 0.71 for our predictor (AFP-PseAAC), 0.48 for AFP-Pred and 0.05 for iAFP. We hope this novel prediction approach will aid in AFP based research for biotechnological applications.


2014 ◽  
Author(s):  
Sukanta Mondal ◽  
Priyadarshini P. Pai

Antifreeze proteins (AFP) in living organisms play a key role in their tolerance to extremely cold temperatures and have wide range of biotechnological applications. But on account of diversity, their identification has been challenging to biologists. Earlier work explored in this area did not cover introduction of sequence order information, known to represent important properties of various proteins and protein systems for prediction of their attributes. In this study, the effect of Chou's pseudo amino acid composition that presents sequence order of proteins was systematically explored using support vector machines for AFP prediction. Our findings suggest that introduction of sequence order information helps identify AFPs with an accuracy of 84.75% on independent test dataset, outperforming approaches such as AFP-Pred and iAFP. The relative performance calculated using Youden’s Index (Sensitivity + Specificity -1) was found to be 0.71 for our predictor (AFP-PseAAC), 0.48 for AFP-Pred and 0.05 for iAFP. We hope this novel prediction approach will aid in AFP based research for biotechnological applications.


Sign in / Sign up

Export Citation Format

Share Document