UMPred-FRL: A New Approach for Accurate Prediction of Umami Peptides Using Feature Representation Learning

Phasit Charoenkwan; Chanin Nantasenamat; Md Mehedi Hasan; Mohammad Ali Moni; Balachandran Manavalan; Watshara Shoombuatong

doi:10.3390/ijms222313124

UMPred-FRL: A New Approach for Accurate Prediction of Umami Peptides Using Feature Representation Learning

International Journal of Molecular Sciences ◽

10.3390/ijms222313124 ◽

2021 ◽

Vol 22 (23) ◽

pp. 13124

Author(s):

Phasit Charoenkwan ◽

Chanin Nantasenamat ◽

Md Mehedi Hasan ◽

Mohammad Ali Moni ◽

Balachandran Manavalan ◽

...

Keyword(s):

Machine Learning ◽

Amino Acid ◽

Amino Acid Composition ◽

Acid Composition ◽

Large Scale ◽

Representation Learning ◽

Sensory Properties ◽

Feature Representation ◽

Support Vector ◽

Pseudo Amino Acid Composition

Umami ingredients have been identified as important factors in food seasoning and production. Traditional experimental methods for characterizing peptides exhibiting umami sensory properties (umami peptides) are time-consuming, laborious, and costly. As a result, it is preferable to develop computational tools for the large-scale identification of available sequences in order to identify novel peptides with umami sensory properties. Although a computational tool has been developed for this purpose, its predictive performance is still insufficient. In this study, we use a feature representation learning approach to create a novel machine-learning meta-predictor called UMPred-FRL for improved umami peptide identification. We combined six well-known machine learning algorithms (extremely randomized trees, k-nearest neighbor, logistic regression, partial least squares, random forest, and support vector machine) with seven different feature encodings (amino acid composition, amphiphilic pseudo-amino acid composition, dipeptide composition, composition-transition-distribution, and pseudo-amino acid composition) to develop the final meta-predictor. Extensive experimental results demonstrated that UMPred-FRL was effective and achieved more accurate performance on the benchmark dataset compared to its baseline models, and consistently outperformed the existing method on the independent test dataset. Finally, to aid in the high-throughput identification of umami peptides, the UMPred-FRL web server was established and made freely available online. It is expected that UMPred-FRL will be a powerful tool for the cost-effective large-scale screening of candidate peptides with potential umami sensory properties.

Download Full-text

Using Chou's amphiphilic pseudo-amino acid composition and support vector machine for prediction of enzyme subfamily classes

Journal of Theoretical Biology ◽

10.1016/j.jtbi.2007.06.001 ◽

2007 ◽

Vol 248 (3) ◽

pp. 546-551 ◽

Cited By ~ 220

Author(s):

Xi-Bin Zhou ◽

Chao Chen ◽

Zhan-Chao Li ◽

Xiao-Yong Zou

Keyword(s):

Support Vector Machine ◽

Amino Acid ◽

Amino Acid Composition ◽

Acid Composition ◽

Support Vector ◽

Pseudo Amino Acid Composition

Download Full-text

Prediction of β-lactamase and its class by Chou’s pseudo-amino acid composition and support vector machine

Journal of Theoretical Biology ◽

10.1016/j.jtbi.2014.10.008 ◽

2015 ◽

Vol 365 ◽

pp. 96-103 ◽

Cited By ~ 98

Author(s):

Ravindra Kumar ◽

Abhishikha Srivastava ◽

Bandana Kumari ◽

Manish Kumar

Keyword(s):

Support Vector Machine ◽

Amino Acid ◽

Amino Acid Composition ◽

Acid Composition ◽

Support Vector ◽

Pseudo Amino Acid Composition

Download Full-text

Identifying the Types of Ion Channel-Targeted Conotoxins by Incorporating New Properties of Residues into Pseudo Amino Acid Composition

BioMed Research International ◽

10.1155/2016/3981478 ◽

2016 ◽

Vol 2016 ◽

pp. 1-5 ◽

Cited By ~ 8

Author(s):

Yun Wu ◽

Yufei Zheng ◽

Hua Tang

Keyword(s):

Amino Acid ◽

Ion Channel ◽

Amino Acid Composition ◽

Acid Composition ◽

Current Method ◽

Computational Method ◽

Support Vector ◽

Pseudo Amino Acid Composition ◽

Feature Selection Technique ◽

Drug Candidates

Conotoxins are a kind of neurotoxin which can specifically interact with potassium, sodium type, and calcium channels. They have become potential drug candidates to treat diseases such as chronic pain, epilepsy, and cardiovascular diseases. Thus, correctly identifying the types of ion channel-targeted conotoxins will provide important clue to understand their function and find potential drugs. Based on this consideration, we developed a new computational method to rapidly and accurately predict the types of ion-targeted conotoxins. Three kinds of new properties of residues were proposed to use in pseudo amino acid composition to formulate conotoxins samples. The support vector machine was utilized as classifier. A feature selection technique based onF-score was used to optimize features. Jackknife cross-validated results showed that the overall accuracy of 94.6% was achieved, which is higher than other published results, demonstrating that the proposed method is superior to published methods. Hence the current method may play a complementary role to other existing methods for recognizing the types of ion-target conotoxins.

Download Full-text

An Ensemble Classifier of Support Vector Machines Used to Predict Protein Structural Classes by Fusing Auto Covariance and Pseudo-Amino Acid Composition

The Protein Journal ◽

10.1007/s10930-009-9222-z ◽

2010 ◽

Vol 29 (1) ◽

pp. 62-67 ◽

Cited By ~ 15

Author(s):

Jiang Wu ◽

Meng-Long Li ◽

Le-Zheng Yu ◽

Chao Wang

Keyword(s):

Amino Acid ◽

Support Vector Machines ◽

Amino Acid Composition ◽

Acid Composition ◽

Ensemble Classifier ◽

Support Vector ◽

Pseudo Amino Acid Composition ◽

Vector Machines ◽

Auto Covariance ◽

Structural Classes

Download Full-text

Weighted-support vector machines for predicting membrane protein types based on pseudo-amino acid composition

Protein Engineering Design and Selection ◽

10.1093/protein/gzh061 ◽

2004 ◽

Vol 17 (6) ◽

pp. 509-516 ◽

Cited By ~ 138

Author(s):

M. Wang ◽

J. Yang ◽

G.-P. Liu ◽

Z.-J. Xu ◽

K.-C. Chou

Keyword(s):

Amino Acid ◽

Support Vector Machines ◽

Membrane Protein ◽

Amino Acid Composition ◽

Acid Composition ◽

Support Vector ◽

Pseudo Amino Acid Composition ◽

Vector Machines

Download Full-text

Prediction of metalloproteinase family based on the concept of Chou’s pseudo amino acid composition using a machine learning approach

Journal of Structural and Functional Genomics ◽

10.1007/s10969-011-9120-4 ◽

2011 ◽

Vol 12 (4) ◽

pp. 191-197 ◽

Cited By ~ 75

Author(s):

Majid Mohammad Beigi ◽

Mohaddeseh Behjati ◽

Hassan Mohabatkar

Keyword(s):

Machine Learning ◽

Amino Acid ◽

Amino Acid Composition ◽

Acid Composition ◽

Learning Approach ◽

Pseudo Amino Acid Composition ◽

Machine Learning Approach ◽

Family Based

Download Full-text

Pseudo amino acid composition improves antifreeze protein prediction

10.7287/peerj.preprints.224v1 ◽

2014 ◽

Author(s):

Sukanta Mondal ◽

Priyadarshini P. Pai

Keyword(s):

Amino Acid ◽

Amino Acid Composition ◽

Acid Composition ◽

Support Vector ◽

Order Information ◽

Biotechnological Applications ◽

Pseudo Amino Acid Composition ◽

Cold Temperatures ◽

Wide Range ◽

Prediction Approach

Antifreeze proteins (AFP) in living organisms play a key role in their tolerance to extremely cold temperatures and have wide range of biotechnological applications. But on account of diversity, their identification has been challenging to biologists. Earlier work explored in this area did not cover introduction of sequence order information, known to represent important properties of various proteins and protein systems for prediction of their attributes. In this study, the effect of Chou's pseudo amino acid composition that presents sequence order of proteins was systematically explored using support vector machines for AFP prediction. Our findings suggest that introduction of sequence order information helps identify AFPs with an accuracy of 84.75% on independent test dataset, outperforming approaches such as AFP-Pred and iAFP. The relative performance calculated using Youden’s Index (Sensitivity + Specificity -1) was found to be 0.71 for our predictor (AFP-PseAAC), 0.48 for AFP-Pred and 0.05 for iAFP. We hope this novel prediction approach will aid in AFP based research for biotechnological applications.

Download Full-text

Pseudo amino acid composition improves antifreeze protein prediction

10.7287/peerj.preprints.224v2 ◽

2014 ◽

Author(s):

Sukanta Mondal ◽

Priyadarshini P. Pai

Keyword(s):

Amino Acid ◽

Amino Acid Composition ◽

Acid Composition ◽

Support Vector ◽

Order Information ◽

Biotechnological Applications ◽

Pseudo Amino Acid Composition ◽

Cold Temperatures ◽

Wide Range ◽

Prediction Approach

Download Full-text

Prediction of Protein Secondary Structure Content by Using the Concept of Chou’s Pseudo Amino Acid Composition and Support Vector Machine

Frontiers in Protein and Peptide Sciences ◽

10.2174/9781608058624114010010 ◽

2014 ◽

pp. 161-173

Keyword(s):

Support Vector Machine ◽

Amino Acid ◽

Secondary Structure ◽

Amino Acid Composition ◽

Acid Composition ◽

Protein Secondary Structure ◽

Support Vector ◽

Pseudo Amino Acid Composition ◽

Secondary Structure Content

Download Full-text

Pseudo amino acid composition and multi-class support vector machines approach for conotoxin superfamily classification

Journal of Theoretical Biology ◽

10.1016/j.jtbi.2006.06.014 ◽

2006 ◽

Vol 243 (2) ◽

pp. 252-260 ◽

Cited By ~ 91

Author(s):

Sukanta Mondal ◽

Rajasekaran Bhavna ◽

Rajasekaran Mohan Babu ◽

Suryanarayanarao Ramakumar

Keyword(s):

Amino Acid ◽

Support Vector Machines ◽

Amino Acid Composition ◽

Acid Composition ◽

Support Vector ◽

Pseudo Amino Acid Composition ◽

Vector Machines

Download Full-text