ECNet is an evolutionary context-integrated deep learning framework for protein engineering

Yunan Luo; Guangde Jiang; Tianhao Yu; Yang Liu; Lam Vo; Hantian Ding; Yufeng Su; Wesley Wei Qian; Huimin Zhao; Jian Peng

doi:10.1038/s41467-021-25976-8

ECNet is an evolutionary context-integrated deep learning framework for protein engineering

Nature Communications ◽

10.1038/s41467-021-25976-8 ◽

2021 ◽

Vol 12 (1) ◽

Author(s):

Yunan Luo ◽

Guangde Jiang ◽

Tianhao Yu ◽

Yang Liu ◽

Lam Vo ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Protein Engineering ◽

Learning Algorithm ◽

Learning Algorithms ◽

Structural Features ◽

Machine Learning Algorithms ◽

Success Rates ◽

General Sequence ◽

Evolutionary Context

AbstractMachine learning has been increasingly used for protein engineering. However, because the general sequence contexts they capture are not specific to the protein being engineered, the accuracy of existing machine learning algorithms is rather limited. Here, we report ECNet (evolutionary context-integrated neural network), a deep-learning algorithm that exploits evolutionary contexts to predict functional fitness for protein engineering. This algorithm integrates local evolutionary context from homologous sequences that explicitly model residue-residue epistasis for the protein of interest with the global evolutionary context that encodes rich semantic and structural features from the enormous protein sequence universe. As such, it enables accurate mapping from sequence to function and provides generalization from low-order mutants to higher-order mutants. We show that ECNet predicts the sequence-function relationship more accurately as compared to existing machine learning algorithms by using ~50 deep mutational scanning and random mutagenesis datasets. Moreover, we used ECNet to guide the engineering of TEM-1 β-lactamase and identified variants with improved ampicillin resistance with high success rates.

Download Full-text

Stock price prediction using DEEP learning algorithm and its comparison with machine learning algorithms

Intelligent Systems in Accounting Finance & Management ◽

10.1002/isaf.1459 ◽

2019 ◽

Vol 26 (4) ◽

pp. 164-174 ◽

Cited By ~ 3

Author(s):

Mahla Nikou ◽

Gholamreza Mansourfar ◽

Jamshid Bagherzadeh

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Stock Price ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Stock Price Prediction ◽

Price Prediction ◽

Deep Learning Algorithm

Download Full-text

Wheat Lodging Detection from UAS Imagery Using Machine Learning Algorithms

Remote Sensing ◽

10.3390/rs12111838 ◽

2020 ◽

Vol 12 (11) ◽

pp. 1838 ◽

Cited By ~ 8

Author(s):

Zhao Zhang ◽

Paulo Flores ◽

C. Igathinathane ◽

Dayakar L. Naik ◽

Ravi Kiran ◽

...

Keyword(s):

Neural Network ◽

Machine Learning ◽

Deep Learning ◽

Standard Deviation ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Superior Performance ◽

Support Vector ◽

Unmanned Aerial Systems

The current mainstream approach of using manual measurements and visual inspections for crop lodging detection is inefficient, time-consuming, and subjective. An innovative method for wheat lodging detection that can overcome or alleviate these shortcomings would be welcomed. This study proposed a systematic approach for wheat lodging detection in research plots (372 experimental plots), which consisted of using unmanned aerial systems (UAS) for aerial imagery acquisition, manual field evaluation, and machine learning algorithms to detect the occurrence or not of lodging. UAS imagery was collected on three different dates (23 and 30 July 2019, and 8 August 2019) after lodging occurred. Traditional machine learning and deep learning were evaluated and compared in this study in terms of classification accuracy and standard deviation. For traditional machine learning, five types of features (i.e. gray level co-occurrence matrix, local binary pattern, Gabor, intensity, and Hu-moment) were extracted and fed into three traditional machine learning algorithms (i.e., random forest (RF), neural network, and support vector machine) for detecting lodged plots. For the datasets on each imagery collection date, the accuracies of the three algorithms were not significantly different from each other. For any of the three algorithms, accuracies on the first and last date datasets had the lowest and highest values, respectively. Incorporating standard deviation as a measurement of performance robustness, RF was determined as the most satisfactory. Regarding deep learning, three different convolutional neural networks (simple convolutional neural network, VGG-16, and GoogLeNet) were tested. For any of the single date datasets, GoogLeNet consistently had superior performance over the other two methods. Further comparisons between RF and GoogLeNet demonstrated that the detection accuracies of the two methods were not significantly different from each other (p > 0.05); hence, the choice of any of the two would not affect the final detection accuracies. However, considering the fact that the average accuracy of GoogLeNet (93%) was larger than RF (91%), it was recommended to use GoogLeNet for wheat lodging detection. This research demonstrated that UAS RGB imagery, coupled with the GoogLeNet machine learning algorithm, can be a novel, reliable, objective, simple, low-cost, and effective (accuracy > 90%) tool for wheat lodging detection.

Download Full-text

AERIAL POINT CLOUD CLASSIFICATION WITH DEEP LEARNING AND MACHINE LEARNING ALGORITHMS

ISPRS - International Archives of the Photogrammetry, Remote Sensing and Spatial Information Sciences ◽

10.5194/isprs-archives-xlii-4-w18-843-2019 ◽

2019 ◽

Vol XLII-4/W18 ◽

pp. 843-849

Author(s):

E. Özdemir ◽

F. Remondino ◽

A. Golkar

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Learning Algorithm ◽

Learning Algorithms ◽

Point Clouds ◽

Machine Learning Algorithms ◽

Geometric Features ◽

Semantic Classes ◽

3D Point Clouds ◽

City Models

Abstract. With recent advances in technology, 3D point clouds are getting more and more frequently requested and used, not only for visualization needs but also e.g. by public administrations for urban planning and management. 3D point clouds are also a very frequent source for generating 3D city models which became recently more available for many applications, such as urban development plans, energy evaluation, navigation, visibility analysis and numerous other GIS studies. While the main data sources remained the same (namely aerial photogrammetry and LiDAR), the way these city models are generated have been evolving towards automation with different approaches. As most of these approaches are based on point clouds with proper semantic classes, our aim is to classify aerial point clouds into meaningful semantic classes, e.g. ground level objects (GLO, including roads and pavements), vegetation, buildings’ facades and buildings’ roofs. In this study we tested and evaluated various machine learning algorithms for classification, including three deep learning algorithms and one machine learning algorithm. In the experiments, several hand-crafted geometric features depending on the dataset are used and, unconventionally, these geometric features are used also for deep learning.

Download Full-text

A Deep Learning Algorithm to Predict Hazardous Drinkers and the Severity of Alcohol-Related Problems Using K-NHANES

Frontiers in Psychiatry ◽

10.3389/fpsyt.2021.684406 ◽

2021 ◽

Vol 12 ◽

Author(s):

Suk-Young Kim ◽

Taesung Park ◽

Kwonyoung Kim ◽

Jihoon Oh ◽

Yoonjae Park ◽

...

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Large Scale ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Deep Learning Algorithm ◽

Conventional Machine ◽

Large Scale Survey ◽

Alcohol Related Problems

Purpose: The number of patients with alcohol-related problems is steadily increasing. A large-scale survey of alcohol-related problems has been conducted. However, studies that predict hazardous drinkers and identify which factors contribute to the prediction are limited. Thus, the purpose of this study was to predict hazardous drinkers and the severity of alcohol-related problems of patients using a deep learning algorithm based on a large-scale survey data.Materials and Methods: Datasets of National Health and Nutrition Examination Survey of South Korea (K-NHANES), a nationally representative survey for the entire South Korean population, were used to train deep learning and conventional machine learning algorithms. Datasets from 69,187 and 45,672 participants were used to predict hazardous drinkers and the severity of alcohol-related problems, respectively. Based on the degree of contribution of each variable to deep learning, it was possible to determine which variable contributed significantly to the prediction of hazardous drinkers.Results: Deep learning showed the higher performance than conventional machine learning algorithms. It predicted hazardous drinkers with an AUC (Area under the receiver operating characteristic curve) of 0.870 (Logistic regression: 0.858, Linear SVM: 0.849, Random forest classifier: 0.810, K-nearest neighbors: 0.740). Among 325 variables for predicting hazardous drinkers, energy intake was a factor showing the greatest contribution to the prediction, followed by carbohydrate intake. Participants were classified into Zone I, Zone II, Zone III, and Zone IV based on the degree of alcohol-related problems, showing AUCs of 0.881, 0.774, 0.853, and 0.879, respectively.Conclusion: Hazardous drinking groups could be effectively predicted and individuals could be classified according to the degree of alcohol-related problems using a deep learning algorithm. This algorithm could be used to screen people who need treatment for alcohol-related problems among the general population or hospital visitors.

Download Full-text

Evaluating Impact of Race in Facial Recognition across Machine Learning and Deep Learning Algorithms

Computers ◽

10.3390/computers10090113 ◽

2021 ◽

Vol 10 (9) ◽

pp. 113

Author(s):

James Coe ◽

Mustafa Atay

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Racial Bias ◽

Learning Algorithm ◽

Facial Recognition ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Design Development ◽

Deep Learning Algorithm ◽

The Impact

The research aims to evaluate the impact of race in facial recognition across two types of algorithms. We give a general insight into facial recognition and discuss four problems related to facial recognition. We review our system design, development, and architectures and give an in-depth evaluation plan for each type of algorithm, dataset, and a look into the software and its architecture. We thoroughly explain the results and findings of our experimentation and provide analysis for the machine learning algorithms and deep learning algorithms. Concluding the investigation, we compare the results of two kinds of algorithms and compare their accuracy, metrics, miss rates, and performances to observe which algorithms mitigate racial bias the most. We evaluate racial bias across five machine learning algorithms and three deep learning algorithms using racially imbalanced and balanced datasets. We evaluate and compare the accuracy and miss rates between all tested algorithms and report that SVC is the superior machine learning algorithm and VGG16 is the best deep learning algorithm based on our experimental study. Our findings conclude the algorithm that mitigates the bias the most is VGG16, and all our deep learning algorithms outperformed their machine learning counterparts.

Download Full-text

Intelligent system of English composition scoring model based on improved machine learning algorithm

Journal of Intelligent & Fuzzy Systems ◽

10.3233/jifs-189235 ◽

2020 ◽

pp. 1-11

Author(s):

Jie Liu ◽

Lin Lin ◽

Xiufang Liang

Keyword(s):

Machine Learning ◽

Evaluation System ◽

Intelligent System ◽

Learning Algorithm ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Assessment System ◽

English Composition ◽

Region Extraction ◽

Constraint Model

The online English teaching system has certain requirements for the intelligent scoring system, and the most difficult stage of intelligent scoring in the English test is to score the English composition through the intelligent model. In order to improve the intelligence of English composition scoring, based on machine learning algorithms, this study combines intelligent image recognition technology to improve machine learning algorithms, and proposes an improved MSER-based character candidate region extraction algorithm and a convolutional neural network-based pseudo-character region filtering algorithm. In addition, in order to verify whether the algorithm model proposed in this paper meets the requirements of the group text, that is, to verify the feasibility of the algorithm, the performance of the model proposed in this study is analyzed through design experiments. Moreover, the basic conditions for composition scoring are input into the model as a constraint model. The research results show that the algorithm proposed in this paper has a certain practical effect, and it can be applied to the English assessment system and the online assessment system of the homework evaluation system algorithm system.

Download Full-text

Semantic segmentation of PolSAR image data using advanced deep learning model

Scientific Reports ◽

10.1038/s41598-021-94422-y ◽

2021 ◽

Vol 11 (1) ◽

Author(s):

Rajat Garg ◽

Anil Kumar ◽

Nikunj Bansal ◽

Manish Prateek ◽

Shashi Kumar

Keyword(s):

Machine Learning ◽

Remote Sensing ◽

Deep Learning ◽

Urban Area ◽

Urban Areas ◽

Learning Algorithms ◽

Semantic Segmentation ◽

Learning Model ◽

Machine Learning Algorithms ◽

Deep Learning Model

AbstractUrban area mapping is an important application of remote sensing which aims at both estimation and change in land cover under the urban area. A major challenge being faced while analyzing Synthetic Aperture Radar (SAR) based remote sensing data is that there is a lot of similarity between highly vegetated urban areas and oriented urban targets with that of actual vegetation. This similarity between some urban areas and vegetation leads to misclassification of the urban area into forest cover. The present work is a precursor study for the dual-frequency L and S-band NASA-ISRO Synthetic Aperture Radar (NISAR) mission and aims at minimizing the misclassification of such highly vegetated and oriented urban targets into vegetation class with the help of deep learning. In this study, three machine learning algorithms Random Forest (RF), K-Nearest Neighbour (KNN), and Support Vector Machine (SVM) have been implemented along with a deep learning model DeepLabv3+ for semantic segmentation of Polarimetric SAR (PolSAR) data. It is a general perception that a large dataset is required for the successful implementation of any deep learning model but in the field of SAR based remote sensing, a major issue is the unavailability of a large benchmark labeled dataset for the implementation of deep learning algorithms from scratch. In current work, it has been shown that a pre-trained deep learning model DeepLabv3+ outperforms the machine learning algorithms for land use and land cover (LULC) classification task even with a small dataset using transfer learning. The highest pixel accuracy of 87.78% and overall pixel accuracy of 85.65% have been achieved with DeepLabv3+ and Random Forest performs best among the machine learning algorithms with overall pixel accuracy of 77.91% while SVM and KNN trail with an overall accuracy of 77.01% and 76.47% respectively. The highest precision of 0.9228 is recorded for the urban class for semantic segmentation task with DeepLabv3+ while machine learning algorithms SVM and RF gave comparable results with a precision of 0.8977 and 0.8958 respectively.

Download Full-text

Reviewing the relationship between machines and radiology: the application of artificial intelligence

Acta Radiologica Open ◽

10.1177/2058460121990296 ◽

2021 ◽

Vol 10 (2) ◽

pp. 205846012199029

Author(s):

Rani Ahmad

Keyword(s):

Artificial Intelligence ◽

Machine Learning ◽

Deep Learning ◽

Health Care Professionals ◽

Learning Algorithms ◽

Machine Learning Algorithms ◽

Health Science ◽

Computer Algorithms ◽

Learning Models ◽

Specificity And Sensitivity

Background The scope and productivity of artificial intelligence applications in health science and medicine, particularly in medical imaging, are rapidly progressing, with relatively recent developments in big data and deep learning and increasingly powerful computer algorithms. Accordingly, there are a number of opportunities and challenges for the radiological community. Purpose To provide review on the challenges and barriers experienced in diagnostic radiology on the basis of the key clinical applications of machine learning techniques. Material and Methods Studies published in 2010–2019 were selected that report on the efficacy of machine learning models. A single contingency table was selected for each study to report the highest accuracy of radiology professionals and machine learning algorithms, and a meta-analysis of studies was conducted based on contingency tables. Results The specificity for all the deep learning models ranged from 39% to 100%, whereas sensitivity ranged from 85% to 100%. The pooled sensitivity and specificity were 89% and 85% for the deep learning algorithms for detecting abnormalities compared to 75% and 91% for radiology experts, respectively. The pooled specificity and sensitivity for comparison between radiology professionals and deep learning algorithms were 91% and 81% for deep learning models and 85% and 73% for radiology professionals (p < 0.000), respectively. The pooled sensitivity detection was 82% for health-care professionals and 83% for deep learning algorithms (p < 0.005). Conclusion Radiomic information extracted through machine learning programs form images that may not be discernible through visual examination, thus may improve the prognostic and diagnostic value of data sets.

Download Full-text

Pervasive Lying Posture Tracking

Sensors ◽

10.3390/s20205953 ◽

2020 ◽

Vol 20 (20) ◽

pp. 5953 ◽

Cited By ~ 1

Author(s):

Parastoo Alinia ◽

Ali Samadani ◽

Mladen Milosevic ◽

Hassan Ghasemzadeh ◽

Saman Parvaneh

Keyword(s):

Machine Learning ◽

Deep Learning ◽

Computational Models ◽

Learning Algorithms ◽

Pressure Sensors ◽

Machine Learning Algorithms ◽

Sensor System ◽

Accurate Detection ◽

Research Questions ◽

Posture Tracking

Automated lying-posture tracking is important in preventing bed-related disorders, such as pressure injuries, sleep apnea, and lower-back pain. Prior research studied in-bed lying posture tracking using sensors of different modalities (e.g., accelerometer and pressure sensors). However, there remain significant gaps in research regarding how to design efficient in-bed lying posture tracking systems. These gaps can be articulated through several research questions, as follows. First, can we design a single-sensor, pervasive, and inexpensive system that can accurately detect lying postures? Second, what computational models are most effective in the accurate detection of lying postures? Finally, what physical configuration of the sensor system is most effective for lying posture tracking? To answer these important research questions, in this article we propose a comprehensive approach for designing a sensor system that uses a single accelerometer along with machine learning algorithms for in-bed lying posture classification. We design two categories of machine learning algorithms based on deep learning and traditional classification with handcrafted features to detect lying postures. We also investigate what wearing sites are the most effective in the accurate detection of lying postures. We extensively evaluate the performance of the proposed algorithms on nine different body locations and four human lying postures using two datasets. Our results show that a system with a single accelerometer can be used with either deep learning or traditional classifiers to accurately detect lying postures. The best models in our approach achieve an F1 score that ranges from 95.2% to 97.8% with a coefficient of variation from 0.03 to 0.05. The results also identify the thighs and chest as the most salient body sites for lying posture tracking. Our findings in this article suggest that, because accelerometers are ubiquitous and inexpensive sensors, they can be a viable source of information for pervasive monitoring of in-bed postures.

Download Full-text

A Robust Method to Predict Fluid Properties Based on Big Data and Machine Learning Algorithms

10.2523/iptc-21356-ms ◽

2021 ◽

Author(s):

Yingxian Liu ◽

Cunliang Chen ◽

Hanqing Zhao ◽

Yu Wang ◽

Xiaodong Han

Keyword(s):

Machine Learning ◽

Physical Properties ◽

Learning Algorithm ◽

Direct Method ◽

Learning Algorithms ◽

Small Error ◽

Machine Learning Algorithms ◽

Well Test ◽

Empirical Formulas ◽

Fluid Properties

Abstract Fluid properties are key factors for predicting single well productivity, well test interpretation and oilfield recovery prediction, which directly affect the success of ODP program design. The most accurate and direct method of acquisition is underground sampling. However, not every well has samples due to technical reasons such as excessive well deviation or high cost during the exploration stage. Therefore, analogies or empirical formulas have to be adopted to carry out research in many cases. But a large number of oilfield developments have shown that the errors caused by these methods are very large. Therefore, how to quickly and accurately obtain fluid physical properties is of great significance. In recent years, with the development and improvement of artificial intelligence or machine learning algorithms, their applications in the oilfield have become more and more extensive. This paper proposed a method for predicting crude oil physical properties based on machine learning algorithms. This method uses PVT data from nearly 100 wells in Bohai Oilfield. 75% of the data is used for training and learning to obtain the prediction model, and the remaining 25% is used for testing. Practice shows that the prediction results of the machine learning algorithm are very close to the actual data, with a very small error. Finally, this method was used to apply the preliminary plan design of the BZ29 oilfield which is a new oilfield. Especially for the unsampled sand bodies, the fluid physical properties prediction was carried out. It also compares the influence of the analogy method on the scheme, which provides potential and risk analysis for scheme design. This method will be applied in more oil fields in the Bohai Sea in the future and has important promotion value.

Download Full-text