Learning from Failure: Predicting Electronic Structure Calculation Outcomes with Machine Learning Models

Chenru Duan; Jon Paul Janet; Fang Liu; Aditya Nandy; Heather J. Kulik

doi:10.1021/acs.jctc.9b00057

Designing in the Face of Uncertainty: Exploiting Electronic Structure and Machine Learning Models for Discovery in Inorganic Chemistry

Inorganic Chemistry ◽

10.1021/acs.inorgchem.9b00109 ◽

2019 ◽

Vol 58 (16) ◽

pp. 10592-10606 ◽

Cited By ~ 26

Author(s):

Jon Paul Janet ◽

Fang Liu ◽

Aditya Nandy ◽

Chenru Duan ◽

Tzuhsiung Yang ◽

...

Keyword(s):

Inorganic Chemistry ◽

Machine Learning ◽

Electronic Structure ◽

Learning Models ◽

The Face ◽

Machine Learning Models

Download Full-text

Learning from Failure: Predicting Electronic Structure Calculation Outcomes with Machine Learning Models

10.26434/chemrxiv.7616009.v1 ◽

2019 ◽

Author(s):

Chenru Duan ◽

Jon Paul Janet ◽

Fang Liu ◽

Aditya Nandy ◽

Heather Kulik

Keyword(s):

Neural Network ◽

Machine Learning ◽

Electronic Structure ◽

Transition Metal ◽

Geometry Optimization ◽

Series Data ◽

Learning Models ◽

Transition Metal Chemistry ◽

Metal Chemistry ◽

Machine Learning Models

<div><div><div><div><div><div><p>High-throughput computational screening for chemical discovery mandates the automated and unsupervised simulation of thousands of new molecules and materials. In challenging materials spaces, such as open shell transition metal chemistry, characterization requires time-consuming first-principles simulation that often necessitates human intervention. These calculations can frequently lead to a null result, e.g., the calculation does not converge or the molecule does not stay intact during a geometry optimization. To overcome this challenge toward realizing fully automated chemical discovery in transition metal chemistry, we have developed the first machine learning models that predict the likelihood of successful simulation outcomes. We train support vector machine and artificial neural network classifiers to predict simulation outcomes (i.e., geometry optimization result and degree of deviation) for a chosen electronic structure method based on chemical composition. For these static models, we achieve an area under the curve of at least 0.95, minimizing computational time spent on non- productive simulations and therefore enabling efficient chemical space exploration. We introduce a metric of model uncertainty based on the distribution of points in the latent space to systematically improve model prediction confidence. In a complementary approach, we train a convolutional neural network classification model on simulation output electronic and geometric structure time series data. This dynamic model generalizes more readily than the static classifier by becoming more predictive as input simulation length increases. Finally, we describe approaches for using these models to enable autonomous job control in transition metal complex discovery.</p></div></div></div></div></div></div>

Download Full-text

Learning from Failure: Predicting Electronic Structure Calculation Outcomes with Machine Learning Models

10.26434/chemrxiv.7616009 ◽

2019 ◽

Author(s):

Chenru Duan ◽

Jon Paul Janet ◽

Fang Liu ◽

Aditya Nandy ◽

Heather Kulik

Keyword(s):

Neural Network ◽

Machine Learning ◽

Electronic Structure ◽

Transition Metal ◽

Geometry Optimization ◽

Series Data ◽

Learning Models ◽

Transition Metal Chemistry ◽

Metal Chemistry ◽

Machine Learning Models

<div><div><div><div><div><div><p>High-throughput computational screening for chemical discovery mandates the automated and unsupervised simulation of thousands of new molecules and materials. In challenging materials spaces, such as open shell transition metal chemistry, characterization requires time-consuming first-principles simulation that often necessitates human intervention. These calculations can frequently lead to a null result, e.g., the calculation does not converge or the molecule does not stay intact during a geometry optimization. To overcome this challenge toward realizing fully automated chemical discovery in transition metal chemistry, we have developed the first machine learning models that predict the likelihood of successful simulation outcomes. We train support vector machine and artificial neural network classifiers to predict simulation outcomes (i.e., geometry optimization result and degree of deviation) for a chosen electronic structure method based on chemical composition. For these static models, we achieve an area under the curve of at least 0.95, minimizing computational time spent on non- productive simulations and therefore enabling efficient chemical space exploration. We introduce a metric of model uncertainty based on the distribution of points in the latent space to systematically improve model prediction confidence. In a complementary approach, we train a convolutional neural network classification model on simulation output electronic and geometric structure time series data. This dynamic model generalizes more readily than the static classifier by becoming more predictive as input simulation length increases. Finally, we describe approaches for using these models to enable autonomous job control in transition metal complex discovery.</p></div></div></div></div></div></div>

Download Full-text

Improving XGBoost with Imagination Sampling

Communications of the Blyth Institute ◽

10.33014/issn.2640-5652.2.1.holloway.1 ◽

2020 ◽

Vol 2 (1) ◽

pp. 3-6

Author(s):

Eric Holloway

Keyword(s):

Machine Learning ◽

General System ◽

Learning Models ◽

Starting Point ◽

Machine Learning Models

Imagination Sampling is the usage of a person as an oracle for generating or improving machine learning models. Previous work demonstrated a general system for using Imagination Sampling for obtaining multibox models. Here, the possibility of importing such models as the starting point for further automatic enhancement is explored.

Download Full-text

Development of Machine Learning Models to Predict Student Performance in Computer Literacy Courses

International Review on Computers and Software (IRECOS) ◽

10.15866/irecos.v13i1.16863 ◽

2018 ◽

Vol 13 (1) ◽

pp. 21

Author(s):

George Anderson ◽

Oduronke T. Eyitayo

Keyword(s):

Machine Learning ◽

Student Performance ◽

Computer Literacy ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Experimental Comparison of Machine Learning Models in Malware Packing Detection

2020 21st Asia-Pacific Network Operations and Management Symposium (APNOMS) ◽

10.23919/apnoms50412.2020.9237007 ◽

2020 ◽

Author(s):

Jong-Wouk Kim ◽

Juhong Namgung ◽

Yang-Sae Moon ◽

Mi-Jung Choi

Keyword(s):

Machine Learning ◽

Experimental Comparison ◽

Learning Models ◽

Machine Learning Models

Download Full-text

Epigenetic Target Prediction with Accurate Machine Learning Models

10.26434/chemrxiv.13522313 ◽

2021 ◽

Author(s):

Norberto Sánchez-Cruz ◽

Jose L. Medina-Franco

Keyword(s):

Machine Learning ◽

Small Molecules ◽

Predictive Models ◽

Large Scale ◽

Target Prediction ◽

Quantitative Measure ◽

Learning Models ◽

Discovery Research ◽

Drug Discovery Research ◽

Machine Learning Models

<p>Epigenetic targets are a significant focus for drug discovery research, as demonstrated by the eight approved epigenetic drugs for treatment of cancer and the increasing availability of chemogenomic data related to epigenetics. This data represents a large amount of structure-activity relationships that has not been exploited thus far for the development of predictive models to support medicinal chemistry efforts. Herein, we report the first large-scale study of 26318 compounds with a quantitative measure of biological activity for 55 protein targets with epigenetic activity. Through a systematic comparison of machine learning models trained on molecular fingerprints of different design, we built predictive models with high accuracy for the epigenetic target profiling of small molecules. The models were thoroughly validated showing mean precisions up to 0.952 for the epigenetic target prediction task. Our results indicate that the herein reported models have considerable potential to identify small molecules with epigenetic activity. Therefore, our results were implemented as freely accessible and easy-to-use web application.</p>

Download Full-text

A Comparative Study of Machine Learning Models for Stock Market Rate Prediction

International Journal of Computer Sciences and Engineering ◽

10.26438/ijcse/v7i6.985990 ◽

2019 ◽

Vol 7 (6) ◽

pp. 985-990

Author(s):

reeraksha M S ◽

Bhargavi M S

Keyword(s):

Machine Learning ◽

Stock Market ◽

Comparative Study ◽

Learning Models ◽

Rate Prediction ◽

Market Rate ◽

Machine Learning Models

Download Full-text

An Intelligent Approach for Prediction of Liver Disease using Machine Learning Models

International Journal of Emerging Trends in Engineering Research ◽

10.30534/ijeter/2020/568102020 ◽

2020 ◽

Vol 8 (10) ◽

pp. 6974-6983

Keyword(s):

Machine Learning ◽

Liver Disease ◽

Learning Models ◽

Intelligent Approach ◽

Machine Learning Models

Download Full-text

Utilizing Blockchain Technology in Social Media Bot Identification

10.36227/techrxiv.12049374 ◽

2020 ◽

Author(s):

Shreya Reddy ◽

Lisa Ewen ◽

Pankti Patel ◽

Prerak Patel ◽

Ankit Kundal ◽

...

Keyword(s):

Machine Learning ◽

Social Media ◽

Gold Standard ◽

The Internet ◽

Learning Models ◽

Current Time ◽

Machine Learning Methods ◽

Blockchain Technology ◽

Modern Age ◽

Machine Learning Models

<p>As bots become more prevalent and smarter in the modern age of the internet, it becomes ever more important that they be identified and removed. Recent research has dictated that machine learning methods are accurate and the gold standard of bot identification on social media. Unfortunately, machine learning models do not come without their negative aspects such as lengthy training times, difficult feature selection, and overwhelming pre-processing tasks. To overcome these difficulties, we are proposing a blockchain framework for bot identification. At the current time, it is unknown how this method will perform, but it serves to prove the existence of an overwhelming gap of research under this area.<i></i></p>

Download Full-text