scholarly journals Molecular Transformer – A Model for Uncertainty-Calibrated Chemical Reaction Prediction

Author(s):  
Philippe Schwaller ◽  
Teodoro Laino ◽  
Theophile Gaudin ◽  
Peter Bolgar ◽  
Costas Bekas ◽  
...  

<div><div><div><p>Organic synthesis is one of the key stumbling blocks in medicinal chemistry. A necessary yet unsolved step in planning synthesis is solving the forward problem: given reactants and reagents, predict the products. Similar to other work, we treat reaction prediction as a machine translation problem between SMILES strings of reactants-reagents and the products. We show that a multi-head attention Molecular Transformer model outperforms all algorithms in the literature, achieving a top-1 accuracy above 90% on a common benchmark dataset. Our algorithm requires no handcrafted rules, and accurately predicts subtle chemical transformations. Crucially, our model can accurately estimate its own uncertainty, with an uncertainty score that is 89% accurate in terms of classifying whether a prediction is correct. Furthermore, we show that the model is able to handle inputs without reactant-reagent split and including stereochemistry, which makes our method universally applicable.</p></div></div></div>

2019 ◽  
Author(s):  
Philippe Schwaller ◽  
Teodoro Laino ◽  
Theophile Gaudin ◽  
Peter Bolgar ◽  
Costas Bekas ◽  
...  

<div><div><div><p>Organic synthesis is one of the key stumbling blocks in medicinal chemistry. A necessary yet unsolved step in planning synthesis is solving the forward problem: given reactants and reagents, predict the products. Similar to other work, we treat reaction prediction as a machine translation problem between SMILES strings of reactants-reagents and the products. We show that a multi-head attention Molecular Transformer model outperforms all algorithms in the literature, achieving a top-1 accuracy above 90% on a common benchmark dataset. Our algorithm requires no handcrafted rules, and accurately predicts subtle chemical transformations. Crucially, our model can accurately estimate its own uncertainty, with an uncertainty score that is 89% accurate in terms of classifying whether a prediction is correct. Furthermore, we show that the model is able to handle inputs without reactant-reagent split and including stereochemistry, which makes our method universally applicable.</p></div></div></div>


Author(s):  
Philippe Schwaller ◽  
Teodoro Laino ◽  
Theophile Gaudin ◽  
Peter Bolgar ◽  
Costas Bekas ◽  
...  

<div><div><div><p>Organic synthesis is one of the key stumbling blocks in medicinal chemistry. A necessary yet unsolved step in planning synthesis is solving the forward problem: given reactants and reagents, predict the products. Similar to other works, we treat reaction prediction as a machine translation problem between SMILES strings of reactants-reagents and the products. We show that a multi-head attention Molecular Transformer model outperforms all algorithms in the literature, achieving a top-1 accuracy above 90% on a common benchmark dataset. Our algorithm requires no handcrafted rules, and accurately predicts subtle chemical transformations. Crucially, our model can accurately estimate its own uncertainty, with an uncertainty score that is 89% accurate in terms of classifying whether a prediction is correct. Furthermore, we show that the model is able to handle inputs without reactant-reagent split and including stereochemistry, which makes our method universally applicable.</p></div></div></div>


Synthesis ◽  
2019 ◽  
Vol 52 (05) ◽  
pp. 673-687 ◽  
Author(s):  
Yan-Ping Meng ◽  
Shi-Meng Wang ◽  
Wan-Yin Fang ◽  
Zhi-Zhong Xie ◽  
Jing Leng ◽  
...  

The sulfur(VI) fluoride exchange reaction (SuFEx), developed by Sharpless and co-workers in 2014, is a new category of click reaction that creates molecular connections with absolute reliability and unprecedented efficiency through a sulfur(VI) hub. Ethenesulfonyl fluoride (ESF), as one of the most important sulfur(VI) hubs, exhibits extraordinary reactivity in SuFEx click chemistry and organic synthesis. This review summarizes the chemical properties and applications of ESF in click chemistry, organic chemistry, materials science, medicinal chemistry and in many other fields related to organic synthesis.1 Introduction2 Chemical Transformations of ESF3 Chemical Transformations of 2-Arylethenesulfonyl Fluorides4 Novel SuFEx Reagents Derived from ESF5 Applications of ESF Derivatives in Medicinal Chemistry6 Applications of ESF Derivatives in Materials Science7 Conclusion


2021 ◽  
Vol 13 (1) ◽  
Author(s):  
Jiazhen He ◽  
Huifang You ◽  
Emil Sandström ◽  
Eva Nittinger ◽  
Esben Jannik Bjerrum ◽  
...  

AbstractA main challenge in drug discovery is finding molecules with a desirable balance of multiple properties. Here, we focus on the task of molecular optimization, where the goal is to optimize a given starting molecule towards desirable properties. This task can be framed as a machine translation problem in natural language processing, where in our case, a molecule is translated into a molecule with optimized properties based on the SMILES representation. Typically, chemists would use their intuition to suggest chemical transformations for the starting molecule being optimized. A widely used strategy is the concept of matched molecular pairs where two molecules differ by a single transformation. We seek to capture the chemist’s intuition from matched molecular pairs using machine translation models. Specifically, the sequence-to-sequence model with attention mechanism, and the Transformer model are employed to generate molecules with desirable properties. As a proof of concept, three ADMET properties are optimized simultaneously: logD, solubility, and clearance, which are important properties of a drug. Since desirable properties often vary from project to project, the user-specified desirable property changes are incorporated into the input as an additional condition together with the starting molecules being optimized. Thus, the models can be guided to generate molecules satisfying the desirable properties. Additionally, we compare the two machine translation models based on the SMILES representation, with a graph-to-graph translation model HierG2G, which has shown the state-of-the-art performance in molecular optimization. Our results show that the Transformer can generate more molecules with desirable properties by making small modifications to the given starting molecules, which can be intuitive to chemists. A further enrichment of diverse molecules can be achieved by using an ensemble of models.


2020 ◽  
Author(s):  
Chengyun Zhang ◽  
Ling Wang ◽  
Yejian Wu ◽  
Yun Zhang ◽  
An Su ◽  
...  

<div><br></div><div><p> Atom mapping reveals the corresponding relationship between reactant and product atoms in chemical reactions, which is important for drug design, exploration for underlying chemical mechanism, reaction classification and so on. Here, we present a new method that links atom mapping and neural machine translation using the transformer model. In contrast to the previous algorithms, our method runs reaction prediction and captures the information of corresponding atoms in parallel. Meanwhile, we use a set of approximately 360K reactions without atom mapping information for obtaining general chemical knowledge and transfer it to atom mapping task on another dataset which contains 50K atom-mapped reactions. With manual evaluation, the top-1 accuracy of the transformer model in atom mapping reaches 91.4%. we hope our work can provide an important step toward solving the challenge problem of atom mapping in a linguistic perspective.</p></div>


2020 ◽  
Author(s):  
Jiazhen He ◽  
huifang you ◽  
Emil Sandström ◽  
eva nittinger ◽  
Esben Jannik Bjerrum ◽  
...  

A main challenge in drug discovery is finding molecules with a desirable balance of multiple properties. Here, we focus on the task of molecular optimization, where the goal is to optimize a given starting molecule towards desirable properties. This task can be framed as a machine translation problem in natural language processing, where in our case, a molecule is translated into a molecule with optimized properties based on the SMILES representation. Typically, chemists would use their intuition to suggest chemical transformations for the starting molecule being optimized. A widely used strategy is the concept of matched molecular pairs where two molecules differ by a single transformation. We seek to capture the chemist's intuition from matched molecular pairs using machine translation models. Specifically, the sequence-to-sequence model with attention mechanism, and the Transformer model are employed to generate molecules with desirable properties. As a proof of concept, three ADMET properties are optimized simultaneously: <i>logD</i>, <i>solubility</i>, and <i>clearance</i>, which are important properties of a drug. Since desirable properties often vary from project to project, the user-specified desirable property changes are incorporated into the input as an additional condition together with the starting molecules being optimized. Thus, the models can be guided to generate molecules satisfying the desirable properties. Additionally, we compare the two machine translation models based on the SMILES representation, with a graph-to-graph translation model HierG2G, which has shown the state-of-the-art performance in molecular optimization. Our results show that the Transformer can generate more molecules with desirable properties by making small modifications to the given starting molecules, which can be intuitive to chemists. A further enrichment of diverse molecules can be achieved by using an ensemble of models.


2020 ◽  
Author(s):  
Jiazhen He ◽  
Huifang You ◽  
Emil Sandström ◽  
Eva Nittinger ◽  
Esben Bjerrum ◽  
...  

Abstract A main challenge in drug discovery is finding molecules with a desirable balance of multiple properties. Here, we focus on the task of molecular optimization, where the goal is to optimize a given starting molecule towards desirable properties. This task can be framed as a machine translation problem in natural language processing, where in our case, a molecule is translated into a molecule with optimized properties based on the SMILES representation. Typically, chemists would use their intuition to suggest chemical transformations for the starting molecule being optimized. A widely used strategy is the concept of matched molecular pairs where two molecules differ by a single transformation. We seek to capture the chemist's intuition from matched molecular pairs using machine translation models. Specifically, the sequence-to-sequence model with attention mechanism, and the Transformer model are employed to generate molecules with desirable properties. As a proof of concept, three ADMET properties are optimized simultaneously: logD, solubility, and clearance, which are important properties of a drug. Since desirable properties often vary from project to project, the user-specified desirable property changes are incorporated into the input as an additional condition together with the starting molecules being optimized. Thus, the models can be guided to generate molecules satisfying the desirable properties. Additionally, we compare the two machine translation models based on the SMILES representation, with a graph-to-graph translation model HierG2G, which has shown the state-of-the-art performance in molecular optimization. Our results show that the Transformer can generate more molecules with desirable properties by making small modifications to the given starting molecules, which can be intuitive to chemists. A further enrichment of diverse molecules can be achieved by using an ensemble of models.


2020 ◽  
Author(s):  
Jiazhen He ◽  
huifang you ◽  
Emil Sandström ◽  
eva nittinger ◽  
Esben Jannik Bjerrum ◽  
...  

A main challenge in drug discovery is finding molecules with a desirable balance of multiple properties. Here, we focus on the task of molecular optimization, where the goal is to optimize a given starting molecule towards desirable properties. This task can be framed as a machine translation problem in natural language processing, where in our case, a molecule is translated into a molecule with optimized properties based on the SMILES representation. Typically, chemists would use their intuition to suggest chemical transformations for the starting molecule being optimized. A widely used strategy is the concept of matched molecular pairs where two molecules differ by a single transformation. We seek to capture the chemist's intuition from matched molecular pairs using machine translation models. Specifically, the sequence-to-sequence model with attention mechanism, and the Transformer model are employed to generate molecules with desirable properties. As a proof of concept, three ADMET properties are optimized simultaneously: <i>logD</i>, <i>solubility</i>, and <i>clearance</i>, which are important properties of a drug. Since desirable properties often vary from project to project, the user-specified desirable property changes are incorporated into the input as an additional condition together with the starting molecules being optimized. Thus, the models can be guided to generate molecules satisfying the desirable properties. Additionally, we compare the two machine translation models based on the SMILES representation, with a graph-to-graph translation model HierG2G, which has shown the state-of-the-art performance in molecular optimization. Our results show that the Transformer can generate more molecules with desirable properties by making small modifications to the given starting molecules, which can be intuitive to chemists. A further enrichment of diverse molecules can be achieved by using an ensemble of models.


2020 ◽  
Author(s):  
Jiazhen He ◽  
huifang you ◽  
Emil Sandström ◽  
eva nittinger ◽  
Esben Jannik Bjerrum ◽  
...  

A main challenge in drug discovery is finding molecules with a desirable balance of multiple properties. Here, we focus on the task of molecular optimization, where the goal is to optimize a given starting molecule towards desirable properties. This task can be framed as a machine translation problem in natural language processing, where in our case, a molecule is translated into a molecule with optimized properties based on the SMILES representation. Typically, chemists would use their intuition to suggest chemical transformations for the starting molecule being optimized. A widely used strategy is the concept of matched molecular pairs where two molecules differ by a single transformation. We seek to capture the chemist's intuition from matched molecular pairs using machine translation models. Specifically, the sequence-to-sequence model with attention mechanism, and the Transformer model are employed to generate molecules with desirable properties. As a proof of concept, three ADMET properties are optimized simultaneously: <i>logD</i>, <i>solubility</i>, and <i>clearance</i>, which are important properties of a drug. Since desirable properties often vary from project to project, the user-specified desirable property changes are incorporated into the input as an additional condition together with the starting molecules being optimized. Thus, the models can be guided to generate molecules satisfying the desirable properties. Additionally, we compare the two machine translation models based on the SMILES representation, with a graph-to-graph translation model HierG2G, which has shown the state-of-the-art performance in molecular optimization. Our results show that the Transformer can generate more molecules with desirable properties by making small modifications to the given starting molecules, which can be intuitive to chemists. A further enrichment of diverse molecules can be achieved by using an ensemble of models.


2020 ◽  
Author(s):  
Chengyun Zhang ◽  
Ling Wang ◽  
Yejian Wu ◽  
Yun Zhang ◽  
An Su ◽  
...  

<div><br></div><div><p> Atom mapping reveals the corresponding relationship between reactant and product atoms in chemical reactions, which is important for drug design, exploration for underlying chemical mechanism, reaction classification and so on. Here, we present a new method that links atom mapping and neural machine translation using the transformer model. In contrast to the previous algorithms, our method runs reaction prediction and captures the information of corresponding atoms in parallel. Meanwhile, we use a set of approximately 360K reactions without atom mapping information for obtaining general chemical knowledge and transfer it to atom mapping task on another dataset which contains 50K atom-mapped reactions. With manual evaluation, the top-1 accuracy of the transformer model in atom mapping reaches 91.4%. we hope our work can provide an important step toward solving the challenge problem of atom mapping in a linguistic perspective.</p></div>


Sign in / Sign up

Export Citation Format

Share Document