Research Brief | Computing+ Finance Professor Mingli Song: Recent Advances in Deep Learning for Retrosynthesis

Source:上海高等研究院英文网

As the cornerstone in organic chemistry, retrosynthesis provides chemists in material and drug manufacturing access to poorly available and brand-new molecules. Conventional rule-based or expert-based computer-aided synthesis has obvious limitations, such as high labor costs and limited search space. In recent years, dramatic breakthroughs driven by artificial intelligence have revolutionized retrosynthesis. AI enables chemists to perform molecular synthesis with greater efficiency and precision.

Conventional computer-assisted retrosynthesis requires matching manually encoded chemical laws, while AI-based retrosynthesis enjoys advantages such as no manual encoding and powerful generalization and context capturing capacity, making screening and design synthetic routes more effectively possible. Professor Mingli Song from Zhejiang University and his collaborators had been working on combining retrosynthesis and AI in theoretical research and experimental studies to identify new routes for drug molecules. Recently, they published the review ‘recent advances in deep learning for retrosynthesis’ in WIRES Computational Molecular Science, providing a systematic overview of the development and application of deep learning for retrosynthesis.

The review summarized progresses and goals of AI in single-step retrosynthesis and multistep retrosynthesis, and conducted a thorough taxonomy of existing method and evaluation metrics. They first described computational approach to represent molecules, e.g., molecular fingerprints, SMILES strings, molecular graphs etc. Next, they presented a full collection of methods in single-step retrosynthesis and multistep retrosynthesis, and focused on selection-based and generation-based methods based on decision making for the former and categorized the methods by the search algorithm used for the latter.

The authors also compared the effectiveness of different methods by additional experiments on multiple reaction datasets such as USPTO-50K, and found that some outperformed on small datasets did not do so well on large datasets. In addition, they analyzed popular retrosynthesis databases as well as established platforms with interfaces, providing easy references for organic synthesis researchers. They also pointed out that current AI-based retrosynthesis still suffered from problems such as lacking of evaluation metrics and interpretability. In conclusion, by reviewing progress and existing studies, the authors discussed about promising research directions in this field.

To learn more about the review, please visit  https://wires.onlinelibrary.wiley.com/doi/pdf/10.1002/wcms.1694