We have developed an end-to-end, retrosynthesis system, named ChemiRise, that can propose complete retrosynthesis routes for organic compounds rapidly and reliably. The system was trained on a processed patent database of over 3 million organic reactions. Experimental reactions were atom-mapped, clustered, and extracted into reaction templates. We then trained a graph convolutional neural network-based one-step reaction proposer using template embeddings and developed a guiding algorithm on the directed acyclic graph (DAG) of chemical compounds to find the best candidate to explore. The atom-mapping algorithm and the one-step reaction proposer were benchmarked against previous studies and showed better results. The final product was demonstrated by retrosynthesis routes reviewed and rated by human experts, showing satisfying functionality and a potential productivity boost in real-life use cases.
System architecture of ChemiRise, information flow from bottom to the top

Evaluation results of atom mapping algorithms
Algorithm Complete mapping rate Average number of C-C bond broken
ChemiRise 73% 0.28
Marvin 63% 0.30
Indigo 57% 0.37

Evaluation results of reaction proposers
Model Top-N Accuracy (%)
1 3 5 10 20 50
ChemiRise 43.8 62.1 70.1 78.3 85.0 90.5
Coley et al. 37.3 54.7 63.3 74.1 82.0 85.3
Interested in using our system? Please click   to get registration.