Deep Synthesis is a deep learning driven application for predicting the products of an organic chemical reaction. Deep Synthesis runs a deep learning machine translation model inspired by Schwaller et al that takes as input the SMILES string representation of chemcal reactants, and "translates" them to the SMILES strings of the products.
Deep Synthesis allows chemists and chemistry enthusiasts to experiment with reactions in silico.
Deep Synthesis is running online at deepsynthesis.xyz.
Deep-Synthesis
├── Synthesis
│ └── Main program files
├── build_aws
│ └── Instructions for AWS setup
├── build_local
│ └── Instructions for local setup
├── configs
├── data
│ └── Small sample datasets
├── media
├── train
└── Code for retraining the model
Main application files are stored in the Synthesis
directory
Code and instructions for setup on AWS are contained in the build_aws
directory
Code and instructions for local setup are contained in the build_local
directory
Code for retraining the model (assuming a local install) is contained in the train
directory
The Deep Synthesis repo supports local installation and setup on AWS.
Deep Synthesis can easily be set up on your local machine, either as a Docker container or a conda environment. Local setup is best if you want to run bulk predictions or tinker with the app.
Local setup supports bulk prediction from a text file of SMILES string, either through the app GUI or a command line interface.
The quickest way to get up and running is to install Deep Synthesis using Docker. If you do not have Docker installed, follow the Docker Download Instructions. Then run the following commands:
git clone https://github.com/kheyer/Deep-Synthesis
cd Deep-Synthesis
docker build -f build_local/local.Dockerfile -t deep_synthesis .
docker run -d -p 8501:8501 deep_synthesis
Deep Synthesis is now running locally on port 8501.
For additional local installation instructions, see the README of the build_local
directory. link. The build_local
README details how to set up Deep Synthesis as a Conda environment as an alternative to Docker, how to run bulk predictions on your local install, and how to run predictions from the command line.
For a more scalable setup, Deep Synthesis can be run on AWS. We can set up a Kubernetes cluster on AWS to host the front end of the application and a AWS Lambda function to handle inference.
This is the framework being used to host the application at deepsynthesis.xyz. For full details on setting up Kubernetes and AWS Lambda, see the build_aws
directory. link
Note that AWS setup is much more involved than local setup, and requires an AWS IAM account with permissions for EKS, EC2 and Lambda.
Deep Synthesis is running online at deepsynthesis.xyz. Using the web app is great if you want to play with the model or run a small number of predictions.
Input your SMILES string into the text box, or choose one of the examples in the drop down menu on the left. Clicking the "Predict Products" button generates a set of predicted reaction products.
Predictions can be further inspected by looking at attention maps between reactant and predicted product strings.
The model used is a sequence to sequence transformer model, implemented in OpenNMT. This model takes as input the SMILES string representation of reactants, and "translates" them to the SMILES of the product molecule. This method was originally developed by Schwaller et al. Their work is available at the Molecular Transformer Repo.
Compared to Schwaller, the model shown here was trained from scratch in Pytorch 1.1.0 using the expanded Patent Reaction Dataset. The new model also uses character level tokenization, which reduces model size and removes the need for the chemically constrained beam search procedure used by Schwaller.
For more details on the project, see the presentation slides