Skip to content

Latest commit

 

History

History
45 lines (35 loc) · 2 KB

generation_readme.md

File metadata and controls

45 lines (35 loc) · 2 KB

Generation Evaluation Scripts

This directory contains scripts to sample molecules from Molecule Chef and also to compute metrics on the generated molecules that appear in Table 1.

Folders

generate_for_mchef

This contains the scripts to sample reactants to put them in a format ready for the Molecular Transformer. To generate molecules these steps can be followed:

  1. Run scripts/evaluate/generation/generate_for_mchef/create_reactant_bags.py to generate tokenized reactant bags for transformer.
  2. Feed these through the transformer to get tokenized product bags. You can run this translation with their code using a command such as:
 python translate.py -model <transformer-weight-path>> \
                    -src <path-to-tokenized-reactants> \
                    -output <path-for-tokenized-products>  \
                    -batch_size 300 -replace_unk -max_length 500 -fast -gpu 1 -n_best 5
  1. Use the script scripts/evaluate/put_together_molecular_transformer_predictions.py to put together the tokenized predictions to create a file of SMILES generated. This can be put in generated_smiles folder.

generated_smiles

Stores generated SMILES strings from the models.

metrics

This folder contains the scripts to evaluate the molecules generated by a model. Modify tables_spec.json to control what metrics are evaluated. Then run python evaluate_metrics.py to create the table.

Quality Filters The quality filters require the rd_filters package to run. This can be installed for instance with: pip install git+https://github.com/PatWalters/rd_filters.git. The rules and alerts that we use come from GuacaMol [1] supplementary information, which can be found on the publication web page.

Refs

  1. GuacaMol: Benchmarking Models for de Novo Molecular Design Nathan Brown, Marco Fiscato, Marwin H.S. Segler, and Alain C. Vaucher Journal of Chemical Information and Modeling 2019 59 (3), 1096-1108 DOI: 10.1021/acs.jcim.8b00839