ScattBO Benchmark - Bayesian optimisation for materials discovery

A self-driving laboratory (SDL) is an autonomous platform that conducts machine learning (ML) selected experiments to achieve a user-defined objective. An objective can be to synthesise a specific material.[1] Such an SDL will synthesise a material, evaluate if this is the target material and if necessary optimise the synthesis parameters for the next synthesis. One way to evaluate if the material is the target material is by measuring scattering data and comparing that to the scattering pattern of the target material. However, these types of SDLs can be expensive to run, which means that intelligent experimental planning is essential. At the same time, only a few people have access to an SDL for materials discovery. Therefore, it is currently challenging to benchmark Bayesian optimisation algorithms for experimental planning tasks in SDLs.

Here, we present a Python-based benchmark (ScattBO) that is an in silico simulation of an SDL for materials discovery. Based on a set of synthesis parameters, the benchmark ‘synthesises’ a structure, calculates the scattering pattern[2] and compares this to the scattering pattern of the target structure. Note: Scattering data may not be enough to conclusively validate that the target material has been synthesised.[3] The benchmark can include other types of data as long they can be simulated.

Synthesis parameters:

pH (float): The pH value, which scales the size of the structure. Range: [0, 14]
pressure (float): The pressure value, which controls the lattice constant of the structure. Range: [0, 100]
solvent (int): The solvent type, which determines the structure type for large clusters. 0 for 'Ethanol', 1 for 'Methanol', 2 for 'Water', 3 for 'Others'

Features:

Continous parameters (pH and pressure) and discrete parameter (solvent).
Two sizes of benchmark is provided: 1) ScatterBO_small_benchmark and 2) ScatterBO_large_benchmark.
Possibility of two different objectives: 1) The scattering data in Q-space (Sq) or 2) the scattering data in r-space (Gr).
Possibility of multi-objective optimisation using both Sq and Gr.
Possibility of four target scattering patterns: 1) simulated Sq, 2) simulated Gr, 3) experimental Sq, and 4) experimental Gr.
OBS: The scattering pattern calculations can be slow on CPU. It is recommended to use GPU.

Scoreboard

These scoreboards represent the performance of different BO algorithms on various of the ScattBO benchmarks. If you have new scores to report, feel free to contact us.

Small benchmark

Scoreboard for Sq - Simulated

Steps for Convergence¹	Name of Algorithm
25	bayes_opt
51	skopt
51	Dragonfly
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

Scoreboard for Sq - Experimental

Steps for Convergence³	Name of Algorithm
20	skopt
27	Dragonfly
>200	bayes_opt
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

Scoreboard for Gr - Simulated

Steps for Convergence²	Name of Algorithm
39	Dragonfly
39	skopt
41	bayes_opt
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

Scoreboard for Gr - Experimental

Steps for Convergence⁴	Name of Algorithm
16	Dragonfly
18	bayes_opt
38	skopt
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

Scoreboard for Multi-objective - Simulated

Steps for Convergence⁵	Name of Algorithm
37	bayes_opt (psedu-MOBO)
45	Dragonfly (psedu-MOBO)
45	skopt (psedu-MOBO)
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

Scoreboard for Multi-objective - Experimental

Steps for Convergence⁵	Name of Algorithm
>200	Dragonfly (psedu-MOBO)
>200	skopt (psedu-MOBO)
>200	bayes_opt (psedu-MOBO)
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

¹ Steps for Convergence for simulated Sq data is defined as the number of steps until Rwp < 0.04.
² Steps for Convergence for simulated Gr data is defined as the number of steps until Rwp < 0.04.
³ Steps for Convergence for experimental Sq data is defined as the number of steps until Rwp < 0.84.
⁴ Steps for Convergence for experimental Gr data is defined as the number of steps until Rwp < 0.80.
⁵ Steps for Convergence for multi-objective optimisation is defined for both above criteria.

Large benchmark

Scoreboard for Sq - Simulated

Steps for Convergence¹	Name of Algorithm
11	bayes_opt
30	Dragonfly
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

Scoreboard for Sq - Experimental

Steps for Convergence³	Name of Algorithm
35	Dragonfly
48	bayes_opt
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

Scoreboard for Gr - Simulated

Steps for Convergence²	Name of Algorithm
36	bayes_opt
114	Dragonfly
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

Scoreboard for Gr - Experimental

Steps for Convergence⁴	Name of Algorithm
33	Dragonfly
34	bayes_opt
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

Scoreboard for Multi-objective - Simulated

Steps for Convergence⁵	Name of Algorithm
36	bayes_opt
78	Dragonfly (psedu-MOBO)
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

Scoreboard for Multi-objective - Experimental

Steps for Convergence⁵	Name of Algorithm
>200	Dragonfly (psedu-MOBO)
>200	bayes_opt (psedu-MOBO)
4808	Bruteforce in structure space
6400	Bruteforce in synthesis space

¹ Steps for Convergence for simulated Sq data is defined as the number of steps until Rwp < 0.04.
² Steps for Convergence for simulated Gr data is defined as the number of steps until Rwp < 0.04.
³ Steps for Convergence for experimental Sq data is defined as the number of steps until Rwp < 0.84.
⁴ Steps for Convergence for experimental Gr data is defined as the number of steps until Rwp < 0.80.
⁵ Steps for Convergence for multi-objective optimisation is defined for both above criteria.

Installation

Ensure that you have PyTorch installed. Follow the instructions on the official PyTorch website to install the appropriate version for your system: PyTorch Installation Guide.

ScattBO is a Python package you can install locally. We recommend creating a new environment (say, with conda), and then installing it:

conda create -n scattbo python=3.10
conda activate scattbo
pip install -e .

This command will install all requirements. Otherwise, you can find them in the requirements.txt file.

Usage

See (https://github.com/AndySAnker/ScattBO/tree/main/tools) for examples of single-objective optimisation with Dragonfly or skopt.

Example usage with Dragonfly

from ase.lattice.cubic import FaceCenteredCubic, SimpleCubic, BodyCenteredCubic
from ase.lattice.hexagonal import HexagonalClosedPacked
from ase.cluster.icosahedron import Icosahedron
from ase.cluster.decahedron import Decahedron
from ase.cluster.octahedron import Octahedron
from debyecalculator import DebyeCalculator
import torch
import numpy as np
import plotly.graph_objects as go
import matplotlib.pyplot as plt
from plotly.subplots import make_subplots
from ScattBO.utils import generate_structure, ScatterBO_small_benchmark, ScatterBO_large_benchmark

from dragonfly import minimise_function

def benchmark_wrapper(params):
    """
    Wrapper function for the ScatterBO_small_benchmark function.

    Parameters:
    params (list): The parameters to pass to the ScatterBO_small_benchmark function.

    simulated_or_experimental (str): If 'simulated', use the filename 'Data/Gr/Target_PDF_small_benchmark.npy'. 
                                     If 'experimental', use the filename 'T2_0p7boro_15hrs_powder.npy'. Default is 'simulated'.
    
    scatteringfunction= (str): The scattering function to use. 'Gr' for pair distribution function, 'Sq' for structure factor. Default is 'Sq'.

    Returns:
    The result of the ScatterBO_small_benchmark function.
    """
    return ScatterBO_small_benchmark(params, plot=False, simulated_or_experimental='simulated', scatteringfunction='Gr')

# Define the domain for each parameter
domain = [
    [2, 12],  # pH values range from 2 to 12
    [15, 80],  # pressure values range from 15 to 80
    [0, 1]  # solvent can be 0 ('Ethanol') or 1 ('Methanol')
]

# Define the maximum capital
max_capital = 10

# Minimize the function
min_val, min_pt, history = minimise_function(benchmark_wrapper, domain, max_capital)

Visualise results

# Print the minimum value and the corresponding parameters
print ("min_val: ", min_val)
print ("min_pt: ", min_pt)

# Use the parameters that minimize the function as input to the generate_structure function
pH, pressure, solvent = min_pt
cluster = generate_structure(pH, pressure, solvent, atom='Au')
print ("Cluster type: ", cluster.structure_type)
print ("Number of atoms: ", len(cluster))

# Use the parameters that minimize the function as input to the ScatterBO_small_benchmark function
ScatterBO_small_benchmark((pH, pressure, solvent), plot=True, simulated_or_experimental='simulated', scatteringfunction='Gr')

Contributing to the software

We welcome contributions to our software! To contribute, please follow these steps:

Fork the repository.
Make your changes in a new branch.
Submit a pull request.

We'll review your changes and merge them if they are appropriate.

Reporting issues

If you encounter any issues or problems with our software, please report them by opening an issue on our GitHub repository. Please include as much detail as possible, including steps to reproduce the issue and any error messages you received.

Seeking support

If you need help using our software, please reach out to us on our GitHub repository. We'll do our best to assist you and answer any questions you have.

References

[1] Szymanski, Nathan J., et al., An autonomous laboratory for the accelerated synthesis of novel materials, Nature, 624(7990), 86-91 (2023)

[2] Frederik L. Johansen & Andy S. Anker, et al., A GPU-Accelerated Open-Source Python Package for Calculating Powder Diffraction, Small-Angle-, and Total Scattering with the Debye Scattering Equation, Journal of Open Source Software, 9(94), 6024 (2024)

[3] Leeman, Josh, et al., Challenges in High-Throughput Inorganic Materials Prediction and Autonomous Synthesis, PRX Energy, 3(1), 011002 (2024)

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
Data		Data
presentation		presentation
results		results
src/ScattBO		src/ScattBO
tools		tools
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
pyproject.toml		pyproject.toml
requirements.txt		requirements.txt
setup.cfg		setup.cfg

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

ScattBO Benchmark - Bayesian optimisation for materials discovery

Scoreboard

Small benchmark

Scoreboard for Sq - Simulated

Scoreboard for Sq - Experimental

Scoreboard for Gr - Simulated

Scoreboard for Gr - Experimental

Scoreboard for Multi-objective - Simulated

Scoreboard for Multi-objective - Experimental

Large benchmark

Scoreboard for Sq - Simulated

Scoreboard for Sq - Experimental

Scoreboard for Gr - Simulated

Scoreboard for Gr - Experimental

Scoreboard for Multi-objective - Simulated

Scoreboard for Multi-objective - Experimental

Installation

Usage

Example usage with Dragonfly

Visualise results

Contributing to the software

Reporting issues

Seeking support

References

About

Releases

Packages

Contributors 3

Languages

License

AndySAnker/ScattBO

Folders and files

Latest commit

History

Repository files navigation

ScattBO Benchmark - Bayesian optimisation for materials discovery

Scoreboard

Small benchmark

Scoreboard for Sq - Simulated

Scoreboard for Sq - Experimental

Scoreboard for Gr - Simulated

Scoreboard for Gr - Experimental

Scoreboard for Multi-objective - Simulated

Scoreboard for Multi-objective - Experimental

Large benchmark

Scoreboard for Sq - Simulated

Scoreboard for Sq - Experimental

Scoreboard for Gr - Simulated

Scoreboard for Gr - Experimental

Scoreboard for Multi-objective - Simulated

Scoreboard for Multi-objective - Experimental

Installation

Usage

Example usage with Dragonfly

Visualise results

Contributing to the software

Reporting issues

Seeking support

References

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 3

Languages

Packages