GaussianProcessRegression() optimize does not work in a subprocess #645

ioananikova · 2022-10-26T11:48:00Z

Describe the bug
When a pool of processes is used for executing calls (like with concurrent.futures.ProcessPoolExecutor), the optimize() method of GaussianProcessRegression() will take forever and never finish. More specifically this happens in evaluate_loss_of_model_parameters().

To reproduce
Steps to reproduce the behaviour:

Create a pool of processes
Make sure a GPR model is created in a process
Update the model
Then try to optimize the model (it will fail here)

A minimal reproducible code example is included to illustrate the problem. (change to .py)
test_concurrent_trieste.txt

Expected behaviour
The expected behavior is that the optimize function behaves as it would in a normal process (not subprocess). Usually this step takes less than a second to finish.

System information

OS: Ubuntu-20.04 (in WSL), on Windows 10
Python version: 3.9.9
Trieste version: 0.13.0 (the pip version, release tag or commit hash)
TensorFlow version: 2.10.0
GPflow version: 2.6.3

Additional context
Even if the import statements are in the subprocess, it fails.

The text was updated successfully, but these errors were encountered:

uri-granta · 2023-12-08T09:27:27Z

(Confirmed that this is still broken with latest version, possibly hitting some sort of deadlock.)

uri-granta · 2023-12-13T11:40:49Z

This is somehow connected to the use of tf.function compilation. Disabling tracing with tf.config.run_functions_eagerly(True) allows the code example to run (though at the obvious expense of executing everything eagerly each time). Will investigate further.

uri-granta · 2023-12-13T12:00:26Z

It's also somehow connected to something trieste or one its dependant libraries does:

# COMMENTING OUT EITHER import trieste OR @tf.function MAKES THIS PASS!
import concurrent.futures
import tensorflow as tf
import trieste

@tf.function
def say_hi():
    tf.print("hi")

def concurrency_test(n):
    print(f"I'm going to say hi!")
    say_hi()

if __name__ == "__main__":
    with concurrent.futures.ProcessPoolExecutor(max_workers=1) as executor:
        executor.map(concurrency_test, [10])

uri-granta · 2023-12-13T14:10:01Z

Ok, so it looks like this is due to some state initialisation performed by tensorflow when you call it for the first time. Replacing import trieste with tf.constant(42) or similar in the example above also hangs.

The solution is to avoid importing trieste until you're inside the subprocess:

import concurrent.futures

WORKERS = 1

def test_concurrent(num_initial_points):
    from trieste.objectives.single_objectives import Branin
    import trieste
    from trieste.models.gpflow import GaussianProcessRegression, build_gpr
    print(f'num_initial_points: {num_initial_points}')
    branin_obj = Branin.objective
    search_space = Branin.search_space
    observer = trieste.objectives.utils.mk_observer(branin_obj)

    initial_query_points = search_space.sample_halton(num_initial_points)
    initial_data = observer(initial_query_points)
    print('initial data created')

    gpflow_model = build_gpr(initial_data, search_space, likelihood_variance=1e-7)
    model = GaussianProcessRegression(gpflow_model)
    print('model created')

    model.update(initial_data)
    print('model updated')
    model.optimize(initial_data)
    print('model optimized')


if __name__ == "__main__":
    with concurrent.futures.ProcessPoolExecutor(max_workers=WORKERS) as executor:
        executor.map(test_concurrent, [10])

uri-granta · 2023-12-13T14:12:40Z

I'll see whether we can document this anywhere. Does this solve your issue? (if you can remember back to October 2022!)

ioananikova added the bug Something isn't working label Oct 26, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

GaussianProcessRegression() optimize does not work in a subprocess #645

GaussianProcessRegression() optimize does not work in a subprocess #645

ioananikova commented Oct 26, 2022

uri-granta commented Dec 8, 2023

uri-granta commented Dec 13, 2023

uri-granta commented Dec 13, 2023

uri-granta commented Dec 13, 2023

uri-granta commented Dec 13, 2023

GaussianProcessRegression() optimize does not work in a subprocess #645

GaussianProcessRegression() optimize does not work in a subprocess #645

Comments

ioananikova commented Oct 26, 2022

uri-granta commented Dec 8, 2023

uri-granta commented Dec 13, 2023

uri-granta commented Dec 13, 2023

uri-granta commented Dec 13, 2023

uri-granta commented Dec 13, 2023