Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Updating for new libE interface #88

Merged
merged 29 commits into from
Jul 27, 2023
Merged

Updating for new libE interface #88

merged 29 commits into from
Jul 27, 2023

Conversation

shuds13
Copy link
Collaborator

@shuds13 shuds13 commented May 10, 2023

Updates optimas to use the latest libEnsemble release.

Highlights

  • Use new libEnsemble interface for specifying resources (set num_procs and num_gpus in gen_specs["out"]). This is more portable than the previous method and will work both with NVIDIA and AMD GPUs.
  • All the output of an optimas Exploration (including the logs) is now stored in the exploration folder. This is done using the new workflow_dir feature of libEnsemble. The simulation folders are located in exploration/evaluations.

Note

It currently needs the develop libE branch.

Do not merge until these fixes have been included in a libEnsemble release:

Before merging:

  • Update libEnsemble line in pyproject.toml

Changes

  • Update to libEnsemble 0.10.2.
  • Use new libEnsemble interface for specifying resources (set num_procs and num_gpus in gen_specs["out"]). This is more portable than the previous method in which we were setting CUDA_VISIBLE_DEVICES.
  • Replace set_env_to_slots with set_env_to_gpus in the generator function.
  • The generator uses resources even when gen_resources is set to 0. To make sure no resources are allocated for the generator, we now set libE_specs['zero_resource_workers'] = [1]
  • All the output of an optimas Exploration (including the logs) is now stored in the exploration folder. This is done using the new workflow_dir feature of libEnsemble. The simulation folders are located in exploration/evaluations.
  • Reduce number of simulation workers to 2 in tests so that they can run properly on GitHub actions (the new libEnsemble will raise an error if we have more workers than CPUs).
  • Adds a new _reset_libensemble method to the Exploration that makes sure that some module variables in libEnsemble are reset after finishing the run. This is needed in order to launch several runs from the same script.
  • The paths to the sim_dir_copy_files are now made absolute before passing them to libEnsemble.
  • n_proc renamed to n_procs in the TemplateEvaluator.
  • By default n_gpus=0 (it used to be 1).
  • Increase version to 0.2.0.

From the changes above, these will affect the user:

  • All the output is now stored in the exploration directory, with the simulation folders being in exploration/evaluations.
  • n_proc renamed to n_procs in the TemplateEvaluator. (this does not really have an impact, as n_proc was just a placehold with no effect until now)
  • By default n_gpus=0 (it used to be 1).
  • An exploration run will now fail to start if more resources than available are required by the simulations (e.g., more workers than CPUs). This comes from the new libEnsemble resource asignement.

@AngelFP AngelFP marked this pull request as ready for review June 15, 2023 14:06
@AngelFP AngelFP merged commit 808051e into main Jul 27, 2023
2 checks passed
@AngelFP AngelFP deleted the feature/libE_1.10.0 branch July 27, 2023 07:46
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants