Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable progress bar under pmap #712

Merged
merged 6 commits into from
Aug 7, 2024
Merged

Conversation

andrewdipper
Copy link
Contributor

Enables proper progress bar behavior of progress_bar_scan under pmap. However the progress bars are not tied to physical devices since io_callback won't expose the device id. The bars behave as if they are physically tied but are always sorted by most to least complete (they're actually randomly shuffled but one bar will always be first and another last).

I haven't visually tested this under a multi-gpu setup as I don't have an identical pair of gpus - just using JAX_PLATFORMS='cpu'

#655 - Related: changes progress bar to tqdm and requires the user to label devices - a different approach

A few important guidelines and requirements before we can merge your PR:

  • We should be able to understand what the PR does from its title only;
  • There is a high-level description of the changes;
  • There are links to all the relevant issues, discussions and PRs;
  • The branch is rebased on the latest main commit;
  • Commit messages follow these guidelines;
  • The code respects the current naming conventions;
  • Docstrings follow the numpy style guide
  • pre-commit is installed and configured on your machine, and you ran it before opening the PR;
  • There are tests covering the changes;
  • The doc is up-to-date;

@junpenglao
Copy link
Member

Thanks! I like this better than #655 because it does not require explicit num_chain. @zaxtax thoughts?

@zaxtax
Copy link
Contributor

zaxtax commented Aug 4, 2024 via email

@andrewdipper
Copy link
Contributor Author

The initial random shuffle is a bit wonky. I believe it comes from fastprogress - the bars should always be in sorted order if we used the change from fastprogress to tqdm. That's based on similar tests with numpyro's progressbar and tqdm - but I'll verify it. Would that be better?

@junpenglao
Copy link
Member

The initial random shuffle is a bit wonky. I believe it comes from fastprogress - the bars should always be in sorted order if we used the change from fastprogress to tqdm. That's based on similar tests with numpyro's progressbar and tqdm - but I'll verify it. Would that be better?

Yes - if that works, how about I merge this PR and then @zaxtax you can update #655?

@andrewdipper
Copy link
Contributor Author

I had it wrong - the randomization was a race condition. Added a lock and now the bars are always ordered by most to least complete

@zaxtax
Copy link
Contributor

zaxtax commented Aug 5, 2024 via email

@andrewdipper
Copy link
Contributor Author

No, the order will always be in decreasing order of chain progress. As such the chain to progress bar mapping will change if one overtakes another. We'd need the equivalent of a chain/device id to avoid this - but from what I've seen so far the new callbacks don't support identifying the underlying device.

@junpenglao
Copy link
Member

No, the order will always be in decreasing order of chain progress. As such the chain to progress bar mapping will change if one overtakes another. We'd need the equivalent of a chain/device id to avoid this - but from what I've seen so far the new callbacks don't support identifying the underlying device.

So IIUC, we cannot have identifier like Chain 0, Chain 1 ...? I am not too bothered by the sorting of the chains, but would be great if there is some identifier, so users are not surprised by the behavior.

@zaxtax
Copy link
Contributor

zaxtax commented Aug 6, 2024 via email

@junpenglao
Copy link
Member

In my PR, I handle this by making a chain id that's carried along with the sample iteration. It's a bit inelegant but straightforward and doesn't require special machinery from the io callback code

Yeah but then it requires an explicit user input of chain_id - I would rather not pushing this handling to user, hence my preference to the current solution here.

@zaxtax
Copy link
Contributor

zaxtax commented Aug 6, 2024 via email

@andrewdipper
Copy link
Contributor Author

So IIUC, we cannot have identifier like Chain 0, Chain 1 ...? I am not too bothered by the sorting of the chains, but would be great if there is some identifier, so users are not surprised by the behavior.

Correct. Progress is being used as a surrogate id.

We could make a helper that hides that plumbing?

If I'm not mistaken I think this helper has to reside outside the pmap and wouldn't be completely transparent. But #655 is more straightforward from a handling perspective.

It might be possible to use the io_callback return value to set a chain id to be used in future io_callbacks. And as a fallback hashing the initial (or whole sequence) of (input, carry) could resolve into a unique id. I'm not sure how friendly jax is to those solutions but I'll take a look.

@andrewdipper
Copy link
Contributor Author

The latest commit uses the return value from the io_callback to uniquely identify the chains. The map from chains to progress bars won't change.

However, this changes the carry used in progress_bar_scan which is a problem if it is used externally. That'd have to be a slow change. But it does avoid forcing the user to do the device identification with part of the input. A wrapper over scan itself might clean things up a bit.

@zaxtax
Copy link
Contributor

zaxtax commented Aug 7, 2024

@junpenglao I'm now ok with the PR as is and definitely think we should explore a wrapper over scan in the future.

@junpenglao
Copy link
Member

Great, thank you @andrewdipper for the contribution!

@junpenglao junpenglao merged commit 27dfc9e into blackjax-devs:main Aug 7, 2024
5 checks passed
@zaxtax
Copy link
Contributor

zaxtax commented Aug 7, 2024

Oh should we update any examples to show how the new functionality works @andrewdipper ? Or does this not change the API?

@andrewdipper
Copy link
Contributor Author

I didn't see any examples using progress_bar_scan and only the two uses within blackjax so we should be good.

@andrewdipper andrewdipper deleted the progbar branch August 8, 2024 16:48
@andrewdipper andrewdipper mentioned this pull request Aug 8, 2024
10 tasks
@zaxtax
Copy link
Contributor

zaxtax commented Aug 8, 2024 via email

AdrienCorenflos added a commit to AdrienCorenflos/blackjax that referenced this pull request Aug 14, 2024
* Update README.md (blackjax-devs#638)

* Update README.md

Update citation.

* Update README.md

* Indexing the notebook showing how to reproduce the GIF. (blackjax-devs#640)

Co-authored-by: Junpeng Lao <[email protected]>

* Bump python version (blackjax-devs#645)

* Bump python version

* update bool inverse

* SMC: allow each mutation kernel to have different parameters. (blackjax-devs#649)

* vmaping over parameters in base

* switch from mcmc_factory to just passing in parameters

* pre-commit and typing

* CRU and docs improvement

* pre-commit

* code review updates

* pre-commit

* rename test

* Migrate from deprecated `host_callback` to `io_callback` (blackjax-devs#651)

* Migrate from deprecated `host_callback` to `io_callback`

Co-Authored-By:
George Necula <[email protected]>

* Format file

* Fix bug

* Fix MALA transition energy (blackjax-devs#653)

* Fix MALA transition energy

* Use a different logic.

* Change variable names (blackjax-devs#654)

* Replace iterative RNG split and carry with `jax.random.fold_in` (blackjax-devs#656)

* Replace iterative RNG split and carry with `jax.random.fold_in`

* revert unintended change

* file formatting

* change `jax.tree_map` to `jax.tree.map`

* revert unintended file

* fiddle with rng_key

* seed again

* Removal of Algorithm classes. (blackjax-devs#657)

* more

* removing export

* removal of classes, tests passing

* linter

* fix on test

* linter

* removing parametrization on test

* code review updates

* exporting as_top_level_api in dynamic_hmc

* linter

* code review update: replace imports

* Fix deprecated call to jnp.clip (blackjax-devs#664)

* Update jax version requirements (blackjax-devs#666)

Fix blackjax-devs#665

* Make tests pass on `aarch64-linux` (blackjax-devs#671)

* Enable fitlering of AdaptationInfo (blackjax-devs#674)

* enable AdaptationInfo filtering

* revert progress_bar

* fix pre-commit

* fix empty sets

* enable adapt info filtering for all adaptation algorithms

* fix precommit /progressbar=True

* change filter tuple to use tree_map

* Update `run_inference_algorithm` to split `initial_position` and `initial_state` (blackjax-devs#672)

* UPDATE DOCSTRING

* ADD STREAMING VERSION

* UPDATE TESTS

* ADD DOCSTRING

* ADD TEST

* REFACTOR RUN_INFERENCE_ALGORITHM

* UPDATE DOCSTRING

* Precommit

* CLEAN TESTS

* ADD INITIAL_POSITION

* FIX TEST

* RENAME O

* FIX DOCSTRING

* PUT EXPECTATION AFTER TRANSFORM

* Preconditioned mclmc (blackjax-devs#673)

* TESTS

* TESTS

* UPDATE DOCSTRING

* ADD STREAMING VERSION

* ADD PRECONDITIONING TO MCLMC

* ADD PRECONDITIONING TO TUNING FOR MCLMC

* UPDATE GITIGNORE

* UPDATE GITIGNORE

* UPDATE TESTS

* UPDATE TESTS

* ADD DOCSTRING

* ADD TEST

* STREAMING AVERAGE

* ADD TEST

* REFACTOR RUN_INFERENCE_ALGORITHM

* UPDATE DOCSTRING

* Precommit

* CLEAN TESTS

* GITIGNORE

* PRECOMMIT CLEAN UP

* ADD INITIAL_POSITION

* FIX TEST

* ADD TEST

* REMOVE BENCHMARKS

* BUG FIX

* CHANGE PRECISION

* CHANGE PRECISION

* RENAME O

* UPDATE STREAMING AVG

* UPDATE PR

* RENAME STD_MAT

* New integrator, and add some metadata to integrators.py (blackjax-devs#681)

* TESTS

* TESTS

* UPDATE DOCSTRING

* ADD STREAMING VERSION

* ADD PRECONDITIONING TO MCLMC

* ADD PRECONDITIONING TO TUNING FOR MCLMC

* UPDATE GITIGNORE

* UPDATE GITIGNORE

* UPDATE TESTS

* UPDATE TESTS

* ADD DOCSTRING

* ADD TEST

* STREAMING AVERAGE

* ADD TEST

* REFACTOR RUN_INFERENCE_ALGORITHM

* UPDATE DOCSTRING

* Precommit

* CLEAN TESTS

* GITIGNORE

* PRECOMMIT CLEAN UP

* FIX SPELLING, ADD OMELYAN, EXPORT COEFFICIENTS

* TEMPORARILY ADD BENCHMARKS

* ADD INITIAL_POSITION

* FIX TEST

* CLEAN UP

* REMOVE BENCHMARKS

* ADD TEST

* REMOVE BENCHMARKS

* BUG FIX

* CHANGE PRECISION

* CHANGE PRECISION

* ADD OMELYAN TEST

* RENAME O

* UPDATE STREAMING AVG

* UPDATE PR

* RENAME STD_MAT

* MERGE MAIN

* REMOVE COEFFICIENT EXPORTS

* Minor formatting (blackjax-devs#685)

* Minor formatting

* formatting

* fix test

* formatting

* MAKE WINDOW ADAPTATION TAKE INTEGRATOR AS ARGUMENT (blackjax-devs#687)

* FIX KWARG BUG (blackjax-devs#686)

* FIX KWARG BUG

* FIX KWARG BUG

* Change isokinetic_integrator generation API (blackjax-devs#689)

* Apply function on pytree directly. (blackjax-devs#692)

* Apply function on pytree directly.

Avoiding unnecssary unpacking

* Fix kwarg

* Fix sampling test. (blackjax-devs#693)

* Enable shared mcmc parameters with tempered smc (blackjax-devs#694)

* add parameter filtering

* fix parameter split + docstring

* change extend_paramss

* convert to bit twiddling (blackjax-devs#696)

* Remove nightly release (blackjax-devs#699)

* Fix doc mistakes (blackjax-devs#701)

* Fix equation formatting

* Clarify JAX gradient error

* Fix punctuation + capitalization

* Fix grammar

Should not begin sentence with "i.e." in English.

* Fix math formatting error

* Fix typo

Change parallel _ensample_ chain adaptation to parallel _ensemble_ chain adaptation.

* Add SVGD citation to appear in doc

Currently the SVGD paper is only cited in the `kernel` function, which is defined _within_ the `build_kernel` function. Because of this nested function format, the SVGD paper is _not_ cited in the documentation.

To fix this, I added a citation to the SVGD paper in the `as_top_level_api` docstring.

* Fix grammar + clarify doc

* Fix typo

---------

Co-authored-by: Junpeng Lao <[email protected]>

* Update index.md (blackjax-devs#711)

The jitted step remained unused, leading to the example running with an uncompiled nuts.step. 

Changing this reduces the execution time by a factor of 30 on my system and showcases blackjax' speed.

* Enable progress bar under pmap (blackjax-devs#712)

* enable pmap progbar

* fix bar creation

* add locking

* fix formatting

* switch to using chain state

* remove labels (blackjax-devs#716)

* Simplify `run_inference_algorithm` (blackjax-devs#714)

* fix minor type errors

* storing only expectation values

* fixed memory efficient sampling

* clean up

* renaming vars

* precommit fixes

* fixing tests

* fixing tests

* fixing tests

* fixing tests

* fixing tests

* merge main

* burn in and fix tests

* burn in and fix tests

* minor fixes

* minor fixes

* minor fixes

---------

Co-authored-by: [email protected] <[email protected]>

* Harmonize Quickstart example (blackjax-devs#717)

* Update README.md (blackjax-devs#719)

---------

Co-authored-by: Junpeng Lao <[email protected]>
Co-authored-by: Carlos Iguaran <[email protected]>
Co-authored-by: ksnxr <[email protected]>
Co-authored-by: Gaétan Lepage <[email protected]>
Co-authored-by: Alberto Cabezas <[email protected]>
Co-authored-by: andrewdipper <[email protected]>
Co-authored-by: Reuben <[email protected]>
Co-authored-by: Gilad Turok <[email protected]>
Co-authored-by: johannahaffner <[email protected]>
Co-authored-by: [email protected] <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants