Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fixing unreasonable memory usage when creating an InferenceData object from an emcee sampler #2215

Merged
merged 4 commits into from
Feb 21, 2023

Conversation

acpetit
Copy link
Contributor

@acpetit acpetit commented Feb 21, 2023

Description

The goal of this change is to prevent large memory consumption during the creation of an InferenceData object from emcee. The change also improves efficiency.

Previously, the full chain was initialized once for each of the variables, leading to long initialization time and a spike in memory usage (my 2.5Gb chain with ~30 dimensions led to 60+Gb of allocated memory, crashing the python process).

Checklist

  • Follows official PR format
  • New features are properly documented (with an example if appropriate)?
  • Code style correct (follows pylint and black guidelines)
  • Changes are listed in changelog

📚 Documentation preview 📚: https://arviz--2215.org.readthedocs.build/en/2215/

The goal of this chang is to prevent large memory consumption
during the creation of an InferenceData object from emcee.
The change also improves efficiency.
Copy link
Member

@OriolAbril OriolAbril left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

left two minor comments about using from_emcee to keep the changelog user friendly (hopefully) as users don't need to know about io_emcee, also rename chain to samples_ary as that array contains samples from all chains.

Approving already though, that was a very good catch, thanks!

arviz/data/io_emcee.py Outdated Show resolved Hide resolved
arviz/data/io_emcee.py Outdated Show resolved Hide resolved
arviz/data/io_emcee.py Outdated Show resolved Hide resolved
arviz/data/io_emcee.py Outdated Show resolved Hide resolved
CHANGELOG.md Outdated Show resolved Hide resolved
@codecov
Copy link

codecov bot commented Feb 21, 2023

Codecov Report

Merging #2215 (03e6552) into main (18fa157) will increase coverage by 0.00%.
The diff coverage is 100.00%.

@@           Coverage Diff           @@
##             main    #2215   +/-   ##
=======================================
  Coverage   89.89%   89.90%           
=======================================
  Files         120      120           
  Lines       12423    12426    +3     
=======================================
+ Hits        11168    11171    +3     
  Misses       1255     1255           
Impacted Files Coverage Δ
arviz/data/io_emcee.py 98.33% <100.00%> (+0.04%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us. Have a feature suggestion? Share it here.

@acpetit
Copy link
Contributor Author

acpetit commented Feb 21, 2023

Thanks ! :)

@OriolAbril OriolAbril merged commit 200922e into arviz-devs:main Feb 21, 2023
bmwiedemann pushed a commit to bmwiedemann/openSUSE that referenced this pull request Mar 22, 2023
https://build.opensuse.org/request/show/1073712
by user dgarcia + dimstar_suse
- skip python 3.11 and python 3.8 because some dependencies doesn't
  support that python versions
- Update to 0.15.1
  - Fix memory usage and improve efficiency in `from_emcee`
    ([2215](arviz-devs/arviz#2215))
  - Lower pandas version needed
    ([2217](arviz-devs/arviz#2217))
- 0.15.0
  - Adds Savage-Dickey density ratio plot for Bayes factor
    approximation.
    ([2037](arviz-devs/arviz#2037),
    [2152](arviz-devs/arviz#2152))
  - Add `CmdStanPySamplingWrapper` and `PyMCSamplingWrapper` classes
    ([2158](arviz-devs/arviz#2158))
  - Changed dependency on netcdf4-python to h5netcdf
    ([2122](arviz-devs/arviz#2122))
  - Fix `reloo` outdate
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants