You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The problem is, not-in-place array copying that happens in mcmc.run after the actual sampling might result in an out-of-memory exception even though the sampling itself was successful. First of all, it would be nice if this could be avoided and the arrays could be transferred to CPU before any not-in-place operations.
More generally, the GPU memory can be controlled buy sampling sequentially using post_warmup_state and transferring each batch of samples to CPU before running the next one. However, this doesn't seem to work as expected, and the consequent batches require more memory than the first one (see the output for the code below).
mcmc_samples = [None] * (n_samples // 1000)
# set up MCMC
self.mcmc = MCMC(kernel, num_warmup=n_warmup, num_samples=1000, num_chains=n_chains)
for i in range((n_samples) // 1000):
print(f"Batch {i+1}")
# run MCMC for 1000 samples
self.mcmc.run(jax.random.PRNGKey(0), self.spliced, self.unspliced)
# store samples transferred to CPU
mcmc_samples[i] = jax.device_put(self.mcmc.get_samples(), jax.devices("cpu")[0])
# reset the mcmc before running the next batch
self.mcmc.post_warmup_state = self.mcmc.last_state
the code above results in:
Running MCMC in batches of 1000 samples, 2 batches in total.
First batch will include 1000 warmup samples.
Batch 1
sample: 100%|██████████| 2000/2000 [11:18<00:00, 2.95it/s, 1023 steps of size 5.13e-06. acc. prob=0.85]
Batch 2
sample: 100%|██████████| 1000/1000 [05:48<00:00, 2.87it/s, 1023 steps of size 5.13e-06. acc. prob=0.85]
2023-11-24 14:43:23.854505: W external/tsl/tsl/framework/bfc_allocator.cc:485] Allocator (GPU_0_bfc) ran out of memory trying to allocate 2.56GiB (rounded to 2750440192)requested by op
To summarise,
Could not-in-place operations at the end of sampling be optionally transferred to CPU?
How should one sample sequentially so that memory usage is not increased in the process?
The text was updated successfully, but these errors were encountered:
I'm opening this issue following the discussion on the forum: https://forum.pyro.ai/t/reducing-mcmc-memory-usage/5639/6.
The problem is, not-in-place array copying that happens in
mcmc.run
after the actual sampling might result in an out-of-memory exception even though the sampling itself was successful. First of all, it would be nice if this could be avoided and the arrays could be transferred to CPU before any not-in-place operations.More generally, the GPU memory can be controlled buy sampling sequentially using
post_warmup_state
and transferring each batch of samples to CPU before running the next one. However, this doesn't seem to work as expected, and the consequent batches require more memory than the first one (see the output for the code below).the code above results in:
To summarise,
The text was updated successfully, but these errors were encountered: