You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
when running your Shapes-2D build/train functions: python train.py --dataset data/shapes_train.h5 --encoder small --name shapes
Specifically, the error says: RuntimeError: DataLoader worker is killed by signal: Killed. It seems to be coming from the fact that the dataloader is having trouble multi-processing the training set loading. But when I look through your utils files, I am not seeing why this error would exist.
This same error has occurred on both a windows machine as well as a linux machine.
Here's the full trace:
Traceback (most recent call last):
File "/home/aadharna/miniconda3/envs/cswm/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 724, in _try_get_data
data = self.data_queue.get(timeout=timeout)
File "/home/aadharna/miniconda3/envs/cswm/lib/python3.7/multiprocessing/queues.py", line 104, in get
if not self._poll(timeout):
File "/home/aadharna/miniconda3/envs/cswm/lib/python3.7/multiprocessing/connection.py", line 257, in poll
return self._poll(timeout)
File "/home/aadharna/miniconda3/envs/cswm/lib/python3.7/multiprocessing/connection.py", line 414, in _poll
r = wait([self], timeout)
File "/home/aadharna/miniconda3/envs/cswm/lib/python3.7/multiprocessing/connection.py", line 921, in wait
ready = selector.select(timeout)
File "/home/aadharna/miniconda3/envs/cswm/lib/python3.7/selectors.py", line 415, in select
fd_event_list = self._selector.poll(timeout)
File "/home/aadharna/miniconda3/envs/cswm/lib/python3.7/site-packages/torch/utils/data/_utils/signal_handling.py", line 66, in handler
_error_if_any_worker_fails()
RuntimeError: DataLoader worker (pid 19014) is killed by signal: Killed.
During handling of the above exception, another exception occurred:
Traceback (most recent call last):
File "train.py", line 104, in <module>
obs = train_loader.__iter__().next()[0]
File "/home/aadharna/miniconda3/envs/cswm/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 804, in __next__
idx, data = self._get_data()
File "/home/aadharna/miniconda3/envs/cswm/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 771, in _get_data
success, data = self._try_get_data()
File "/home/aadharna/miniconda3/envs/cswm/lib/python3.7/site-packages/torch/utils/data/dataloader.py", line 737, in _try_get_data
raise RuntimeError('DataLoader worker (pid(s) {}) exited unexpectedly'.format(pids_str))
RuntimeError: DataLoader worker (pid(s) 19012, 19014) exited unexpectedly
(cswm) aadharna@penguin:~/PycharmProjects/c-swm$
I'll continue poking around and update when I find the root cause.
The text was updated successfully, but these errors were encountered:
When I try to run the code provided here, I end up hitting a RuntimeError here:
c-swm/train.py
Line 104 in e944b24
python train.py --dataset data/shapes_train.h5 --encoder small --name shapes
Specifically, the error says:
RuntimeError: DataLoader worker is killed by signal: Killed
. It seems to be coming from the fact that the dataloader is having trouble multi-processing the training set loading. But when I look through your utils files, I am not seeing why this error would exist.This same error has occurred on both a windows machine as well as a linux machine.
Here's the full trace:
I'll continue poking around and update when I find the root cause.
The text was updated successfully, but these errors were encountered: