You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, when I do bash scripts/dist_train.sh 1 --cfg_file cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml, some question appear, If you can help me, very appreciate!
python -m torch.distributed.launch --nproc_per_node=1 train.py --launcher pytorch --cfg_file cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml
/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects --local_rank argument to be set, please
change it to read from os.environ['LOCAL_RANK'] instead. See https://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions
warnings.warn(
Traceback (most recent call last):
File "train.py", line 7, in
from test import repeat_eval_ckpt
File "/root/VoxelNeXt/tools/test.py", line 14, in
from eval_utils import eval_utils
File "/root/VoxelNeXt/tools/eval_utils/eval_utils.py", line 8, in
from pcdet.models import load_data_to_gpu
File "/root/VoxelNeXt/tools/../pcdet/init.py", line 4, in
from .version import version
ModuleNotFoundError: No module named 'pcdet.version'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 31174) of binary: /root/miniconda3/bin/python
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/miniconda3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/run.py", line 710, in run
elastic_launch(
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
train.py FAILED
The text was updated successfully, but these errors were encountered:
hello,I want to reproduce voxelnext also, but I don't know how to verify that the configuration is successful, and where to put the datasets, whether mini nuscenes can be used for testing, etc., please help, thanks!
Hi, when I do bash scripts/dist_train.sh 1 --cfg_file cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml, some question appear, If you can help me, very appreciate!
NGPUS=1
PY_ARGS='--cfg_file cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml'
true
PORT=13654
++ nc -z 127.0.0.1 13654
++ echo 127
status=127
'[' 127 '!=' 0 ']'
break
echo 13654
13654
python -m torch.distributed.launch --nproc_per_node=1 train.py --launcher pytorch --cfg_file cfgs/nuscenes_models/cbgs_voxel0075_voxelnext.yaml
/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py:178: FutureWarning: The module torch.distributed.launch is deprecated
and will be removed in future. Use torchrun.
Note that --use_env is set by default in torchrun.
If your script expects
--local_rank
argument to be set, pleasechange it to read from
os.environ['LOCAL_RANK']
instead. Seehttps://pytorch.org/docs/stable/distributed.html#launch-utility for
further instructions
warnings.warn(
Traceback (most recent call last):
File "train.py", line 7, in
from test import repeat_eval_ckpt
File "/root/VoxelNeXt/tools/test.py", line 14, in
from eval_utils import eval_utils
File "/root/VoxelNeXt/tools/eval_utils/eval_utils.py", line 8, in
from pcdet.models import load_data_to_gpu
File "/root/VoxelNeXt/tools/../pcdet/init.py", line 4, in
from .version import version
ModuleNotFoundError: No module named 'pcdet.version'
ERROR:torch.distributed.elastic.multiprocessing.api:failed (exitcode: 1) local_rank: 0 (pid: 31174) of binary: /root/miniconda3/bin/python
Traceback (most recent call last):
File "/root/miniconda3/lib/python3.8/runpy.py", line 194, in _run_module_as_main
return _run_code(code, main_globals, None,
File "/root/miniconda3/lib/python3.8/runpy.py", line 87, in _run_code
exec(code, run_globals)
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 193, in
main()
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 189, in main
launch(args)
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launch.py", line 174, in launch
run(args)
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/run.py", line 710, in run
elastic_launch(
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 131, in call
return launch_agent(self._config, self._entrypoint, list(args))
File "/root/miniconda3/lib/python3.8/site-packages/torch/distributed/launcher/api.py", line 259, in launch_agent
raise ChildFailedError(
torch.distributed.elastic.multiprocessing.errors.ChildFailedError:
============================================================
train.py FAILED
The text was updated successfully, but these errors were encountered: