Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

supNMT pre-train problem with multi gpus #177

Open
Andrewlesson opened this issue Sep 8, 2021 · 1 comment
Open

supNMT pre-train problem with multi gpus #177

Andrewlesson opened this issue Sep 8, 2021 · 1 comment

Comments

@Andrewlesson
Copy link

Andrewlesson commented Sep 8, 2021

pre-train script from sup-nmt only run in single gpu. when i use multi gpus to pre-train supNMT, i get some problem below. Has anyone encountered the same situation?

Traceback (most recent call last):
File "/search/odin/txguo/anaconda3/envs/mass/bin/fairseq-train", line 8, in
sys.exit(cli_main())
File "/search/odin/txguo/anaconda3/envs/mass/lib/python3.6/site-packages/fairseq_cli/train.py", line 298, in cli_main
nprocs=args.distributed_world_size,
File "/search/odin/txguo/anaconda3/envs/mass/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 167, in spawn
while not spawn_context.join():
File "/search/odin/txguo/anaconda3/envs/mass/lib/python3.6/site-packages/torch/multiprocessing/spawn.py", line 103, in join
(error_index, name)
Exception: process 0 terminated with signal SIGKILL

@jiaohuix
Copy link

jiaohuix commented Jul 8, 2022

how to run with multi gpus?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants