Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Evaluation on multiple GPUs #73

Open
ezhang7423 opened this issue Sep 22, 2024 · 1 comment
Open

Evaluation on multiple GPUs #73

ezhang7423 opened this issue Sep 22, 2024 · 1 comment

Comments

@ezhang7423
Copy link

ezhang7423 commented Sep 22, 2024

Hi there! I am trying to run evaluation with multiple GPUs. I have ran everything in './scripts/minimal_example.sh'. Running on 1 GPU works perfectly, however when I pass in more than 1 GPU, all Huggingface Generator workers still only initialize on the first GPU, leaving the rest unutilized. Unfortunately, --use-vlm does not appear to be working either, as an error is thrown that the T5 model is not supported. Any support would be greatly appreciated.

@yangky11
Copy link
Member

Hi,

I'm not able to reproduce the problem. I tried running python prover/evaluate.py --data-path data/leandojo_benchmark_4/random/ --gen_ckpt_path kaiyuy/leandojo-lean4-tacgen-byt5-small --split test --num-workers 40 --num-gpus 8 and got the GPU utilization below. It seems all GPUs are running the evaluation.

image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants