4bit-QLora + Qwen2 72b + 16k cutoff_len #5798

lmc8133 · 2024-10-23T14:34:00Z

How many gpus are needed to finetune? I have tried 16 PPUs (96GB each) but got CUDA OUT OF MEMROY

hiyouga · 2024-10-23T15:37:57Z

try --enable_liger_kernel and --use_unsloth_gc

lmc8133 · 2024-10-24T02:48:52Z

try --enable_liger_kernel and --use_unsloth_gc

--use_unsloth_gc or --use_unsloth?

hiyouga · 2024-10-24T04:12:36Z

use_unsloth_gc

lmc8133 · 2024-10-25T03:57:08Z

use_unsloth_gc

Thanks.

BTW, I have encounter an error : Triton Error [CUDA]: device kernel image is invalid when --enable_liger_kernel.

Here are some pkg info:
triton==3.1.0
transformers==4.44.2
torch=2.3.0
CUDA SDK == 12.3.2

Any suggestions?

github-actions bot added the pending This problem is yet to be addressed label Oct 23, 2024

Provide feedback