You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I received the error, ImportError: cannot import name 'LlavaOnevisionForConditionalGeneration' from 'transformers'. The error message along with the complete stack is posted below.
(swift) m.banerjee@PHYVDGPU02PRMV:/VDIL_COREML/m.banerjee/ms-swift$ CUDA_VISIBLE_DEVICES=0,1,2,3,5 \
swift sft \
--model_type llava-onevision-qwen2-0_5b-ov \
--dataset rlaif-v#1000 \
--dataset_test_ratio 0.1 \
--num_train_epochs 5 \
--output_dir output
run sh: `/VDIL_COREML/m.banerjee/anaconda3/envs/swift/bin/python /VDIL_COREML/m.banerjee/ms-swift/swift/cli/sft.py --model_type llava-onevision-qwen2-0_5b-ov --dataset rlaif-v#1000 --dataset_test_ratio 0.1 --num_train_epochs 5 --output_dir output`
[INFO:swift] Successfully registered `/VDIL_COREML/m.banerjee/ms-swift/swift/llm/data/dataset_info.json`
[INFO:swift] No vLLM installed, if you are using vLLM, you will get `ImportError: cannot import name 'get_vllm_engine' from 'swift.llm'`
[INFO:swift] No LMDeploy installed, if you are using LMDeploy, you will get `ImportError: cannot import name 'prepare_lmdeploy_engine_template' from 'swift.llm'`
[INFO:swift] Start time of running main: 2024-08-31 01:32:33.881287
[INFO:swift] Setting template_type: llava-onevision-qwen
[INFO:swift] Setting args.lazy_tokenize: True
[INFO:swift] Setting args.dataloader_num_workers: 1
[INFO:swift] output_dir: /VDIL_COREML/m.banerjee/ms-swift/output/llava-onevision-qwen2-0_5b-ov/v0-20240831-013234
[INFO:swift] args: SftArguments(model_type='llava-onevision-qwen2-0_5b-ov', model_id_or_path='AI-ModelScope/llava-onevision-qwen2-0.5b-ov-hf', model_revision='master', full_determinism=False, sft_type='lora', freeze_parameters=0.0, additional_trainable_parameters=[], tuner_backend='peft', template_type='llava-onevision-qwen', output_dir='/VDIL_COREML/m.banerjee/ms-swift/output/llava-onevision-qwen2-0_5b-ov/v0-20240831-013234', add_output_dir_suffix=True, ddp_backend=None, ddp_find_unused_parameters=None, ddp_broadcast_buffers=None, seed=42, resume_from_checkpoint=None, resume_only_model=False, ignore_data_skip=False, dtype='bf16', packing=False, train_backend='transformers', tp=1, pp=1, min_lr=None, sequence_parallel=False, dataset=['rlaif-v#1000'], val_dataset=[], dataset_seed=42, dataset_test_ratio=0.1, use_loss_scale=False, loss_scale_config_path='/VDIL_COREML/m.banerjee/ms-swift/swift/llm/agent/default_loss_scale_config.json', system=None, tools_prompt='react_en', max_length=2048, truncation_strategy='delete', check_dataset_strategy='none', streaming=False, streaming_val_size=0, streaming_buffer_size=16384, model_name=[None, None], model_author=[None, None], quant_method=None, quantization_bit=0, hqq_axis=0, hqq_dynamic_config_path=None, bnb_4bit_comp_dtype='bf16', bnb_4bit_quant_type='nf4', bnb_4bit_use_double_quant=True, bnb_4bit_quant_storage=None, rescale_image=-1, target_modules='^(language_model|multi_modal_projector)(?!.*(lm_head|output|emb|wte|shared)).*', target_regex=None, modules_to_save=[], lora_rank=8, lora_alpha=32, lora_dropout=0.05, lora_bias_trainable='none', lora_dtype='AUTO', lora_lr_ratio=None, use_rslora=False, use_dora=False, init_lora_weights='true', fourier_n_frequency=2000, fourier_scaling=300.0, rope_scaling=None, boft_block_size=4, boft_block_num=0, boft_n_butterfly_factor=1, boft_dropout=0.0, vera_rank=256, vera_projection_prng_key=0, vera_dropout=0.0, vera_d_initial=0.1, adapter_act='gelu', adapter_length=128, use_galore=False, galore_target_modules=None, galore_rank=128, galore_update_proj_gap=50, galore_scale=1.0, galore_proj_type='std', galore_optim_per_parameter=False, galore_with_embedding=False, galore_quantization=False, galore_proj_quant=False, galore_proj_bits=4, galore_proj_group_size=256, galore_cos_threshold=0.4, galore_gamma_proj=2, galore_queue_size=5, adalora_target_r=8, adalora_init_r=12, adalora_tinit=0, adalora_tfinal=0, adalora_deltaT=1, adalora_beta1=0.85, adalora_beta2=0.85, adalora_orth_reg_weight=0.5, ia3_feedforward_modules=[], llamapro_num_new_blocks=4, llamapro_num_groups=None, neftune_noise_alpha=None, neftune_backend='transformers', lisa_activated_layers=0, lisa_step_interval=20, reft_layers=None, reft_rank=4, reft_intervention_type='LoreftIntervention', reft_args=None, gradient_checkpointing=True, deepspeed=None, batch_size=1, eval_batch_size=1, auto_find_batch_size=False, num_train_epochs=5, max_steps=-1, optim='adamw_torch', adam_beta1=0.9, adam_beta2=0.95, adam_epsilon=1e-08, learning_rate=0.0001, weight_decay=0.1, gradient_accumulation_steps=16, max_grad_norm=1, predict_with_generate=False, lr_scheduler_type='cosine', lr_scheduler_kwargs={}, warmup_ratio=0.05, warmup_steps=0, eval_steps=50, save_steps=50, save_only_model=False, save_total_limit=2, logging_steps=5, acc_steps=1, dataloader_num_workers=1, dataloader_pin_memory=True, dataloader_drop_last=False, push_to_hub=False, hub_model_id=None, hub_token=None, hub_private_repo=False, push_hub_strategy='push_best', test_oom_error=False, disable_tqdm=False, lazy_tokenize=True, preprocess_num_proc=1, use_flash_attn=None, ignore_args_error=False, check_model_is_latest=True, logging_dir='/VDIL_COREML/m.banerjee/ms-swift/output/llava-onevision-qwen2-0_5b-ov/v0-20240831-013234/runs', report_to=['tensorboard'], acc_strategy='token', save_on_each_node=False, evaluation_strategy='steps', save_strategy='steps', save_safetensors=True, gpu_memory_fraction=None, include_num_input_tokens_seen=False, local_repo_path=None, custom_register_path=None, custom_dataset_info=None, device_map_config_path=None, device_max_memory=[], max_new_tokens=2048, do_sample=True, temperature=0.3, top_k=20, top_p=0.7, repetition_penalty=1.0, num_beams=1, fsdp='', fsdp_config=None, sequence_parallel_size=1, model_layer_cls_name=None, metric_warmup_step=0, fsdp_num=1, per_device_train_batch_size=None, per_device_eval_batch_size=None, eval_strategy=None, self_cognition_sample=0, train_dataset_mix_ratio=0.0, train_dataset_mix_ds=['ms-bench'], train_dataset_sample=-1, val_dataset_sample=None, safe_serialization=None, only_save_model=None, neftune_alpha=None, deepspeed_config_path=None, model_cache_dir=None, lora_dropout_p=None, lora_target_modules=[], lora_target_regex=None, lora_modules_to_save=[], boft_target_modules=[], boft_modules_to_save=[], vera_target_modules=[], vera_modules_to_save=[], ia3_target_modules=[], ia3_modules_to_save=[], custom_train_dataset_path=[], custom_val_dataset_path=[])
[INFO:swift] Global seed set to 42
device_count: 5
rank: -1, local_rank: -1, world_size: 1, local_world_size: 1
[INFO:swift] Downloading the model from ModelScope Hub, model_id: AI-ModelScope/llava-onevision-qwen2-0.5b-ov-hf
[WARNING:modelscope] Using branch: master as version is unstable, use with caution
[INFO:swift] Loading the model using model_dir: /VDIL_COREML/m.banerjee/.cache/modelscope/hub/AI-ModelScope/llava-onevision-qwen2-0___5b-ov-hf
Traceback (most recent call last):
File "/VDIL_COREML/m.banerjee/ms-swift/swift/cli/sft.py", line 5, in <module>
sft_main()
File "/VDIL_COREML/m.banerjee/ms-swift/swift/utils/run_utils.py", line 32, in x_main
result = llm_x(args, **kwargs)
File "/VDIL_COREML/m.banerjee/ms-swift/swift/llm/sft.py", line 215, in llm_sft
model, tokenizer = get_model_tokenizer(
File "/VDIL_COREML/m.banerjee/ms-swift/swift/llm/utils/model.py", line 6347, in get_model_tokenizer
model, tokenizer = get_function(model_dir, torch_dtype, model_kwargs, load_model, **kwargs)
File "/VDIL_COREML/m.banerjee/ms-swift/swift/llm/utils/model.py", line 5889, in get_model_tokenizer_llava_onevision
from transformers import LlavaOnevisionForConditionalGeneration
ImportError: cannot import name 'LlavaOnevisionForConditionalGeneration' from 'transformers' (/VDIL_COREML/m.banerjee/anaconda3/envs/swift/lib/python3.9/site-packages/transformers/__init__.py)
(swift) m.banerjee@PHYVDGPU02PRMV:/VDIL_COREML/m.banerjee/ms-swift$
Your hardware and system info
CUDA Version: 12.4
System: Ubuntu 22.04.3 LTS
GPU
torch==2.4.0
transformers==4.45.0.dev0
trl==0.9.6
peft==0.12.0
The text was updated successfully, but these errors were encountered:
Describe the bug
I am trying to SFT fine-tune the model
llava-onevision-qwen2-0_5b-ov
using the following command:I received the error,
ImportError: cannot import name 'LlavaOnevisionForConditionalGeneration' from 'transformers'
. The error message along with the complete stack is posted below.Your hardware and system info
CUDA Version: 12.4
System: Ubuntu 22.04.3 LTS
GPU
torch==2.4.0
transformers==4.45.0.dev0
trl==0.9.6
peft==0.12.0
The text was updated successfully, but these errors were encountered: