Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove JSON config mangling for Gemma ckpt #124

Merged
merged 1 commit into from
Jun 13, 2024

Conversation

lsy323
Copy link
Collaborator

@lsy323 lsy323 commented Jun 12, 2024

The fix to the invalid JSON config file in HF Gemma pytorch ckpt is merged. We don't need to fix the invalid json in convert script.

https://huggingface.co/google/gemma-7b-it-pytorch/discussions/2#6667fe530ec7ed4422eb070c
https://huggingface.co/google/gemma-7b-pytorch/discussions/2#6667fdbf647001c39240a47e
https://huggingface.co/google/gemma-2b-pytorch/discussions/2#6667fdab51545a8b46c3a121

Other gemma pytorch ckpts are handled by HF staff as well.

Test

export input_ckpt_dir=/mnt/disks/lsiyuan/gemma_weight/gemma-7b-pytorch-it
export output_ckpt_dir=/mnt/disks/lsiyuan/gemma_weight/gemma-7b-pytorch-it-bf16
export model_name="gemma"
export quantize_weights=False
export quantize_type="int8_per_channel"
export from_hf=False
python -m convert_checkpoints --model_name=$model_name \
    --input_checkpoint_dir=$input_ckpt_dir \
    --output_checkpoint_dir=$output_ckpt_dir \
    --quantize_weights=$quantize_weights \
    --quantize_type=$quantize_type \
    --from_hf=$from_hf
export tokenizer_path=/mnt/disks/lsiyuan/gemma_weight/gemma-7b-pytorch-it/tokenizer.model
export size="7b"
export quantize_weights=False
export quantize_activation=False
export quantize_kv_cache=False

python run_interactive.py --model_name=$model_name --size=$size --batch_size=2 --max_cache_length=2048 \
    --checkpoint_path=$output_ckpt_dir \
    --tokenizer_path=$tokenizer_path \
    --quantize_kv_cache=$quantize_kv_cache \
    --quantize_weights=$quantize_weights \
    --quantize_type=$quantize_type \

@lsy323 lsy323 requested a review from qihqi June 12, 2024 22:26
@lsy323 lsy323 changed the title update gemma convert Remove JSON config mangling for Gemma ckpt Jun 12, 2024
@wang2yn84
Copy link
Collaborator

Thank you very much for raising the issue and get it solved!

@lsy323 lsy323 merged commit fe8dbde into AI-Hypercomputer:main Jun 13, 2024
4 checks passed
@lsy323 lsy323 deleted the lsiyuan/update-gemma-convert branch June 13, 2024 17:20
@wang2yn84 wang2yn84 requested review from wang2yn84 and removed request for wang2yn84 June 13, 2024 17:21
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants