Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add Qwen2 #24

Open
ehartford opened this issue Jul 15, 2024 · 6 comments
Open

add Qwen2 #24

ehartford opened this issue Jul 15, 2024 · 6 comments

Comments

@ehartford
Copy link

Please add Qwen2 support

EETQ_CAUSAL_LM_MODEL_MAP = {
    "llama": LlamaEETQForCausalLM,
    "baichuan": BaichuanEETQForCausalLM,
    "gemma": GemmaEETQForCausalLM
}
@ehartford ehartford mentioned this issue Jul 15, 2024
@dtlzhuangz
Copy link
Collaborator

dtlzhuangz commented Jul 15, 2024

Hello, thank you for your interest in EETQ. The code you modified is for vllm, whose code is not merged yet, vllm-project/vllm#3614, which make me confused about how you will use it. Could you please specify it? If you want to quantize Qwen2 with EETQ on transformers or TGI, I think you could directly use it under these two frameworks.

@ehartford
Copy link
Author

ehartford commented Jul 15, 2024

I am not using vllm. My change is not related to vllm.

I am trying to do this:

from eetq import AutoEETQForCausalLM
from transformers import AutoTokenizer

model_name = "/workspace/models/dolphin-2.9.2-qwen2-72b"
quant_path = "/workspace/models/dolphin-2.9.2-qwen2-72b-eetq"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoEETQForCausalLM.from_pretrained(model_name)
model.quantize(quant_path)
tokenizer.save_pretrained(quant_path)

The code changes I made here, enable this code to function.
Without the code changes, I get the error that qwen2 is not supported.

qwen2 is not supported because, it is not in EETQ_CAUSAL_LM_MODEL_MAP
it is not in EETQ_CAUSAL_LM_MODEL_MAP because there is no Qwen2EETQForCausalLM
I implement that class.

@ehartford
Copy link
Author

@dtlzhuangz
Copy link
Collaborator

I am not using vllm. My change is not related to vllm.

I am trying to do this:

from eetq import AutoEETQForCausalLM
from transformers import AutoTokenizer

model_name = "/workspace/models/dolphin-2.9.2-qwen2-72b"
quant_path = "/workspace/models/dolphin-2.9.2-qwen2-72b-eetq"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoEETQForCausalLM.from_pretrained(model_name)
model.quantize(quant_path)
tokenizer.save_pretrained(quant_path)

The code changes I made here, enable this code to function. Without the code changes, I get the error that qwen2 is not supported.

qwen2 is not supported because, it is not in EETQ_CAUSAL_LM_MODEL_MAP it is not in EETQ_CAUSAL_LM_MODEL_MAP because there is no Qwen2EETQForCausalLM I implement that class.

If you want to use EETQ to quantize a model and inference in an existing inference framework like TGI, transformers or vllm, you have to customize the quantization for each framework because cutlass kernel will change the layout of quantized weight. The above code is customized for vllm (Sorry for unclear description in README). If you use it in other framework, it may output wrong tokens.

@SidaZh
Copy link
Collaborator

SidaZh commented Jul 16, 2024

@ehartford AutoEETQForCausalLM is developed for vllm framework, you can use eetq in transformers, like this:

from transformers import AutoModelForCausalLM, EetqConfig

path = "/workspace/models/dolphin-2.9.2-qwen2-72b"
quantization_config = EetqConfig("int8")
model = AutoModelForCausalLM.from_pretrained(path, device_map="auto", quantization_config=quantization_config)
quant_path = "/workspace/models/dolphin-2.9.2-qwen2-72b-eetq"
model.save_pretrained(quant_path)

@ehartford
Copy link
Author

I wanna quantize my model to eetq format to publish it so people can download the eetq quantized version of my model. Just like they do with gptq, gguf, exl2, etc.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants