add Qwen2 #24

ehartford · 2024-07-15T05:03:50Z

Please add Qwen2 support

EETQ_CAUSAL_LM_MODEL_MAP = {
    "llama": LlamaEETQForCausalLM,
    "baichuan": BaichuanEETQForCausalLM,
    "gemma": GemmaEETQForCausalLM
}

The text was updated successfully, but these errors were encountered:

dtlzhuangz · 2024-07-15T06:10:20Z

Hello, thank you for your interest in EETQ. The code you modified is for vllm, whose code is not merged yet, vllm-project/vllm#3614, which make me confused about how you will use it. Could you please specify it? If you want to quantize Qwen2 with EETQ on transformers or TGI, I think you could directly use it under these two frameworks.

ehartford · 2024-07-15T06:54:46Z

I am not using vllm. My change is not related to vllm.

I am trying to do this:

from eetq import AutoEETQForCausalLM
from transformers import AutoTokenizer

model_name = "/workspace/models/dolphin-2.9.2-qwen2-72b"
quant_path = "/workspace/models/dolphin-2.9.2-qwen2-72b-eetq"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoEETQForCausalLM.from_pretrained(model_name)
model.quantize(quant_path)
tokenizer.save_pretrained(quant_path)

The code changes I made here, enable this code to function.
Without the code changes, I get the error that qwen2 is not supported.

qwen2 is not supported because, it is not in EETQ_CAUSAL_LM_MODEL_MAP
it is not in EETQ_CAUSAL_LM_MODEL_MAP because there is no Qwen2EETQForCausalLM
I implement that class.

ehartford · 2024-07-15T06:55:56Z

The output

https://huggingface.co/cognitivecomputations/dolphin-2.9.2-qwen2-72b-eetq

dtlzhuangz · 2024-07-15T08:08:46Z

I am not using vllm. My change is not related to vllm.

I am trying to do this:
from eetq import AutoEETQForCausalLM
from transformers import AutoTokenizer

model_name = "/workspace/models/dolphin-2.9.2-qwen2-72b"
quant_path = "/workspace/models/dolphin-2.9.2-qwen2-72b-eetq"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoEETQForCausalLM.from_pretrained(model_name)
model.quantize(quant_path)
tokenizer.save_pretrained(quant_path)
The code changes I made here, enable this code to function. Without the code changes, I get the error that qwen2 is not supported.

qwen2 is not supported because, it is not in EETQ_CAUSAL_LM_MODEL_MAP it is not in EETQ_CAUSAL_LM_MODEL_MAP because there is no Qwen2EETQForCausalLM I implement that class.

If you want to use EETQ to quantize a model and inference in an existing inference framework like TGI, transformers or vllm, you have to customize the quantization for each framework because cutlass kernel will change the layout of quantized weight. The above code is customized for vllm (Sorry for unclear description in README). If you use it in other framework, it may output wrong tokens.

SidaZh · 2024-07-16T03:15:23Z

@ehartford AutoEETQForCausalLM is developed for vllm framework, you can use eetq in transformers, like this：

from transformers import AutoModelForCausalLM, EetqConfig

path = "/workspace/models/dolphin-2.9.2-qwen2-72b"
quantization_config = EetqConfig("int8")
model = AutoModelForCausalLM.from_pretrained(path, device_map="auto", quantization_config=quantization_config)
quant_path = "/workspace/models/dolphin-2.9.2-qwen2-72b-eetq"
model.save_pretrained(quant_path)

ehartford · 2024-07-16T03:33:47Z

I wanna quantize my model to eetq format to publish it so people can download the eetq quantized version of my model. Just like they do with gptq, gguf, exl2, etc.

ehartford mentioned this issue Jul 15, 2024

add qwen2 #25

Draft

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add Qwen2 #24

add Qwen2 #24

ehartford commented Jul 15, 2024

dtlzhuangz commented Jul 15, 2024 •

edited

Loading

ehartford commented Jul 15, 2024 •

edited

Loading

ehartford commented Jul 15, 2024

dtlzhuangz commented Jul 15, 2024

SidaZh commented Jul 16, 2024

ehartford commented Jul 16, 2024

add Qwen2 #24

add Qwen2 #24

Comments

ehartford commented Jul 15, 2024

dtlzhuangz commented Jul 15, 2024 • edited Loading

ehartford commented Jul 15, 2024 • edited Loading

ehartford commented Jul 15, 2024

dtlzhuangz commented Jul 15, 2024

SidaZh commented Jul 16, 2024

ehartford commented Jul 16, 2024

dtlzhuangz commented Jul 15, 2024 •

edited

Loading

ehartford commented Jul 15, 2024 •

edited

Loading