Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[CORE] Add vLLM Backend for FORMAT.GPTQ #190

Merged
merged 17 commits into from
Jul 10, 2024

Conversation

PZS-ModelCloud
Copy link
Contributor

No description provided.

gptqmodel/models/base.py Outdated Show resolved Hide resolved
gptqmodel/models/base.py Outdated Show resolved Hide resolved
gptqmodel/models/base.py Outdated Show resolved Hide resolved
tests/test_vllm.py Outdated Show resolved Hide resolved
gptqmodel/utils/vllm.py Outdated Show resolved Hide resolved
tests/test_vllm.py Outdated Show resolved Hide resolved
gptqmodel/models/base.py Outdated Show resolved Hide resolved
@Qubitium Qubitium changed the title Add vlm [CORE]Add vLLM Backend for FORMAT.GPTQ Jul 9, 2024
@Qubitium Qubitium changed the title [CORE]Add vLLM Backend for FORMAT.GPTQ [CORE] Add vLLM Backend for FORMAT.GPTQ Jul 9, 2024
@Qubitium Qubitium merged commit 3951416 into ModelCloud:main Jul 10, 2024
1 check passed
@PZS-ModelCloud PZS-ModelCloud deleted the add_vlm_sglang branch July 10, 2024 03:08
DeJoker pushed a commit to DeJoker/GPTQModel that referenced this pull request Jul 19, 2024
* add vllm load support

* add sglang

* fix vllm load model show kv_caches error

* revert sglang

* mod clean up

* Update base.py

* Update base.py

* Update base.py

* Update test_vllm.py

* Update vllm.py

* Update base.py

* Update vllm.py

* add convert_hf_params_to_vllm and clean up

* format code

* mod clean up

* mod clean up

---------

Co-authored-by: Qubitium-ModelCloud <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants