Add vLLM ARC support with OpenVINO backend #641

gavinlichn · 2024-09-09T10:15:47Z

Description

Support vllm inference on Intel ARC GPU

Issues

#629

Type of change

List the type of change like below. Please delete options that are not relevant.

Bug fix (non-breaking change which fixes an issue)
New feature (non-breaking change which adds new functionality)
Breaking change (fix or feature that would break existing design and interface)
Others (enhancement, documentation, validation, etc.)

Dependencies

n/a

Tests

n/a

comps/llms/text-generation/vllm/vllm_arc.sh

comps/llms/text-generation/vllm/docker/Dockerfile.arc

Support vllm inference on Intel ARC GPU Signed-off-by: Li Gang <[email protected]> Co-authored-by: Chen, Hu1 <[email protected]>

With vLLM official repo: https://github.com/vllm-project/vllm/ based on openvino backend Dockerfile is based on Dockerfile.openvino https://github.com/vllm-project/vllm/blob/main/Dockerfile.openvino And add ARC support packages Default mode: meta-llama/Llama-3.2-3B-Instruct to fit ARC A770 VRAM Signed-off-by: Li Gang <[email protected]>

for more information, see https://pre-commit.ci

Signed-off-by: Li Gang <[email protected]>

for more information, see https://pre-commit.ci

eero-t

Noticed couple of README typos

comps/llms/text-generation/vllm/langchain/README.md

Co-authored-by: Eero Tamminen <[email protected]>

Signed-off-by: Li Gang <[email protected]>

.github/workflows/docker/compose/llms-compose-cd.yaml

comps/llms/text-generation/vllm/langchain/README.md

comps/llms/text-generation/vllm/langchain/dependency/build_docker_vllm_openvino.sh

comps/llms/text-generation/vllm/langchain/dependency/launch_vllm_service_openvino.sh

chensuyue · 2024-11-08T02:38:34Z

Please keep using single formal image name for the same image.

Signed-off-by: Li Gang <[email protected]>

gavinlichn · 2024-11-08T03:25:50Z

Please keep using single formal image name for the same image.

Aligned all image name as opea/vllm-arc:latest

gavinlichn requested a review from lvliang-intel as a code owner September 9, 2024 10:15

gavinlichn mentioned this pull request Sep 9, 2024

Enable ChatQnA with vllm Arc support opea-project/GenAIExamples#771

Closed

4 tasks

lvliang-intel approved these changes Sep 9, 2024

View reviewed changes

hshen14 reviewed Sep 10, 2024

View reviewed changes

comps/llms/text-generation/vllm/vllm_arc.sh Outdated Show resolved Hide resolved

chensuyue reviewed Sep 11, 2024

View reviewed changes

comps/llms/text-generation/vllm/docker/Dockerfile.arc Outdated Show resolved Hide resolved

lkk12014402 pushed a commit that referenced this pull request Sep 19, 2024

fix tgi xeon tag (#641)

6674832

gavinlichn force-pushed the arc_vllm branch from d1b77e9 to 574fadb Compare November 1, 2024 09:38

gavinlichn changed the title ~~Add Dockerfile for vllm Arc support~~ Add vLLM ARC support with OpenVINO backend Nov 5, 2024

gavinlichn and others added 5 commits November 5, 2024 08:48

Add vllm Arc Dockerfile support

83f7641

Support vllm inference on Intel ARC GPU Signed-off-by: Li Gang <[email protected]> Co-authored-by: Chen, Hu1 <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

e2f60d8

for more information, see https://pre-commit.ci

Add README and .github workflow for vLLM ARC support

8ace5b4

Signed-off-by: Li Gang <[email protected]>

[pre-commit.ci] auto fixes from pre-commit.com hooks

3590ccc

for more information, see https://pre-commit.ci

gavinlichn force-pushed the arc_vllm branch from 635772c to 3590ccc Compare November 5, 2024 00:48

lianhao mentioned this pull request Nov 6, 2024

Add Intel ARC GPU test for vllm openvino. #856

Merged

4 tasks

eero-t reviewed Nov 7, 2024

View reviewed changes

comps/llms/text-generation/vllm/langchain/README.md Show resolved Hide resolved

comps/llms/text-generation/vllm/langchain/README.md Show resolved Hide resolved

comps/llms/text-generation/vllm/langchain/README.md Outdated Show resolved Hide resolved

gavinlichn and others added 2 commits November 8, 2024 10:01

Update comps/llms/text-generation/vllm/langchain/README.md

4d9e3ed

Co-authored-by: Eero Tamminen <[email protected]>

Rename Dockerfile to meet Contribution Guidelines

5b404bb

Signed-off-by: Li Gang <[email protected]>

chensuyue reviewed Nov 8, 2024

View reviewed changes

.github/workflows/docker/compose/llms-compose-cd.yaml Outdated Show resolved Hide resolved

chensuyue reviewed Nov 8, 2024

View reviewed changes

comps/llms/text-generation/vllm/langchain/README.md Outdated Show resolved Hide resolved

chensuyue reviewed Nov 8, 2024

View reviewed changes

comps/llms/text-generation/vllm/langchain/dependency/build_docker_vllm_openvino.sh Outdated Show resolved Hide resolved

chensuyue reviewed Nov 8, 2024

View reviewed changes

comps/llms/text-generation/vllm/langchain/dependency/launch_vllm_service_openvino.sh Outdated Show resolved Hide resolved

Align image names as opea/vllm-arc:latest

261530d

Signed-off-by: Li Gang <[email protected]>

chensuyue approved these changes Nov 8, 2024

View reviewed changes

chensuyue merged commit a2b9d95 into opea-project:main Nov 8, 2024
10 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add vLLM ARC support with OpenVINO backend #641

Add vLLM ARC support with OpenVINO backend #641

gavinlichn commented Sep 9, 2024

eero-t left a comment

chensuyue commented Nov 8, 2024

gavinlichn commented Nov 8, 2024

Add vLLM ARC support with OpenVINO backend #641

Add vLLM ARC support with OpenVINO backend #641

Conversation

gavinlichn commented Sep 9, 2024

Description

Issues

Type of change

Dependencies

Tests

eero-t left a comment

Choose a reason for hiding this comment

chensuyue commented Nov 8, 2024

gavinlichn commented Nov 8, 2024