[CI/BUILD] enable intel queue for longer CPU tests #4113

zhouyuan · 2024-04-16T09:50:24Z

FILL IN THE PR DESCRIPTION HERE

This PR enables intel queue for longer CPU tests. The main changes are

fix the bug on pos_encoding kernel
enable part of the tests under models
skip for tests for cuda only on CPU

related: #3654

BEFORE SUBMITTING, PLEASE READ THE CHECKLIST BELOW AND FILL IN THE DESCRIPTION ABOVE

PR Checklist (Click to Expand)

Thank you for your contribution to vLLM! Before submitting the pull request, please ensure the PR meets the following criteria. This helps vLLM maintain the code quality and improve the efficiency of the review process.

PR Title and Classification

Only specific types of PRs will be reviewed. The PR title is prefixed appropriately to indicate the type of change. Please use one of the following:

[Bugfix] for bug fixes.
[CI/Build] for build or continuous integration improvements.
[Doc] for documentation fixes and improvements.
[Model] for adding a new model or improving an existing model. Model name should appear in the title.
[Frontend] For changes on the vLLM frontend (e.g., OpenAI API server, LLM class, etc.)
[Kernel] for changes affecting CUDA kernels or other compute kernels.
[Core] for changes in the core vLLM logic (e.g., LLMEngine, AsyncLLMEngine, Scheduler, etc.)
[Hardware][Vendor] for hardware-specific changes. Vendor name should appear in the prefix (e.g., [Hardware][AMD]).
[Misc] for PRs that do not fit the above categories. Please use this sparingly.

Note: If the PR spans more than one category, please include all relevant prefixes.

Code Quality

The PR need to meet the following code quality standards:

We adhere to Google Python style guide and Google C++ style guide.
Pass all linter checks. Please use format.sh to format your code.
The code need to be well-documented to ensure future contributors can easily understand the code.
Include sufficient tests to ensure the project to stay correct and robust. This includes both unit tests and integration tests.
Please add documentation to docs/source/ if the PR modifies the user-facing behaviors of vLLM. It helps vLLM user understand and utilize the new features or changes.

Notes for Large Changes

Please keep the changes as concise as possible. For major architectural changes (>500 LOC excluding kernel/data/config/test), we would expect a GitHub issue (RFC) discussing the technical design and justification. Otherwise, we will tag it with rfc-required and might not go through the PR.

What to Expect for the Reviews

The goal of the vLLM team is to be a transparent reviewing machine. We would like to make the review process transparent and efficient and make sure no contributor feel confused or frustrated. However, the vLLM team is small, so we need to prioritize some PRs over others. Here is what you can expect from the review process:

After the PR is submitted, the PR will be assigned to a reviewer. Every reviewer will pick up the PRs based on their expertise and availability.
After the PR is assigned, the reviewer will provide status update every 2-3 days. If the PR is not reviewed within 7 days, please feel free to ping the reviewer or the vLLM team.
After the review, the reviewer will put an action-required label on the PR if there are changes required. The contributor should address the comments and ping the reviewer to re-review the PR.
Please respond to all comments within a reasonable time frame. If a comment isn't clear or you disagree with a suggestion, feel free to ask for clarification or discuss the suggestion.

Thank You

Finally, thank you for taking the time to read these guidelines and for your interest in contributing to vLLM. Your contributions make vLLM a great tool for everyone!

tests/models/test_models.py

.buildkite/run-cpu-test.sh

tests/conftest.py

zhouyuan · 2024-05-28T14:29:46Z

@simon-mo hi Simon, I made more progress and covered more model tests for CPU backend, please kindly help to check and advise.

thanks
-yuan

simon-mo

This looks good to me overall pending test_big_models.py to be fixed. If the test suite grow, we can take the same approach as AMD did for mirror_hardwares to run all tests in Intel CPUs.

.buildkite/run-cpu-test.sh

zhouyuan · 2024-05-29T02:17:57Z

Dockerfile.cpu

@@ -19,4 +19,6 @@ RUN VLLM_TARGET_DEVICE=cpu python3 setup.py install

 WORKDIR /workspace/

+RUN ln -s /workspace/vllm/tests  && ln -s /workspace/vllm/examples && ln -s /workspace/vllm/benchmarks
+


.buildkite/run-cpu-test.sh

zhouyuan · 2024-05-29T04:28:36Z

This looks good to me overall pending test_big_models.py to be fixed. If the test suite grow, we can take the same approach as AMD did for mirror_hardwares to run all tests in Intel CPUs.

@simon-mo just fixed the test_big_models.py issue and thanks for the inputs! Will looking into that direction

thanks
-yuan

Signed-off-by: Yuan Zhou <[email protected]>

torch.cuda.is_available() is not working well with multiprocessing case so we switch to use is_cpu() to check the device Signed-off-by: Yuan Zhou <[email protected]>

Signed-off-by: Yuan Zhou <[email protected]>

zhouyuan · 2024-06-03T23:56:42Z

@simon-mo thanks a lot for review and merge, will move on to cover more tests on CPU backend.

thanks, -yuan

bigPYJ1151 mentioned this pull request Apr 16, 2024

[RFC] Initial Support for CPUs #3654

Open

4 tasks

zhouyuan force-pushed the wip_ci_intel_cpu branch 6 times, most recently from 67443ba to 1227a8f Compare April 18, 2024 03:25

zhouyuan marked this pull request as ready for review April 18, 2024 04:49

simon-mo requested changes Apr 18, 2024

View reviewed changes

tests/models/test_models.py Outdated Show resolved Hide resolved

.buildkite/run-cpu-test.sh Outdated Show resolved Hide resolved

zhouyuan commented Apr 18, 2024

View reviewed changes

.buildkite/run-cpu-test.sh Outdated Show resolved Hide resolved

zhouyuan commented Apr 19, 2024

View reviewed changes

.buildkite/run-cpu-test.sh Outdated Show resolved Hide resolved

zhouyuan requested a review from simon-mo April 19, 2024 05:22

jikunshang reviewed Apr 19, 2024

View reviewed changes

tests/conftest.py Show resolved Hide resolved

tests/conftest.py Show resolved Hide resolved

zhouyuan force-pushed the wip_ci_intel_cpu branch 6 times, most recently from 733a859 to 743c2e4 Compare May 28, 2024 13:01

zhouyuan force-pushed the wip_ci_intel_cpu branch from 799d720 to 5030a7c Compare May 28, 2024 23:25

simon-mo reviewed May 29, 2024

View reviewed changes

.buildkite/run-cpu-test.sh Outdated Show resolved Hide resolved

zhouyuan force-pushed the wip_ci_intel_cpu branch 2 times, most recently from d5f81a5 to f3e300e Compare May 29, 2024 00:41

zhouyuan commented May 29, 2024

View reviewed changes

bigPYJ1151 reviewed May 29, 2024

View reviewed changes

.buildkite/run-cpu-test.sh Show resolved Hide resolved

simon-mo approved these changes May 29, 2024

View reviewed changes

simon-mo enabled auto-merge (squash) May 29, 2024 04:30

auto-merge was automatically disabled May 29, 2024 23:10
Head branch was pushed to by a user without write access

zhouyuan added 17 commits June 3, 2024 14:10

enable stabilityai model

fb00ea2

Signed-off-by: Yuan Zhou <[email protected]>

using torch.cuda.is_available() to check the device type

3ed0ede

Signed-off-by: Yuan Zhou <[email protected]>

fix format

0ec2b79

Signed-off-by: Yuan Zhou <[email protected]>

ignore failed tests firstly

9de3c52

Signed-off-by: Yuan Zhou <[email protected]>

ignore embedding test

0bda6d9

Signed-off-by: Yuan Zhou <[email protected]>

fix test dir

a93ad94

Signed-off-by: Yuan Zhou <[email protected]>

fix tests with cuda device only

774eba0

Signed-off-by: Yuan Zhou <[email protected]>

add local develop

913eb24

Signed-off-by: Yuan Zhou <[email protected]>

fix format

9f299d5

Signed-off-by: Yuan Zhou <[email protected]>

soft link tests folder

22d5228

Signed-off-by: Yuan Zhou <[email protected]>

enable big model test

3208d4e

Signed-off-by: Yuan Zhou <[email protected]>

addning hf token

1be1f1f

Signed-off-by: Yuan Zhou <[email protected]>

enable test mistral

ffb647f

Signed-off-by: Yuan Zhou <[email protected]>

use float dtype for big model tests with CPU backend

e019d21

Signed-off-by: Yuan Zhou <[email protected]>

fix failed CI on CUDA spawn issue

a7e25a3

torch.cuda.is_available() is not working well with multiprocessing case so we switch to use is_cpu() to check the device Signed-off-by: Yuan Zhou <[email protected]>

fix rebase issue

eca1baf

Signed-off-by: Yuan Zhou <[email protected]>

fix rebase

7f0a344

Signed-off-by: Yuan Zhou <[email protected]>

zhouyuan force-pushed the wip_ci_intel_cpu branch from 0138a98 to 7f0a344 Compare June 3, 2024 06:11

Trigger CI

1eeb09d

simon-mo merged commit cafb8e0 into vllm-project:main Jun 3, 2024
60 of 65 checks passed

blinkbear pushed a commit to blinkbear/vllm that referenced this pull request Jun 6, 2024

[CI/BUILD] enable intel queue for longer CPU tests (vllm-project#4113)

6034663

robertgshaw2-neuralmagic pushed a commit to neuralmagic/nm-vllm that referenced this pull request Jun 11, 2024

[CI/BUILD] enable intel queue for longer CPU tests (vllm-project#4113)

48e8e3f

joerunde pushed a commit to joerunde/vllm that referenced this pull request Jun 17, 2024

[CI/BUILD] enable intel queue for longer CPU tests (vllm-project#4113)

bac28b3

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jun 27, 2024

[CI/BUILD] enable intel queue for longer CPU tests (vllm-project#4113)

a7b45d5

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 8, 2024

[CI/BUILD] enable intel queue for longer CPU tests (vllm-project#4113)

b2f4b87

xjpang pushed a commit to xjpang/vllm that referenced this pull request Jul 24, 2024

[CI/BUILD] enable intel queue for longer CPU tests (vllm-project#4113)

095f38c

Temirulan pushed a commit to Temirulan/vllm-whisper that referenced this pull request Sep 6, 2024

[CI/BUILD] enable intel queue for longer CPU tests (vllm-project#4113)

e8061f4

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[CI/BUILD] enable intel queue for longer CPU tests #4113

[CI/BUILD] enable intel queue for longer CPU tests #4113

zhouyuan commented Apr 16, 2024 •

edited

Loading

zhouyuan commented May 28, 2024

simon-mo left a comment

zhouyuan May 29, 2024

zhouyuan commented May 29, 2024

zhouyuan commented Jun 3, 2024

		@@ -19,4 +19,6 @@ RUN VLLM_TARGET_DEVICE=cpu python3 setup.py install

		WORKDIR /workspace/

		RUN ln -s /workspace/vllm/tests && ln -s /workspace/vllm/examples && ln -s /workspace/vllm/benchmarks

[CI/BUILD] enable intel queue for longer CPU tests #4113

[CI/BUILD] enable intel queue for longer CPU tests #4113

Conversation

zhouyuan commented Apr 16, 2024 • edited Loading

PR Title and Classification

Code Quality

Notes for Large Changes

What to Expect for the Reviews

Thank You

zhouyuan commented May 28, 2024

simon-mo left a comment

Choose a reason for hiding this comment

zhouyuan May 29, 2024

Choose a reason for hiding this comment

zhouyuan commented May 29, 2024

zhouyuan commented Jun 3, 2024

zhouyuan commented Apr 16, 2024 •

edited

Loading