Skip to content

Commit

Permalink
add support for pipeline-parallel-size in vLLM example (ray-project#2370
Browse files Browse the repository at this point in the history
)

Signed-off-by: Andrew Sy Kim <[email protected]>
  • Loading branch information
andrewsykim authored Sep 10, 2024
1 parent 3e68606 commit d6fbdd5
Show file tree
Hide file tree
Showing 2 changed files with 2 additions and 1 deletion.
1 change: 1 addition & 0 deletions ray-operator/config/samples/vllm/ray-service.vllm.yaml
Original file line number Diff line number Diff line change
Expand Up @@ -20,6 +20,7 @@ spec:
env_vars:
MODEL_ID: "meta-llama/Meta-Llama-3-8B-Instruct"
TENSOR_PARALLELISM: "2"
PIPELINE_PARALLELISM: "1"
rayClusterConfig:
headGroupSpec:
rayStartParams:
Expand Down
2 changes: 1 addition & 1 deletion ray-operator/config/samples/vllm/serve.py
Original file line number Diff line number Diff line change
Expand Up @@ -122,4 +122,4 @@ def build_app(cli_args: Dict[str, str]) -> serve.Application:


model = build_app(
{"model": os.environ['MODEL_ID'], "tensor-parallel-size": os.environ['TENSOR_PARALLELISM']})
{"model": os.environ['MODEL_ID'], "tensor-parallel-size": os.environ['TENSOR_PARALLELISM'], "pipeline-parallel-size": os.environ['PIPELINE_PARALLELISM']})

0 comments on commit d6fbdd5

Please sign in to comment.