Implement chat in BaseOpenAiGpuPredictor #1122

elatt · 2024-09-19T15:36:44Z

This repository is public. Do not put here any private DataRobot or customer's data: code, datasets, model artifacts, .etc.

Summary

Update base GPU predictor to support Chat API. Also add test coverage for NIM and vLLM environments.

Rationale

The NIM and vLLM inference servers support ChatAPI natively so this adds a simple hook to pass the DRUM chat completion requests through to the backend inference servers.

elatt · 2024-09-19T15:37:34Z

Just spun this into a Draft PR so it is easier for me to watch commits/comment, etc.

custom_model_runner/datarobot_drum/drum/gpu_predictors/base.py

We can use this in the GPU predictor code.

This was missing from #1109

This package is breaking vLLM now (we didn't have it pinned to a version)

This support `None` better for `n`, `tempurature`, etc.

uwsgi shouldn't be needed

older versions don't have to_dict() or to_json()

devexp-slackbot · 2024-10-09T04:57:10Z

The Needs Review labels were added based on the following file changes.

Team @datarobot/core-modeling (#core-modeling) was assigned because of changes in files:

custom_model_runner/datarobot_drum/drum/language_predictors/base_language_predictor.py
custom_model_runner/datarobot_drum/resource/drum_server_utils.py
tests/functional/run_integration_tests_in_framework_container.sh
tests/functional/test_inference_per_framework.py

Team @datarobot/custom-models (#custom-models) was assigned because of changes in files:

custom_model_runner/datarobot_drum/drum/gpu_predictors/base.py
custom_model_runner/datarobot_drum/drum/language_predictors/base_language_predictor.py
custom_model_runner/datarobot_drum/resource/drum_server_utils.py
public_dropin_gpu_environments/nim_llm/dr_requirements.in
public_dropin_gpu_environments/nim_llm/dr_requirements.txt
public_dropin_gpu_environments/vllm/Dockerfile
public_dropin_gpu_environments/vllm/dr_requirements.in
public_dropin_gpu_environments/vllm/dr_requirements.txt
tests/functional/run_integration_tests_in_framework_container.sh
tests/functional/test_inference_per_framework.py

Team @datarobot/tracking-agent (#tracking-agent-reviews) was assigned because of changes in files:

custom_model_runner/datarobot_drum/drum/gpu_predictors/base.py
public_dropin_gpu_environments/nim_llm/dr_requirements.in
public_dropin_gpu_environments/nim_llm/dr_requirements.txt
public_dropin_gpu_environments/vllm/Dockerfile
public_dropin_gpu_environments/vllm/dr_requirements.in
public_dropin_gpu_environments/vllm/dr_requirements.txt

If you think that there are some issues with ownership, please discuss with C&A domain at #core-backend-domain slack channel and create PR to update DRCODEOWNERS\CODEOWNERS file.

custom_model_runner/datarobot_drum/drum/language_predictors/base_language_predictor.py

baekdahl

Thanks for improving the PR. Have a couple of comments

custom_model_runner/datarobot_drum/drum/language_predictors/base_language_predictor.py

Implement chat in BaseOpenAiGpuPredictor

368fa8d

elatt assigned baekdahl Sep 19, 2024

elatt commented Sep 19, 2024

View reviewed changes

custom_model_runner/datarobot_drum/drum/gpu_predictors/base.py Show resolved Hide resolved

baekdahl and others added 16 commits October 8, 2024 15:32

Add chat to test

3d6fed5

Merge remote-tracking branch 'origin/master' into baekdahl/chat-gpu

ae89615

Make openai import local

619f014

Mark get_prompt_column_name as non private

b8881b9

We can use this in the GPU predictor code.

Allow tests to override monitoring settings

9100334

Merge branch 'master' into baekdahl/chat-gpu

02e4a0a

Use error-server so we fail-fast

a17c5f5

Make target more realistic

523ec55

Update Dockerfile to reflect changes in DRUM

2fbabf2

This was missing from #1109

Remove autoawq

20a47dc

This package is breaking vLLM now (we didn't have it pinned to a version)

Bump version of vllm

fefb11d

This support `None` better for `n`, `tempurature`, etc.

Update vllm tests

153ab4d

blackify

d641e44

Allow configuring log level

41f38eb

Update NIM tests for chat API

07da4d3

blackify

cac2928

elatt requested a review from baekdahl October 9, 2024 02:00

elatt marked this pull request as ready for review October 9, 2024 02:02

elatt added 8 commits October 8, 2024 22:18

More efficient concatenation

8872768

Exercise stream and non-streaming

a71c467

better names

09417c9

Update vllm packages

60ed1b3

uwsgi shouldn't be needed

Bump version of openai package in NIM env

d83ee59

older versions don't have to_dict() or to_json()

Fix handling of optional monitoring_settings

aff8111

Update secrets filtering

2a80968

Fixup vLLM tests

3e66025

elatt added the 00 - Ready for Review label Oct 9, 2024

devexp-slackbot bot added Needs Review: Core Modeling Needs Review: Custom Models Needs Review: Tracking-Agent labels Oct 9, 2024

baekdahl reviewed Oct 9, 2024

View reviewed changes

custom_model_runner/datarobot_drum/drum/language_predictors/base_language_predictor.py Show resolved Hide resolved

baekdahl approved these changes Oct 9, 2024

View reviewed changes

custom_model_runner/datarobot_drum/drum/language_predictors/base_language_predictor.py Show resolved Hide resolved

svc-engprod-git1 requested a review from alexromanyuk October 9, 2024 08:38

engprod-2 bot removed the Needs Review: Tracking-Agent label Oct 9, 2024

alexromanyuk approved these changes Oct 9, 2024

View reviewed changes

Fixup mlops reporting for streaming

69b8ae0

elatt merged commit 97da556 into master Oct 9, 2024
19 checks passed

svc-engprod-git1 removed Needs Review: Core Modeling Needs Review: Custom Models labels Oct 9, 2024

svc-engprod-git1 deleted the baekdahl/chat-gpu branch October 9, 2024 11:37

svc-engprod-git1 removed the 00 - Ready for Review label Oct 9, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Implement chat in BaseOpenAiGpuPredictor #1122

Implement chat in BaseOpenAiGpuPredictor #1122

elatt commented Sep 19, 2024 •

edited

Loading

elatt commented Sep 19, 2024

devexp-slackbot bot commented Oct 9, 2024

baekdahl left a comment

Implement chat in BaseOpenAiGpuPredictor #1122

Implement chat in BaseOpenAiGpuPredictor #1122

Conversation

elatt commented Sep 19, 2024 • edited Loading

This repository is public. Do not put here any private DataRobot or customer's data: code, datasets, model artifacts, .etc.

Summary

Rationale

elatt commented Sep 19, 2024

devexp-slackbot bot commented Oct 9, 2024

baekdahl left a comment

Choose a reason for hiding this comment

elatt commented Sep 19, 2024 •

edited

Loading