-
Notifications
You must be signed in to change notification settings - Fork 85
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement chat in BaseOpenAiGpuPredictor #1122
Conversation
Just spun this into a Draft PR so it is easier for me to watch commits/comment, etc. |
We can use this in the GPU predictor code.
This was missing from #1109
This package is breaking vLLM now (we didn't have it pinned to a version)
This support `None` better for `n`, `tempurature`, etc.
uwsgi shouldn't be needed
older versions don't have to_dict() or to_json()
The Needs Review labels were added based on the following file changes. Team @datarobot/core-modeling (#core-modeling) was assigned because of changes in files:custom_model_runner/datarobot_drum/drum/language_predictors/base_language_predictor.py custom_model_runner/datarobot_drum/resource/drum_server_utils.py tests/functional/run_integration_tests_in_framework_container.sh tests/functional/test_inference_per_framework.py Team @datarobot/custom-models (#custom-models) was assigned because of changes in files:custom_model_runner/datarobot_drum/drum/gpu_predictors/base.py custom_model_runner/datarobot_drum/drum/language_predictors/base_language_predictor.py custom_model_runner/datarobot_drum/resource/drum_server_utils.py public_dropin_gpu_environments/nim_llm/dr_requirements.in public_dropin_gpu_environments/nim_llm/dr_requirements.txt public_dropin_gpu_environments/vllm/Dockerfile public_dropin_gpu_environments/vllm/dr_requirements.in public_dropin_gpu_environments/vllm/dr_requirements.txt tests/functional/run_integration_tests_in_framework_container.sh tests/functional/test_inference_per_framework.py Team @datarobot/tracking-agent (#tracking-agent-reviews) was assigned because of changes in files:custom_model_runner/datarobot_drum/drum/gpu_predictors/base.py public_dropin_gpu_environments/nim_llm/dr_requirements.in public_dropin_gpu_environments/nim_llm/dr_requirements.txt public_dropin_gpu_environments/vllm/Dockerfile public_dropin_gpu_environments/vllm/dr_requirements.in public_dropin_gpu_environments/vllm/dr_requirements.txt If you think that there are some issues with ownership, please discuss with C&A domain at #core-backend-domain slack channel and create PR to update DRCODEOWNERS\CODEOWNERS file. |
custom_model_runner/datarobot_drum/drum/language_predictors/base_language_predictor.py
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for improving the PR. Have a couple of comments
This repository is public. Do not put here any private DataRobot or customer's data: code, datasets, model artifacts, .etc.
Summary
Update base GPU predictor to support Chat API. Also add test coverage for NIM and vLLM environments.
Rationale
The NIM and vLLM inference servers support ChatAPI natively so this adds a simple hook to pass the DRUM chat completion requests through to the backend inference servers.