Fix azure openai hanging problem #10153

serena-ruan · 2023-10-26T06:08:43Z

🛠 DevTools 🛠

Install mlflow from this PR

pip install git+https://github.com/mlflow/mlflow.git@refs/pull/10153/merge

Checkout with GitHub CLI

gh pr checkout 10153

Related Issues/PRs

#xxx

What changes are proposed in this pull request?

Add several needed configs for Azure OpenAI
Solve the hanging problem due to not raising the error.

How is this PR tested?

Existing unit/integration tests
New unit/integration tests
Manual tests

Notebook: https://e2-dogfood.staging.cloud.databricks.com/?o=6051921418418893#notebook/4169768892693977/command/4169768892694002

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

No. You can skip the rest of this section.
Yes. Give a description of this change to be included in the release notes for MLflow users.

What component(s), interfaces, languages, and integrations does this PR affect?

Components

Interface

area/uiux: Front-end, user experience, plotting, JavaScript, JavaScript dev server
area/docker: Docker use across MLflow's components, such as MLflow Projects and MLflow Models
area/sqlalchemy: Use of SQLAlchemy in the Tracking Service or Model Registry
area/windows: Windows support

Language

language/r: R APIs and clients
language/java: Java APIs and clients
language/new: Proposals for new client languages

Integrations

integrations/azure: Azure and Azure ML integrations
integrations/sagemaker: SageMaker integrations
integrations/databricks: Databricks integrations

How should the PR be classified in the release notes? Choose one:

rn/none - No description will be included. The PR will be mentioned only by the PR number in the "Small Bugfixes and Documentation Updates" section
rn/breaking-change - The PR will be mentioned in the "Breaking Changes" section
rn/feature - A new user-facing feature worth mentioning in the release notes
rn/bug-fix - A user-facing bug fix worth mentioning in the release notes
rn/documentation - A user-facing documentation change worth mentioning in the release notes

Signed-off-by: Serena Ruan <[email protected]>

github-actions · 2023-10-26T06:09:03Z

Documentation preview for c7790d1 will be available here when this CircleCI job completes successfully.

More info

Ignore this comment if this PR does not change the documentation.
It takes a few minutes for the preview to be available.
The preview is updated when a new commit is pushed to this PR.
This comment was created by https://github.com/mlflow/mlflow/actions/runs/6661779709.

Signed-off-by: Serena Ruan <[email protected]>

serena-ruan · 2023-10-26T07:35:31Z

mlflow/openai/api_request_parallel_processor.py

+            else:
+                _logger.warning(f"Request #{self.index} failed with {e!r}")
+                status_tracker.increment_num_api_errors()
+                status_tracker.complete_task(success=False)


This is the root cause of hanging

Great find! :D

mlflow/openai/utils.py

mlflow/openai/__init__.py

Signed-off-by: Serena Ruan <[email protected]>

BenWilson2 · 2023-10-26T15:37:07Z

mlflow/openai/utils.py

+    """
+    params passed at inference time should override envs.
+    """
+    return {k: v for k, v in envs.items() if k not in params} if params else envs


Will this erase env entries if a value submitted for a similar key in params is set to None? Is that intended?

Would a dict unpacking + packing work here to ensure that only valid params values replace the env variable values?

def _exclude_params_from_envs(params, envs): """ params passed at inference time should override envs. """ return {**envs, **(params or {})}

We shouldn't expect params to set a key as None. If that's the case, it overrides envs value unless we exclude None values from params itself, otherwise {**envs, **(params or {})} still overrides the value to None.
params is something users explicitly pass at inference time, I think they won't pass it unless they really wants to set the key to None.

Yeah that was what I was getting at (checking for None values that would invalidate a config). However, if that's a user's intention, that's their intention.

For the code suggestion, and my line of thinking.... If a user passes "None" with the code in your PR, it deletes the key instead of preserving the key: None relationship. With dict packing / unpacking, it preserves the state of params regardless of the values supplied by the user.

Here's an ugly repro:

# Original function def _exclude_params_from_envs_original(params, envs): """ params passed at inference time should override envs. """ return {k: v for k, v in envs.items() if k not in params} if params else envs # Proposed function def _exclude_params_from_envs_proposed(params, envs): """ params passed at inference time should override envs. """ return {**envs, **(params or {})} # Test cases def test_function(func): # Test 1: Key in params is set to None and the same key exists in envs envs = {"key1": "value1", "key2": "value2"} params = {"key1": None} result = func(params, envs) assert result["key1"] is None, f"Test 1 failed for {func.__name__}!" # Test 2: Key in params is set to a non-None value and the same key exists in envs params = {"key1": "new_value1"} result = func(params, envs) assert result["key1"] == "new_value1", f"Test 2 failed for {func.__name__}!" # Test 3: Key exists only in params and not in envs params = {"key3": "value3"} result = func(params, envs) assert result["key3"] == "value3", f"Test 3 failed for {func.__name__}!" # Test 4: Key exists only in envs and not in params params = {} result = func(params, envs) assert result["key2"] == "value2", f"Test 4 failed for {func.__name__}!" print(f"All tests passed for {func.__name__}!")

test_function(_exclude_params_from_envs_proposed)

All tests passed for _exclude_params_from_envs_proposed!

test_function(_exclude_params_from_envs_original)

--------------------------------------------------------------------------- KeyError Traceback (most recent call last) /var/folders/cd/n8n0rm2x53l_s0xv_j_xklb00000gp/T/ipykernel_37720/2252776925.py in <cell line: 1>() ----> 1 test_function(_exclude_params_from_envs_original) /var/folders/cd/n8n0rm2x53l_s0xv_j_xklb00000gp/T/ipykernel_37720/2499167076.py in test_function(func) 19 params = {"key1": None} 20 result = func(params, envs) ---> 21 assert result["key1"] is None, f"Test 1 failed for {func.__name__}!" 22 23 KeyError: 'key1'

Is this the behavior you are going for?

BenWilson2 · 2023-10-26T15:39:38Z

mlflow/openai/utils.py

@@ -151,6 +151,13 @@ def _validate_model_params(task, model, params):
        )


+def _exclude_params_from_envs(params, envs):


Can we add a parametrized test for the behavior of this override to ensure that param entries override env variable entries as expected (checking for things like None values in param overrides so that we have effective error handling for situations like that)

BenWilson2

LGTM once the small test is added! TY @serena-ruan :D

Signed-off-by: Serena Ruan <[email protected]>

Signed-off-by: Serena Ruan <[email protected]> Signed-off-by: swathi <[email protected]>

serena-ruan added 8 commits October 26, 2023 12:42

fix azure openai

1c9a98e

Signed-off-by: Serena Ruan <[email protected]>

update and add timeout

2dd43f2

Signed-off-by: Serena Ruan <[email protected]>

add check for engine & deployment_id

5b2539b

Signed-off-by: Serena Ruan <[email protected]>

test

9c78294

Signed-off-by: Serena Ruan <[email protected]>

update

02f5d8b

Signed-off-by: Serena Ruan <[email protected]>

fix

7d30ec7

Signed-off-by: Serena Ruan <[email protected]>

add log

06456aa

Signed-off-by: Serena Ruan <[email protected]>

add test

c52acef

Signed-off-by: Serena Ruan <[email protected]>

github-actions bot added the rn/none List under Small Changes in Changelogs. label Oct 26, 2023

serena-ruan added 2 commits October 26, 2023 14:22

fix envs & params conflict

4ad016f

Signed-off-by: Serena Ruan <[email protected]>

fix tests

85f27ce

Signed-off-by: Serena Ruan <[email protected]>

serena-ruan requested review from harupy, BenWilson2 and sunishsheth2009 October 26, 2023 07:28

remove useless comment

bebcebc

Signed-off-by: Serena Ruan <[email protected]>

serena-ruan commented Oct 26, 2023

View reviewed changes

harupy reviewed Oct 26, 2023

View reviewed changes

mlflow/openai/utils.py Outdated Show resolved Hide resolved

harupy reviewed Oct 26, 2023

View reviewed changes

mlflow/openai/__init__.py Outdated Show resolved Hide resolved

address comments

03e5553

Signed-off-by: Serena Ruan <[email protected]>

BenWilson2 reviewed Oct 26, 2023

View reviewed changes

BenWilson2 approved these changes Oct 27, 2023

View reviewed changes

add test

c7790d1

Signed-off-by: Serena Ruan <[email protected]>

serena-ruan merged commit cf17437 into mlflow:master Oct 27, 2023
36 checks passed

serena-ruan deleted the fix_openai branch October 27, 2023 05:30

KonakanchiSwathi pushed a commit to KonakanchiSwathi/mlflow that referenced this pull request Nov 29, 2023

Fix azure openai hanging problem (mlflow#10153)

1c19345

Signed-off-by: Serena Ruan <[email protected]> Signed-off-by: swathi <[email protected]>

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix azure openai hanging problem #10153

Fix azure openai hanging problem #10153

serena-ruan commented Oct 26, 2023 •

edited

Loading

github-actions bot commented Oct 26, 2023 •

edited

Loading

serena-ruan Oct 26, 2023

BenWilson2 Oct 26, 2023

BenWilson2 Oct 26, 2023

serena-ruan Oct 26, 2023

BenWilson2 Oct 27, 2023

BenWilson2 Oct 26, 2023

BenWilson2 left a comment

		@@ -151,6 +151,13 @@ def _validate_model_params(task, model, params):
		)


		def _exclude_params_from_envs(params, envs):

Fix azure openai hanging problem #10153

Fix azure openai hanging problem #10153

Conversation

serena-ruan commented Oct 26, 2023 • edited Loading

Install mlflow from this PR

Checkout with GitHub CLI

Related Issues/PRs

What changes are proposed in this pull request?

How is this PR tested?

Does this PR require documentation update?

Release Notes

Is this a user-facing change?

What component(s), interfaces, languages, and integrations does this PR affect?

How should the PR be classified in the release notes? Choose one:

github-actions bot commented Oct 26, 2023 • edited Loading

serena-ruan Oct 26, 2023

Choose a reason for hiding this comment

BenWilson2 Oct 26, 2023

Choose a reason for hiding this comment

BenWilson2 Oct 26, 2023

Choose a reason for hiding this comment

serena-ruan Oct 26, 2023

Choose a reason for hiding this comment

BenWilson2 Oct 27, 2023

Choose a reason for hiding this comment

BenWilson2 Oct 26, 2023

Choose a reason for hiding this comment

BenWilson2 left a comment

Choose a reason for hiding this comment

serena-ruan commented Oct 26, 2023 •

edited

Loading

github-actions bot commented Oct 26, 2023 •

edited

Loading