refactor(runner): add InferenceError to all pipelines #188

rickstaa · 2024-09-04T13:38:15Z

This pull request adds the inference error logic from the SAM2 pipeline to all pipelines so users are given a warning when they supply wrong arguments. Please note that the 422 thrown by FastAPI are not handled corretly and we don't do validation at the go-livepeer side yet (see https://linear.app/livepeer-ai/project/improve-error-handling-117c11ff78f3/issues). As a result although this pull request improves the behavoir we don't yet have an optimised solution.

This commit adds the inference error logic from the SAM2 pipeline to all pipelines so users are given a warning when they supply wrong arguments.

This commit ensures that users get a descriptive error message when the GPU runs out of memory.

This commit ensures that all response errors are known by FastAPI and therefore shown in the docs.

This commit adds some missing error handling to the pipeline worker functions.

runner/app/routes/audio_to_text.py

This commit improves the out of memory error handling by using the native torch error.

This commit ensures that errors thrown by the runner are forwarded to the orchestrator. It applies the logic used by the SAM2 and audio-to-text pipelines to the other pipelines.

This commit applies the black formatter to the PR files.

This commit removes the redundant str call.

leszko

Added one comment, other than that LGTM

leszko · 2024-10-14T12:19:47Z

runner/app/routes/image_to_video.py

+    Returns:
+        A JSONResponse with the appropriate error message and status code.
+    """
+    if isinstance(e, torch.cuda.OutOfMemoryError):


I see this part of the code is repeated in every pipeline, would it maybe make sense to extract it and reuse?

@leszko thanks for the quick review. Yea I was planning to extract it similar to the logic @mjh1 implemented in #226 (See #226 (comment)). I however wanted to do this in a subsequent pull request. I can do it here and co-author @mjh1 👍🏻.

@leszko as also stated in the discord I added a variant of @gioelecerati's logic in ae8df74.

This commit introduces a global error handling configuration and function to streamline error management across different pipelines. The new `handle_pipeline_exception` function centralizes error handling logic, allowing pipelines to override it if necessary. This change reduces code duplication and improves maintainability. Co-authored-by: rickstaa <[email protected]>

This commit ensures that pipelines can overwrite the default error message when the Global error configuration contains a empty string.

This commit adds a test for the 'handle_pipeline_exception' route utility function. It also fixes some errors into that function.

refactor(runner): add InferenceError to all pipelines

a469892

This commit adds the inference error logic from the SAM2 pipeline to all pipelines so users are given a warning when they supply wrong arguments.

rickstaa force-pushed the add_all_pipelines_inference_error branch from f08aced to a469892 Compare September 4, 2024 13:39

rickstaa added 3 commits September 5, 2024 08:26

fixup! refactor(runner): add InferenceError to all pipelines

6ecc024

refactor(runner): handle OOM error

d2475fa

This commit ensures that users get a descriptive error message when the GPU runs out of memory.

chore: apply black formatter

8566106

rickstaa requested a review from victorges September 5, 2024 07:15

rickstaa added 2 commits September 5, 2024 09:26

refactor(runner): improve response errors

c5a5560

This commit ensures that all response errors are known by FastAPI and therefore shown in the docs.

refactor(worker): add missing error handling

b81a3ce

This commit adds some missing error handling to the pipeline worker functions.

rickstaa mentioned this pull request Sep 19, 2024

Add LoRa support to the txt2img and img2img pipelines #119

Merged

rickstaa commented Sep 22, 2024

View reviewed changes

runner/app/routes/audio_to_text.py Outdated Show resolved Hide resolved

rickstaa marked this pull request as draft October 7, 2024 09:04

rickstaa added 2 commits October 14, 2024 09:49

Merge branch 'main' into add_all_pipelines_inference_error

c7b759a

refactor: improve out of memory error handling

53d76eb

This commit improves the out of memory error handling by using the native torch error.

rickstaa force-pushed the add_all_pipelines_inference_error branch from dba6dc8 to 53d76eb Compare October 14, 2024 08:26

rickstaa added 3 commits October 14, 2024 11:18

feat: forward runner errors upstream

25a7700

This commit ensures that errors thrown by the runner are forwarded to the orchestrator. It applies the logic used by the SAM2 and audio-to-text pipelines to the other pipelines.

refactor: apply black formatter

f6e715e

This commit applies the black formatter to the PR files.

refactor: improve error string handling

b9d025c

This commit removes the redundant str call.

rickstaa marked this pull request as ready for review October 14, 2024 10:22

rickstaa mentioned this pull request Oct 14, 2024

worker: return back error strings to O #226

Closed

rickstaa requested review from mjh1, gioelecerati, thomshutt and leszko October 14, 2024 10:31

leszko approved these changes Oct 14, 2024

View reviewed changes

rickstaa force-pushed the add_all_pipelines_inference_error branch from 21a3d77 to ae8df74 Compare October 14, 2024 14:12

rickstaa added 3 commits October 14, 2024 16:25

refactor(runner): change default error message behavoir

870c2f0

This commit ensures that pipelines can overwrite the default error message when the Global error configuration contains a empty string.

test(runner): add handle_pipeline_exception test

6702956

This commit adds a test for the 'handle_pipeline_exception' route utility function. It also fixes some errors into that function.

fixup! test(runner): add handle_pipeline_exception test

e379840

rickstaa merged commit 40fa0c2 into main Oct 15, 2024
3 checks passed

rickstaa deleted the add_all_pipelines_inference_error branch October 15, 2024 08:19

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

refactor(runner): add InferenceError to all pipelines #188

refactor(runner): add InferenceError to all pipelines #188

rickstaa commented Sep 4, 2024 •

edited

Loading

leszko left a comment

leszko Oct 14, 2024

rickstaa Oct 14, 2024

rickstaa Oct 14, 2024 •

edited

Loading

refactor(runner): add InferenceError to all pipelines #188

refactor(runner): add InferenceError to all pipelines #188

Conversation

rickstaa commented Sep 4, 2024 • edited Loading

leszko left a comment

Choose a reason for hiding this comment

leszko Oct 14, 2024

Choose a reason for hiding this comment

rickstaa Oct 14, 2024

Choose a reason for hiding this comment

rickstaa Oct 14, 2024 • edited Loading

Choose a reason for hiding this comment

rickstaa commented Sep 4, 2024 •

edited

Loading

rickstaa Oct 14, 2024 •

edited

Loading