Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Data] Async MapBatches hangs upon exception raised from UDF #47102

Closed
scottjlee opened this issue Aug 12, 2024 · 1 comment · Fixed by #47110
Closed

[Data] Async MapBatches hangs upon exception raised from UDF #47102

scottjlee opened this issue Aug 12, 2024 · 1 comment · Fixed by #47110
Labels
bug Something that is supposed to be working; but isn't data Ray Data-related issues P1 Issue that should be fixed within a few weeks

Comments

@scottjlee
Copy link
Contributor

What happened + What you expected to happen

When using Ray Data's async map_batches API and an exception is raised from the UDF, the exception is not raised and the data execution will be hanging. See reproducible for a simple example

Versions / Dependencies

ray master b6d4792

Reproduction script

import ray

class MyUDF:
    def __init__(self):
        pass

    async def __call__(self, batch):
        assert False
        yield batch

ds = ray.data.range(20)
ds = ds.map_batches(MyUDF, concurrency=2)
result = ds.materialize()
assert len(result) == 20

We should expect the script above to run to completion, but instead hangs forever without any indication of exception or failure.

Issue Severity

None

@scottjlee scottjlee added bug Something that is supposed to be working; but isn't P1 Issue that should be fixed within a few weeks data Ray Data-related issues labels Aug 12, 2024
@Bye-legumes
Copy link
Contributor

@scottjlee I add a exception queue to fix it and this is what look like now.
image

simonsays1980 pushed a commit to simonsays1980/ray that referenced this issue Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something that is supposed to be working; but isn't data Ray Data-related issues P1 Issue that should be fixed within a few weeks
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants