Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-23754][BRANCH-2.3][PYTHON] Re-raising StopIteration in client code #21463

Closed
wants to merge 1 commit into from

Conversation

e-dorigatti
Copy link
Contributor

What changes are proposed

Make sure that StopIterations raised in users' code do not silently interrupt processing by spark, but are raised as exceptions to the users. The users' functions are wrapped in safe_iter (in shuffle.py), which re-raises StopIterations as RuntimeErrors

How were the changes tested

Unit tests, making sure that the exceptions are indeed raised. I am not sure how to check whether a Py4JJavaError contains my exception, so I simply looked for the exception message in the java exception's toString. Can you propose a better way?

This is my original work, licensed in the same way as spark


Author: e-dorigatti [email protected]

Closes #21383 from e-dorigatti/fix_spark_23754.

(cherry picked from commit 0ebb0c0)

Make sure that `StopIteration`s raised in users' code do not silently interrupt processing by spark, but are raised as exceptions to the users. The users' functions are wrapped in `safe_iter` (in `shuffle.py`), which re-raises `StopIteration`s as `RuntimeError`s

Unit tests, making sure that the exceptions are indeed raised. I am not sure how to check whether a `Py4JJavaError` contains my exception, so I simply looked for the exception message in the java exception's `toString`. Can you propose a better way?

This is my original work, licensed in the same way as spark

Author: e-dorigatti <[email protected]>

Closes apache#21383 from e-dorigatti/fix_spark_23754.

(cherry picked from commit 0ebb0c0)
@SparkQA
Copy link

SparkQA commented May 30, 2018

Test build #91306 has finished for PR 21463 at commit 7628936.

  • This patch passes all tests.
  • This patch merges cleanly.
  • This patch adds no public classes.

Copy link
Member

@HyukjinKwon HyukjinKwon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@HyukjinKwon
Copy link
Member

Merged to branch-2.3.

@e-dorigatti, in case of other branches, it should be manually closed by you.

asfgit pushed a commit that referenced this pull request May 30, 2018
…code

## What changes are proposed
Make sure that `StopIteration`s raised in users' code do not silently interrupt processing by spark, but are raised as exceptions to the users. The users' functions are wrapped in `safe_iter` (in `shuffle.py`), which re-raises `StopIteration`s as `RuntimeError`s

## How were the changes tested
Unit tests, making sure that the exceptions are indeed raised. I am not sure how to check whether a `Py4JJavaError` contains my exception, so I simply looked for the exception message in the java exception's `toString`. Can you propose a better way?

This is my original work, licensed in the same way as spark

---

Author: e-dorigatti <emilio.dorigattigmail.com>

Closes #21383 from e-dorigatti/fix_spark_23754.

(cherry picked from commit 0ebb0c0)

Author: e-dorigatti <[email protected]>

Closes #21463 from e-dorigatti/branch-2.3.
@HyukjinKwon
Copy link
Member

@e-dorigatti, I think this should really be cleaned up in the master branch. Mind if I ask to give a try with fixing it in worker side and revert 0ebb0c0 in the master branch?

@e-dorigatti
Copy link
Contributor Author

@HyukjinKwon go ahead, no problem

@HyukjinKwon
Copy link
Member

oh, I meant I was wondering if you could have some time to give another try by fixing it in worker side. You could revert the current approach in your PR and try the fix in worker side.

@HyukjinKwon
Copy link
Member

HyukjinKwon commented May 31, 2018

Let me leave cc for you @icexelloss or @viirya. Seems we should really clean up in the master (and probably backport it again to branch-2.3) and try it in worker side - I need to take a closer look too. If you guys all happened to be busy, will take a look by myself.

I merged this to unblock the release.

@viirya
Copy link
Member

viirya commented May 31, 2018

@HyukjinKwon I'll spend some time to look at it. But not guarantee how much time. You can work on it if I don't come out a work for it. :)

@e-dorigatti
Copy link
Contributor Author

@HyukjinKwon haha no probs :) I have to open a new PR tho as yesterday I messed up my repos.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants