-
Notifications
You must be signed in to change notification settings - Fork 2.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Support cluster mode in PySpark #2197
Conversation
Tested w/ Spark 2.1
@@ -237,8 +237,6 @@ class PySparkTask(SparkSubmitTask): | |||
|
|||
# Path to the pyspark program passed to spark-submit | |||
app = os.path.join(os.path.dirname(__file__), 'pyspark_runner.py') | |||
# Python only supports the client deploy mode, force it | |||
deploy_mode = "client" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Was this an intentional deletion? Why not just allow overwrite of deploy_mode
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
deploy_mode = "client"
overwrites deploy_mode
in SparkSubmitTask
. It was here to force deploy mode to be client
since previously cluster deploy mode wasn't supported. Now that we do support it, there is no need to pin it to client
only.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't use this module, but the edits seem to be logical.
If you wouldn't mind tagging some other PySpark contributors for review, that'd be great!
Thanks
@jthi3rry @ivannotes @ntim How do you like this PR? I tagged you here because you have contributed to pyspark class :) |
@dlstadther @Tarrasch I remember there used to be some bot that automatically tag ppl for reviews. What happened to it? |
👍 Looking forward ti try it! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@interskh I don't know what happened to mention-bot. :)
LGTM! |
Thanks @interskh! |
Description
Support cluster mode in PySpark
Motivation and Context
We want to use cluster mode for pyspark like spark tasks.
Have you tested this? If so, how?
We run it on production for a couple of months.