Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-2470] PEP8 fixes to PySpark #1505

Closed
wants to merge 33 commits into from
Closed

[SPARK-2470] PEP8 fixes to PySpark #1505

wants to merge 33 commits into from

Conversation

nchammas
Copy link
Contributor

This pull request aims to resolve all outstanding PEP8 violations in PySpark.

merging upstream updates
This update gives launched EC2 instances descriptive names by using
instance tags. Launched instances now show up in the EC2 console with
these names.

I used `format()` with named parameters, which I believe is the
recommended practice for string formatting in Python, but which doesn’t
seem to be used elsewhere in the script.
pulling upstream updates
Merge upstream updates
merging upstream updates
Security groups created by spark-ec2 do not prepend “spark-“ to the
name.

Since naming the instances themselves is new to spark-ec2, it’s better
to change that pattern to match the existing naming pattern for the
security groups, rather than the other way around.
Functions in Python should be preceded by 2 blank lines, not 1.
merge upstream updates
@AmplabJenkins
Copy link

Can one of the admins verify this patch?

@rxin
Copy link
Contributor

rxin commented Jul 21, 2014

@davies can you take a look?

@@ -124,5 +125,5 @@ def sampleStdev(self):
return math.sqrt(self.sampleVariance())

def __repr__(self):
return "(count: %s, mean: %s, stdev: %s, max: %s, min: %s)" % (self.count(), self.mean(), self.stdev(), self.max(), self.min())

return "(count: %s, mean: %s, stdev: %s, max: %s, min: %s)" %
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Need ""

@nchammas
Copy link
Contributor Author

OK, I've fixed the problems you pointed out, reverted the changes to cloudpickle, and confirmed that python/run-tests passes.

@davies
Copy link
Contributor

davies commented Jul 22, 2014

Thanks, LGTM.
cc @rxin

@nchammas
Copy link
Contributor Author

Reynold, is there anything else we need to clean up before we can have pep8 checks become part of the CI cycle?

Also, it sounds like we want to make an exception for cloudpickle. So that needs to be made somewhere when we turn this on.

@rxin
Copy link
Contributor

rxin commented Jul 22, 2014

If we have fixed all the problems, let's do it. We should add pep8 check to the Jenkins scripts (in /dev/).

I'm not 100% positive whether our Jenkins instances have pep8 installed. Is there a way to run pep8 if it is not installed in the system by default?

@rxin
Copy link
Contributor

rxin commented Jul 22, 2014

Jenkins, test this please.

@rxin
Copy link
Contributor

rxin commented Jul 22, 2014

Jenkins, add to whitelist.

@SparkQA
Copy link

SparkQA commented Jul 22, 2014

QA tests have started for PR 1505. This patch merges cleanly.
View progress: https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16941/consoleFull

@nchammas
Copy link
Contributor Author

If we have to, we could probably somehow package pep8 and its dependencies as a standalone. It's doable but I think also a bit ugly and harder to update.

As for adding to the Jenkins scripts, shall I take a crack at that in a separate PR?

How are you thinking we approach that? There are at least a couple of things we could do, like:

  • update /dev/run-tests to include pep8 checks directly
  • update /dev/run-tests to call a pythonstyle script under /dev/

We should probably update step 4 under "Contributing Code" in the "Contributing to Spark Guide" to cover Python style as well.

Also, would it make sense to replace sbt/sbt scalastyle with a new check-style script in /dev/ that checks style for both Scala and Python code?

@rxin
Copy link
Contributor

rxin commented Jul 22, 2014

Let's definitely add the pep8 check in a separate PR. I'm waiting for Jenkins to come back positive before merging this pull request.

I think it'd make sense to have a linter script for scala (dev/lint-scala), and a linter script (dev/lint-python) for python so it is easier for devs to run. lint-scala can just call scalastyle, while lint-python calls pep8.

And then run-tests should call both. And yes, let's update the wiki once we have this checked in.

@SparkQA
Copy link

SparkQA commented Jul 22, 2014

QA results for PR 1505:
- This patch PASSES unit tests.
- This patch merges cleanly
- This patch adds no public classes

For more information see test ouptut:
https://amplab.cs.berkeley.edu/jenkins/job/SparkPullRequestBuilder/16941/consoleFull

@rxin
Copy link
Contributor

rxin commented Jul 22, 2014

Merging this in master!

@rxin
Copy link
Contributor

rxin commented Jul 22, 2014

Now I've merged this, do you mind submitting a pep8 checker as part of Jenkins?

@asfgit asfgit closed this in 5d16d5b Jul 22, 2014
@nchammas
Copy link
Contributor Author

do you mind submitting a pep8 checker as part of Jenkins?

Will do. I won't be able to work on this today, but I will open a separate PR for this this week hopefully. Do we need a separate JIRA issue for that btw?

@rxin
Copy link
Contributor

rxin commented Jul 22, 2014

A separate PR and JIRA would be better (since they are for different things). Thanks!

@nchammas
Copy link
Contributor Author

I've created SPARK-2627 to track the Jenkins/CI part of this work. I'll post future updates there.

@rxin rxin mentioned this pull request Jul 23, 2014
xiliu82 pushed a commit to xiliu82/spark that referenced this pull request Sep 4, 2014
This pull request aims to resolve all outstanding PEP8 violations in PySpark.

Author: Nicholas Chammas <[email protected]>
Author: nchammas <[email protected]>

Closes apache#1505 from nchammas/master and squashes the following commits:

98171af [Nicholas Chammas] [SPARK-2470] revert PEP 8 fixes to cloudpickle
cba7768 [Nicholas Chammas] [SPARK-2470] wrap expression list in parentheses
e178dbe [Nicholas Chammas] [SPARK-2470] style - change position of line break
9127d2b [Nicholas Chammas] [SPARK-2470] wrap expression lists in parentheses
22132a4 [Nicholas Chammas] [SPARK-2470] wrap conditionals in parentheses
24639bc [Nicholas Chammas] [SPARK-2470] fix whitespace for doctest
7d557b7 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to tests.py
8f8e4c0 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to storagelevel.py
b3b96cf [Nicholas Chammas] [SPARK-2470] PEP8 fixes to statcounter.py
d644477 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to worker.py
aa3a7b6 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to sql.py
1916859 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to shell.py
95d1d95 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to serializers.py
a0fec2e [Nicholas Chammas] [SPARK-2470] PEP8 fixes to mllib
c85e1e5 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to join.py
d14f2f1 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to __init__.py
81fcb20 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to resultiterable.py
1bde265 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to java_gateway.py
7fc849c [Nicholas Chammas] [SPARK-2470] PEP8 fixes to daemon.py
ca2d28b [Nicholas Chammas] [SPARK-2470] PEP8 fixes to context.py
f4e0039 [Nicholas Chammas] [SPARK-2470] PEP8 fixes to conf.py
a6d5e4b [Nicholas Chammas] [SPARK-2470] PEP8 fixes to cloudpickle.py
f0a7ebf [Nicholas Chammas] [SPARK-2470] PEP8 fixes to rddsampler.py
4dd148f [nchammas] Merge pull request apache#5 from apache/master
f7e4581 [Nicholas Chammas] unrelated pep8 fix
a36eed0 [Nicholas Chammas] name ec2 instances and security groups consistently
de7292a [nchammas] Merge pull request apache#4 from apache/master
2e4fe00 [nchammas] Merge pull request apache#3 from apache/master
89fde08 [nchammas] Merge pull request apache#2 from apache/master
69f6e22 [Nicholas Chammas] PEP8 fixes
2627247 [Nicholas Chammas] broke up lines before they hit 100 chars
6544b7e [Nicholas Chammas] [SPARK-2065] give launched instances names
69da6cf [nchammas] Merge pull request apache#1 from apache/master
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants