Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[AIRFLOW-5373] Super fast pre-commit check for basic python2 compatib… #5979

Conversation

potiuk
Copy link
Member

@potiuk potiuk commented Sep 2, 2019

…ility

Make sure you have checked all steps below.

Jira

Description

  • Here are some details about my PR, including screenshots of any UI changes:

Tests

  • My PR adds the following unit tests OR does not need testing for this extremely good reason:

Commits

  • My commits all reference Jira issues in their subject lines, and I have squashed multiple commits if they address the same issue. In addition, my commits follow the guidelines from "How to write a good git commit message":
    1. Subject is separated from body by a blank line
    2. Subject is limited to 50 characters (not including Jira issue reference)
    3. Subject does not end with a period
    4. Subject uses the imperative mood ("add", not "adding")
    5. Body wraps at 72 characters
    6. Body explains "what" and "why", not "how"

Documentation

  • In case of new functionality, my PR adds documentation that describes how to use it.
    • All the public functions and the classes in the PR contain docstrings that explain what it does
    • If you implement backwards incompatible changes, please leave a note in the Updating.md so we can assign it to a appropriate release

Code Quality

  • Passes flake8

@potiuk potiuk requested review from ashb and kaxil September 2, 2019 01:14
@potiuk
Copy link
Member Author

potiuk commented Sep 2, 2019

@kaxil @ashb -> that might be a life-saver for cherry-picking to v1-10-test branch. Not perfect but I think even such simple regexp can save us hours of waiting for CI.

@potiuk
Copy link
Member Author

potiuk commented Sep 2, 2019

Running pre-commit run python2-fastcheck --all-files takes milliseconds

@potiuk potiuk force-pushed the add-python2-compatibility-basic-check branch 2 times, most recently from 6067491 to 8bcd287 Compare September 2, 2019 01:32
@potiuk
Copy link
Member Author

potiuk commented Sep 2, 2019

I also figured that we can simply do py_compile on all files. It works great too.

@potiuk
Copy link
Member Author

potiuk commented Sep 2, 2019

I think both together works beautifully and can save us many hours https://travis-ci.org/apache/airflow/jobs/579595829 - if you configure them as pre-commit checks especially all those problems will not even leave your local sources (I have both pre-commit and pre-push configured and it works great as long as we merge #5972

@potiuk
Copy link
Member Author

potiuk commented Sep 2, 2019

We can even add python3 py_compile in master as well - I think it's rather nice way of catching basic syntax errors.

@potiuk potiuk force-pushed the add-python2-compatibility-basic-check branch from 8bcd287 to b6df089 Compare September 2, 2019 01:46
@potiuk
Copy link
Member Author

potiuk commented Sep 2, 2019

@kaxil @ashb -> WDYT ? I think it can be really helpful for further cherry-picking.

@@ -198,7 +198,7 @@ def compare_result(ds, **kwargs):

/** Computes an approximation to pi */
object SparkPi {
def main(args: Array[String]) {
def main(args) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This one is a false positive -- this is Scala code :D

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Aaaaah indeed. So I need to add a a possibility to exclude such false positives. Will do soon.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks like https://pre-commit.com/#regular-expressions to exclude the whole file is the only way to do this one. A bit overly broad but this file isn't changed very often so that's probably okay. That or adjust the regex to not allow ) { on the line (trickier regex to get that working)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually I thought about writing a bit more complex regex to exclude lines which have the standard # noqa in the same line - so that you can explicitly disable it if needed.

Here is the regex that will likely work :)

^(?!.*# noqa)(.*\.super\(\)|^\s*def\s*\S*\([^):]*:.*|^\sdef\s*\S*\(.*\):\s*\-\>\s*\S*).*$

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And likely with (?x) I can write in multiple lines to make it easier to read :)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

But I also remember about "you have one problem - introduce regexp - then you will have two problems" saying, so maybe it's even worth to write our own simple hook for that purpose if you think it's too "nerdy" with the regexp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I used to work as a Perl developer for about 5 years, so I am in favour of regeps :D You can take the programmer out of perl, but not the perl out of the programmer.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If each pygrep test is quick, might it be easier to split the super case from the type annotation case into separate rules?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually if you are a perl programmer - then you will find the final result well.. refreshing :).

I found that with (?x) extension you can not only write elegant and pretty "modular" code but also one that can contain comments (!).

Also the exclusion rules can be rather elegant :).

Surprisingly I have now regexp that I am actually proud of!

@potiuk potiuk force-pushed the add-python2-compatibility-basic-check branch from b6df089 to b748328 Compare September 3, 2019 14:45
@potiuk
Copy link
Member Author

potiuk commented Sep 3, 2019

Hold on. Just checking super() matching :)

@potiuk potiuk force-pushed the add-python2-compatibility-basic-check branch from b748328 to c11c4f8 Compare September 3, 2019 15:06
@potiuk
Copy link
Member Author

potiuk commented Sep 3, 2019

Now it's perfect. It detects 846 Python3-specific lines in master and 0 in v1-10-test :)

Copy link
Member

@kaxil kaxil left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@potiuk potiuk merged commit 00b50e9 into apache:v1-10-test Sep 3, 2019
@potiuk potiuk deleted the add-python2-compatibility-basic-check branch September 19, 2019 22:44
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants