Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable More PyDocStyle Checks #10742

Closed
6 of 10 tasks
kaxil opened this issue Sep 5, 2020 · 67 comments
Closed
6 of 10 tasks

Enable More PyDocStyle Checks #10742

kaxil opened this issue Sep 5, 2020 · 67 comments
Labels
contributors-workshop Issues that are good for first-time contributor's workshop good first issue kind:feature Feature Requests

Comments

@kaxil
Copy link
Member

kaxil commented Sep 5, 2020

We use PyDocStyle in pre-commit to enforce docstring style

We use Ruff to enforce docsting style

airflow/pyproject.toml

Lines 47 to 74 in 23b8e83

extend-select = [
"I", # Missing required import (auto-fixable)
"UP", # Pyupgrade
"RUF100", # Unused noqa (auto-fixable)
# implicit single-line string concatenation
"ISC001",
# We ignore more pydocstyle than we enable, so be more selective at what we enable
"D101",
"D106",
"D2",
"D3",
"D400",
# "D401", # Not enabled by ruff, but we don't want it
"D402",
"D403",
"D412",
"D419"
]
extend-ignore = [
"D203",
"D205",
"D212",
"D213",
"D214",
"D215",
"E731",
]

We follow pep257 style (http://www.pydocstyle.org/en/stable/error_codes.html) for checks.

Currently, we ignore the following rules:

The task is to complete the following missing rules:

Missing Docstrings

Whitespace Issues

Docstring Content Issues

It would be good if we can enable them one by one -- separate PRs are ok

Let me know if you need any help

@pcandoalmeida
Copy link
Contributor

I've taken:

  • D204

Might be an idea @kaxil to add a checklist for all rules needing updating in the initial issue comment. This way potential contributors can know which ones are done and which are still open to do? Just a quality of life idea so we don't have to look at the PRs individually.

@kaxil
Copy link
Member Author

kaxil commented Sep 19, 2020

@pcandoalmeida That's a good idea, I updated the description and added a check-box.

@pcandoalmeida
Copy link
Contributor

Hi @kaxil I'm working on D202 now.

@pcandoalmeida
Copy link
Contributor

Morning @kaxil what do you think about adding the Hacktoberfest label to this issue?

@kaxil
Copy link
Member Author

kaxil commented Sep 25, 2020

Morning @kaxil what do you think about adding the Hacktoberfest label to this issue?

Good idea again, just added :)

@mik-laj mik-laj added the kind:feature Feature Requests label Sep 30, 2020
@ktrueda
Copy link

ktrueda commented Oct 10, 2020

Hi I wanna try D205. Can I try this ?

@ktrueda
Copy link

ktrueda commented Oct 11, 2020

I started to fix D205 (1 blank line required between summary line and description).
It's difficult because a lot of docs has not summary line.

For example,

def remove_option(self, section, option, remove_default=True):
"""
Remove an option if it exists in config from a file or
default config. If both of config have the same option, this removes
the option in both configs unless remove_default=False.
"""

This docs has no summary line and there are a lot of docs like this.
I cannot make summary line for all these docs.
If I deal all docs as summary line, the doc obey D205. But it is not easy to read.

How can I do that? If someone has any idea, please help me.

@kaxil
Copy link
Member Author

kaxil commented Oct 11, 2020

@ktrueda You can do 2 things:

  1. Add a summary line:
Remove an option if it exists in config from a file or 
default config. 

If both of config have the same option, this removes 
the option in both configs unless remove_default=False.
  1. Only update blank line for which docstrings exists. In this case, we won't be able to enable D205 for all of them but when in future we have summary lines, it would make the word easier.

I would prefer (1) though

@ktrueda
Copy link

ktrueda commented Oct 12, 2020

@kaxil Thank you for your help.
I would prefer (1) too. So I'll try that way.

Since I found thousands of docs to fix, I need much times.
Leave it to me.

@potix2
Copy link
Contributor

potix2 commented Oct 20, 2020

@kaxil Can I take D200?

@pcandoalmeida
Copy link
Contributor

Hiya 👋🏼 @potix2 I believe D200 is currently being worked on 😃

@potix2
Copy link
Contributor

potix2 commented Oct 20, 2020

@pcandoalmeida Yeah, I'm working on it. I'll send a pull request soon.

@potix2
Copy link
Contributor

potix2 commented Oct 23, 2020

I'm working on D400.

@potiuk potiuk added contributors-workshop Issues that are good for first-time contributor's workshop and removed contributors-workshop Issues that are good for first-time contributor's workshop labels Jul 6, 2021
@peter0083
Copy link

peter0083 commented Sep 26, 2021

I would like to tackle this issue if it hasn't been assigned recently. Thanks! @uranusjr :)

@uranusjr
Copy link
Member

Go ahead 👍 Note that the ignore list above is out of date; please find the correct list in the pre-commit config file on main.

@ferruzzi
Copy link
Contributor

ferruzzi commented Feb 5, 2024

No, holidays came and distracted me, I never finished the changes.

@Taragolis
Copy link
Contributor

Just finish calculation other violations of D100-D107

Rule Total Errors Uniq Core Modules Uniq Providers Modules Uniq Other Modules
D100 767 303 463 1
D102 2202 117 474 1
D104 104 53 402 0
D105 237 55 21 0
D107 1557 127 402 0

Tip

D104 for providers packages should be resolved by change PROVIDER__INIT__PY_TEMPLATE.py.jinja2 IN the dev/breeze/src/airflow_breeze/templates DIRECTORY

If someone interested to look on possible exclusion lists, see attached archive ruff-exclusions.zip

@Taragolis
Copy link
Contributor

In theory we could enable all rules #10742 (comment) but place exclusions into the separate file, e.g. in ruff.toml

extend = "pyproject.toml"

[lint.extend-per-file-ignores]
"airflow/api/auth/backend/kerberos_auth.py" = ["D100", "D107"]
"airflow/api/common/airflow_health.py" = ["D100"]
"airflow/api_connexion/endpoints/config_endpoint.py" = ["D100"]

# Another 1500 lines of exclusions

"airflow/providers/teradata/hooks/teradata.py" = ["D107"]
"airflow/providers/yandex/secrets/lockbox.py" = ["D107"]

"hatch_build.py" = ["D100", "D102"]

@rawwar
Copy link
Collaborator

rawwar commented Feb 16, 2024

In theory we could enable all rules #10742 (comment) but place exclusions into the separate file, e.g. in ruff.toml

extend = "pyproject.toml"

[lint.extend-per-file-ignores]
"airflow/api/auth/backend/kerberos_auth.py" = ["D100", "D107"]
"airflow/api/common/airflow_health.py" = ["D100"]
"airflow/api_connexion/endpoints/config_endpoint.py" = ["D100"]

# Another 1500 lines of exclusions

"airflow/providers/teradata/hooks/teradata.py" = ["D107"]
"airflow/providers/yandex/secrets/lockbox.py" = ["D107"]

"hatch_build.py" = ["D100", "D102"]

Can we just update it in the current pyproject.toml? I would like to contribute and I'll finish them off one by one

@potiuk
Copy link
Member

potiuk commented Feb 16, 2024

Can we just update it in the current pyproject.toml? I would like to contribute and I'll finish them off one by one

Oh absolutley. Moving it out is just theorethically possible, but we do not want to do it (or at least we have not decided about it yet. I personally think there is a value in keeping all project settings in a single pyproject.toml.

@rawwar
Copy link
Collaborator

rawwar commented Feb 16, 2024

Oh absolutley. Moving it out is just theorethically possible, but we do not want to do it (or at least we have not decided about it yet. I personally think there is a value in keeping all project settings in a single pyproject.toml.

I was just trying to do this. But, I see an issue with updating pyproject.tom

one of the module "airflow/api/auth/backend/kerberos_auth.py" has D100 issue and also E402. Since, we are adding separate sections with module as a key in pyproject.toml, that's causing a conflict.

As of now, I only see three other modules causing above conflict:

airflow/security/kerberos.py
airflow/security/permissions.py
airflow/security/utils.py

@rawwar

This comment was marked as outdated.

@potiuk
Copy link
Member

potiuk commented Feb 16, 2024

We could combine several D*s if we want in one row. I don't think there is a particular reason why we want to keep them in separate sections. @Taragolis ?

@rawwar
Copy link
Collaborator

rawwar commented Feb 16, 2024

We could combine several D*s if we want in one row. I don't think there is a particular reason why we want to keep them in separate sections. @Taragolis ?

Issue is that same module have D*s and also E*s.

@potiuk
Copy link
Member

potiuk commented Feb 16, 2024

Issue is that same module have Ds and also Es.

We could combine D*'s and E*s together as well.

@Taragolis
Copy link
Contributor

We could combine several D*s if we want in one row. I don't think there is a particular reason why we want to keep them in separate sections. @Taragolis ?

I don’t think it is even possible not to combine them if we want to enable all of them. Only problem to create issues with all of them

@rawwar
Copy link
Collaborator

rawwar commented Feb 16, 2024

I don’t think it is even possible not to combine them if we want to enable all of them. Only problem to create issues with all of them

I just raised this draft PR - #37469 . can you please take a look. ruff did ignore multiple lint errors in a single line.

@Taragolis
Copy link
Contributor

And personally not sure about D107

ahidalgob pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue May 15, 2024
Part of apache/airflow#10742

D205 asserts that all docstrings must have a one-line summary ending in a period.  If there is more than one sentence then there must be a blank line before the rest of the docstring.  Meeting these requirements could be as simple as adding a newline, or might require some rephrasing.

There are almost a thousand violations in the repo so we're going to have to take this in bites.

### PLEASE NOTE

There should be zero logic changes in this PR, only changes to docstrings and whitespace.  If you see otherwise, please call it out.

### Included in this chunk

All files in the `airflow/jobs` module.

### To test

If you comment out [this line](https://github.com/apache/airflow/blob/main/pyproject.toml#L68) and run pre-commit in main you will get 284 errors.  After these changes, "only" 269 remain and no files in the dag_processing folder should be on the list.  After uncommenting that line and rerunning pre-commits, there should be zero regressions.

GitOrigin-RevId: 156b85ad85ded6131e0902a4e00c27506eee62f3
@ferruzzi
Copy link
Contributor

I know this is ancient, but checking back in on it. It appears that the remaining rules on the checklist are no longer ignored. Searching the current pyproject.toml doesn't find D100, D102, D104, and D107. I think maybe we can call this done, or are there new rules we'd like to target now?

@potiuk
Copy link
Member

potiuk commented Jul 2, 2024

Agree.

@potiuk potiuk closed this as completed Jul 2, 2024
@ferruzzi
Copy link
Contributor

ferruzzi commented Jul 2, 2024

@potiuk Sorry, I started a slack thread and did a bit of a deep dive on this but I never followed up here.

Since the "D1" prefix is not enabled, the D100, D102, D104, and D107 don't need to be explicitly excluded. So currently, they are not actually being checked, but it isn't obvious that they aren't and I mistook the fact they aren't excluded to mean they were running.

TLDR: If we want those rules enabled, then this is not actually done.

Proposal:
A new issue (since this one is N years old) and either explicitly exclude those or add a comment in the code indicating we'd like to enable them so people can see them as tasks needing to be done and not make the same mistake I did.

@potiuk
Copy link
Member

potiuk commented Jul 2, 2024

A new issue (since this one is N years old) and either explicitly exclude those or add a comment in the code indicating we'd like to enable them so people can see them as tasks needing to be done and not make the same mistake I did.

+1

kosteev pushed a commit to GoogleCloudPlatform/composer-airflow that referenced this issue Sep 19, 2024
Part of apache/airflow#10742

D205 asserts that all docstrings must have a one-line summary ending in a period.  If there is more than one sentence then there must be a blank line before the rest of the docstring.  Meeting these requirements could be as simple as adding a newline, or might require some rephrasing.

There are almost a thousand violations in the repo so we're going to have to take this in bites.

### PLEASE NOTE

There should be zero logic changes in this PR, only changes to docstrings and whitespace.  If you see otherwise, please call it out.

### Included in this chunk

All files in the `airflow/jobs` module.

### To test

If you comment out [this line](https://github.com/apache/airflow/blob/main/pyproject.toml#L68) and run pre-commit in main you will get 284 errors.  After these changes, "only" 269 remain and no files in the dag_processing folder should be on the list.  After uncommenting that line and rerunning pre-commits, there should be zero regressions.

GitOrigin-RevId: 156b85ad85ded6131e0902a4e00c27506eee62f3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
contributors-workshop Issues that are good for first-time contributor's workshop good first issue kind:feature Feature Requests
Projects
None yet
Development

No branches or pull requests