Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

BUG: doesn't recognize docstring if closing bracket of function is black-style #27

Closed
peterprescott opened this issue Mar 16, 2022 · 11 comments
Assignees

Comments

@peterprescott
Copy link
Contributor

Hi, I really like the package you've made, and our team is looking at using it to enforce well-written docstrings before new merges are made to the master branch of our codebase.

But we also use black to format all our code, and there's a frustrating incompatibility between the way black formats a function with too many parameters to fit on a line, and the way numdoclint reads that function and its docstring.

(Or at least for a class method, which is the specific example I'm looking at).

Specifically, we have this method:

    def decompress(
        raw_string: str, base64_decode: bool = True, json_load: bool = True
    ) -> str:
        """
        Decompress raw string...

and black makes the closing bracket of the function's parameters line up with the initial def keyword, while numdoclint doesn't seem to recognize the docstring unless the closing bracket is indented like parameters on the line above, eg.

    def decompress(
        raw_string: str, base64_decode: bool = True, json_load: bool = True
        ) -> str:
        """
        Decompress raw string...

Full example here: https://asciinema.org/a/477219?speed=3

If this could be fixed that would be much appreciated!

@peterprescott
Copy link
Contributor Author

The problem is with numdoclint.helper.get_func_overall_docstring, which fails to detect any docstring after black has made it so "closing brackets are always dedented".

@peterprescott
Copy link
Contributor Author

the problem is this bit:

    if (line_indent_num < indent_num and line_str != ''
            and line_str.strip() != '):'
            and not is_docstring_line
            and not is_docstring_last_line):
        break

@peterprescott
Copy link
Contributor Author

instead of just and line_str.strip() != '):' we need a more general test for whether the line_str is the function's closing bracket and colon, allowing for the possibility that there might be a type hint, eg. ') -> str:'

@peterprescott
Copy link
Contributor Author

peterprescott commented Mar 16, 2022

actually, the easiest thing is probably just to use inspect.getdoc() from the standard library instead of numdoclint.helper.get_func_overall_docstring(), although that is sufficiently different from your approach of working with the text of the module as a string, that a simple adjustment is not immediately obvious...

@peterprescott
Copy link
Contributor Author

instead of and line_str.strip() != '):', we need to do something like and not is_end_of_signature(line_str), with

def is_end_of_signature(line_str: str) -> bool:
    try:
        line_str = line_str.strip()
        assert line_str[0] == ')'
        assert line_str[-1] == ':'

        if len(line_str)>2:
            line_str = line_str.replace(' ','')
            assert line_str[1:3] == '->'
        return True
    except Exception as e:
        return False

and then we also need to adjust helper.return_val_exists_in_func()...

@peterprescott
Copy link
Contributor Author

peterprescott commented Mar 17, 2022

Okay, I've used that in the .get_func_str() and the get_func_overall_docstring() methods, and it seems to fix my problems. So I've made a PR (#28).

@peterprescott
Copy link
Contributor Author

peterprescott commented Mar 17, 2022

Ah, there's also a problem with helper.get_arg_name_list()... Given this signature:

    def __init__(
        self,
        in_units: int,
        out_units: int,
        hidden_layers: List[HiddenLayerInfo] = None,
        num_epochs: int = 10000,
        validation_split: float = 0.2,
        optimizer: Union[str, Optimizer] = Adam(learning_rate=0.001),
        loss: Union[str, Loss] = "mse",
        train_batch_size: int = 1,
        scaler: Optional[TransformerMixin] = None,
    ):

it loses the last three parameters.

@peterprescott
Copy link
Contributor Author

... I think because the regex of _get_args_str() ends at the closing bracket, assuming that the first closing bracket must be the function's bracket, but ignoring the possibility (as here), that it is part of another function call present as a default value...

@peterprescott
Copy link
Contributor Author

peterprescott commented Mar 21, 2022

I started work using the Python abstract syntax tree (ast) module to get information from the function signature, which I think should be more robust than the current approach, but unfortunately I don't have the time to get it properly integrated with your code, so I'll have to leave this to you now!

@simon-ritchie
Copy link
Owner

Hi, @peterprescott

Sorry, for some reason my GitHub's watch setting was off and notifications didn't fly out and I was very late to notice. Will check the details when I get home today!

@simon-ritchie simon-ritchie self-assigned this Jun 1, 2022
simon-ritchie added a commit that referenced this issue Jun 1, 2022
@simon-ritchie
Copy link
Owner

I've merged and deployed v0.1.9!
Thx lots for posting the issue and pull requests!

Possibly I will rewrite this lib with the ast package that you proposed. However, I cannot say for sure as I'm currently focusing on work on another lib

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants