Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Functions yielding wrongly report errors about Return: being missing #15

Closed
leandro-lucarella-frequenz opened this issue Jun 1, 2023 · 12 comments · Fixed by #23
Closed

Comments

@leandro-lucarella-frequenz

When functions yield instead of return, a Yield: should be used in the docstring (at least in google-style) instead of Return:, but pydoclint complains about a missing Return:.

For example this code: https://github.com/frequenz-floss/frequenz-sdk-python/blob/776352c0e9383506a778751560a3e21e91e2d6fd/src/frequenz/sdk/timeseries/_base_types.py#L52-L60

    def __iter__(self) -> Iterator[float | None]:
        """Return an iterator that yields values from each of the phases.

        Yields:
            Per-phase measurements one-by-one.
        """
        yield self.value_p1
        yield self.value_p2
        yield self.value_p3

Fails with:

src/frequenz/sdk/timeseries/_base_types.py
    52: DOC201: Method `Sample3Phase.__iter__` does not have a return section in docstring

When running: pydoclint --style google -aid True -th False src/ (for pydoclint 0.0.6)

@jsh9
Copy link
Owner

jsh9 commented Jun 1, 2023

Hi @leandro-lucarella-frequenz , I think the more accurate return type of this __iter__ method is Generator[float | None, None, None], rather than an Iterator. A Generator is always an Iterator, but not the other way around:

>>> from typing import Iterator, Generator
>>> issubclass(Generator, Iterator)
True
>>> issubclass(Iterator, Generator)
False

pydoclint checks the return annotation of methods/functions, and if the return annotation is Generator, the docstring doesn't need a "Returns" section, but rather a "Yields" section.

@leandro-lucarella-frequenz
Copy link
Author

This goes against the google style. The google-style says Yields: must be used for iterators (and it makes sense, having the Generator there is just noise really.

Fine. Use “Yields:” rather than “Returns:” in the docstring for generator functions.

https://google.github.io/styleguide/pyguide.html#294-decision

@leandro-lucarella-frequenz
Copy link
Author

Also Python typing documentation says that Iterator can be used for generators:

Alternatively, annotate your generator as having a return type of either Iterable[YieldType] or Iterator[YieldType]:

def infinite_stream(start: int) -> Iterator[int]:
    while True:
        yield start
        start += 1

https://docs.python.org/3/library/typing.html#typing.Generator

@jsh9
Copy link
Owner

jsh9 commented Jun 2, 2023

Hi @leandro-lucarella-frequenz , let me clarify this a bit more. I might have been unclear before.

The reason why we are seeing this violation ("DOC201: Method Sample3Phase.__iter__ does not have a return section in docstring") is: pydoclint sees a return type annotation of Iterator[...], so it thinks that the function should be returning something (rather than yielding something).

Currently, only when pydoclint sees a return annotation of Generator[.......] does it not expect a "Returns:" section in the docstring.

I could make a small code change so that when pydoclint sees Iterator[...] or Generator[....] and it doesn't expect a "Returns:" section, but this would introduce a new risk:

from typing import Any, Iterator, Tuple, List

def zip_lists(list1: List[Any], list2: List[Any]) -> Iterator[Tuple[Any, Any]]:
    return zip(list1, list2)

Look at this function zip_lists(). It's returning an iterator rather than a generator, and it's not yielding anything. So the return type annotation should be Iterator[...].

If I implemented the change I suggested above, this function would fall through the crack and become unchecked by pydoclint.

Do you have any suggestions on how to solve this dilemma?

@leandro-lucarella-frequenz
Copy link
Author

Hi @jsh9 , thanks for the clarification.

Yes! I do have a suggestion for your dilemma :)

I think what pydoclint should look at is also the yield statement. If a function returns an Iterator or Iterable and have yield instead of return, then and only then it should be considered a generator instead of a function returning an Iterator or Iterable, and the documentation should be required to use Yield: instead of Return:. To be honest, I'm not even sure if you need to look at the types at all, any function yielding is really a generator, no matter how you decided to type-hint it.

That said, I'm not sure how complicated implementing this could be, but darglint does this correctly, so it should be possible.

BTW, same applies to AsyncIterator/ables and AsyncGenerators.

@jsh9
Copy link
Owner

jsh9 commented Jun 2, 2023

Thanks! I don't think this would be very difficult to implement.

Btw, how do you suggest that I deal with functions that both yield something and return something? Shall we require a "Returns:" section in the docstring if there's a return statement?

It's probably not the best practice but I've seen code that does this.

Btw I asked ChatGPT about this, and it gave me some concrete examples:


Question

In Python, can I technically both return something and yield something in the same function?

Answer

Yes, in Python, you can technically include both return and yield in the same function. However, they serve very different purposes, and their combined use in the same function is typically seen as a bad practice.

yield is used to create a generator, which is a special kind of iterator that only generates its values on-the-fly as you request them, rather than all at once. This can be very efficient for large data sets, as it allows the generator to produce values one at a time, using only a small amount of memory.

return, on the other hand, is used to send a value back to the caller and end the execution of the function.

When you use yield in a function, that function becomes a generator function, and when it's called, it returns a generator object, rather than running the function's code. The function's code is only run when you iterate over the generator object.

When a return statement is executed in a generator function, it raises a StopIteration exception, signaling that the generator is exhausted. If the return statement includes a value (as in return x), that value becomes the argument to the StopIteration exception.

Here's an example:

def func():
    yield 1
    yield 2
    return 3

gen = func()  # This returns a generator object, it doesn't run the function's code

print(next(gen))  # This prints 1, the first value yielded by the generator
print(next(gen))  # This prints 2, the second value yielded by the generator
print(next(gen))  # This raises a StopIteration exception with 3 as its argument

In the last line, trying to get the next value from the generator causes the return 3 statement to be executed, which raises a StopIteration exception. You can catch this exception to get the value returned by the function:

try:
    print(next(gen))
except StopIteration as e:
    print(e.value)  # This prints 3, the value returned by the function

In general, it's best to only use yield in a function if you intend for that function to be a generator function. If you want a function to compute a value and return it, use return. Mixing the two can be confusing and make your code harder to understand.

@leandro-lucarella-frequenz
Copy link
Author

Btw, how do you suggest that I deal with functions that both yield something and return something? Shall we require a "Returns:" section in the docstring if there's a return statement?

Require both Yields: and Return:?

@leandro-lucarella-frequenz
Copy link
Author

I think you should document what the code does, if it does both, document both.

@jsh9
Copy link
Owner

jsh9 commented Jun 6, 2023

Hi @leandro-lucarella-frequenz , I fixed this issue. It's now included in the newest release (v0.0.8). Please feel free to take a look!

@llucax
Copy link
Contributor

llucax commented Jun 8, 2023

Thanks! I'm a bit backed up, so I might not be able to try it this week. BTW, I'm using a new gh handle but I'm the same guy :D

@llucax
Copy link
Contributor

llucax commented Aug 7, 2023

Hi @jsh9, sorry for the radio silence for so long, your support didn't go unnoticed at all, I just really needed to take on some other high-priority topics, but I think I will have time this week to resume my efforts to migrate from darglint to pydoclint.

@llucax
Copy link
Contributor

llucax commented Aug 7, 2023

OK, I was able to test this again with some of our (open source) projects and it works great! Thanks a lot, it looks like the project had a lot of improvements.

We are still hitting 2 blockers, hopefully that's all and we can finally switch:

We are very eager for the switch, pydoclint already found quite a few errors that were undetected by darglint!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants