Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

generators should lower to Iterators.map, not Base.Generator #34368

Open
bramtayl opened this issue Jan 14, 2020 · 12 comments
Open

generators should lower to Iterators.map, not Base.Generator #34368

bramtayl opened this issue Jan 14, 2020 · 12 comments
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) help wanted Indicates that a maintainer wants help on an issue or pull request

Comments

@bramtayl
Copy link
Contributor

This will allow dispatching to types that aren't always Base.Generator

@tkf
Copy link
Member

tkf commented Jan 14, 2020

Thanks for opening this. I think the same holds for Iterators.filter, Iterators.flatten, and Iterators.product.

@JeffBezanson
Copy link
Sponsor Member

Why?

@bramtayl
Copy link
Contributor Author

It would be nice to be able to define function Iterators.map(a_function, thing::SpecialTypeOfArray) to return something other than a Generator. Of course, we can currently overload the Generator constructor to make something that's not a Generator, but it seems to violate the principle of least surprise. @tkf what was your reasoning?

@JeffBezanson
Copy link
Sponsor Member

Why do you need it to return something other than a Generator? A generator just contains the function and iterator, so no information is lost.

@tkf
Copy link
Member

tkf commented Jan 14, 2020

@bramtayl That's my thought too.

More specifically, for example, Iterators.map(very_special_function, ::SpecialTypeOfArray) can return a lazy AbstractArray if you have a compiler-agnostic way of computing output type of the very_special_function. Likewise, Iterators.product can easily return a lazy array, too. Iterators.flatten can also return an array if the input is an array of static arrays. OTOH, Generator cannot be an array subtype as it has to be able wrap other things (e.g., Iterators.filter)

Another usecase is to overload Iterators.map(f, ::MyTable) :: MyLazyTable. You can then overload show of MyLazyTable to preview computed result like Query is doing.

(But maybe the first point should be resolved using traits, as discussed in the ArrayLike PR #34196)

@rapus95
Copy link
Contributor

rapus95 commented Feb 5, 2020

As I understand it right now, that'll easen the implementation of all types of operators which intend to propagate something. For example a caching iterator which automatically caches the outermost iteration call. Even if it's used as the innermost starter. This would work by dispatching on

Iterators.map(f, c::Caching) = cache(f(x) for x in c.iter)

where cache(iter) creates the caching layer.

Now, (f(x) for x in cache(1:10)) would be the same as cache(f(x) for x in 1:10)

Also, it'd allow custom return types which in turn could implement arbitrary traits. Currently, as I understand it, you can only implement that by either not using the generator syntax or by specializing the trait for ::Generator{myf, myiter}

Edit: We'd even be able to explicitly fuse multiple Generator layers (instead of nesting them):

Iterators.map(f, c::Generator) = ((fc.f)(x) for x in c.iter)

(Sure that already could be done aswell but it'd fit into the same generic style very well!)

@andyferris
Copy link
Member

andyferris commented Jun 18, 2020

I support multiple dispatch for construction of generic generators based on lowering to Iterators.map.

A container has a larger API surface than iteration, and if Base.map can conserve these features I don't know why generators and Iterators.map wouldn't also try to conserve indexing and other rich and useful behaviors on the lazily-mapped container. The first example that comes to mind is that a generator of arrays or dictionaries could return something indexable - ideally closely adhering to interfaces like AbstractArray or AbstractDict or, in my case, Dictionaries.AbstractDictionary. (There is the issue of it being ElTypeUnknown, of course...)

@tkf
Copy link
Member

tkf commented Jun 19, 2020

Here is another argument: If you want to define a special method for (f(x) for x in xs), it is kind of possible to do this with dispatching on Generator{<:MyContainer}. However, it does not scale well because Generator can be nested; e.g., dispatching on (g(y) for y in (f(x) for x in xs)) requires Generator{<:Generator{<:MyContainer}}. Lowering to Iterators.map helps because it lets you use arbitrary container type.

@rapus95
Copy link
Contributor

rapus95 commented Jun 19, 2020

Why is that:

julia> Iterators.map(x->x^2, 1:10) isa Base.Generator
false

julia> Iterators.map==map
true

I mean, shouldn't Iterators.map be the lazy variant in the first place? Otherwise lowering the generator syntax to Iterators.map would be very game changing in the first place since it wouldn't be lazy anymore. So either way, that should be changed even before changing the lowering IMO.

@tkf
Copy link
Member

tkf commented Jun 19, 2020

This is because Iterators.map is implemented in #34352 which just got merged. So, Iterators.map === Base.map is false in Julia >= 1.6 and true in Julia <= 1.5. This PR is what to do after #34352. As you said, (f(x) for x in xs) should always be lazy.

@rapus95
Copy link
Contributor

rapus95 commented Aug 14, 2020

What are the chances that this will make it into LTS (=1.6 feature freeze)? I think it would be very important to have a generic style and meaning of the Iterators submodule and to be able to hook into its functions in a way that uses context information to eliminate unnecessary work. I.e. the Iterators.<fun> should only promise that whatever will be returned will be lazy. The current implementation which just points towards the lazy variants of the eager algorithms will become the default/fallback.implementation.

For example, having Iterators.product(CartesianIndices, CartesianIndices) return something of type CartesianIndices rather than just a generic ProductIterator (EDIT: that concrete example might be a bad one since a tuple of CartesianIndex is something else than a single flattened CartesianIndex). Encourage the specialization of functions in the Iterators submodule.

@StefanKarpinski StefanKarpinski added the triage This should be discussed on a triage call label Aug 17, 2020
@JeffBezanson
Copy link
Sponsor Member

This came up in the triage channel on slack, and I guess we might as well do it. Anybody want to make a PR?

@vtjnash vtjnash removed the triage This should be discussed on a triage call label Jun 3, 2021
@mbauman mbauman added help wanted Indicates that a maintainer wants help on an issue or pull request compiler:lowering Syntax lowering (compiler front end, 2nd stage) labels Jun 4, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler:lowering Syntax lowering (compiler front end, 2nd stage) help wanted Indicates that a maintainer wants help on an issue or pull request
Projects
None yet
Development

No branches or pull requests

8 participants