Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Specialise any and all for tuples #44063

Merged
merged 6 commits into from
May 9, 2022

Conversation

jipolanco
Copy link
Contributor

This avoids a small allocation (and improves performance) when iterating over a ProductIterator constructed using mixed iterator types. See #40283 for details.

I couldn't really identify where the allocation comes from, and there's likely an underlying issue to be still fixed. But this does solve the ProductIterator issue.

When used with homogeneous iterator types, this has no additional cost compared to the original short-circuiting version, as seen in the example below.

Also note that Base.map (used in this PR) was already being used in isdone(z::Zip).

Closes #40283.

Example:

using BenchmarkTools

# 1. Mixing different iterator types (heterogeneous iterator)
xs = (1:5, 1:0.5:4)
iter = Iterators.product(xs...)

# Julia master:
@btime iterate($iter);  # 37.649 ns (2 allocations: 144 bytes)

# With this PR:
@btime iterate($iter);  # 8.586 ns (0 allocations: 0 bytes)

# 2. Using the same iterator type (homogeneous iterator)
xs = (1:5, 2:4)
iter = Iterators.product(xs...)

# Julia master:
@btime iterate($iter);  # 2.181 ns (0 allocations: 0 bytes)

# With this PR:
@btime iterate($iter);  # 2.124 ns (0 allocations: 0 bytes)

@N5N3
Copy link
Member

N5N3 commented Feb 7, 2022

any(f, iter::Tuple) use loop-based fallback thus any mixed types might do harm to performance.
Maybe we should unroll the loop for short iter (similar to Base._tuple_any)

base/iterators.jl Outdated Show resolved Hide resolved
@jipolanco
Copy link
Contributor Author

Seems reasonable to add an optimised version of any (and also all?) for tuples.

Here is one possible implementation:

# any(f, itr) = _any(f, itr, :)  # definition in base/reduce.jl

_any(f, itr::Tuple, ::Colon) = _any_tuple(f, false, itr...)

@inline function _any_tuple(f, anymissing, x, rest...)
    v = f(x)
    if ismissing(v)
        anymissing = true
    elseif v
        return true
    end
    return _any_tuple(f, anymissing, rest...)
end

@inline _any_tuple(f, anymissing) = anymissing ? missing : false

Should this be limited to small tuples?

@jipolanco jipolanco changed the title Avoid allocation in iterate(::ProductIterator) Specialise any and all for tuples Feb 8, 2022
@N5N3
Copy link
Member

N5N3 commented Feb 9, 2022

I think it's OK to put these optimization in reduce.jl to fix the build error.
Another suggestion: if eltype(x::Tuple) is concrete, it seems good to use for loop and Let LLVM unroll for us.

# If eltype(x) is concrete. Use for loop
any(f, x::NTuple{N}) where {N} = _any(f, x, :)
# Otherwise force unroll to avoid union-split. (Only for small Tuple)
function any(f, x::NTuple{N,Any}) where {N}
    N >= 32 && return _any(f, x, :) # Not tuned, just follow Any32
    _any_tuple(f, false, x...)
end

@jipolanco
Copy link
Contributor Author

jipolanco commented Feb 9, 2022

@N5N3 Thanks for the suggestions! This should be taken care of now.

Fall back to original for-loop implementation if the tuple either:

- is a homogeneous tuple (all elements have the same type);
- has length > 32.
@KristofferC
Copy link
Sponsor Member

@jipolanco Is this good to go?

@jipolanco
Copy link
Contributor Author

@KristofferC Yes I think it's ready to be merged.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Allocation in iterating over a ProductIterator constructed using mixed iterator types
4 participants