Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Question on eduction #313

Closed
juliohm opened this issue Jun 24, 2020 · 13 comments
Closed

Question on eduction #313

juliohm opened this issue Jun 24, 2020 · 13 comments
Labels
question Further information is requested

Comments

@juliohm
Copy link

juliohm commented Jun 24, 2020

Is there any major difference in performance between:

reduce(+, eduction(sin(x) for x in xs if abs(x) < 1))

and

reduce(+, collect(sin(x) for x in xs if abs(x) < 1))

in general? When is it recommended to collect the iterator?

@tkf
Copy link
Member

tkf commented Jun 24, 2020

Here is a list of similar snippets what happens with them:

  • reduce(+, eduction(sin(x) for x in xs if abs(x) < 1)) uses threaded reduction implemented in Transducers.jl
  • ThreadsX.reduce(+, (sin(x) for x in xs if abs(x) < 1)) ditto
  • ThreadsX.sum(sin(x) for x in xs if abs(x) < 1) ditto
  • sum(eduction(sin(x) for x in xs if abs(x) < 1)) uses single-thread reduction implemented in Transducers.jl (requires the latest version 0.4.36)
  • reduce(+, collect(sin(x) for x in xs if abs(x) < 1)) uses single-thread reduction implemented in Base. It also allocates an array.
  • reduce(+, (sin(x) for x in xs if abs(x) < 1)) also uses transducers, but implemented in Base Transducer as an optimization: map, filter and flatten JuliaLang/julia#33526

If the mapping, filtering, etc. are slow and/or the input is huge, I'd recommend one of the threaded versions. Note that the functions used in the iterator should be "thread-safe" (not modifying shared state etc.). If you are not using transducers directly, I think ThreadsX.jl interface is simpler to use and easier to get rid of.

When is it recommended to collect the iterator?

If you are going to reduce/foldl it, I'm tempted to say "never" (it's the primary goal of Transducers.jl). But collecting the iterator could be beneficial if you are going to use it multiple times.

@tkf tkf added the question Further information is requested label Jun 24, 2020
@juliohm
Copy link
Author

juliohm commented Jun 24, 2020

So ThreadsX.jl is a wrapper over Transducers.jl to provide similar syntax to Base, right? I will think about it carefully, but since the package is "heavier" than Transducers.jl that is already a good reason to stick with Transducers.jl for now.

ThreadsX.jl provides any performance benefit compared to Transducers.jl?

@juliohm
Copy link
Author

juliohm commented Jun 24, 2020

There is also ThreadPools.jl, I wonder if you had a chance to compare.

@juliohm
Copy link
Author

juliohm commented Jun 24, 2020

Regarding the eduction, I asked because I think it didn't work when I tried it here.

@tkf
Copy link
Member

tkf commented Jun 24, 2020

Yeah, Threadsx.jl is mostly a wrapper around Transducers.jl. It makes sense to stick with Transducers.jl if you don't need it. Some reducers are non-trivial to implement (e.g., ThreadsX.unique) so that's why I wrote it. But I guess sum is too easy.

ThreadsX.jl provides any performance benefit compared to Transducers.jl?

Not really. It has a bit of heuristics for basesize but it's not a big deal (at least for now).

There is also ThreadPools.jl, I wonder if you had a chance to compare.

I haven't done any performance comparison yet. But my impression is that there is no threading tool that is as extensible and "correct" as Transducers.jl. For example, many threaded map implementation uses Core.Compiler.return_type to pre-allocate the output array. This introduces undefined behavior and subtle bugs in the user code.

I think the current performance of Transducers.jl is reasonably OK. So I want to cleanup/consolidate the API first. After that, there is plenty of room for optimizations and a lot to learn from other threading tools like ThreadPools.jl.

it didn't work when I tried it here.

Did it error out or was it just slow? The latest release (0.4.36) tweaked this part so it may be worth trying it again. See also #306.

As usual, a bug report (preferably with a MWE) is very welcome :)

@juliohm
Copy link
Author

juliohm commented Jun 27, 2020

It errors with the following message:

Varioplane: Error During Test at /home/juliohm/.julia/dev/Variography/test/varioplane.jl:1
  Got exception outside of a @test
  MethodError: no method matching halve(::SpatialPartition{RegularGridData{Float64,2}})

Could you please confirm that the method only works with iterators that provide a halve implementation?

@juliohm
Copy link
Author

juliohm commented Jun 27, 2020

Do I understand correctly that I need to implement __foldl__ on my type to have it as a reducible iterator? https://juliafolds.github.io/Transducers.jl/stable/examples/reducibles

I am a bit confused given that I couldn't find the halve function defined anywhere to specialize in my package.

To clarify a bit, I've just tried removing the collect from this line, and then the halve error appeared.

@tkf
Copy link
Member

tkf commented Jun 27, 2020

__foldl__ is defined in terms of iterate so you don't need to define it manually (although it's possible to define a faster __foldl__ manually). This is only for single-threaded reduction.

If you want multi-threaded reduction on a custom container, you need to define halve (at least). See https://juliafolds.github.io/SplittablesBase.jl/dev/#SplittablesBase.halve

(I guess the word "reducible" is confusing at this moment. It was written before I added parallel reductions.)

@juliohm
Copy link
Author

juliohm commented Jun 27, 2020

Can I implement the halve function by importing it from Transducers.jl or I need to add SplittablesBase.jl as a dependency as well?

@tkf
Copy link
Member

tkf commented Jun 27, 2020

Transducers.jl import SplittablsBase.jl always anyway so there is no point to avoid adding it as a dependency.

SplittablsBase.jl is meant to be a light-weight package to let packages support Transducers.jl etc. without importing Transducers.jl. FWIW, I think I'll move __foldl__ etc. out from Transducers.jl at some point.

(You can do Transducers.halve(::CustomType) = ... instead of SplittablsBase.halve(::CustomType) = ... because Transducers.halve === SplittablsBase.halve. However, I'm not sure if it is a bug in julia that it's possible.)

@juliohm
Copy link
Author

juliohm commented Jun 27, 2020

I will try to implement the function tomorrow. My iterator stores a vector of vectors that could be halved. In general, should I expect a major performance gain by avoiding the collect?

@tkf
Copy link
Member

tkf commented Jun 28, 2020

Looking at SpatialPartition, it looks like it's "almost" a vector because you already define Base.getindex. If you are OK with making it a subtype of vector, halve would work out-of-the-box.

Otherwise, defining halve should also be easy:

struct SpatialPartition{O<:AbstractSpatialObject,V<:AbstractVector{Vector{Int}}}
  object::O
  subsets::V  # so that we can use view
  metadata::Dict
end

function SplittablesBase.halve(partition::SpatialPartition)
    left, right = SplittablesBase.halve(partition.subsets)
    return (
        SpatialPartition(partition.object, left, partition.metadata),
        SpatialPartition(partition.object, right, partition.metadata),
    )
end

In general, should I expect a major performance gain by avoiding the collect?

I guess it depends. If the computation for each element is very expensive, it's probably hard to observe the difference.

@juliohm
Copy link
Author

juliohm commented Aug 20, 2020

Thank you for the clarifications @tkf , I will close the issue for now knowing the the way forward is to implement halve for the types.

@juliohm juliohm closed this as completed Aug 20, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants