Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Remove weight method and introduce sequential-threshold #81

Closed
wants to merge 10 commits into from

Conversation

nikomatsakis
Copy link
Member

As I wrote in #49: "Currently parallel iterators assume cheap operations. I am thinking that should be changed to assume expensive operations (i.e., fine-grained parallel splits), and have people opt-in to the current behavior by manually adjusting the weights or calling weight_min."

This branch implements that idea. The weight method is deprecated and a new sequential_threshold(N) method is added. Calling with a value N means that, if Rayon has less than N items remaining, it will not attempt to spawn another thread. So if you set the threshold to 222, and you had 400 items total, then Rayon would first split into threads, each processing 200 items, and then stop. (The docs point out that this is not a guarantee, however, and your code cannot rely on this for correctness.)

I still think this is the right thing -- but I was surprised by how drastically it affected some fine-grained benchmarks, which really suffered unless the threshold is added in. (I expected them to go slow, but not as slow as they did.) This is probably just highlighting optimization that needs to be done.

I'm curious to hear what people think. cc @cuviper, who always has good advice, and @dirvine, who participated on #49. =)

Fixes #49

@cuviper
Copy link
Member

cuviper commented Aug 14, 2016

In general, I like it. The weight was a fairly abstract concept, but this should be easier to understand how the code will execute, since you're almost directly controlling that.

My only hesitation is that this threshold can't be stacked the same way weights were multiplied. That means the caller has to understand their whole chain in one place. Maybe that's fine.

In the recommendation for setting this right before the final action, I think it might be better if this were actively enforced. I'm imagining moving those final actions into a new ParallelFinalize trait, with the general requirement of ParallellIterator: ParallelFinalize, but then Threshold would only implement the latter. This should also simplify a lot of the other types so they don't have to reason about base thresholds at all -- it's only either an implicit 1 or directly overridden from Threshold.

@nikomatsakis
Copy link
Member Author

My only hesitation is that this threshold can't be stacked the same way weights were multiplied. That means the caller has to understand their whole chain in one place. Maybe that's fine.

Yeah, that's awkward. I liked how in theory if you had something that yielded like a fn foo() -> impl ParallelIterator, it could (internally) apply some weights, that would compose nicely.

Is there a way to make the "threshold" compose better? It is sort of counterintuitive to me that it doesn't work...

@nikomatsakis
Copy link
Member Author

In the recommendation for setting this right before the final action, I think it might be better if this were actively enforced.

Yeah I guess this might be better, though I'd prefer to find something more composable.

I still think "heavy" is the right default though.

@nikomatsakis
Copy link
Member Author

OTOH what I really want is to make schedule more adaptive and/or efficient so that it adjusts weights automatically. =) But I'm nervous about this, I don't know of any project that truly claims "success" in this area.

@iqualfragile
Copy link

Maybe the weight problem could be helped by providing some constants, called for example cheap, medium heavy, that would make people tag their tasks correctly, and you could still give more precise values.

@nikomatsakis
Copy link
Member Author

I like the idea of a simplified "cheap vs default vs expensive" API. I will play with that.

FWIW, I tried playing around with some simple heuristics for "auto-thresholding". In particular, I experiment with, after a split, checking when you go to do the RHS if a steal has occurred. If no steal, then we would forego further splitting. This definitely affected performance, but generally made things worse. I have to do more experimentation but as I said I think for short term at least (if not forever) having some hints from user has to be helpful.

@cuviper
Copy link
Member

cuviper commented Sep 15, 2016 via email

@cuviper cuviper mentioned this pull request Oct 2, 2016
@nikomatsakis
Copy link
Member Author

After having let this sit for a bit, I think i've come to a few conclusions:

  • changing the default still feels right;
  • but I'd like to simplify the interface to something like weight(CHEAP | EXPENSIVE | DEFAULT), where DEFAULT is equivalent to EXPENSIVE for now but may become something like "adaptive".
    • not sure yet if this should be a "threshold" or not, but the key point is that the main API ought to be very simple to use without having to enter numbers.

@nikomatsakis
Copy link
Member Author

closing in favor of #106

@nikomatsakis nikomatsakis deleted the no-more-weight branch May 23, 2017 09:35
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants