-
Notifications
You must be signed in to change notification settings - Fork 690
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Elastic scaling: runtime dependency tracking and enactment #3479
Conversation
Signed-off-by: alindima <[email protected]>
also no need to sort by core index any more
…. optimise process_candidates to be O(N)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Logic looks good at first pass, but readability can certainly be improved.
I think we should also see if we can adjust apply_weight_limit
candidate selection to account for elastic scaling.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looking good, left a couple of questions. Happy to approve once this is tested/burned-in.
// In `Enter` context (invoked during execution) there should be no backing votes from | ||
// disabled validators because they should have been filtered out during inherent data | ||
// preparation (`ProvideInherent` context). Abort in such cases. | ||
if context == ProcessInherentDataContext::Enter { | ||
ensure!(!votes_from_disabled_were_dropped, Error::<T>::BackedByDisabled); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
why was this error removed? is it because it was merged into a generic CandidatesFilteredDuringExecution
error? i liked the specificity of the previous errors more
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I believe the original intention was to trade the specific errors for simplicity by using a catch all approach. I will look and see if we can keep it simple and have these errors specific or maybe logging these errors instead of returning them might achieve same.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that's right. I generally added debug logs to the filtering functions called in sanitize_backed_candidates
whenever a candidate is filtered and the reason why it was dropped.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Indeed, checking that filtering filtered nothing at the outer most level is the most robust way to check.
Fixed the runtime API panic caused by #64 and reran benchmarks for westend and rococo |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Rococo weights seem off. Westend look good.
polkadot/runtime/rococo/src/weights/runtime_parachains_paras_inherent.rs
Show resolved
Hide resolved
polkadot/runtime/rococo/src/weights/runtime_parachains_paras_inherent.rs
Show resolved
Hide resolved
polkadot/runtime/rococo/src/weights/runtime_parachains_paras_inherent.rs
Show resolved
Hide resolved
polkadot/runtime/westend/src/weights/runtime_parachains_paras_inherent.rs
Show resolved
Hide resolved
yeah, the weights for rococo are way off when comparing to the previous values. I'm betting that's because they were last updated in 2021. The difference for westend shows that there isn't a significant change |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great work @alindima ! I couldn't help it and still had a few nits, but it is good to go!
let freed = freed_concluded | ||
.into_iter() | ||
.map(|(c, _hash)| (c, FreedReason::Concluded)) | ||
.chain(freed_disputed.into_iter().map(|core| (core, FreedReason::Concluded))) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not introduced here, but a third enum variant Disputed
would have done no harm 😶🌫️ (and also no need to fix it here)
// Cores 1, 2 and 3 are being made available in this block. Propose 6 more candidates (one | ||
// for each core) and check that the right ones are successfully backed and the old ones | ||
// enacted. | ||
let config = default_config(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that we are not even sharing initialization, why is this not a separate test case?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
to avoid long and dubious test names :D that's arguably a bad reason but I didn't think too much about it
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fair.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Amazing work @alindima
…h#3479) Changes needed to implement the runtime part of elastic scaling: paritytech#3131, paritytech#3132, paritytech#3202 Also fixes paritytech#3675 TODOs: - [x] storage migration - [x] optimise process_candidates from O(N^2) - [x] drop backable candidates which form cycles - [x] fix unit tests - [x] add more unit tests - [x] check the runtime APIs which use the pending availability storage. We need to expose all of them, see paritytech#3576 - [x] optimise the candidate selection. we're currently picking randomly until we satisfy the weight limit. we need to be smart about not breaking candidate chains while being fair to all paras - paritytech#3573 Relies on the changes made in paritytech#3233 in terms of the inclusion policy and the candidate ordering --------- Signed-off-by: alindima <[email protected]> Co-authored-by: command-bot <> Co-authored-by: eskimor <[email protected]>
On top of #5082. ## Background Previously, before #3479, we would [include](https://github.com/paritytech/polkadot-sdk/blame/75074952a859f90213ea25257b71ec2189dbcfc1/polkadot/runtime/parachains/src/builder.rs#L508C12-L508C44) the cost enacting the candidate into the cost of processing a single bitfield. [Now](https://github.com/paritytech/polkadot-sdk/blame/dd48544a573dd02da2082cec1dda7ce735e2e719/polkadot/runtime/parachains/src/builder.rs#L529) it is different, although the benchmarks seems to be not-up-to date. Including the cost of enacting a candidate into a processing a single bitfield cost was incorrect, since we multiple that by the number of bitfields we have. Instead, we should separate calculate the cost of processing a single bitfield without enactment, and multiple the cost of enactment by the actual number of processed candidates (which is limited by the number cores, not validators). ## Bench Previously, the weight of `enact_candidate` was calculated manually (without a benchmark) and then neglected: https://github.com/paritytech/polkadot-sdk/blob/dd48544a573dd02da2082cec1dda7ce735e2e719/polkadot/runtime/parachains/src/inclusion/mod.rs#L584 In this PR, we have a benchmark for it and it's based on the number of ump and sent hrmp messages as well as whether the candidate has a runtime upgrade (new_validation_code). The differences from the previous attempt paritytech/polkadot#6929 are that * we don't include the cost of enactment into the cost of processing a backed candidate. The reason for it is that enactment happens not in the same block as backing (typically the next one), since we process bitfields before backing votes. * we don't take into account the size of the runtime upgrade, the benchmark weight doesn't seem to depend much on it, but rather whether there was one or not. Similarly to the previous attempt, we don't account for dmp messages (fixed cost). Also we don't account properly for received hrmp messages (hrmp_watermark) because the cost of it depends on the runtime state and can't be statically deduced in the benchmark (unless we pass the information about channels as benchmark u32 arguments). The total weight cost of processing a parainherent now includes the cost of enactment of each candidate, but we don't do filtering based on that (because we enact after processing bitfields and making other changes to the storage). ## Numbers ``` Reads = 7 + (0 * u) + (3 * h) + (8 * c) Writes = 10 + (1 * u) + (3 * h) + (7 * c) ``` In addition, there is a fixed cost of a few of ms (!) per candidate. This might result a full block slightly overflowing its weight with 200 enacted candidates, which in turn could prevent non-mandatory transactions from being included in a block. Given our modest limits on max ump and hrmp messages: ``` maxUpwardMessageNumPerCandidate: 16 hrmpMaxMessageNumPerCandidate: 10 ``` and the fact that runtime upgrades are can't happen very frequently (`validation_upgrade_cooldown`), we might only go over the limits in case of many disputes. TODOs: - [x] Fix the overweight test - [x] Generate the weights for Westend and Rococo - [x] PRDoc --------- Co-authored-by: command-bot <> Co-authored-by: Alin Dima <[email protected]>
Changes needed to implement the runtime part of elastic scaling: #3131, #3132, #3202
Also fixes #3675
TODOs:
candidates_pending_availability
#3576apply_weight_limit
wrt elastic scaling #3573Relies on the changes made in #3233 in terms of the inclusion policy and the candidate ordering