Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: value quota based idl decoder limiting #4657

Merged
merged 51 commits into from
Aug 16, 2024

Conversation

crusso
Copy link
Contributor

@crusso crusso commented Aug 9, 2024

Simplifies #4624 to a simple linear limit on the number of decoded values as a function of decoded payload size,
instead of using two linear functions on perfcounter (simulated or real) and allocation counter.

The function is:

value_quota(blob) : Nat64 = blob.size() * (numerator/denominator) + bias

where blob is the candid blob to be decoded, and numerator (default 1), denominator (default 1) and bias (default 1024) are Nat32s.

Much simpler than #4624 and doesn't depend on vagaries of instruction metering and byte allocation which varies with gc and compiler options, but is it good enough?

The constants can be (globally) modified/inspected using prims (Prim.getCandidLimits/Prim.setCandidLimits) which will need to get exposed in base eventually.

The quota is decremented on every call to deserialise or skip a value in vanilla candid mode (destabilization is not metered).
The quota is eagerly checked before deserializing or skipping arrays.

One possible refinement would be to combine the value quota with a memory quota (though the latter would still vary with gc flavour and perhaps word-size unless we count logical words)

  • Disable for destabilization (iff Registers.get_rel_buf_opt is zero)
  • Port new candid spacebomb test suite to drun-tests, to test against real perf counter provided by drun.
  • Bump candid dependency to most recent
  • Pass new spacebomb tests, both in candid test suite on wasmtime using value counter.

src/mo_values/prim.ml Outdated Show resolved Hide resolved
@luc-blaeser
Copy link
Contributor

luc-blaeser commented Aug 13, 2024

One thing that I am unsure or I have probably missed (sorry for the potentially stupid question): Does the metering limit also need to be checked on ordinary decoding of Candid when it is not skipped and not recursively called?

Copy link
Contributor

@luc-blaeser luc-blaeser left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Very nice PR!
I have only some minor comments and one question (probably not relevant).

@crusso
Copy link
Contributor Author

crusso commented Aug 13, 2024

One thing that I am unsure or I have probably missed (sorry for the potentially stupid question): Does the metering limit also need to be checked on ordinary decoding of Candid when it is not skipped and not recursively called?

Actually, I'm not sure I understand the question (which worries me). I think all calls, initial, skip and recursive should be doing the check, but I maybe you've seen something I haven't.

@crusso
Copy link
Contributor Author

crusso commented Aug 13, 2024

@luc-blaeser PTAL (and thanks for the review!)

Copy link
Contributor

@chenyan-dfinity chenyan-dfinity left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we document the cost model somewhere, so that people can roughly know how to tune the parameters?

@@ -314,6 +319,7 @@ unsafe fn skip_any_vec(buf: *mut Buf, typtbl: *mut *mut u8, t: i32, count: u32)
// makes no progress. No point in calling it over and over again.
// (This is easier to detect this way than by analyzing the type table,
// where we’d have to chase single-field-records.)
idl_limit_check((count - 1) as u64);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can count be 0?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, because we return on previous line 311 when count == 0

@luc-blaeser
Copy link
Contributor

luc-blaeser commented Aug 14, 2024

@luc-blaeser PTAL (and thanks for the review!)

Not that I have any concrete concern, I was just curious whether I got all aspects.
If I understood correctly, the metering covers skipping and recursive calls. Also, the recursive call is done for each decoded Candid value.
I was wondering about the following cases of whether metering could be relevant there or would already be implicitly handled (by the recursion metering).

  • If a value would have dynamic encoded length such as an array of listed elements, I guess the recursive function would be called for each sub-element. This would be fine.
  • Would there be a need for size-dependent metering of blobs? I guess there is no recursive call involved per blob byte.
  • IIRC, some constant Candid values can inflate to variable-sized objects (arrays, maybe also blob). This would be a non-linear effort (initialization) that is maybe not metered.
    These are just thoughts of mine, probably not relevant.

@crusso
Copy link
Contributor Author

crusso commented Aug 14, 2024

@luc-blaeser PTAL (and thanks for the review!)

Not that I have any concrete concern, I was just curious whether I got all aspects. If I understood correctly, the metering covers skipping and recursive calls. Also, the recursive call is done for each decoded Candid value. I was wondering about the following cases of whether metering could be relevant there or would already be implicitly handled (by the recursion metering).

  • If a value would have dynamic encoded length such as an array of listed elements, I guess the recursive function would be called for each sub-element. This would be fine.

Yes, that's the idea.

  • Would there be a need for size-dependent metering of blobs? I guess there is no recursive call involved per blob byte.

Fortunately, the blob decoder first checks that the blob does not exceed the length of the candid payload, which is limited to 10MB by the IC (in the worst case). The Nat and Int decoders also check they don't decode past the candid payload, so I was hoping that would limit the allocation.

  • IIRC, some constant Candid values can inflate to variable-sized objects (arrays, maybe also blob). This would be a non-linear effort (initialization) that is maybe not metered.

Hmm, for arrays, maybe one could decrement the quota by number of elements before allocation to trigger then check, and then bulk-increment the quota before deserializing the element (to compensate for the earlier bulk check). Not sure.

These are just thoughts of mine, probably not relevant.

Copy link

github-actions bot commented Aug 14, 2024

Comparing from 367145d to 8e07a63:
In terms of gas, 5 tests regressed and the mean change is +0.0%.
In terms of size, 5 tests regressed and the mean change is +0.2%.

@crusso crusso added the automerge-squash When ready, merge (using squash) label Aug 16, 2024
@crusso
Copy link
Contributor Author

crusso commented Aug 16, 2024

Can we document the cost model somewhere, so that people can roughly know how to tune the parameters?
Done in Changelog for now. Will add to base eventually...

@mergify mergify bot merged commit b03a4d6 into master Aug 16, 2024
10 checks passed
@crusso crusso changed the title experiment: value quota based idl decoder limiting feat: value quota based idl decoder limiting Aug 16, 2024
@mergify mergify bot deleted the claudio/candid-metering-count-values branch August 16, 2024 16:07
@mergify mergify bot removed the automerge-squash When ready, merge (using squash) label Aug 16, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants