Skip to content
This repository has been archived by the owner on Nov 15, 2023. It is now read-only.

MaxEncodedLen tracking issue #8719

Closed
13 of 20 tasks
coriolinus opened this issue May 3, 2021 · 4 comments
Closed
13 of 20 tasks

MaxEncodedLen tracking issue #8719

coriolinus opened this issue May 3, 2021 · 4 comments

Comments

@coriolinus
Copy link
Contributor

coriolinus commented May 3, 2021

Rationale

For PoV benchmarking, we need some way to estimate what the maximum encoded size of an arbitrary storage item can be. We've settled on a new trait, MaxEncodedLen.

While Encode::size_hint already exists, we can't rely on it. One fatal flaw: it's an optional, "if possible" method which for many structs simply returns the default size of 0. Another: Encode is implemented for certain types such as VecDeque<T>, which we'd want to intentionally exclude from storage, as they aren't bounded. We need a trait which we can use to exclude types.

Design

pub trait MaxEncodedLen: Encode {
  fn max_encoded_len() -> usize;
}

Roadmap

Notes

This always explicitly computes an upper bound. While a u128 can potentially be compact-encoded in only one byte, Compact<u128>::max_encoded_len() will always return 17.

While implementations of BoundedVec<T> and friends impose a hard cap on their length, we do not use BoundedEncodedLen to impose a hard cap on the size of StorageMap and friends. While potentially a parachain author could misrepresent the bounds on its maps to attempt to pack a parachain block with more transactions, that behavior is self-defeating. Eventually they will try to pack too many transactions into a block, causing that block to become invalid because its actual size is too large. User documentation for the storage::map_bound attribute should emphasize that it is in the chain author's best interest to use a reasonable value here.

We'd like to offer logging warnings that could be noted by Prometheus etc when the size of StorageMap exceeds certain thresholds (eg. info at 80%, warn at 100%), but that feature is out of scope of this tracking issue. It requires additional design work because currently the only way to determine a map's size is to iterate over its members, which is not performant.

max_encoded_len is a function instead of an associated constant for pragmatic rather than theoretical reasons. Currently, BoundedVec's limit is not a constant, so we can't use its value in a constant context.

@bkchr
Copy link
Member

bkchr commented May 3, 2021

* Introduce the `BoundedEncodedLen` trait to SCALE

Why? Why should this exist in scale-codec?

What I still don't understand, how should this help to calculate the proof size? The proof size is not only composed of the size of the individual elements. The size also comes from the trie structure itself that is encoded in the proof. Than there are also values included which are may not being accessed by an extrinsic and still would increase the proof size.

@shawntabrizi
Copy link
Member

shawntabrizi commented May 3, 2021

After our chat in Element, I agree this probably can go in Substrate instead of SCALE.

@coriolinus you can transfer the issue.

@coriolinus coriolinus transferred this issue from paritytech/parity-scale-codec May 4, 2021
@kianenigma
Copy link
Contributor

I can help with

  • Extend Frame macros to..

  • Implement Derive macro for...


Backwards-compatibility: existing macro declarations should use a fixed bound of 300,000

Where is this number coming from?


New requirement on frame::storage: all storage items must implement BoundedEncodedLen.

Given all of this, in the context of a parachain, we pretty much want to ban storing anything unbounded like Vec<_> in storage, right? be it a map or simple value, it shall never be unbounded.

@coriolinus
Copy link
Contributor Author

coriolinus commented May 4, 2021

I'm working on the derive macro right now, but assistance on the Frame side would be much appreciated.

Where is this number coming from?

Just a SWAG; it's high enough that it'll probably suffice for most cases without being high enough to make the weight predictions unreasonably high.

be it a map or simple value, it shall never be unbounded.

Yes, we're specifically excluding Vec<T> with that requirement, in favor of BoundedVec. Bounded maps etc which can fit inside other arbitrary storage items will follow.

@coriolinus coriolinus changed the title BoundedEncodedLen tracking issue MaxEncodedLen tracking issue May 4, 2021
coriolinus added a commit that referenced this issue May 6, 2021
ghost pushed a commit that referenced this issue May 6, 2021
* Add `BoundedBTreeMap` to `frame_support::storage`

Part of #8719.

* max_encoded_len will never encode length > bound

* requiring users to maintain an unchecked invariant is unsafe

* only impl debug when std

* add some marker traits

* add tests
coriolinus added a commit that referenced this issue May 6, 2021
ghost pushed a commit that referenced this issue May 10, 2021
* Add `BoundedBTreeSet`

Part of #8719

* fix copy-pasta errors

Co-authored-by: Kian Paimani <[email protected]>

Co-authored-by: Kian Paimani <[email protected]>
nazar-pc pushed a commit to autonomys/substrate that referenced this issue Aug 8, 2021
* Add `BoundedBTreeMap` to `frame_support::storage`

Part of paritytech#8719.

* max_encoded_len will never encode length > bound

* requiring users to maintain an unchecked invariant is unsafe

* only impl debug when std

* add some marker traits

* add tests
nazar-pc pushed a commit to autonomys/substrate that referenced this issue Aug 8, 2021
* Add `BoundedBTreeSet`

Part of paritytech#8719

* fix copy-pasta errors

Co-authored-by: Kian Paimani <[email protected]>

Co-authored-by: Kian Paimani <[email protected]>
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants