Feat(standalone): Storage backend #375

birchmd · 2021-11-23T03:20:11Z

Lays out the necessary logic for the rocksdb backend. This implementation follows @mfornet 's suggestion (using labelled keys instead of snapshots + diffs). Diffs are still persisted because they might be useful for other things.

~~I've added some basic tests, and I'll write more before this PR gets merged. But it's a relatively large PR so I wanted to get it out there to give others time to review it.~~ Edit: I'm happy with the test coverage now.

I also need to fix some clippy warnings that popped up because I wanted to import test utils into the standalone crate. Though maybe the better way to do it would be to put standalone tests into the engine-tests crate. Edit: moved standalone-storage into its own crate and put the tests for it in engine-tests; this will also enable correctness checks by running the same transaction under both the wasm contract and the standalone backed by the new storage layer.

joshuajbouw

Just a couple of changes. All in all, looks correct as far as I can tell right now. Good start!

engine-standalone-storage/Cargo.toml

engine-standalone-storage/src/engine_state.rs

engine-standalone-storage/src/lib.rs

joshuajbouw · 2021-11-24T06:27:05Z

The three storage lifetimes, thats more of a optimisation to prevent cloning of large storage states I take it?

mfornet

The current implementation doesn't allow to iterate through ranges in the storage. One of the suggestions made in the notes were:

Maintain the current state indexed by an empty key. This can provide range queries for the latest storage only.

Do we plan to support this as well?

engine-standalone-storage/src/lib.rs

mfornet · 2021-11-24T13:01:55Z

engine-standalone-storage/src/engine_state.rs

+    fn write_storage(&mut self, key: &[u8], value: &[u8]) -> Option<Self::StorageValue> {
+        let original_value = self.read_storage(key);
+
+        self.transaction_diff
+            .borrow_mut()
+            .modify(key.to_vec(), value.to_vec());
+
+        original_value
+    }


Should we allow some option to write to storage without reading evicted value! I think this feature is not always necessary, and reading from storage can be slow if the value is not in the diff right?

Same with remove_storage.

I see this is part of IO trait, and probably that involve a bigger change.

Don't take this comment as blocking, just something that passed through my mind while reading the code.

The interface for the IO trait is based on the one from NEAR itself where write also takes a register to store the old value.

I agree the current implementation is more overhead when the old value is not used. An alternative implementation is the following:

We add another variant to EngineStorageValue called LazyDBRead which holds the key. And its AsRef implementation actually performs the read to return the value. Then for write_storage and remove_storage we check if the key is in the diff, if so we capture the value as we do now, but if not this LazyDBRead is returned insteasd. This means no DB read is done unless the returned value is used. It works because we do not modify the DB during the lifetime of this IO instance (writes only change the Diff), so we can technically perform the write before reading the old value from the DB.

What do you think about that? It is more efficient (writes only require allocating memory to store the key, not DB reads unless the value is used), but more complex also. I'm not sure what level of optimization we need for the standalone engine. It probably has an inherent advantage in terms of performance because it does not require wasm interpretation.

This is an excellent suggestion, though not strictly necessary for now. I'm ok skipping it.

engine-standalone-storage/src/engine_state.rs

birchmd · 2021-11-24T18:01:38Z

The current implementation doesn't allow to iterate through ranges in the storage. One of the suggestions made in the notes were:

Maintain the current state indexed by an empty key. This can provide range queries for the latest storage only.

Do we plan to support this as well?

I don't think I will support this. Having a "hot" state that is updated with every block seems like it will make it hard to get a consistent view of the state. I think if we need to do range queries it should be possible the get the relevant data from diffs alone. It might be a high latency operation, but it seems like use cases requiring range queries will be infrequent and not require instantaneous responses.

birchmd · 2021-11-24T18:05:23Z

The three storage lifetimes, thats more of a optimisation to prevent cloning of large storage states I take it?

The lifetimes are needed because the engine currently requires IO + Copy on its storage access trait bound. None of the rust smart pointers (Box, Rc, Arc, etc) implement Copy, so regular references needed to be used. Semantically there are three different references involved, with potentially distinct origins, and therefore three lifetimes. That said, in practice the implementation ends up with all three lifetimes being the same so we could simplify to a single lifetime. I kept the lifetimes independent though because I think it is an implementation detail that they all happen to come out the same and in the future we may wish to change the implementation such that the lifetimes are different.

birchmd · 2021-11-24T20:15:47Z

@joshuajbouw @mfornet Thanks for the reviews! I've addressed all the comments. I've also added the last test I wanted to include for this feature. I think this PR is ready to merge once I have your approvals.

joshuajbouw

Just note, I believe there are a few optimisations here. But, we need to get this over the edge and its fine for now.

Specifically with write_storage, we require .to_vec(), and taking in owned values as a reference even though we are taking ownership of them. Likewise, write_storage_direct shouldn't exist.

engine-standalone-storage/src/diff.rs

birchmd · 2021-11-25T18:39:58Z

Just note, I believe there are a few optimisations here. But, we need to get this over the edge and its fine for now.

Specifically with write_storage, we require .to_vec(), and taking in owned values as a reference even though we are taking ownership of them. Likewise, write_storage_direct shouldn't exist.

write_storage_direct makes sense in the NEAR-based IO implementation as a performance optimization because it means we don't need to copy values from a register just to send it back to the register again (the StorageValue type is a register ID). In this implementation I agree it doesn't add any value. But I also think it's harm is pretty minimal, so I doubt it'll cause any performance issue. If it does then the implementation can be adjusted to reduce cloning bytes.

* Feat(engine): London hard fork support (#244) * Fix(exit precompile): Address to refund in case of error is an argument (#311) * Feat(engine): Make engine parametric in storage access (#314) * Test verifying the EVM log returns the correct address (#341) * Remove sdk::current_account_id usage from engine-precompiles (#346) * Remove Default trait bound from engine IO (#342) * Remove some sdk usage from core logic (#347) * Factor out blockchain environment variable access as a trait (#349) * Factor out NEAR promise host functions into a trait (#353) * Borsh deserialized value field for call args (#351) * Refactor(eth-connector): Use Result return values instead of panicking (#355) * Gate all NEAR host functions behind the contract feature (#356) * Bump @openzeppelin/contracts from 4.3.2 to 4.3.3 in /etc/eth-contracts * Chore: Newtypes for gas (#344) * Feat(standalone): Standalone (#345) * Minor fixes to sdk refactor (#359) * Refactor(engine): Move submit logic into engine module (#366) * Feat(standalone): Storage backend (#375) * NEAR random numbers from solidity contract (#368) * Feat(standalone): EVM tracing via SputnikVM (#383) * Feat(standalone): Bootstrap storage from relayer and state snapshots (#379) * Feat(standalone): Structures and logic for keeping storage in sync with the blockchain (#382) * Feat(standalone): Capture geth-like tracing from SputnikVM events (#384) * Connector cleanup (#374) * Remove betanet * Fix(engine): original_storage bug fix; more tracing tests (#390) * Increase NEAR Gas for ft_on_transfer (#389) Co-authored-by: Andrew Bednoff <[email protected]> Co-authored-by: dependabot[bot] <49699333+dependabot[bot]@users.noreply.github.com> Co-authored-by: Evgeny Ukhanov <[email protected]> Co-authored-by: Marcelo Fornet <[email protected]> Co-authored-by: Michael Birch <[email protected]>

birchmd added A-storage Area: Any changes that affect storage. A-standalone Area: the standalone engine EVM labels Nov 23, 2021

birchmd requested review from joshuajbouw and mfornet November 23, 2021 03:20

birchmd linked an issue Nov 23, 2021 that may be closed by this pull request

Standalone storage #370

Closed

joshuajbouw suggested changes Nov 24, 2021

View reviewed changes

mfornet reviewed Nov 24, 2021

View reviewed changes

mfornet self-requested a review November 25, 2021 11:14

mfornet approved these changes Nov 25, 2021

View reviewed changes

joshuajbouw approved these changes Nov 25, 2021

View reviewed changes

engine-standalone-storage/src/diff.rs Outdated Show resolved Hide resolved

Feat(standalone): Storage backend (aurora-is-near#375)

5bc72a4

birchmd force-pushed the standalone-storage branch from 01d8026 to 5bc72a4 Compare November 25, 2021 18:36

birchmd merged commit 5bc72a4 into aurora-is-near:develop Nov 25, 2021

birchmd deleted the standalone-storage branch November 25, 2021 18:47

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Feat(standalone): Storage backend #375

Feat(standalone): Storage backend #375

birchmd commented Nov 23, 2021 •

edited

Loading

joshuajbouw left a comment

joshuajbouw commented Nov 24, 2021

mfornet left a comment

mfornet Nov 24, 2021

mfornet Nov 24, 2021

mfornet Nov 24, 2021

birchmd Nov 24, 2021

mfornet Nov 25, 2021

birchmd commented Nov 24, 2021

birchmd commented Nov 24, 2021

birchmd commented Nov 24, 2021

joshuajbouw left a comment •

edited

Loading

birchmd commented Nov 25, 2021

Feat(standalone): Storage backend #375

Feat(standalone): Storage backend #375

Conversation

birchmd commented Nov 23, 2021 • edited Loading

joshuajbouw left a comment

Choose a reason for hiding this comment

joshuajbouw commented Nov 24, 2021

mfornet left a comment

Choose a reason for hiding this comment

mfornet Nov 24, 2021

Choose a reason for hiding this comment

mfornet Nov 24, 2021

Choose a reason for hiding this comment

mfornet Nov 24, 2021

Choose a reason for hiding this comment

birchmd Nov 24, 2021

Choose a reason for hiding this comment

mfornet Nov 25, 2021

Choose a reason for hiding this comment

birchmd commented Nov 24, 2021

birchmd commented Nov 24, 2021

birchmd commented Nov 24, 2021

joshuajbouw left a comment • edited Loading

Choose a reason for hiding this comment

birchmd commented Nov 25, 2021

birchmd commented Nov 23, 2021 •

edited

Loading

joshuajbouw left a comment •

edited

Loading