Skip to content
This repository has been archived by the owner on Apr 4, 2024. It is now read-only.

adr: draft ADR for stateful precompiled contracts #1131

Closed
wants to merge 5 commits into from

Conversation

yihuang
Copy link
Contributor

@yihuang yihuang commented Jun 16, 2022

ref: #1116

Description

Example implementation: https://github.com/yihuang/ethermint/tree/precompiled

IBC example:


For contributor use:

  • Targeted PR against correct branch (see CONTRIBUTING.md)
  • Linked to Github issue with discussion and accepted design OR link to spec that describes this work.
  • Code follows the module structure standards.
  • Wrote unit and integration tests
  • Updated relevant documentation (docs/) or specification (x/<module>/spec/)
  • Added relevant godoc comments.
  • Added a relevant changelog entry to the Unreleased section in CHANGELOG.md
  • Re-reviewed Files changed in the Github PR explorer

For admin use:

  • Added appropriate labels to PR (ex. WIP, R4R, docs, etc)
  • Reviewers assigned
  • Squashed all commits, uses message "Merge pull request #XYZ: [title]" (coding standards)

@yihuang yihuang marked this pull request as draft June 16, 2022 10:09
@yihuang yihuang marked this pull request as ready for review July 6, 2022 03:28
@facs95
Copy link
Contributor

facs95 commented Sep 1, 2022

This is awesome! But collides a bit with the designed proposed in #1272 with a more modular proposition. We should try to decide towards one model and work together towards that model. If we agree the other proposal is the way to go we could either close this or adapt it to meet the modular design. Open for discussion 🙏

@yihuang
Copy link
Contributor Author

yihuang commented Sep 2, 2022

This is awesome! But collides a bit with the designed proposed in #1272 with a more modular proposition. We should try to decide towards one model and work together towards that model. If we agree the other proposal is the way to go we could either close this or adapt it to meet the modular design. Open for discussion 🙏

I think the difference is mainly stateful vs stateless, we can rework this one after the stateless part is done.

@loredanacirstea
Copy link
Contributor

This draft ADR is similar to the work I did back in February 2022:

go-ethereum

loredanacirstea/go-ethereum@abd62c3

  1. Changes for making precompiles dynamic:
  • add a func getPrecompiles(chainRules params.Rules) map[common.Address]PrecompiledContract function, used to instantiate the precompiles in NewEVM
  1. Changes for making precompiles stateful
  • adding evm *EVM, caller ContractRef information when running a precompile - see RunPrecompiledContract

Changes 1 and 2 were mostly isolated in loredanacirstea/go-ethereum@abd62c3.

  1. Other changes for advanced precompiles that are original work
  • CallWithState, RunWithInitialState - for the interpreter precompile
  • giving access to the internal chain state, for introspection precompile

ethermint

Then, ethermint could just use the NewEVM constructor with custom precompiles loredanacirstea@5a8c3a7

But I think @fedekunze's approach is superior because it does not require go-ethereum changes. So, we don't need to expend effort in supporting a geth fork or in convincing the geth team to change their source.

@yihuang
Copy link
Contributor Author

yihuang commented Sep 15, 2022

This draft ADR is similar to the work I did back in February 2022:

Yeah, It's directly inspired by your work, the main development is to make stateful precompiles work with statedb snapshot and revert.

@loredanacirstea
Copy link
Contributor

loredanacirstea commented Sep 15, 2022

@yihuang We should define what stateless and stateful precompiles are. Because I see several categories:

  1. Stateless - as in they do not need to keep any precompile-specific state, nor do they need other state than what they receive as direct input. The equivalent of pure functions in Solidity.
  2. State-viewing - precompiles do not have their own state, but they use internal state (e.g. introspection precompile, where you can get info about transactions/blocks/events). E.g. view functions in Solidity
  3. Stateful - precompiles that need to keep their own internal state (even if temporary)
  4. State-changing - precompiles that change the global state (EVM sstore, etc.). E.g. mutable/non-payable/payable functions in Solidity
  5. [update] transaction initiator - precompiles that initiate transactions (e.g. EVM transactions through the account abstraction precompile that I developed)

@yihuang
Copy link
Contributor Author

yihuang commented Sep 15, 2022

@yihuang We should define what stateless and stateful precompiles are. Because I see several categories:

  1. Stateless - as in they do not need to keep any precompile-specific state, nor do they need other state than what they receive as direct input. The equivalent of pure functions in Solidity.
  2. State-viewing - precompiles do not have their own state, but they use internal state (e.g. introspection precompile, where you can get info about transactions/blocks/events). E.g. view functions in Solidity
  3. Stateful - precompiles that need to keep their own internal state (even if temporary)
  4. State-changing - precompiles that change the global state (EVM sstore, etc.). E.g. mutable/non-payable/payable functions in Solidity

Hmm, what I mean here is the contract modifies the global state, specifically, modifies the cosmos-sdk states, like bank transfers, or sending ibc packets.

@itsdevbear
Copy link
Contributor

@yihuang @loredanacirstea. All of the above makes sense, I've been doing some research into getting state changing precompile working and ultimately am realizing that the StateDB is the problem.

IMO we need to find a way to make it so that an EVM txn reverting reverts the Cosmos txn as well. The design decision to make a reverting EVM transaction a successful cosmos transaction is a huge antipattern in my opinion, and is the root of the StateDB commit() revert() issue.

@yihuang
Copy link
Contributor Author

yihuang commented Sep 22, 2022

@yihuang @loredanacirstea. All of the above makes sense, I've been doing some research into getting state changing precompile working and ultimately am realizing that the StateDB is the problem.

IMO we need to find a way to make it so that an EVM txn reverting reverts the Cosmos txn as well. The design decision to make a reverting EVM transaction a successful cosmos transaction is a huge antipattern in my opinion, and is the root of the StateDB commit() revert() issue.

This ADR proposed a solution to this problem, with examples.

@itsdevbear
Copy link
Contributor

@yihuang correct. But the issue is that we can't call arbitrary module logic, in the way that we can pass CosmosMsg's from CosmWasm VM to the module directly. Going this route, would required duplicating logic that is possibly already defined in it's keeper.

@yihuang
Copy link
Contributor Author

yihuang commented Sep 22, 2022

@yihuang correct. But the issue is that we can't call arbitrary module logic, in the way that we can pass CosmosMsg's from CosmWasm VM to the module directly. Going this route, would required duplicating logic that is possibly already defined in it's keeper.

yes, we have to carefully manage temporary states in memory.
cosmos-sdk do have a CacheContext, which I don't think it can play well with StateDB, and deeply nested CacheContext perform very bad.

@itsdevbear
Copy link
Contributor

Seems like bad developer UX, there has to be a way to get it to work like how CosmWasm works. That should be the end goal for Stateful Precompiles IMO.

@itsdevbear
Copy link
Contributor

Having an EthTx revert, revert the whole Cosmos Txn would resolve this issue, no?

@yihuang
Copy link
Contributor Author

yihuang commented Sep 22, 2022

Seems like bad developer UX, there has to be a way to get it to work like how CosmWasm works. That should be the end goal for Stateful Precompiles IMO.

I think CosmWasm call native functionalities by passing async messages, similar to how we do the log hooks currently.

@yihuang
Copy link
Contributor Author

yihuang commented Sep 22, 2022

Having an EthTx revert, revert the whole Cosmos Txn would resolve this issue, no?

The exception revert can happen in nested way.

@itsdevbear
Copy link
Contributor

Needless to say, I think the approach should be attempt to get synchronous ability to call Cosmos SDK modules from the evm. Having to duplicate all the logic seems like a massive way to introduce issues.

@dreamer-zq
Copy link

dreamer-zq commented Oct 31, 2022

@yihuang @dreamer-zq I've proposed an approach to use CacheContext and still handle nested reverts. Essentially, by implementing the cache contexts in the statedb journals, cache contexts can follow the snapshot/revert logic properly, even in nested contract calls/reverts! Please take a look at this doc, which provides further explanation on this approach.

A first attempt Proof of Concept implementation can be found in this PR, notable files include: x/evm/statedb/interfaces.go, x/evm/statedb/statedb.go, x/evm/vm/geth/geth.go, and x/evm/types/precompiles.go. Note: this PR works on the latest version of geth without modifications, but requires shadowing many methods.

@calbera I have two questions,

  1. evm When executing opcode, the injected evm object in EVMInterpreter is still go-etherum Implementation, then how to jump to the call method implemented in ethermint? My understanding is that only by injecting the evm implemented in ethermint into the EVMInterpreter will the execution logic of the opcode be overwritten. I may have misunderstood the execution chain of EVMInterpreter
  2. statedb When executing the commit method, why only the last ExtJournalEntry is committed? Why does the state generated in the middle do not need to manually execute commit?

@itsdevbear
Copy link
Contributor

@yihuang personally I'm not against a very mild fork with an extremely small diff. Though I know @fedekunze is quite against.

Why would we need nonce?

An invariant that is checked post call to ensure nonce hasn't been altered by a native call seems reasonable.

@calbera
Copy link

calbera commented Oct 31, 2022

@yihuang @dreamer-zq I've proposed an approach to use CacheContext and still handle nested reverts. Essentially, by implementing the cache contexts in the statedb journals, cache contexts can follow the snapshot/revert logic properly, even in nested contract calls/reverts! Please take a look at this doc, which provides further explanation on this approach.
A first attempt Proof of Concept implementation can be found in this PR, notable files include: x/evm/statedb/interfaces.go, x/evm/statedb/statedb.go, x/evm/vm/geth/geth.go, and x/evm/types/precompiles.go. Note: this PR works on the latest version of geth without modifications, but requires shadowing many methods.

@calbera I have two questions,

  1. evm When executing opcode, the injected evm object in EVMInterpreter is still go-etherum Implementation, then how to jump to the call method implemented in ethermint? My understanding is that only by injecting the evm implemented in ethermint into the EVMInterpreter will the execution logic of the opcode be overwritten. I may have misunderstood the execution chain of EVMInterpreter
  2. statedb When executing the commit method, why only the last ExtJournalEntry is committed? Why does the state generated in the middle do not need to manually execute commit?
  1. That's right. The EVMInterpreter is currently go-eth so it will not correctly call the modified version of Call. This will require significant amounts of shadowing, which includes rewriting the opcode functions (opCall, opCallCode, etc.) and the jump table to point to these new functions.
  2. Essentially, each ExtJournalEntry holds a cache context. A subsequent ExtJournalEntry will add state by simply appending to the previous cache context. So in effect each cache context holds all state changes before it and the current state changes, and therefore only the last cache context needs to be committed.

@yihuang
Copy link
Contributor Author

yihuang commented Nov 1, 2022

Essentially, each ExtJournalEntry holds a cache context. A subsequent ExtJournalEntry will add state by simply appending to the previous cache context. So in effect each cache context holds all state changes before it and the current state changes, and therefore only the last cache context needs to be committed.

But you'll need to hold the whole stack of cache contexts to be able to commit all the way back to the top level, commit once only commit to the upper layer.

@dreamer-zq
Copy link

@yihuang @dreamer-zq I've proposed an approach to use CacheContext and still handle nested reverts. Essentially, by implementing the cache contexts in the statedb journals, cache contexts can follow the snapshot/revert logic properly, even in nested contract calls/reverts! Please take a look at this doc, which provides further explanation on this approach.
A first attempt Proof of Concept implementation can be found in this PR, notable files include: x/evm/statedb/interfaces.go, x/evm/statedb/statedb.go, x/evm/vm/geth/geth.go, and x/evm/types/precompiles.go. Note: this PR works on the latest version of geth without modifications, but requires shadowing many methods.

@calbera I have two questions,

  1. evm When executing opcode, the injected evm object in EVMInterpreter is still go-etherum Implementation, then how to jump to the call method implemented in ethermint? My understanding is that only by injecting the evm implemented in ethermint into the EVMInterpreter will the execution logic of the opcode be overwritten. I may have misunderstood the execution chain of EVMInterpreter
  2. statedb When executing the commit method, why only the last ExtJournalEntry is committed? Why does the state generated in the middle do not need to manually execute commit?
  1. That's right. The EVMInterpreter is currently go-eth so it will not correctly call the modified version of Call. This will require significant amounts of shadowing, which includes rewriting the opcode functions (opCall, opCallCode, etc.) and the jump table to point to these new functions.
  2. Essentially, each ExtJournalEntry holds a cache context. A subsequent ExtJournalEntry will add state by simply appending to the previous cache context. So in effect each cache context holds all state changes before it and the current state changes, and therefore only the last cache context needs to be committed.

Thank you for your reply

@loredanacirstea
Copy link
Contributor

I added Cosmos sdk.Context and vm.EVM as context to precompiles: #1433
And if this PR ethereum/go-ethereum#26119 is merged, it will also provide current call/subcall context.

@itsdevbear
Copy link
Contributor

itsdevbear commented Nov 6, 2022

@loredanacirstea very similar to the arch @calbera and I are starting to come to, this is awsome.

The geth PR is sweet too, one issue we faced when getting this going without a fork was that we ended up having to pull much the evm out into ethermint to shadow.

@loredanacirstea
Copy link
Contributor

loredanacirstea commented Nov 6, 2022

@itsdevbear
Yes, I know. This is based on work I did back in February and I returned to simplify it.
Would be good to thumbs up the go-ethereum PR.

@yihuang
Copy link
Contributor Author

yihuang commented Nov 16, 2022

cosmos/cosmos-sdk#13881

If we fix the slowness of nested cache context, we might reconsider the nested cache context solution, which would be perfect for precompiles.

@dreamer-zq
Copy link

@loredanacirstea I was thinking that if we customize an opCode, then define the evm in the interpreter as an interface, and then suggest that the etherum team export the interpreter as a public variable, so that the modification may be less and easier for them to accept? Of course, this is my simple idea, there may be something I haven't considered

@loredanacirstea
Copy link
Contributor

I have a full working quasar precompile (cosmos sdk messages & queries from inside the EVM). This means stateful precompiles, solved call context nesting, efficient state sync between EVM StateDB & Cosmos context, gas metering of the mixed context! It is publicly verifiable on the Mythos chain. And demoed here, with more details: https://youtu.be/COu5Olszhtg. If you want to see the example contract used for testing, see min 7:09.

@yihuang
Copy link
Contributor Author

yihuang commented Dec 10, 2022

I have a full working quasar precompile (cosmos sdk messages & queries from inside the EVM). This means stateful precompiles, solved call context nesting, efficient state sync between EVM StateDB & Cosmos context, gas metering of the mixed context! It is publicly verifiable on the Mythos chain. And demoed here, with more details: https://youtu.be/COu5Olszhtg. If you want to see the example contract used for testing, see min 7:09.

I'm curious how you do the context nesting and statedb thing.

@loredanacirstea
Copy link
Contributor

I'm curious how you do the context nesting and statedb thing.

Of course.
But I hope you know that my effort until now has been volunteer effort (under The Laurel Project, unpaid). Recent events have forced me to change my approach. I will be continuing my innovations on Mythos and if any chain is interested to also have them, I invite them there (link to Discord in the video description).

@itsdevbear
Copy link
Contributor

Sad to see Ethermint going down this path, but @loredanacirstea I totally understand your rationale, given the state of the project. We recently had to go this route as well and it totally sucks. If you ever want to collaborate on this topic please reach out to either myself or @calbera.

Godspeed 🙏

@fedekunze
Copy link
Contributor

Sad to see Ethermint going down this path, but @loredanacirstea I totally understand your rationale, given the state of the project. We recently had to go this route as well and it totally sucks.

@itsdevbear care to elaborate on why you wrote this comment?

@yihuang
Copy link
Contributor Author

yihuang commented Dec 16, 2022

I still only see one way to support precompiles with CacheContext, which is revert back to the context stack solution before the StateDB refactoring, that's the only way the state observed by precompile and evm contract are consistent.
With recent merge of PR cosmos/cosmos-sdk#13881, I think we should give it a try, WDYT @fedekunze ? Some degree of performance regression is expected, but hopefully not too much.

@loredanacirstea
Copy link
Contributor

@fedekunze please clarify if this is the official stance of Tharsis, as cited in my comment here https://commonwealth.im/evmos/discussion/8496-bring-quasar-to-evmos-execute-cosmos-transactions-queries-from-the-evm-from-any-smart-contract?comment=41558

As the Evmos Core Development team, we won’t be using the Quasar codebase that you have ended up writing. The Evmos Core Development team will be proposing its own solution for this.

@github-actions
Copy link

github-actions bot commented Feb 1, 2023

This pull request has been automatically marked as stale because it has not had recent activity. It will be closed in 7 days-before-close if no further activity occurs.

@yihuang
Copy link
Contributor Author

yihuang commented Feb 1, 2023

This PR is not relevant now, @mmsqe is taking a different approach on this issue, based on the cache store refactoring here: cosmos/cosmos-sdk#14444, we can expect an implementation PR directly soon.

@yihuang yihuang closed this Feb 1, 2023
@0xadu
Copy link

0xadu commented Mar 1, 2023

Ohh @yihuang I see what you mean, specifically the evm_denom balances, this should be easily solvable as well thankfully!

evm_denom balances, account nonces, I don't know how to do that, would be exciting to know how you guys did it. If you wrote those into ctx, it basically go back to the approach before journal logs, we'll have one nested cache layer for each message call. If we can modify go-ethereum to call statedb on message call return, we can flatten the cache layer there.

@yihuang thanks for your insightful discussions.
Does this mean that we need to access the dirty data that's not committed when we make a .call?

@yihuang
Copy link
Contributor Author

yihuang commented Mar 1, 2023

@0xadu

the ideas in this doc are out dated, you should checkout the new implementation here: #1650

@0xadu
Copy link

0xadu commented Mar 1, 2023

@0xadu

the ideas in this doc is out dated, you can checkout the new implementation here: #1650

thnaks, I see.
I'm just curious why the original journal-based statedb approach does not work.

@yihuang
Copy link
Contributor Author

yihuang commented Mar 1, 2023

@0xadu
the ideas in this doc is out dated, you can checkout the new implementation here: #1650

thnaks, I see. I'm just curious why the original journal-based statedb approach does not work.

do you mean wrap existing cache store into a journal entry? the performance is very bad with deeply nested cache store.

@0xadu
Copy link

0xadu commented Mar 1, 2023

I mean AFAIK the statedb has been refactored to replace nested cachestore with journals, but you seem to imply that we need to go back to the original cachestore approach to support stateful precompiled contracts. I'm not clear about the motivation. Correct me if I'm wrong.

@yihuang
Copy link
Contributor Author

yihuang commented Mar 1, 2023

I mean AFAIK the statedb has been refactored to replace nested cachestore with journals, but you seem to imply that we need to go back to the original cachestore approach to support stateful precompiled contracts. I'm not clear about the motivation. Correct me if I'm wrong.

because native code don't have access to the dirty cache in statedb, for example the current balance of evm denom.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants