-
Notifications
You must be signed in to change notification settings - Fork 32
EVM <> FVM mapping #39
Changes from 1 commit
459fe92
a81c284
1ad5a06
2d59250
e23b1dd
55c2c76
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -42,7 +42,71 @@ However, keep in mind that execution is ultimately controlled by FVM gas and not | |
|
||
## Account model | ||
|
||
## Address semantics | ||
## Addressing scheme | ||
|
||
Ethereum uses 160-bit (20-byte) addresses. Addresses are the keccak-256 hash of the public key of an account, truncated to preserve the 20 rightmost bytes. Solidity and the [Contract ABI spec](https://docs.soliditylang.org/en/v0.5.3/abi-spec.html) represent addresses with the `address` type, equivalent to `uint160`. | ||
|
||
There's an active, yet informal proposal to [increase the address width to 32 bytes](https://ethereum-magicians.org/t/increasing-address-size-from-20-to-32-bytes/5485). | ||
|
||
In Filecoin, addresses are multi-class, and there are currently four recognized classes. Sidenote: they're actually called _protocols_ in the spec, but we'll refrain from using that term here because it's hopelessly overloaded. | ||
|
||
The address byte representation is as follows: | ||
|
||
``` | ||
class (1 byte) || payload (n bytes) | ||
``` | ||
|
||
Thus, the total length of the address varies depending on the address class. | ||
|
||
- Class `0` (ID addresses): payload is [multiformats-style uvarint](https://github.com/multiformats/unsigned-varint). Maximum 9 bytes. | ||
- Class `1` (Secp256k1 key): payload is a blake2b-160 hash of the secp256k1 pubkey. Fixed 20 bytes. | ||
- Class `2` (actor addresses): payload is a blake2b-160 hash of some payload generated by the init actor. Fixed 20 bytes. | ||
- Class `3` (BLS key): payload is an inlined BLS public key. Fixed 48 bytes. | ||
|
||
In conclusion, the maximum address length in Filecoin is 49 bytes or 392 bits (class 3 address). This creates two problems: | ||
|
||
1. The worst case scenario is larger than the width of the Ethereum address type. Even if BLS addresses were prohibited in combination with EVM actors, class 1 and class 2 still miss the limit by 1 byte (due to the prefix). | ||
2. It exceeds the EVM's 256 bit architecture. | ||
|
||
Problem 1 renders Solidity smart contracts instantly incompatible with the Filecoin addressing scheme, as well as EVM opcodes that take or return addresses for arguments, e.g. CALLER, CALL, CALLCODE, DELEGATECALL, COINBASE, etc. This problem is hard to work around, and would require a fork of the EVM to modify existing opcodes for semantic awareness of addresses (although this is really hard to get right), or to introduce a Filecoin-specific opcode family to deal Filecoin addresses (e.g. FCALL, FCALLCODE, etc.) The latter would break as-is deployability of existing smart contracts. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Note: CALLER and COINBASE (and likely others) won't have this issue. All runtime APIs (in the current VM) return ID addresses, but accept (and resolve) other address types. |
||
|
||
Problem 2 can be workable by spilling over and combining values in the stack, through Filecoin-specific Solidity libraries. | ||
|
||
**Solution A: using ID addresses** | ||
|
||
However, there's a simpler solution: use Filecoin ID addresses (max. 10 bytes) everywhere inside EVM execution. However, this comes with drawbacks: | ||
|
||
1. EVM smart contracts can't send to inexisting, stable account addresses, and rely on account actor auto-creation, as those addressess can't be used with EVM opcodes (see problem 1). Potential solution: have the caller create the account on chain prior to invoking the EVM smart contract. | ||
2. ID addresses are vulnerable to reorg within the current finality window, so submitting EVM transactions involving actors created recently (900 epochs; 7.5 hours) would be unsafe. Potential solution: have the runtime detect and fail calls involving recently-created actors. | ||
|
||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. So, it would be possible to assign every actor a stable address (including account actors) and use that everywhere. That would mean addresses would be 20 bytes and unambiguous. However, we'd have to make a few changes:
But this should be doable. But there's a whole other can of worms...
Basically:
All of this is leading me to believe that we're going to need a bit of an indirection layer. Possibly a registry mapping "EVM" addresses to the rest of the FVM address space. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. This is all a bit confusing so I'll try to explain more in the standup. Unfortunately, documentation is scattered and almost universally of the "here's how to take your first steps in Ethereum" form not the "this is how this thing actually works" form. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Aren't pubkey addresses stable addresses? What's the nuance here? Related: account actors are also bound to an ID at creation, so every actor is guaranteed to have an ID address, which is volatile during the current finality window. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Notes:
Relevant references (in addition to the yellow paper).
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
If the pubkey address exists ahead of time, the contract can use a reorg-stable ID address (I'll post a proposal shortly). If the address doesn't exist ahead of time, this becomes harder because the CALL opcode consumes a single word for the recipient address (and probably truncates it to 160 bits), yet our pubkey addresses can span up to 2 Ethereum words.
Yes, 100% agreed.
Yes, but this should be straightforward IMO; we'd generate an f2 address using the user-provided inputs to assemble the preimage passed to There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
Sorry, f2 address. You're right, "stable" just means "not f0". |
||
**Solution B: using address handles** | ||
|
||
If these tradeoffs are unacceptable, we can consider using _address references/handles_ in the FVM EVM calling convention. Input parameters would be enveloped in a tuple: | ||
|
||
``` | ||
(1) ABI-encoded parameters (using uint160 addr handles) || (2) { addr handle: actual addr } | ||
``` | ||
|
||
Where: | ||
|
||
1. ABI encoded parameters replacing address positions with indexed uint160. | ||
2. Mapping of indices to real Filecoin addresses. | ||
|
||
On an incoming call, the EVM <> FVM shim would unpack the call and pass only (1) as input parameters to the smart contract. It would use (2) to resolve the address whenever the smart contract called a relevant opcode. When returning, the EVM <> FVM shim would perform the inverse operation. | ||
|
||
However, address-returning opcodes are still unsolved (e.g. CREATE, CREATE2, COINBASE, SENDER). The contract may want to persist these addresses, so making them return address handles is not an option, as they aren't safe to persist. | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Actually, this applies more generally. If I pass a handle There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Yeah, this solution seems brittle. |
||
|
||
Finally, this approach alters the calling convention, which in turns breaks compatibility with existing Ethereum tooling like wallets (e.g. MetaMask). | ||
|
||
**Solution C: using address guards** | ||
|
||
Another alternative consists of adopting ID addresses (like proposed in Solution A), but when those addresses are "fresh" (i.e. created within the finality window), allowing to pack a stable address guard/assertion in a data structure similar to that of Solution B. | ||
|
||
The EVM <> FVM shim would apply assertions prior to invoking the contract. | ||
|
||
This solution imposes extra complexity on the caller (so as to determine address freshness). It may require extending the InitActor's state object to inline the creation epoch for ease of query. | ||
|
||
This solution also suffers from the ecosystem tooling compatibility drawbacks, just like Solution B. | ||
|
||
## Gas accounting and execution halt semantics | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This covers account addresses. But see https://ethereum.stackexchange.com/questions/760/how-is-the-address-of-an-ethereum-contract-computed for contract addresses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yep, for contract addresses, it's the RLP-encoding of sender + nonce. I'll add that for reference.