Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

EIP 1186 eth_getProof support #1185

Merged
merged 8 commits into from
May 15, 2019
Merged

Conversation

paouvrard
Copy link
Contributor

Hi everyone,
I've been working on patricia merkle proofs verification in python so i though i'd contribute the code here.
The eth_getProof api was just added to geth and parity a few months ago and it's a really cool tool to make state merkle proofs.
specification : https://github.com/ethereum/EIPs/blob/master/EIPS/eip-1186.md

What was wrong?

eth_getProof wasn't available in web3py

How was it fixed?

  • getProof was added to the eth api
  • verify_eth_getProof was added to _utils so that a proof can be verified for a given state root
    For testing verify_eth_proof() i simply hardcoded storage proofs for different keys making sure each node type is tested.

Usage

Example solidity contract:

contract Proof {
        string public greeting;
        mapping (address => uint) public my_map;
        constructor() public {
                greeting = 'Hello';
                my_map[msg.sender] = 33333;
        }
}

Verifying storage proofs for 'greeting' and 'my_map':

from web3 import Web3, IPCProvider
from web3.middleware import geth_poa_middleware
from web3._utils.proof import verify_eth_getProof, storage_position

w3 = Web3(IPCProvider(...))
w3.middleware_stack.inject(geth_poa_middleware, layer=0)
w3.eth.defaultAccount = Web3.toChecksumAddress('0x...')

block = w3.eth.getBlock('latest')
greeting = "0x0"
my_map_sender = storage_position(w3.eth.defaultAccount, "0x1")
proof = w3.eth.getProof(contract_addr, [greeting, my_map], block.number)
is_valid_proof = verify_eth_getProof(proof, block.stateRoot)

You might want to organize things differently so feedback is very welcome.

Cute Animal Picture

screen shot 2019-01-07 at 3 04 00 pm

nibble_to_number[c] = i


class _Account(rlp.Serializable):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Side note, it'd be nice if we could re-use the same data structure across py-evm and web3.py. I believe that eth-account also has at least one copy of a serializable. New library, something like:

  • eth-datatypes

Copy link
Contributor Author

@paouvrard paouvrard Jan 8, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wasn't able to find something in eth-account that could be used instead of _Account(rlp.Serializable). Btw i got this from pyethereum

]


def verify_eth_getProof(proof, root):
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Since the getProof API doesnt return a bool for whether or not a value is included (inclusion or exclusion should be verified for each key being proved), verifying the getProof result means verifying all values are included or all are not included. But we can't verify a mixture of inclusion and exclusion storage proofs. In this case verify_eth_getProof verifies all keys are included. possible rename to verify_getProof_inclusion ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this should be a publicly-used API? If so, it shouldn't be in _utils. If not, it's not as important to find an ideal name. inclusion does seem a bit better, at first glance.

@paouvrard
Copy link
Contributor Author

@pipermerriam, @carver, what else can be done to move this forward ?

Copy link
Collaborator

@carver carver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the submission! Sorry about the long delay. It's totally welcome to re-ping us if you don't hear for a week or two. Sometimes things get busy and buried.

It would be awesome to test against parity and geth clients in the integration tests. I also made a few in-line comments, and have questions about which parts you think out to be public APIs.

def test_verify_eth_getProof():
block_stateRoot = "0xe8cbf0ef6814a55f071be90cdc0b699a3795208a0792ca18bc2a3b7947a594b7"
stateRoot = bytes.fromhex(block_stateRoot[2:])
assert verify_eth_getProof(exp, stateRoot)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is intended to be a public API, then it should be able to understand hex-encoded state roots, also.

]


def verify_eth_getProof(proof, root):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you think this should be a publicly-used API? If so, it shouldn't be in _utils. If not, it's not as important to find an ideal name. inclusion does seem a bit better, at first glance.

@@ -265,6 +284,7 @@ def to_hexbytes(num_bytes, val, variable_length=False):
),
'eth_getCode': apply_formatter_at_index(block_number_formatter, 1),
'eth_getStorageAt': apply_formatter_at_index(block_number_formatter, 2),
'eth_getProof': apply_formatter_at_index(block_number_formatter, 2),
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's tricky to navigate all the formatters, especially split ones like this and the rpc_abi ones. Nice work tracking all these down!

@@ -178,6 +178,7 @@ def personal_send_transaction(eth_tester, params):
),
'getBalance': call_eth_tester('get_balance'),
'getStorageAt': not_implemented,
'getProof': not_implemented,
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Of course, it would be ideal to get eth-tester to support this. Seems out of scope, but once it's back in, it will enable more straightforward testing options back here in web3.py.

return True


def _verify(expected_root, key, proof, key_index, proof_index, expected_value):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We definitely don't want to maintain another copy of a trie navigator here, py-trie is the canonical one. I think we can accomplish the goal of this method using trie.HexaryTrie.get_from_proof().

We probably don't want to add py-trie as a core dependency, but adding it to the tests extra seems reasonable.

I think this module is only being used for testing, in which case we can change to something like web3._utils.test_proofs, plus a comment at the top of the file that it's only intended to be imported in testing contexts (because the py-trie dependency is only satisfied during testing).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for reviewing, i didn't know about py-trie :-)
My idea for adding verify_eth_getProof() is that a user should trust only the state root and after verifying the proofs (with web3py) returned by getProof() he can safely use the values in the proof object.

So it depends if you want the proof verification to be supported by web3py or not :

  • if yes the proof verification should be a public API : i can refactor and use py-trie as core dependency
  • if not, web3py is just used to get the getProof() result and users need to write their own code to verify it (testing for proof verification is not even needed in this case i think)

Which do you think is suitable for web3py ?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, happy to help!

My instinct is that for now web3.py just returns the proof. The docs can demonstrate the couple of lines it takes to use py-trie to get a value from the proof.

It's possible that a project would want to get the proof only to record to a database or re-communicate over the network. So I'd say let's avoid adding the py-trie dependency until it becomes obvious that it's necessary.

)


def storage_position(map_key_hex, position_hex):
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this method used anywhere? Looks like it can be dropped.

in the contract
'''
key = pad_hex(map_key_hex, 256)[2:] + pad_hex(position_hex, 256)[2:]
return keccak(bytes.fromhex(key)).hex()
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this method is to be kept, I generally expect internal APIs to be manipulating bytes instead of hex-encoded strings. The manipulation just looks so much cleaner like:

    key = map_key.rjust(32, b'\0') + position.rjust(32, b'\0')
    return keccak(key)

And then we can convert in and out of hex when we need to for user interaction.

@paouvrard
Copy link
Contributor Author

@carver
As we talked in previous comments, i removed the proof verification from the scope.
This is an example code that verifies proofs are valid for a state root using py-trie.
Please advise if and where to include it in the docs.

from eth_utils import (
    keccak,
)
import rlp
from rlp.sedes import (
    Binary,
    big_endian_int,
)
from trie import (
    HexaryTrie,
)
from web3._utils.encoding import (
    pad_bytes,
)


def format_proof_nodes(proof):
    trie_proof = []
    for rlp_node in proof:
        trie_proof.append(rlp.decode(bytes(rlp_node)))
    return trie_proof


def verify_eth_getProof(proof, root):
    trie_root = Binary.fixed_length(32, allow_empty=True)
    hash32 = Binary.fixed_length(32)

    class _Account(rlp.Serializable):
        fields = [
                    ('nonce', big_endian_int),
                    ('balance', big_endian_int),
                    ('storage', trie_root),
                    ('code_hash', hash32)
                ]
    acc = _Account(
        proof.nonce, proof.balance, proof.storageHash, proof.codeHash
    )
    rlp_account = rlp.encode(acc)
    trie_key = keccak(bytes.fromhex(proof.address[2:]))
   
    assert rlp_account == HexaryTrie.get_from_proof(
        block.stateRoot, trie_key, format_proof_nodes(proof.accountProof)
    ), "Failed to verify account proof {}".format(proof.address)

    for storage_proof in proof.storageProof:
        trie_key = keccak(pad_bytes(b'\x00', 32, storage_proof.key))
        root = proof.storageHash
        if storage_proof.value == b'\x00':
            rlp_value = b''
        else:
            rlp_value = rlp.encode(storage_proof.value)

        assert rlp_value == HexaryTrie.get_from_proof(
            root, trie_key, format_proof_nodes(storage_proof.proof)
        ), "Failed to verify storage proof {}".format(storage_proof.key)

    return True

Copy link
Collaborator

@carver carver left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I think it's fine to insert the example proof validation in the eth.getProof section of the docs. Thanks!

@paouvrard
Copy link
Contributor Author

I think that's all from my side, tell me if it needs anything else :-)

@carver
Copy link
Collaborator

carver commented May 15, 2019

Looks good, thanks for the contribution. Welcome to the repo!

I'll squash it down and merge.

Feel free to do your own rebasing next time to squash away commits like "sort imports" before the final check, but it's not a problem for us to do a squash commit.

@carver carver merged commit 11ef9df into ethereum:master May 15, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants