Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add test: check if decoding is one to one by generating collisions #24

Merged

Conversation

paulperegud
Copy link
Contributor

@paulperegud paulperegud commented Nov 12, 2019

Two encodings that produce identical term: <<47>> and <<129,47>>. This is a violation of "choose optimal length encoding" rule in from protocol description here: https://github.com/ethereum/wiki/wiki/RLP

Note that the decode_length function rejects invalid encodings that have "non-optimal"
lengths, namely (1) singleton strings whose only byte is below 128 that are encoded
with a short (i.e. one-byte) length of 1 instead of as the strings themselves and (2) 
strings and lists with long (i.e. multi-byte) lengths with leading zeros (which must be
absent) or below 56 (which should be encoded using short lengths).

This PR contains test detecting the error only, no fix.

@paulperegud
Copy link
Contributor Author

@pdobacz PR to ExRLP - can't add you as an reviewer, unfortunately.


forall l <- list(gen) do
mapsto = for item <- :lists.usort(l), do: ExRLP.decode(item)
:lists.sort(mapsto) == :lists.usort(mapsto)
Copy link
Contributor

@pdobacz pdobacz Nov 20, 2019

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure what this does, can you throw in some comments/expand the description of the test/make the assertion more explicit, without the use of usorts/make variable names more readable? (ideally some mix of these 4).

I suspect that were checking that two sets are the same, but not sure how this proves the "one to one"...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done, please see again :-)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

On one-to-one proof. It looks for two binaries that violate one-to-one property of decoder. It is obviously not a proof since it is not exhaustive. Also it is a mapping in one direction only and no claims about the state of encoder are made in this particular test.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

perfect, thanks for dumbing it down ;). I'm wondering; how sure are we, that https://github.com/mana-ethereum/ex_rlp/pull/24/files#diff-0dbbe73b6149be5f2b00785f37f44760R27 doesn't yield a very small testing set? I'm assuming it doesn't and probably propcheck is smart enough to not feed us duplicates, then in such case maybe let's put a sanity precondition check there, that usort==sort (I can do it later, if I finally get to taking this over).

On one-to-one proof

ah, sorry I used "proves" too liberally, as in "attempts to prove a point/verify a property". All is understood

@ayrat555
Copy link
Member

ayrat555 commented Nov 24, 2019

@paulperegud @pdobacz good catch. Currently, we don't handle errors from https://github.com/ethereum/tests/blame/develop/RLPTests/invalidRLPTest.json It'd be great if you implement it

Related issue - #26

@InoMurko
Copy link
Member

hey @pdobacz @paulperegud are we bringing this one home or was it mitigated differently?

@pdobacz
Copy link
Contributor

pdobacz commented Jan 29, 2020

I'll take a stab at this (and #26)

@pdobacz pdobacz force-pushed the decoding-must-be-a-bijection branch from c0e6914 to b91d2a8 Compare February 5, 2020 13:19
@pdobacz
Copy link
Contributor

pdobacz commented Feb 5, 2020

@paulperegud hey, I rebased this on top of the #27's master and the prop test passes 🎉

Now, I also added a commit, with an alternative formulation of the generator, can you take a look if this makes sense? b91d2a8

  • I dropped the binary() case since it is covered in the other generator already
  • went with 1/ list of binaries 2/ list of binaries or lists of binaries 3/ list of binaries, lists or binaries or lists of (binaries or lists of binaries) 😂. So 1-, 2-, 3- levels of nesting resp.

@ayrat555 ayrat555 merged commit c5a87e2 into mana-ethereum:master Feb 5, 2020
@pdobacz pdobacz deleted the decoding-must-be-a-bijection branch February 5, 2020 13:29
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants