Skip to content
This repository has been archived by the owner on Jan 11, 2022. It is now read-only.

Bug. Decoding 0x80 / Encoding and decoding of 0 #28

Closed
hamdiallam opened this issue Jul 26, 2018 · 8 comments
Closed

Bug. Decoding 0x80 / Encoding and decoding of 0 #28

hamdiallam opened this issue Jul 26, 2018 · 8 comments

Comments

@hamdiallam
Copy link

hamdiallam commented Jul 26, 2018

The rlp encoding of 0 is 0x80.

decoding 0x80 should return 0x0. The source returns an empty buffer which is incorrect.

rlp.decode(rlp.encode(0)) returns Buffer.from([]) instead of 0.

@hamdiallam
Copy link
Author

It seems like in the tests, 0 is treated as a null value. This goes against the spec from what I understand.

Can I submit a PR fixing this?

@axic
Copy link
Member

axic commented Jul 28, 2018

Yes please! Can you also give a reference to the part in the spec (Yellow Paper) to ease the reviewers job.

@hamdiallam
Copy link
Author

Awesome, will do.

@holgerd77
Copy link
Member

Just as a note here for reference: there are also 0-issues in ethereumjs-tx https://github.com/ethereumjs/ethereumjs-tx/issues/112 which in my perception are urgent to address and trickle down to ethereumjs-util https://github.com/ethereumjs/ethereumjs-util/issues/141 (this is not the direct issue but one eventually tackling the problem, one has to look into the code to get the connections here).

These have to be addressed and might be connected to the issue here.

@holgerd77
Copy link
Member

(as an on-top note: the issues in ethereumjs-tx for me seems to be so grave that I am totally puzzled why this hasn't been discovered before)

@sc0Vu
Copy link

sc0Vu commented Mar 11, 2019

Found wiki here: https://github.com/ethereum/wiki/wiki/RLP
After try, 0x80 decode to empty buffer. It should be application logic to judge the type (empty string or number 0)?

@gzm55
Copy link

gzm55 commented Mar 11, 2021

when using rlp, we need two level ser/der protocol:

  • high level objects <---> rlp objects ( byte string or nest string )
  • rlp objects <---> byte string

The RLP spec most describe the latter one, leaving the high level protocol to decide how to convert the high level objects to rlp objects. The exception is that how the unsigned integer as a high level object should converte to rlp objects, this is used to encode the list length. According to the implementation, the unsigned integer is convert to byte string by removing all the prefix 0s from the big-endian bytes. So

hl objects rlp objects byte string
integer 0 '' (a zero-length byte string) 0x80 (using rule 2)
integer 1 '\x01' 0x01 (using rule 1)
byte list [ ] '' (a zero-length byte string) 0x00 (using rule 2)
byte list [ 0x00 ] '\x00' 0x00 (using rule 1)

When doing rlp decoding, the output should be in the scope of rlp objects, so the Buffer.from([]) seems reasonable. And the the high level decoding is left to the protocol implementation.

The real problem, imo, is that rlp and all implementations do not distinguish the null and common default values (0, empty list). Out of the ethereum scene, missing null should introduce some minor problems.

In our own rlp implementation, each high level object type is defined as a byte-string type (integer, string, float, etc) or a nest-list type (list, map, struct), and convert to null as following:

hl objects rlp objects byte string
null (a byte-string type) [] (an empty nest list) 0xC0
null (a nest-list type) '' (an empty byte string) 0x80

@ryanio
Copy link
Contributor

ryanio commented Jan 3, 2022

Copying my comment from #32:

Since https://eth.wiki/en/fundamentals/rlp now has these examples I will consider this conversation resolved:

The integer 0 = [ 0x80 ]
The encoded integer 0 (’\x00’) = [ 0x00 ]

Thanks to everyone for participating

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

Successfully merging a pull request may close this issue.

6 participants