Homogenise types to int64 #36

JustinDrake · 2018-10-03T20:33:01Z

See discussion here #34 (comment)

djrtwo · 2018-10-03T20:44:15Z

aw man, I liked our compact int types!
I'm going to marinate on this until the morning before merging 😆

JustinDrake · 2018-10-03T20:58:51Z

Do marinate :)

Don't you think it's nice to have a definition/optimisation separation of concerns? Designers need simple, clear, homogeneous definitions. Implementers who need the speed can go ahead and optimise.

mkalinin · 2018-10-05T11:23:49Z

Implementers who need the speed can go ahead and optimise.

Won't it be convenient to make those suffixes reflect real sizes that are used by SSZ encoding?
It would make the spec clearer. On the other side it would allow implementers to decide on the types they use in their structures. Going further, it might make sense to replace int type with uint as int is not supported by SSZ.

Maybe it doesn't make much sense in usage of int8 for SpecialObject.type, it would even be good to have int32 there. But it should be considered that changing committee type from [int24] to [int64] significantly increases size of its byte representation.

What do you think about using int32/64/128 in all cases where it doesn't significantly affect the size of the structure? And continue to keep more precise types like int24 for size-sensible arguments.

JustinDrake · 2018-10-05T11:40:26Z

it might make sense to replace int type with uint

I agree that uint makes semantic sense pretty much everywhere (a possible exception is balance where penalties could make the balance negative).

it should be considered that changing committee type from [int24] to [int64] significantly increases size of its byte representation

Right. I'm wondering if storing shard_and_committee_for_slots in the crystallised state is overkill and should be avoided. Indeed, shard_and_committee_for_slots is redundant information once you have the required inputs to get_new_shuffling such as randao_mix. cc @djrtwo @vbuterin

What do you think about using int32/64/128 in all cases where it doesn't significantly affect the size of the structure?

Seems reasonable. Note that int128 is current only used for balance and this pull request makes it a int64.

JustinDrake · 2018-10-08T09:33:31Z

I've reverted the committee indices back to int24, addressing @mkalinin's remark. This should hopefully make the pull request less controversial.

djrtwo · 2018-10-09T13:44:15Z

@JustinDrake regarding storing the shuffling in state. Might we want to be able to serve the ShardAndCommittees to clients without having to force them to run the shuffling alg?

mkalinin · 2018-10-09T15:45:21Z

@JustinDrake wouldn't be int64 an overkill for shard_id, type and status fields?

arnetheduck · 2018-10-09T17:20:25Z

From an implementation point of view, I believe there's little room to optimize, should the spec spell out a concrete size for an integer that is "too large" without explicitly specifying the range of valid values.

Implementations are limited in their optimizations by how other clients are guaranteed to behave, and a more precise integer type limits place practical bounds of what other clients can send.

In the situation of loosely specified sizes, implementers are left with two unsavory choices:

"Interpret" the spec and put their own, arbitrary constraints on accepted values - essentially what happens is that the "politically strongest" client gets to decide what is acceptable. A similar situation exists with gas in the EVM - everything is 256-bit, but practically, "sensible" gas limits are smaller and implementations simply ignore the problem (stTransactionTest/OverflowGasRequire\.json` // gasLimit > 256 bits tests#227).
Forgo optimizations

From a simplicity point of view, as soon as there is any integer of a different size, it is equivalent to there being many of them - the machinery to support multiple integer sizes must be put in place, regardless. For example, keeping everything as int64 helps mitigate alignment issues. If int24 is introduced, we will still need to deal with alignment - likewise, the existence of hash32 means that implementer need to tread with care - as such, I see no noteworthy simplicity gain for implementations from having some fields larger than they need to be.

Another point is that the encoding of fields and their constraints / specification are two orthogonal concerns - for example, if the shard_id is guaranteed to be within the range of an int16, the spec can spell that out, so as to create a safe space for implementers to optimize. What the serialization looks like can at that point be discussed separately - including concerns over uint vs int etc.

If keeping integers simple for designers in the spec is an explicit goal, an alternative is to simply call them integers, spelling out their valid ranges or minimum constraints on what values they must support, instead of bit sizes. This would have a couple of benefits:

Implementations are free to use homogeneous or optimized types as they like (ie int in python vs sized types in typed languages)
More freedom in choosing / revisiting serialization format as we progress with practical implementation - for example, we could investigate the use varint encodings without requiring the rest of the spec to be updated - which has several benefits from a network and storage efficiency point of view - a key area that client implementers are in a good spot to experiment with as the spec firms up.

paulhauner · 2018-10-10T00:33:29Z

To echo @mkalinin, I understand the spec still needs to be concerned about integer sizing when it comes to serialization. If this is the case, it seems confusing to call some variable a unit64 then later declare it's a unit16 if you hash it or send it over the wire.

@arnetheduck's comments around either supplying exact uint sizes or replacing them with mini-specs also makes sense.

djrtwo · 2018-10-30T20:13:39Z

This seems at least contentious enough to not merge. @JustinDrake Can you bring it up on our next call for discussion if it is something you would still like to integrate?

Also update binary output due to metadata change.

Homogenise types to int64

0c0b7b7

See discussion here #34 (comment)

hwwhww added the general:RFC Request for Comments label Oct 5, 2018

Revert to int24 for committee indices

7ded107

djrtwo closed this Oct 30, 2018

hwwhww mentioned this pull request Dec 5, 2018

Rework helper functions - part 1 ethereum/py-evm#1521

Merged

hwwhww deleted the homogenised-types branch January 31, 2019 15:54

hwwhww pushed a commit that referenced this pull request Aug 17, 2020

Remove duplicated Natspec from the implementation (#36)

2739718

Also update binary output due to metadata change.

sauliusgrigaitis mentioned this pull request Jan 4, 2023

Proper EL hashes in from_syncing_to_invalid test #3172

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Homogenise types to int64 #36

Homogenise types to int64 #36

JustinDrake commented Oct 3, 2018

djrtwo commented Oct 3, 2018

JustinDrake commented Oct 3, 2018

mkalinin commented Oct 5, 2018

JustinDrake commented Oct 5, 2018 •

edited

Loading

JustinDrake commented Oct 8, 2018 •

edited

Loading

djrtwo commented Oct 9, 2018

mkalinin commented Oct 9, 2018

arnetheduck commented Oct 9, 2018

paulhauner commented Oct 10, 2018

djrtwo commented Oct 30, 2018

Homogenise types to int64 #36

Homogenise types to int64 #36

Conversation

JustinDrake commented Oct 3, 2018

djrtwo commented Oct 3, 2018

JustinDrake commented Oct 3, 2018

mkalinin commented Oct 5, 2018

JustinDrake commented Oct 5, 2018 • edited Loading

JustinDrake commented Oct 8, 2018 • edited Loading

djrtwo commented Oct 9, 2018

mkalinin commented Oct 9, 2018

arnetheduck commented Oct 9, 2018

paulhauner commented Oct 10, 2018

djrtwo commented Oct 30, 2018

JustinDrake commented Oct 5, 2018 •

edited

Loading

JustinDrake commented Oct 8, 2018 •

edited

Loading