Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exceptions for the "first block" #101

Open
ebuchman opened this issue Dec 20, 2019 · 7 comments
Open

Exceptions for the "first block" #101

ebuchman opened this issue Dec 20, 2019 · 7 comments
Labels
domain-types Anything relating to the creation, modification or removal of domain types needs-dod Needs a "definition of done" for clarity structure High level repo-wide structural concerns

Comments

@ebuchman
Copy link
Member

This first block in Tendermint is unique in that all of the last_xxx fields are actually empty, while in normal operation they are required to not be empty. On the other hand, some fields can be empty in normal operation (eg. app_hash, data_hash, evidence_hash). Recent PRs like #66 have been dealing with this by making all fields that can be empty Options, but it may be clearer if there is actually an explicit FirstBlock type that doesn't have the empty fields and and we reserve Option only for the fields that can be empty in normal operation. See the discussion in #66.

Another possibility is to clean this up by making a chance in the underlying tendermint protocol, see tendermint/tendermint#4241

@tarcieri
Copy link
Contributor

Sorry for waffling, but given the set of tradeoffs here, I'm now wondering if Hash::Null might've actually been a good idea. I felt a little dirty introducing it in the first place, but after looking at the Option-based alternatives empirically, it seems like it might actually be a fairly good fit for the protocol as it exists today.

@ebuchman
Copy link
Member Author

ebuchman commented Dec 21, 2019

Yes, except we might also need something for BlockID and Commit, which aren't already enums ... I guess those could be Option still

@ebuchman
Copy link
Member Author

@tarcieri @yihuang @liamsi what do we think about making block an enum with a FirstBlock variant that just doesn't have all the fields that would be empty? Would that encode properly? Of course it means every time we handle a block we have to account for both variants, but that might actually be net positive since it's already the case that we'll have to special logic the first block, so this would just reflect it in the type system.

Then we'd only have to consider a few remaining fields that can be empty in normal operation, and for those we either use Option like we do now or bring back the Hash::Null variant. @yihuang mentioned the downside of Hash::Null being that we have to consider the null variant every time we use Hash, but I'm not sure I understand, since we're mostly just checking for equality or trying to get the bytes, and we can implement that functionality once for both variants and then never have to worry about them again, unless I'm missing something ...

We might also be able to make Hash::Null go away if we can get Tendermint to use the hash of empty values instead of empty hashes for instance when there's no txs or evidence. Then we'd always have full hashes (except the first block).

@ebuchman
Copy link
Member Author

ebuchman commented Dec 31, 2019

I'm leaning in favour of being explicit with a FirstBlock variant but not using Hash::Null.

FirstBlock is a meaningful concept - It has no precursor block and is derived from genesis. It makes descriptions of normal blocks cleaner since fields that are absent from FirstBlock are required in NormalBlock, so we don't need to worry about eg. parse_non_empty_block_id. I think we could also get rid of parse_non_empty_commit and maybe more, and it would simplify #98. And there's a bunch of special case logic in Tendermint in Go for handling the first block that would then be more explicit. The downside is that any API returning blocks would need to expose this, but given the 1st block often requires special case logic anyways, that might be ok.

An empty hash is also a meaningful concept in tendermint, since it indicates the corresponding set is empty, and having empty hashes is probably nicer than using the hash of the empty set since it saves space. However, if we use Hash::Null instead of Option, then we lose the ability to denote via types which hash fields are actually required and which can be empty. While we might want Hash::Null for other purposes, the Hash type we use in eg. header fields should signal if the field is actually required or not.

This was referenced Mar 29, 2020
@ebuchman ebuchman added the structure High level repo-wide structural concerns label Jun 3, 2020
@thanethomson thanethomson added the domain-types Anything relating to the creation, modification or removal of domain types label Mar 22, 2021
@thanethomson
Copy link
Contributor

So concretely, towards trying to figure out a "definition of done" for this issue, would this imply we should be refactoring Block to become an enum of the form:

pub enum Block {
    Initial {
        // ... fields exclusively relevant to the first block
    },
    NonInitial {
        // ... fields relevant to blocks with heights above the first block
    }
}

I was looking at this link for examples of terms we could use to refer to the two "categories" of blocks. "Normal" doesn't quite resonate, as a first block is also technically quite normal in a blockchain 😁

@xla
Copy link
Contributor

xla commented Mar 22, 2021

One thing we should account for is that there is shared shape between the two variants, which incurs overhead when shared fields change/evolve. It could be elegant to actually encode if a block is associated/chained to another block, has a LastBlock. Is there a good overview of all the fields in both variants, have it comparable might help distill more of the distinction in shapes.

@thanethomson thanethomson added the needs-dod Needs a "definition of done" for clarity label Mar 22, 2021
@tony-iqlusion
Copy link
Collaborator

tony-iqlusion commented Mar 22, 2021

One way you could handle the shared shape is via a trait.

Putting aside what to call the enum variants and using @thanethomson's names for now, if you had:

pub struct Initial { ... }
pub struct NonInitial { ... }

pub enum Block {
    Initial(Initial),
    NonInitial(NonInitial)
}

you could (again, sidestepping what specifically to name this) define a trait and add impls as follows:

pub trait SharedFields {
    fn shared_field1(&self) -> X;
    fn shared_field2(&self) -> Y;
    fn shared_field3(&self) -> Z;
}

impl SharedFields for Initial { ... }

impl SharedFields for NonInitial { ... }

impl Block {
    fn shared_fields(&self) -> &dyn SharedFields { // may be able to use `impl Trait` here and monomorphize?
        match self {
            Self::Initial(block) => block, // haven't tested this but I think the compiler can infer
            Self::NonInitial(block) => block
        }
    }
}

impl SharedFields for Block {
    fn shared_field1(&self) -> X {
        self.shared_fields().shared_field1()
    }

    fn shared_field1(&self) -> Y {
        self.shared_fields().shared_field2()
    }

   ...
}

That's quite a bit of boilerplate that may perhaps be more succinctly represented by just having a struct that contains the shared fields and enum variants the discrepancies. But that said, it does have the benefit that the SharedFields trait would work for all three types, and functions that need only those fields can accept fields: impl SharedFields as an argument.


Another approach would be to make an enum "view" of the existing struct which borrows its values. This would allow you to coalesce the Option-al fields without forcing you to go through an enum if you're only concerned with shared fields that always exist.

Here's an example of this idea from some of my code:

https://docs.rs/elliptic-curve/0.9.6/elliptic_curve/sec1/enum.Coordinates.html

pub enum Coordinates<'a, C: Curve> {
    Identity,
    Compressed {
        x: &'a FieldBytes<C>,
        y_is_odd: bool,
    },
    Uncompressed {
        x: &'a FieldBytes<C>,
        y: &'a FieldBytes<C>,
    },
}

This enum is constructed on request and borrows its contents from another struct. There are corresponding Option-based accessors for each of these fields, but when you use the borrowed enum form, you can then just match on the variant you want and avoid dealing with Option-based access.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
domain-types Anything relating to the creation, modification or removal of domain types needs-dod Needs a "definition of done" for clarity structure High level repo-wide structural concerns
Projects
None yet
Development

No branches or pull requests

5 participants