Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd,internal/era: implement export-history subcommand #26621

Merged
merged 28 commits into from
Feb 7, 2024
Merged
Show file tree
Hide file tree
Changes from 27 commits
Commits
Show all changes
28 commits
Select commit Hold shift + click to select a range
d017030
all: implement era format, add history importer/export
lightclient Feb 3, 2023
f3ec256
internal/era/e2store: refactor e2store to provide ReadAt interface
lightclient May 19, 2023
a9304e8
internal/era/e2store: export HeaderSize
lightclient May 19, 2023
545f0b1
internal/era: refactor era to use ReadAt interface
lightclient May 19, 2023
d58c913
internal/era: elevate anonymous func to named
lightclient May 22, 2023
76e08e0
cmd/utils: don't store entire era file in-memory during import / export
lightclient May 22, 2023
d6cc601
internal/era: better abstraction between era and e2store
lightclient May 23, 2023
6c4747f
cmd/era: properly close era files
lightclient May 23, 2023
97b86f4
cmd/era: don't let defers stack
lightclient May 23, 2023
b65db9d
cmd/geth: add description for import-history
lightclient May 23, 2023
ca3d8bf
cmd/utils: better bytes buffer
lightclient May 23, 2023
cccf47b
internal/era: error if accumulator has more records than max allowed
lightclient May 23, 2023
7b9bda2
internal/era: better doc comment
lightclient May 23, 2023
de45fca
internal/era/e2store: rm superfluous reader, rm superfluous testcases…
holiman May 29, 2023
235d413
internal/era: avoid some repetition
holiman May 29, 2023
1000755
internal/era: simplify clauses
holiman May 29, 2023
008ed82
internal/era: unexport things
holiman May 29, 2023
3fa17e7
internal/era,cmd/utils,cmd/era: change to iterator interface for read…
lightclient Jun 2, 2023
02b3f19
cmd/utils: better defer handling in history test
lightclient Jun 2, 2023
b98b6d6
internal/era,cmd: add number method to era iterator to get the curren…
lightclient Jun 5, 2023
af6837f
internal/era/e2store: avoid double allocation during write
lightclient Jun 5, 2023
64513d7
internal/era,cmd/utils: fix lint issues
lightclient Jun 5, 2023
b8fc70e
internal/era: add ReaderAt func so entry value can be read lazily
lightclient Jun 5, 2023
67defc4
internal/era: improve iterator interface
lightclient Jun 6, 2023
e9cb207
internal/era: fix rlp decode of header and correctly read total diffi…
lightclient Jun 8, 2023
2a4cbda
cmd/era: fix rebase errors
lightclient Jan 29, 2024
0fa8947
cmd/era: clearer comments
lightclient Jan 30, 2024
beb6505
cmd,internal: fix comment typos
lightclient Feb 1, 2024
File filter

Filter by extension

Filter by extension


Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
324 changes: 324 additions & 0 deletions cmd/era/main.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,324 @@
// Copyright 2023 The go-ethereum Authors
// This file is part of go-ethereum.
//
// go-ethereum is free software: you can redistribute it and/or modify
// it under the terms of the GNU General Public License as published by
// the Free Software Foundation, either version 3 of the License, or
// (at your option) any later version.
//
// go-ethereum is distributed in the hope that it will be useful,
// but WITHOUT ANY WARRANTY; without even the implied warranty of
// MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the
// GNU General Public License for more details.
//
// You should have received a copy of the GNU General Public License
// along with go-ethereum. If not, see <http://www.gnu.org/licenses/>.

package main

import (
"encoding/json"
"fmt"
"math/big"
"os"
"path"
"strconv"
"strings"
"time"

"github.com/ethereum/go-ethereum/common"
"github.com/ethereum/go-ethereum/core/types"
"github.com/ethereum/go-ethereum/internal/era"
"github.com/ethereum/go-ethereum/internal/ethapi"
"github.com/ethereum/go-ethereum/internal/flags"
"github.com/ethereum/go-ethereum/params"
"github.com/ethereum/go-ethereum/trie"
"github.com/urfave/cli/v2"
)

var app = flags.NewApp("go-ethereum era tool")

var (
dirFlag = &cli.StringFlag{
Name: "dir",
Usage: "directory storing all relevant era1 files",
Value: "eras",
}
networkFlag = &cli.StringFlag{
Name: "network",
Usage: "network name associated with era1 files",
Value: "mainnet",
}
eraSizeFlag = &cli.IntFlag{
Name: "size",
Usage: "number of blocks per era",
Value: era.MaxEra1Size,
}
txsFlag = &cli.BoolFlag{
Name: "txs",
Usage: "print full transaction values",
}
)

var (
blockCommand = &cli.Command{
Name: "block",
Usage: "get block data",
ArgsUsage: "<number>",
Action: block,
Flags: []cli.Flag{
txsFlag,
},
}
infoCommand = &cli.Command{
Name: "info",
ArgsUsage: "<epoch>",
Usage: "get epoch information",
Action: info,
}
verifyCommand = &cli.Command{
Name: "verify",
ArgsUsage: "<expected>",
Usage: "verifies each era1 against expected accumulator root",
Action: verify,
}
)

func init() {
app.Commands = []*cli.Command{
blockCommand,
infoCommand,
verifyCommand,
}
app.Flags = []cli.Flag{
dirFlag,
networkFlag,
eraSizeFlag,
}
}

func main() {
if err := app.Run(os.Args); err != nil {
fmt.Fprintf(os.Stderr, "%v\n", err)
os.Exit(1)
}
}

// block prints the specified block from an era1 store.
func block(ctx *cli.Context) error {
num, err := strconv.ParseUint(ctx.Args().First(), 10, 64)
if err != nil {
return fmt.Errorf("invalid block number: %w", err)
}
e, err := open(ctx, num/uint64(ctx.Int(eraSizeFlag.Name)))
if err != nil {
return fmt.Errorf("error opening era1: %w", err)
}
defer e.Close()
// Read block with number.
block, err := e.GetBlockByNumber(num)
if err != nil {
return fmt.Errorf("error reading block %d: %w", num, err)
}
// Convert block to JSON and print.
val := ethapi.RPCMarshalBlock(block, ctx.Bool(txsFlag.Name), ctx.Bool(txsFlag.Name), params.MainnetChainConfig)
b, err := json.MarshalIndent(val, "", " ")
if err != nil {
return fmt.Errorf("error marshaling json: %w", err)
}
fmt.Println(string(b))
return nil
}

// info prints some high-level information about the era1 file.
func info(ctx *cli.Context) error {
epoch, err := strconv.ParseUint(ctx.Args().First(), 10, 64)
if err != nil {
return fmt.Errorf("invalid epoch number: %w", err)
}
e, err := open(ctx, epoch)
if err != nil {
return err
}
defer e.Close()
acc, err := e.Accumulator()
if err != nil {
return fmt.Errorf("error reading accumulator: %w", err)
}
td, err := e.InitialTD()
if err != nil {
return fmt.Errorf("error reading total difficulty: %w", err)
}
info := struct {
Accumulator common.Hash `json:"accumulator"`
TotalDifficulty *big.Int `json:"totalDifficulty"`
StartBlock uint64 `json:"startBlock"`
Count uint64 `json:"count"`
}{
acc, td, e.Start(), e.Count(),
}
b, _ := json.MarshalIndent(info, "", " ")
fmt.Println(string(b))
return nil
}

// open opens an era1 file at a certain epoch.
func open(ctx *cli.Context, epoch uint64) (*era.Era, error) {
var (
dir = ctx.String(dirFlag.Name)
network = ctx.String(networkFlag.Name)
)
entries, err := era.ReadDir(dir, network)
if err != nil {
return nil, fmt.Errorf("error reading era dir: %w", err)
}
if epoch >= uint64(len(entries)) {
return nil, fmt.Errorf("epoch out-of-bounds: last %d, want %d", len(entries)-1, epoch)
}
return era.Open(path.Join(dir, entries[epoch]))
}

// verify checks each era1 file in a directory to ensure it is well-formed and
// that the accumulator matches the expected value.
func verify(ctx *cli.Context) error {
if ctx.Args().Len() != 1 {
return fmt.Errorf("missing accumulators file")
}

roots, err := readHashes(ctx.Args().First())
if err != nil {
return fmt.Errorf("unable to read expected roots file: %w", err)
}

var (
dir = ctx.String(dirFlag.Name)
network = ctx.String(networkFlag.Name)
start = time.Now()
reported = time.Now()
)

entries, err := era.ReadDir(dir, network)
if err != nil {
return fmt.Errorf("error reading %s: %w", dir, err)
}

if len(entries) != len(roots) {
return fmt.Errorf("number of era1 files should match the number of accumulator hashes")
}

// Verify each epoch matches the expected root.
for i, want := range roots {
// Wrap in function so defers don't stack.
err := func() error {
name := entries[i]
e, err := era.Open(path.Join(dir, name))
if err != nil {
return fmt.Errorf("error opening era1 file %s: %w", name, err)
}
defer e.Close()
// Read accumulator and check against expected.
if got, err := e.Accumulator(); err != nil {
return fmt.Errorf("error retrieving accumulator for %s: %w", name, err)
} else if got != want {
return fmt.Errorf("invalid root %s: got %s, want %s", name, got, want)
}
// Recompute accumulator.
if err := checkAccumulator(e); err != nil {
return fmt.Errorf("error verify era1 file %s: %w", name, err)
}
// Give the user some feedback that something is happening.
if time.Since(reported) >= 8*time.Second {
fmt.Printf("Verifying Era1 files \t\t verified=%d,\t elapsed=%s\n", i, common.PrettyDuration(time.Since(start)))
reported = time.Now()
}
return nil
}()
if err != nil {
return err
}
}

return nil
}

// checkAccumulator verifies the accumulator matches the data in the Era.
func checkAccumulator(e *era.Era) error {
var (
err error
want common.Hash
td *big.Int
tds = make([]*big.Int, 0)
hashes = make([]common.Hash, 0)
)
if want, err = e.Accumulator(); err != nil {
return fmt.Errorf("error reading accumulator: %w", err)
}
if td, err = e.InitialTD(); err != nil {
return fmt.Errorf("error reading total difficulty: %w", err)
}
it, err := era.NewIterator(e)
if err != nil {
return fmt.Errorf("error making era iterator: %w", err)
}
// To fully verify an era the following attributes must be checked:
// 1) the block index is constructed correctly
// 2) the tx root matches the value in the block
// 3) the receipts root matches the value in the block
// 4) the starting total difficulty value is correct
// 5) the accumulator is correct by recomputing it locally, which verifies
// the blocks are all correct (via hash)
//
// The attributes 1), 2), and 3) are checked for each block. 4) and 5) require
// accumulation accross the entire set and are verified at the end.
lightclient marked this conversation as resolved.
Show resolved Hide resolved
for it.Next() {
// 1) next() walks the block index, so we're able to implicitly verify it.
if it.Error() != nil {
return fmt.Errorf("error reading block %d: %w", it.Number(), err)
}
block, receipts, err := it.BlockAndReceipts()
if it.Error() != nil {
return fmt.Errorf("error reading block %d: %w", it.Number(), err)
}
// 2) recompute tx root and verify against header.
tr := types.DeriveSha(block.Transactions(), trie.NewStackTrie(nil))
if tr != block.TxHash() {
return fmt.Errorf("tx root in block %d mismatch: want %s, got %s", block.NumberU64(), block.TxHash(), tr)
}
// 3) recompute receipt root and check value against block.
rr := types.DeriveSha(receipts, trie.NewStackTrie(nil))
if rr != block.ReceiptHash() {
return fmt.Errorf("receipt root in block %d mismatch: want %s, got %s", block.NumberU64(), block.ReceiptHash(), rr)
}
hashes = append(hashes, block.Hash())
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you want to verify the withdrawals root in the future as well?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The era1 format is not intended to ever support withdrawals. We'll want to use the standard era format currently exported by nimbus.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so if I want to restore an archive node using these files in a post-4444 world, would I use .era1 files up to the merge and then .era files for merge up some post-weak subjectivity period checkpoint? Would this all be done on the EL, or would I import that .era files in my CL?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It hasn't really been determined how the post-merge era files will be ingested. Two obvious paths:

  1. CL imports era files and pushes payloads directly to EL.
  2. CL imports beacon chain data from era files and EL imports execution chain data from same era files.

Obvious 1) seems preferred from a UX standpoint, but not sure it will be performant enough. Plus, some clients may opt to use era in-place as an ancient store. In which case you could just point both clients to the directory containing the era files.

td.Add(td, block.Difficulty())
tds = append(tds, new(big.Int).Set(td))
}
// 4+5) Verify accumulator and total difficulty.
got, err := era.ComputeAccumulator(hashes, tds)
if err != nil {
return fmt.Errorf("error computing accumulator: %w", err)
}
if got != want {
return fmt.Errorf("expected accumulator root does not match calculated: got %s, want %s", got, want)
}
return nil
}

// readHashes reads a file of newline-delimited hashes.
func readHashes(f string) ([]common.Hash, error) {
b, err := os.ReadFile(f)
if err != nil {
return nil, fmt.Errorf("unable to open accumulators file")
}
s := strings.Split(string(b), "\n")
// Remove empty last element, if present.
if s[len(s)-1] == "" {
s = s[:len(s)-1]
}
// Convert to hashes.
r := make([]common.Hash, len(s))
for i := range s {
r[i] = common.HexToHash(s[i])
}
return r, nil
}
Loading
Loading