Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

tools to export/import swing-store to/from genesis block #6527

Closed
warner opened this issue Nov 2, 2022 · 10 comments · Fixed by #8152
Closed

tools to export/import swing-store to/from genesis block #6527

warner opened this issue Nov 2, 2022 · 10 comments · Fixed by #8152
Assignees
Labels
cosmic-swingset package: cosmic-swingset enhancement New feature or request SwingSet package: SwingSet vaults_triage DO NOT USE

Comments

@warner
Copy link
Member

warner commented Nov 2, 2022

What is the Problem Being Solved?

A subset of the state-sync problem is to support emergency chain upgrades in which an old validator (which has halted) performs a "genesis block export", basically copying the contents the IAVL tree (only the most recent blockHeight/version) into a big JSON file. People can then edit the file manually (e.g. removing validators which are known to be wedged), then start a new chain from this modified genesis data. The new chain has a different name, and starts at a few blocks higher than the old one (to guard against accidental double-signs by validators who didn't stop at quite the right time). The cosmos operations for this are just agd export and agd import (or something similar).

Our chain keeps a lot of state outside the IAVL tree, so agd export won't automatically get enough state to launch a new validator.

Description of the Design

The idea would be to add a swing-store API to dump the entire state of the swing-store out as key-value pairs. Basically we'd iterate through the kvStore and emit kv-${key}: ${value} for each entry, then walk the streamStore (transcript entries) and emit stream-${vatID}-${transcriptNum}: ${transcriptEntry} for each, then walk the snapStore (XS heap snapshots) and emit base64-encoded compressed snapshot files as strings.

This iteration must be in-consensus, which means:

  • keys must be sorted in some canonical manner
  • the snapStore might have extra snapshots that aren't being used by any vats (which will be deleted soon, but there are race conditions), so maybe the tool should be aware of the kvStore entries that indicate which snapshots are used by which vats, and only emit the within-consensus used ones
  • the kvStore also contains local.* keys which are defined to be outside consensus, these should be omitted

Then we'd have a second API which can take these key-value pairs and initialize a new swing-store from them.

Then we modify agd export and import to use these APIs when building/consuming the genesis block.

This is like what we need for state-sync, but easier, because:

  • export only happens while the node is halted, so we aren't trying to ignore ongoing mutations
  • we can take as long as we need to generate the data; there's no block waiting for its hash
  • this is only performed in an emergency, so even less frequently than daily-ish state-sync generation
  • everybody agrees upon the same blockHeight to perform the export

Security Considerations

Genesis export is a manual consensus operation: everybody can perform their own export and compare hashes, but there's no automatic staking/evidence/slashing taking place. On the plus side, any validators who launch from a different genesis block than the majority will fall out of consensus immediately.

Test Plan

Unit tests in packages/swing-store, probably some unit tests in cosmic-swingset if we can come up with a good scheme.

cc @arirubinstein @JimLarson @michaelfig @FUDCo @mhofman

@warner warner added enhancement New feature or request SwingSet package: SwingSet labels Nov 2, 2022
@warner
Copy link
Member Author

warner commented Nov 30, 2022

If our "plan A" for state-sync works out, this should be nearly trivial.

@ivanlei ivanlei added vaults_triage DO NOT USE and removed vaults-release labels Jan 4, 2023
@dckc dckc added the cosmic-swingset package: cosmic-swingset label Jan 17, 2023
@warner
Copy link
Member Author

warner commented May 12, 2023

@FUDCo and I think that the existing swing-store import/export support (added for state-sync) should be sufficient. So the task is to somehow glue it to the cosmos genesis import/export code. We'll definitely need @JimLarson 's help here.

The swing-store export process generates two chunks of data: the "export data" records, and the binary "artifacts" blobs. To support state-sync, we continually mirror the export-data records into IAVL, so those are already covered by genesis export. The artifact blobs are only generated on-demand, and must be requested before the swing-store has committed any data beyond the block of interest (a condition that is trivial to meet if the node has halted).

During agd export, we need the cosmos exporter to make a swing-store exporter and request the same artifacts that would be included in a state-sync. These artifacts should be base64-encoded and included in the genesis record (maybe as swingset.artifact.${name} keys), in a canonically-sorted order.

During agd import, we need to store the decoded artifacts into a temporary directory, populate the IAVL tree, create a swing-store importer, find the subset of the IAVL records which are swing-store export data records, feed those records into the importer, then feed all the artifact blobs into the importer. At the end of this process, the swing-store importer should declare a successful import, and both IAVL and swingstore should be ready to resume into a running system that behaves just like the original.

BTW #7225 is the PR where the golang code for state-sync import/export landed, it might provide some clues to how we should proceed for genesis import/export. And I think https://github.com/agoric-labs/cosmos-sdk/blob/Agoric/server/export.go is the entry point for the agd export command.

If we were sufficiently motivated, we could make the genesis export somewhat easier to edit, by breaking out the transcript spans into separate entries, each in their own section of the export JSON. If we did this, we'd want to remove the span hashes from the export data records, as well as the span artifact blobs. We'd need to change the importer to reconstruct those artifacts (from whatever items were in the genesis export JSON, which could then be edited just like you might edit the IAVL balance or delegation data).

I think it's most important to be able to do a genesis export-edit-import that edits IAVL keys. Second importance is editing swingstore kvStore keys (removing something from the run-queue, maybe resurrecting a dead vat, manually adding a refcount to some object to prevent it from being GCed, editing a c-list). Being able to edit transcript entries is at least third on the list, if not lower.

@michaelfig
Copy link
Member

I don't like the idea of trying to shoehorn the agd export into a single file of JSON.

Why not have entirely new agd export and agd import commands that are implemented entirely in JS, and manipulate an "archive" instead? Then, the genesis.json from the original agd export would just be one entry in that archive. You could agd import the archive (an actual CLI command, not implicit as part of agd start which would just call some JS code) and it would explode out all the artifacts to disk to .agoric/config/genesis.json and .agoric/data/agoric/..., including rebuilding any indexed caches we have (like the sqlite3 database). That's where we could consistency-check things for safety.

Then the next agd start doesn't need to be aware of anything except for its normal job of creating IAVL from genesis.json.

@warner
Copy link
Member Author

warner commented May 13, 2023

Huh, I didn't realize that was an option.. I figured a single file of JSON was necessary so there'd be something canonical and easy to hash, so everybody could discuss the edits and agree on the new contents.

We could dump the SQL to its own file, but ensuring that it's in a canonical order (and only contains in-consensus fields, excluding things like local no-consensus stats) would be required to get a consistent hash, and we're already doing most of that work for the state-sync export machinery (and the kvStore contents are already in IAVL, also for that support).

I definitely agree that agd export should be an explicit command, and agd start should not use it. My preference, of course, would be an explicit agd init to make the initial state, such that you must either do agd init or agd export, and then agd start would refuse to run unless the DB was already in place. But that became moot once we launched the chain: we're never going to run an agd init again, no point in inventing it now.

@FUDCo
Copy link
Contributor

FUDCo commented May 13, 2023

If everything can be done from the JS side, it would be extremely easy to write a standalone utility that would pull the data out using the export API and write it to one or more files.

@mhofman
Copy link
Member

mhofman commented May 13, 2023

The cosmic-swingset import / export JS files are already CLI apps that can do just that. No need to implement anything new. At worst you can have the agd app forward to them.

@warner
Copy link
Member Author

warner commented Jun 6, 2023

@mhofman is going to write up the recipe for validators / emergency state editors:

  1. commands to run to export everything into a set of files
  2. describe what kinds of edits are safe to make (i.e. the keys that contain hashes of swingstore data must match the swingstore data itself)
  3. files whose hashes need to be published for comparison with others
  4. files which need to be copied to the new machine
  5. hash checks which must be performed to ensure all the files you received match the social consensus
  6. commands to run to import the files into genesis/IAVL and a new swingstore

The process should somewhat resemble the original plain-cosmos-chain gaiad export ; edit ; gaiad import sequence, even though it will involve more files and more manual hash checks.

@mhofman
Copy link
Member

mhofman commented Jun 6, 2023

As discussed in the kernel meeting and detailed above, I will write a script / recipe that allows doing an export and import based off the tools that currently exist in the repo/vault release. In particular the steps would likely be:

I believe a suggestion to support not changing the chain id would be to change the block height in the genesis file. We will likely need to also patch the swingstore export manifest to keep the block height in sync. In general I don't know much about genesis export, so testing this will be interesting, and I'll need help.

@michaelfig
Copy link
Member

change the block height in the genesis file

In my experience, agd export does the right thing wrt the block height. I would just try that first and only look for other approaches if the attempt fails.

@mhofman
Copy link
Member

mhofman commented Jun 7, 2023

In my experience, agd export does the right thing wrt the block height

I think @arirubinstein mentioned something like having block height a hundred block or so in the future compared to export height to avoid potential double signing issues.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
cosmic-swingset package: cosmic-swingset enhancement New feature or request SwingSet package: SwingSet vaults_triage DO NOT USE
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants