Skip to content

Commit

Permalink
fix(swing-store): add 'replay' artifactMode, make export more strict
Browse files Browse the repository at this point in the history
Previously, `makeSwingStoreExporter()` took a positional argument
named `exportMode`, with values of 'current', 'archival', or
'debug'. This controlled how many artifacts were included in the
export, on a best-effort basis (e.g. a DB whose old spans were pruned
would emit the same artifacts with either 'current' or 'archival').

`importSwingStore()` took an options bag with both the
`makeSwingStore` options (like `keepSnapshots` and `keepTranscripts`),
and an import-specific `includeHistorical` boolean, which controlled
which artifacts were processed by the import. This was also on a
best-effort basis: `includeHistorical: true` on an export dataset that
lacked old spans would produce the same (pruned) DB as `false`.

This commit changes both APIs to take an options bag with a common
`artifactMode` option, with values of `operational`, `replay`,
`archival`, or `debug`. The `operational` choice replaces `current`
and behaves the same way: just enough data for normal operations. The
new `replay` choice 'operational' and 'archival', and selects all
transcript spans for the current incarnation of each vat, but omits
transcript spans for old incarnations: enough to perform a full
vat-replay of the latest incarnation.

Note: `makeSwingStoreExporter` was changed from a positional argument
to an options bag, and no attempt was made to be compatible with
old-style callers.

During export, the mode is now strict: if the DB lacks the artifacts
requested by the given mode, `makeSwingStoreExporter()` will throw an
error, rather than emit fewer artifacts than desired. This means
`artifactMode: 'replay'` will fail unless the DB being exported has
all those old (current-incarnation) transcript items. And `archival`
will fail unless the DB has the old incarnation spans too. The `debug`
mode is best-effort, and emits everything available without the
additional completeness checks.

During import, the mode applies both an import filter and a
completeness check. So exporting with `archival` but importing with
`operational` will get you a pruned DB, lacking anything
historical. Exporting with `operational` and importing with `replay`
or `archival` will fail, because the newly-populated DB does not
contain any historical artifacts.

closes #8105
  • Loading branch information
warner committed Aug 15, 2023
1 parent 7bfadd3 commit 9939ea6
Show file tree
Hide file tree
Showing 14 changed files with 441 additions and 240 deletions.
31 changes: 20 additions & 11 deletions packages/swing-store/docs/data-export.md
Original file line number Diff line number Diff line change
Expand Up @@ -179,22 +179,31 @@ As a result, for each active vat, the first-stage Export Data contains a record
The `openSwingStore()` function has an option named `keepTranscripts` (which defaults to `true`), which causes the transcriptStore to retain the old transcript items. A second option named `keepSnapshots` (which defaults to `false`) causes the snapStore to retain the old heap snapshots. Opening the swingStore with a `false` option does not necessarily delete the old items immediately, but they'll probably get deleted the next time the kernel triggers a heap snapshot or transcript-span rollover. Validators who care about minimizing their disk usage will want to set both to `false`. In the future, we will arrange the SwingStore SQLite tables to provide easy `sqlite3` CLI commands that will delete the old data, so validators can also periodically use the CLI command to prune it.
When exporting, the `makeSwingStoreExporter()` function takes an `exportMode=` argument. This serves to limit the set of artifacts that will be provided in the export. The defined values of `exportMode` are:
* `current`: include only the current transcript span and current snapshot for each vat: just the minimum set necessary for current operations
* `archival`: include all available transcript spans
* `debug`: include all available transcript spans *and* all available snapshots. The old snapshots are never necessary for normal operations, nor are they likely to be usefor for extreme upgrade scenarios, but they might be useful for some unusual debugging operation
When exporting, the `makeSwingStoreExporter()` function takes an `artifactMode` option (in an options bag). This serves to both limit, and provide some minimal guarantees about, the set of artifacts that will be provided in the export. The defined values of `artifactMode` each build upon the previous one:
Note that `exportMode` does not affect the Export Data generated by the exporter (if we *ever* want to validate this optional data, the hashes are mandatory). It only affects the names returned by `getArtifactNames()`: the list will be smaller for `current` than for `archival`. Re-exporting from a pruned copy will lack the old data, even if the re-export uses `archival`, because the second SwingStore cannot magically reconstruct the missing data.
* `operational`: include only the current transcript span and current snapshot for each vat: just the minimum set necessary for current operations
* `replay`: add all transcript spans for the current incarnation
* `archival`: add all available transcript spans, even for old incarnations
* `debug`: add all available snapshots, giving you everything. The old snapshots are never necessary for normal operations, nor are they likely to be useful for extreme upgrade scenarios, but they might be useful for some unusual debugging operations or investigations
Note that when a vat is terminated, we delete all information about it, including transcript items and snapshots, both current and old. This will remove all the Export Data records, and well as the matching artifacts from `getArtifactNames`.
For each mode, the export will fail if the data necessary for those artifacts is not available (e.g. it was previously pruned). For example, an export with `artifactMode: 'replay'` will fail unless every vat has all transcript entries for each one's current incarnation. The `archival` mode will fail to export unless every vat has *every* transcript entry, back to the very first incarnation.
When importing, the `importSwingStore()` function takes an options bag, which has property named `includeHistorical`. This property defaults to `false`, which makes the importer ignore any historical artifacts present in the export dataset. To import the historical transcript spans (and snapshots), you must set it to `true`.
However the `debug` export mode will never fail: it merely dumps everything in the swingstore, without limits or completeness checks.
So, to convey historical transcript spans from one swingstore to another, you must set three options along the way:
Note that `artifactMode` does not affect the Export Data generated by the exporter (because if we *ever* want to validate this optional data, the hashes are mandatory). It only affects the names returned by `getArtifactNames()`: `operational` returns a subset of `replay`, which returns a subset of `archival`. And re-exporting from a previously-pruned copy under `archival` mode will fail, because the second SwingStore cannot magically reconstruct the missing data.
* the original swingstore must be opened with `{ includeHistorical: true }`, otherwise the old spans will be pruned immediately
* the export must use `makeSwingStoreExporter(dirpath, 'archival')`, otherwise the export will omit the old spans
* the import must use `importSwingStore(exporter, dirPath, { includeHistorical: true })`, otherwide teh import will ignore the old spans
Also note that when a vat is terminated, we delete all information about it, including transcript items and snapshots, both current and old. This will remove all the Export Data records, and well as the matching artifacts from `getArtifactNames`.
When importing, the `importSwingStore()` function's options bag takes a property named `artifactMode`, with the same meanings as for export. Importing with the `operational` mode will ignore any artifacts other than those needed for current operations, and will fail unless all such artifacts were available. Importing with `replay` will ignore spans from old incarnations, but will fail unless all spans from current incarnations are present. Importing with `archival` will fail unless all spans from all incarnations are present. There is no `debug` option during import.
`importSwingStore()` returns a swingstore, which means its options bag also contains the same options as `openSwingStore()`, including the `keepTranscripts` option. This defaults to `true`, but if it were overridden to `false`, then the new swingstore will delete transcript spans as soon as they are no longer needed for operational purposes (e.g. when `transcriptStore.rolloverSpan()` is called).
So, to avoid pruning current-incarnation historical transcript spans when exporting from one swingstore to another, you must set (or avoid overriding) the following options along the way:
* the original swingstore must not be opened with `{ keepTranscripts: false }`, otherwise the old spans will be pruned immediately
* the export must use `makeSwingStoreExporter(dirpath, { artifactMode: 'replay'})`, otherwise the export will omit the old spans
* the import must use `importSwingStore(exporter, dirPath, { artifactMode: 'replay'})`, otherwise the import will ignore the old spans
* the `importSwingStore` call (and all subsequent `openSwingStore` calls) must not use `keepTranscripts: false`, otherwise the new swingstore will prune historical spans as new ones are created (during `rolloverSpan`).
## Implementation Details
Expand Down
11 changes: 5 additions & 6 deletions packages/swing-store/src/assertComplete.js
Original file line number Diff line number Diff line change
@@ -1,20 +1,19 @@
/**
* @param {import('./internal.js').SwingStoreInternal} internal
* @param {'operational'} level
* @param {Omit<import('./internal.js').ArtifactMode, 'debug'>} checkMode
* @returns {void}
*/
export function assertComplete(internal, level) {
assert.equal(level, 'operational'); // only option for now
export function assertComplete(internal, checkMode) {
// every bundle must be populated
internal.bundleStore.assertComplete(level);
internal.bundleStore.assertComplete(checkMode);

// every 'isCurrent' transcript span must have all items
// TODO: every vat with any data must have a isCurrent transcript
// span
internal.transcriptStore.assertComplete(level);
internal.transcriptStore.assertComplete(checkMode);

// every 'inUse' snapshot must be populated
internal.snapStore.assertComplete(level);
internal.snapStore.assertComplete(checkMode);

// TODO: every isCurrent span that starts with load-snapshot has a
// matching snapshot (counter-argument: swing-store should not know
Expand Down
7 changes: 4 additions & 3 deletions packages/swing-store/src/bundleStore.js
Original file line number Diff line number Diff line change
Expand Up @@ -16,6 +16,7 @@ import { createSHA256 } from './hasher.js';
*/
/**
* @typedef { import('./exporter').SwingStoreExporter } SwingStoreExporter
* @typedef { import('./internal.js').ArtifactMode } ArtifactMode
*
* @typedef {{
* addBundle: (bundleID: string, bundle: Bundle) => void;
Expand All @@ -29,7 +30,7 @@ import { createSHA256 } from './hasher.js';
* repairBundleRecord: (key: string, value: string) => void,
* importBundleRecord: (key: string, value: string) => void,
* importBundle: (name: string, dataProvider: () => Promise<Buffer>) => Promise<void>,
* assertComplete: (level: 'operational') => void,
* assertComplete: (checkMode: Omit<ArtifactMode, 'debug'>) => void,
* getExportRecords: () => IterableIterator<readonly [key: string, value: string]>,
* getArtifactNames: () => AsyncIterableIterator<string>,
* getBundleIDs: () => IterableIterator<string>,
Expand Down Expand Up @@ -162,8 +163,8 @@ export function makeBundleStore(db, ensureTxn, noteExport = () => {}) {
return sqlGetPrunedBundles.all();
}

function assertComplete(level) {
assert.equal(level, 'operational'); // for now
function assertComplete(checkMode) {
assert(checkMode !== 'debug', checkMode);
const pruned = getPrunedBundles();
if (pruned.length) {
throw Fail`missing bundles for: ${pruned.join(',')}`;
Expand Down
30 changes: 18 additions & 12 deletions packages/swing-store/src/exporter.js
Original file line number Diff line number Diff line change
Expand Up @@ -8,6 +8,8 @@ import { makeBundleStore } from './bundleStore.js';
import { makeSnapStore } from './snapStore.js';
import { makeSnapStoreIO } from './snapStoreIO.js';
import { makeTranscriptStore } from './transcriptStore.js';
import { assertComplete } from './assertComplete.js';
import { validateArtifactMode } from './internal.js';

/**
* @template T
Expand Down Expand Up @@ -53,7 +55,7 @@ import { makeTranscriptStore } from './transcriptStore.js';
*
* Get a list of name of artifacts available from the swingStore. A name
* returned by this method guarantees that a call to `getArtifact` on the same
* exporter instance will succeed. The `exportMode` option to
* exporter instance will succeed. The `artifactMode` option to
* `makeSwingStoreExporter` controls the filtering of the artifact names
* yielded.
*
Expand All @@ -75,22 +77,20 @@ import { makeTranscriptStore } from './transcriptStore.js';
*/

/**
* @typedef {'current' | 'archival' | 'debug'} ExportMode
* @typedef { object } ExportSwingStoreOptions
* @property { import('./internal.js').ArtifactMode } [artifactMode] What artifacts should/must the exporter provide?
*/

/**
* @param {string} dirPath
* @param { ExportMode } exportMode
* @param { ExportSwingStoreOptions } [options]
* @returns {SwingStoreExporter}
*/
export function makeSwingStoreExporter(dirPath, exportMode = 'current') {
export function makeSwingStoreExporter(dirPath, options = {}) {
typeof dirPath === 'string' || Fail`dirPath must be a string`;
exportMode === 'current' ||
exportMode === 'archival' ||
exportMode === 'debug' ||
Fail`invalid exportMode ${q(exportMode)}`;
const exportHistoricalSnapshots = exportMode === 'debug';
const exportHistoricalTranscripts = exportMode !== 'current';
const { artifactMode = 'operational' } = options;
validateArtifactMode(artifactMode);

const filePath = dbFileInDirectory(dirPath);
const db = sqlite3(filePath);

Expand All @@ -106,6 +106,12 @@ export function makeSwingStoreExporter(dirPath, exportMode = 'current') {
const bundleStore = makeBundleStore(db, ensureTxn);
const transcriptStore = makeTranscriptStore(db, ensureTxn, () => {});

if (artifactMode !== 'debug') {
// throw early if this DB will not be able to create all the desired artifacts
const internal = { snapStore, bundleStore, transcriptStore };
assertComplete(internal, artifactMode);
}

const sqlGetAllKVData = db.prepare(`
SELECT key, value
FROM kvStore
Expand All @@ -132,8 +138,8 @@ export function makeSwingStoreExporter(dirPath, exportMode = 'current') {
* @yields {string}
*/
async function* getArtifactNames() {
yield* snapStore.getArtifactNames(exportHistoricalSnapshots);
yield* transcriptStore.getArtifactNames(exportHistoricalTranscripts);
yield* snapStore.getArtifactNames(artifactMode);
yield* transcriptStore.getArtifactNames(artifactMode);
yield* bundleStore.getArtifactNames();
}

Expand Down
30 changes: 17 additions & 13 deletions packages/swing-store/src/importer.js
Original file line number Diff line number Diff line change
Expand Up @@ -2,11 +2,12 @@ import { Fail, q } from '@agoric/assert';

import { makeSwingStore } from './swingStore.js';
import { buffer } from './util.js';
import { validateArtifactMode } from './internal.js';
import { assertComplete } from './assertComplete.js';

/**
* @typedef { object } ImportSwingStoreOptions
* @property { boolean } [includeHistorical] Should the importer pay attention to historical artifacts?
* @property { import('./internal.js').ArtifactMode } [artifactMode] What artifacts should the importer use and require?
*/

/**
Expand All @@ -16,15 +17,17 @@ import { assertComplete } from './assertComplete.js';
*
* @param {import('./exporter').SwingStoreExporter} exporter
* @param {string | null} [dirPath]
* @param {ImportSwingStoreOptions} options
* @param {ImportSwingStoreOptions} [options]
* @returns {Promise<import('./swingStore').SwingStore>}
*/
export async function importSwingStore(exporter, dirPath = null, options = {}) {
if (dirPath && typeof dirPath !== 'string') {
Fail`dirPath must be a string`;
}
const { includeHistorical = false } = options;
const store = makeSwingStore(dirPath, true, options);
const { artifactMode = 'operational', ...makeSwingStoreOptions } = options;
validateArtifactMode(artifactMode);

const store = makeSwingStore(dirPath, true, makeSwingStoreOptions);
const { kernelStorage, internal } = store;

// For every exportData entry, we add a DB record. 'kv' entries are
Expand Down Expand Up @@ -81,11 +84,10 @@ export async function importSwingStore(exporter, dirPath = null, options = {}) {

// All the metadata is now installed, and we're prepared for
// artifacts. We walk `getArtifactNames()` and offer each one to the
// submodule, which ignores historical ones (unless
// 'includeHistorical' is true), and validates+accepts the
// rest. This is an initial import, so we don't need to check if we
// already have the data, but the submodule function is free to do
// that check if they want.
// submodule, which may ignore it according to `artifactMode`, but
// otherwise validates and accepts it. This is an initial import, so
// we don't need to check if we already have the data, but the
// submodule function is free to do such checks.

for await (const name of exporter.getArtifactNames()) {
const makeChunkIterator = () => exporter.getArtifact(name);
Expand All @@ -98,23 +100,25 @@ export async function importSwingStore(exporter, dirPath = null, options = {}) {
await internal.bundleStore.importBundle(name, dataProvider);
} else if (tag === 'snapshot') {
await internal.snapStore.populateSnapshot(name, makeChunkIterator, {
includeHistorical,
artifactMode,
});
} else if (tag === 'transcript') {
await internal.transcriptStore.populateTranscriptSpan(
name,
makeChunkIterator,
{ includeHistorical },
{ artifactMode },
);
} else {
Fail`unknown artifact type ${q(tag)} on import`;
}
}

// We've installed all the artifacts that we could, now do a
// completeness check.
// completeness check. Enforce at least 'operational' completeness,
// even if the given mode was 'debug'.

assertComplete(internal, 'operational');
const checkMode = artifactMode === 'debug' ? 'operational' : artifactMode;
assertComplete(internal, checkMode);

await exporter.close();
return store;
Expand Down
12 changes: 10 additions & 2 deletions packages/swing-store/src/internal.js
Original file line number Diff line number Diff line change
@@ -1,3 +1,5 @@
import { Fail, q } from '@agoric/assert';

/**
* @typedef { import('./snapStore').SnapStoreInternal } SnapStoreInternal
* @typedef { import('./transcriptStore').TranscriptStoreInternal } TranscriptStoreInternal
Expand All @@ -8,7 +10,13 @@
* snapStore: SnapStoreInternal,
* bundleStore: BundleStoreInternal,
* }} SwingStoreInternal
*
* @typedef {'operational' | 'replay' | 'archival' | 'debug'} ArtifactMode
*/

// Ensure this is a module.
export {};
export const artifactModes = ['operational', 'replay', 'archival', 'debug'];
export function validateArtifactMode(artifactMode) {
if (!artifactModes.includes(artifactMode)) {
Fail`invalid artifactMode ${q(artifactMode)}`;
}
}
4 changes: 3 additions & 1 deletion packages/swing-store/src/repairMetadata.js
Original file line number Diff line number Diff line change
Expand Up @@ -60,6 +60,8 @@ export async function doRepairMetadata(internal, exporter) {
}

// and do a completeness check
assertComplete(internal, 'operational');
/** @type { import('./internal.js').ArtifactMode } */
const artifactMode = 'operational';
assertComplete(internal, artifactMode);
await exporter.close();
}
21 changes: 11 additions & 10 deletions packages/swing-store/src/snapStore.js
Original file line number Diff line number Diff line change
Expand Up @@ -31,6 +31,7 @@ import { buffer } from './util.js';

/**
* @typedef { import('./exporter').SwingStoreExporter } SwingStoreExporter
* @typedef { import('./internal.js').ArtifactMode } ArtifactMode
*
* @typedef {{
* loadSnapshot: (vatID: string) => AsyncIterableIterator<Uint8Array>,
Expand All @@ -44,10 +45,10 @@ import { buffer } from './util.js';
* @typedef {{
* exportSnapshot: (name: string) => AsyncIterableIterator<Uint8Array>,
* getExportRecords: (includeHistorical: boolean) => IterableIterator<readonly [key: string, value: string]>,
* getArtifactNames: (includeHistorical: boolean) => AsyncIterableIterator<string>,
* getArtifactNames: (artifactMode: ArtifactMode) => AsyncIterableIterator<string>,
* importSnapshotRecord: (key: string, value: string) => void,
* populateSnapshot: (name: string, makeChunkIterator: () => AnyIterableIterator<Uint8Array>, options: { includeHistorical: boolean }) => Promise<void>,
* assertComplete: (level: 'operational') => void,
* populateSnapshot: (name: string, makeChunkIterator: () => AnyIterableIterator<Uint8Array>, options: { artifactMode: ArtifactMode }) => Promise<void>,
* assertComplete: (checkMode: Omit<ArtifactMode, 'debug'>) => void,
* repairSnapshotRecord: (key: string, value: string) => void,
* }} SnapStoreInternal
*
Expand Down Expand Up @@ -481,11 +482,11 @@ export function makeSnapStore(
}
}

async function* getArtifactNames(includeHistorical) {
async function* getArtifactNames(artifactMode) {
for (const rec of sqlGetAvailableSnapshots.iterate(1)) {
yield snapshotArtifactName(rec);
}
if (includeHistorical) {
if (artifactMode === 'debug') {
for (const rec of sqlGetAvailableSnapshots.iterate(null)) {
yield snapshotArtifactName(rec);
}
Expand Down Expand Up @@ -564,12 +565,12 @@ export function makeSnapStore(
* @param {string} name Artifact name of the snapshot
* @param {() => AnyIterableIterator<Uint8Array>} makeChunkIterator get an iterator of snapshot byte chunks
* @param {object} options
* @param {boolean} options.includeHistorical
* @param {ArtifactMode} options.artifactMode
* @returns {Promise<void>}
*/
async function populateSnapshot(name, makeChunkIterator, options) {
ensureTxn();
const { includeHistorical } = options;
const { artifactMode } = options;
const parts = name.split('.');
const [type, vatID, rawEndPos] = parts;
// prettier-ignore
Expand All @@ -580,7 +581,7 @@ export function makeSnapStore(
sqlGetSnapshotHashFor.get(vatID, snapPos) ||
Fail`no metadata for snapshot ${name}`;

if (!metadata.inUse && !includeHistorical) {
if (!metadata.inUse && artifactMode !== 'debug') {
return; // ignore old snapshots
}

Expand Down Expand Up @@ -617,8 +618,8 @@ export function makeSnapStore(
`);
sqlListPrunedCurrentSnapshots.pluck();

function assertComplete(level) {
assert.equal(level, 'operational'); // for now
function assertComplete(checkMode) {
assert(checkMode !== 'debug', checkMode);
// every 'inUse' snapshot must be populated
const vatIDs = sqlListPrunedCurrentSnapshots.all();
if (vatIDs.length) {
Expand Down
Loading

0 comments on commit 9939ea6

Please sign in to comment.