Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[docs] Add documentation on fileset migrations under operational guides #2630

Merged
merged 1 commit into from
Sep 17, 2020
Merged
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
43 changes: 43 additions & 0 deletions docs/operational_guide/fileset_migrations.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,43 @@
# Fileset Migrations

Occasionally, changes will be made to the format of fileset files on disk. When those changes need to be applied to already existing filesets, a fileset migration is required. Migrating existing filesets is beneficial so that improvements made in newer releases can be applied to all filesets, not just newly created ones.

## Migration Process
Migrations are executed during the initial stages of the bootstrap. When enabled, the filesystem bootstrapper will scan for filesets that should be migrated and migrate any filesets found. A fileset is determined to be in need of a migration based on the `MajorVersion` and `MinorVersion` found in the info file. If `MajorVersion.MinorVersion` is less than the target migration version, then that fileset will be scheduled for migration.

If migrations are deemed necessary, the bootstrap process pauses until the migrations complete. If a failure occurs while migrating, an error is logged and the process continues. If a fileset is not successfully migrated, the non-migrated version of the fileset is used going forward. In other words, whether they succeed or fail, migrations should leave filesets in a good state.
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems unfortunate that migrations slow down the bootstrapping process, especially since it seems like the migration isn't required for boot up (i.e if it fails, the old fs is used). curious if there is any future plan to change this.

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The concurrency knobs help a bit to make them happen pretty quickly, but, yeah, not ideal for sure. This is something we can revisit if it starts to become a problem and we have more migrations. Given that m3 has been around for a while now and we're just implementing something like this, I'm inclined to wait and see before investing more time in optimizations.


## Enabling Migrations
Migrations are enabled by setting the following fields in the M3 configuration (`m3dbnode.yml`):

```
db:
bootstrap:
fs:
migration:
targetMigrationVersion: "1.1"
# Optional. Defaults to the number of available CPUs / 2.
concurrency: <# of concurrent workers>
```

## Valid Target Migration Versions

<table>
<thead>
<tr>
<th>Version</th>
<th>Description</th>
</tr>
</thead>

<tbody>
<tr>
<td><code>&quot;none&quot;</code></td>
<td>Disables migrations. Effectively the same as not including the migration option in the M3 configuration</td>
</tr>
<tr>
<td><code>&quot;1.1&quot;</code></td>
<td>Migrates to version 1.1. Version 1.1 adds checksum values to individual entries in the index file of data filesets. This speeds up bootstrapping as validating the index file no longer requires loading and calculating the checksum of the entire file against the value in the digests file.</td>
</tr>
</tbody>
</table>