[POC] Kernel and session rehydration and syncing #752
Closed
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
I'm opening this as a proof-of-concept—it's not ready for thorough review.
Many enterprise deployments of Jupyter Server manage kernels remotely and separately from Jupyter Server via a separate service, i.e. kernel gateway and enterprise gateway. The advantage of this configuration is that kernels can persist past the lifetime of the Jupyter Server. This is particularly helpful if you need to upgrade the server/frontend but don't want to shutdown kernels.
A major challenge of this type of system is that it introduces distributed state for kernels. Today, we don't have a good, general way to synchronize the state of kernels (and client sessions) across such a system.
This PR proposes a solution—a Synchronizer class that watches, syncs, and persists kernels and sessions within and beyond the lifetime of a Jupyter Server. Kernels and Sessions can be "rehydrated" from a database at any time. This is mostly useful in a remote kernel situation, where kernels are provided by a service separate from Jupyter Server (e.g, kernel gateway or enterprise gateway). If the Jupyter Server goes down or is restarted (say, for an upgrade), this synchronizer can be used to repopulate the state. This works with pending, local, and remote kernels.
Lots of discussion and work to still do here (including adding unit tests and documentation).
I'll share at our Jupyter Server meeting tomorrow.
Depends on #751
Pinging @kevin-bates, since he is likely interested.