Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Lazy Creation of WebSocketClient within PolykeyClient #771

Open
amydevs opened this issue Jul 12, 2024 · 4 comments
Open

Lazy Creation of WebSocketClient within PolykeyClient #771

amydevs opened this issue Jul 12, 2024 · 4 comments
Labels
development Standard development

Comments

@amydevs
Copy link
Member

amydevs commented Jul 12, 2024

Specification

After experimenting with PolykeyClient in Polykey Network Status. I've realized that the error handling is really tedious.

The main issues happen when a connection fails, the error you will receive from the PolykeyClient will be inconsistent depending on the point at which the code throws during the the stopping process.

This is because the Stopping signal propagation of WebSocketClient to the PolykeyClient is entirely asynchronous through EventListeners. This makes for a huge pain when attempting connection retries like so:

https://github.com/MatrixAI/Polykey-Network-Status/blob/staging/src/plugins/polykeyClientPlugin.ts

This initially gave me the impression that error handling with RPCClient + WebSocketClient was really messy.

But after playing around with it in React, I managed to come up with a small React context provider component that handles it all really elegantly: https://github.com/MatrixAI/Polykey-Enterprise/blob/feature-ws/src/app/contexts/RPCProvider.tsx

The gist is that instead of the lifecycle of PolykeyClient being flakily tied to WebSocketClient, the PolykeyClient will instead always be started, creating WebSocketConnections lazily when an RPCClient call is being attempted.

This has several benefits:

  • Errors from PolykeyClient RPC calls are more predictable, as making an RPC call can no longer error from either
    1. The connection error from the RPC call
    2. The connection error from the WebSocketClient
    3. The WebSocketClientStopped error
  • Reconnection + reinitate RPC call logic can be entirely encapsulated within a single loop of the RPC call, rather than 2 independent loops. This is because WebSocketConnection errors will now always occur on the RPC call, rather than possibly receiving three different errors in this circumstance.

Implementation

A lazy param will be simply added to the PolykeyClient constructor. There will be no API changes.

There WILL however be behaviour changes that need to be documented:

  • PolykeyClient will never stop unless .stop is called.
  • RPC calls will throw WebSocketConnection errors if the lazily created WebSocketConnection is unsuccessful in connecting

Additional context

Tasks

  1. Add lazy option to PolykeyClient
@amydevs amydevs added the development Standard development label Jul 12, 2024
Copy link

linear bot commented Jul 12, 2024

Copy link
Contributor

tegefaulkes commented Aug 6, 2024

There is some stuff the be worked out with differences in behaviour. With the network state being lazily created we can end up with not actually active networking state while we're runing. This means getters for host and port will return undefined. This changes the API somewhat.

@amydevs Mentioned we can branch the behaviour into two separate classes to avoid the behaviour conflict. but this is something to be addressed later.

@CMCDragonkai
Copy link
Member

I was discussing this with @amydevs. We don't actually want this. The problem is more about decomposing the problem dimensions between a structured hierarchical concurrent state machines, and using locks to prevent non-deterministic behaviour when an underlying state machine or the overarching state machine is making transitions.

I think the idea that a "connection" is blocked while "streams" are transitioning is now incorrect. The lock priority should be the other way around. "Streams" should be blocked while the "connection" is transitioning. This is because a stream is a more abstract thing, and the connection transition is higher priority.

@CMCDragonkai
Copy link
Member

Also #772 is now cancelled.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
development Standard development
Development

Successfully merging a pull request may close this issue.

3 participants