Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

"Restart Extension Host" or any similar command hangs all active interpreter sessions #2683

Closed
jmcphers opened this issue Apr 5, 2024 · 6 comments
Assignees
Labels
area: core Issues related to Core category. area: runtimes Issues related to Language Runtimes bug Something isn't working support

Comments

@jmcphers
Copy link
Collaborator

jmcphers commented Apr 5, 2024

Positron Version:

Positron Version: 2024.04.0 (Universal) build 29
Code - OSS Version: 1.87.0
Commit: 6c41dddd853aa3f36b47296a42ec2672c5a523de
Date: 2024-04-04T03:32:13.676Z
Electron: 27.3.2
Chromium: 118.0.5993.159
Node.js: 18.17.1
V8: 11.8.172.18-electron.0
OS: Darwin arm64 23.4.0

Steps to reproduce the issue:

  1. Open any folder in Positron (do not open a workspace file; the folder must not contain a Workspace file)
  2. Start R and/or Python
  3. From the File menu, choose Save Workspace As
  4. Save the workspace to any file name
  5. Attempt to execute an R and/or Python command in the Console

Nothing happens; the console hangs and doesn't run the commands.

From a cursory investigation, it appears that the issue here is that the extensions get soft-restarted when we switch context to the newly saved workspace file. This soft-restart causes them to lose all their state and forget about the active runtime sessions, even though the UI still shows the sessions as active.

Originally reported by a beta user: https://github.com/posit-dev/positron-beta/discussions/74; a video demo is available there.

What did you expect to happen?

Interpreters should be usable after running this command.

@jmcphers jmcphers added the bug Something isn't working label Apr 5, 2024
@petetronic petetronic added this to the Public Beta 2024 Q2 milestone Apr 8, 2024
@juliasilge juliasilge added area: core Issues related to Core category. area: runtimes Issues related to Language Runtimes labels Apr 26, 2024
@juliasilge
Copy link
Contributor

I looked at this a little bit today, seeing where we were doing extensionService.stopExtensionHosts() and then extensionService.startExtensionHosts(). There wasn't much different about this particular instance ("Save Workspace As") compare to all the other places, and then I realized another way to get the same problem is to run the "Restart Extension Host" command. The command is pretty simple:

async run(accessor: ServicesAccessor): Promise<void> {
const extensionService = accessor.get(IExtensionService);
const stopped = await extensionService.stopExtensionHosts(nls.localize('restartExtensionHost.reason', "Restarting extension host on explicit request."));
if (stopped) {
extensionService.startExtensionHosts();
}
}
}

I'm pretty sure from my initial poke around that all instances where Positron might stop and then start the extension host are currently broken:
https://github.com/search?q=repo:posit-dev/positron+%22stopExtensionHosts(localize%22&type=code

@juliasilge
Copy link
Contributor

Can we shut down a ILanguageRuntimeSessionManager when we do stopExtensionHosts()? The extension service does have a onWillStop event but it seems to mostly be used for vetoing the extension hosts stopping.

@juliasilge juliasilge changed the title "Save workspace as..." command hangs all active interpreter sessions "Restart Extension Host" or any similar command hangs all active interpreter sessions Apr 30, 2024
@jmcphers
Copy link
Collaborator Author

Yes. I have a change that does a better job handling session managers as it's also needed to make things work correctly with a remote extension host. (not yet merged)

We basically need to detect that this has happened and do a reconnect to recreate all of the state in the extensions that gets lost when the host restarts.

@juliasilge
Copy link
Contributor

Sounds great! I'm going to assign you @jmcphers so we know this issue is being addressed.

@jmcphers
Copy link
Collaborator Author

Did some exploratory work on this today and the results were not great. Because the extensions hold a lot of state w/r/t the registered and active runtimes, a forced restart of all extensions results in a lot of lost data. For example:

  • the extension host no longer knows which runtimes are registered
  • all the zeromq connections between the extension host and the ptyhost are severed
  • all comms are offline
  • the language runtimes are in a unique state: the backend and frontend are still alive but the middleware is gone

To fix this, we need to:

  • put the runtimes in a sort of limbo state when the extension host stops (which is worth doing any way since the extension host can crash)
  • same deal for each active comm
  • detect the extension host's reappearance
  • reactivate all the extensions that had active runtimes
  • wait for that activation to complete
  • replay runtime registration records back into the extension host, OR just re-register the active runtimes and then defer a second discovery pass (argument for doing this is that you could then restart the extension host to rescan/retrigger discovery as a debugging tool/reset button)
  • reconnect the runtimes that were in limbo state
  • reconnect the comms

@juliasilge
Copy link
Contributor

In Positron 2024.05.0 (Universal) build 1313, I now see a great experience for "Restart Extension Host":

restart-ext-host.mov

I see an intermittent remaining problem with the original reported workflow around "Save Workspace As" that I have outlined in #3300.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Aug 8, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
area: core Issues related to Core category. area: runtimes Issues related to Language Runtimes bug Something isn't working support
Projects
None yet
Development

No branches or pull requests

3 participants