Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Revamp pipeline thread handling #1295

Closed
SeeminglyScience opened this issue May 21, 2020 · 119 comments · Fixed by #1459
Closed

Revamp pipeline thread handling #1295

SeeminglyScience opened this issue May 21, 2020 · 119 comments · Fixed by #1459
Assignees
Labels

Comments

@SeeminglyScience
Copy link
Collaborator

SeeminglyScience commented May 21, 2020

A lot of the problems we face is based around handling of the pipeline thread. In order to invoke something on it, we need to interact with the PowerShell class, making us invoke at least a small bit of script in most cases. The reason for this is that the actual thread used by PowerShell is internal to the runspace by default. The only way to access it is to invoke a command of some sort.

This is the default experience, but we can change it. Runspace has a property called ThreadOptions. One of the choices for that enum is UseCurrentThread. So what we can do is start our own thread, create the runspace there with that option, and never give up that thread.

One of the biggest wins here would be that we could call PSConsoleReadLine.ReadLine directly without having to invoke something. We could also ditch using the thread pool to wait for PowerShell.Invoke to finish (which probably causes some dead locks). We could get rid of a lot of the more complicated code in PowerShellContext.

I'm pretty confident that if this doesn't outright solve a lot of the sluggishness and dead locks, it'll at the very least be significantly easier to debug.

The rest of this post is taken from #980 since the idea is the same: Nvm, just look at @rjmholt's code and the rest of conversation. The linked post is pretty outdated.

@ghost ghost added the Needs: Triage Maintainer attention needed! label May 21, 2020
@rjmholt
Copy link
Contributor

rjmholt commented May 21, 2020

I've been thinking about this too, and wrote a simple consumer that allows asynchronous dispatch of PowerShell commands like this:

    public class AsyncPowerShell
    {
        private readonly PowerShell _pwsh;

        private readonly Runspace _runspace;

        private readonly BlockingCollection<Tuple<PSCommand, TaskCompletionSource<IReadOnlyCollection<PSObject>>>> _commandQueue;

        private readonly Thread _pwshThread;

        private readonly CancellationTokenSource _cancellationSource;

        public AsyncPowerShell()
        {
            _commandQueue = new BlockingCollection<Tuple<PSCommand, TaskCompletionSource<IReadOnlyCollection<PSObject>>>>();
            _cancellationSource = new CancellationTokenSource();
            _runspace = RunspaceFactory.CreateRunspace();
            _runspace.Open();
            _pwsh = PowerShell.Create();
            _pwsh.Runspace = _runspace;
            _runspace.Debugger.DebuggerStop += OnDebuggerStop;
            _pwshThread = new Thread(RunPwshConsumer);
            _pwshThread.Start();
        }

        public Task<IReadOnlyCollection<PSObject>> ExecutePowerShellAsync(string command)
        {
            var completionSource = new TaskCompletionSource<IReadOnlyCollection<PSObject>>();
            _commandQueue.Add(new Tuple<PSCommand, TaskCompletionSource<IReadOnlyCollection<PSObject>>>(new PSCommand().AddCommand(command), completionSource));
            return completionSource.Task;
        }

        public void Stop()
        {
            _commandQueue.CompleteAdding();
            _cancellationSource.Cancel();
            _pwshThread.Join();
        }

        private void RunPwshConsumer()
        {
            try
            {
                foreach (Tuple<PSCommand, TaskCompletionSource<IReadOnlyCollection<PSObject>>> executionRequest in _commandQueue.GetConsumingEnumerable(_cancellationSource.Token))
                {
                    try
                    {
                        _pwsh.Commands = executionRequest.Item1;
                        executionRequest.Item2.SetResult(_pwsh.Invoke());
                    }
                    catch (OperationCanceledException)
                    {
                        executionRequest.Item2.SetCanceled();
                    }
                    catch (Exception e)
                    {
                        executionRequest.Item2.SetException(e);
                    }
                    finally
                    {
                        _pwsh.Commands.Clear();
                    }
                }
            }
            catch (OperationCanceledException)
            {
                // End nicely
            }
        }

        private void OnDebuggerStop(object sender, DebuggerStopEventArgs args)
        {
            Console.WriteLine("Debugger stopped");
        }
    }

I think with some added sophistication, we could build off such a model to add:

  • Nested prompt and debugging support
  • Per-execution cancellations, allowing us to skip particular requests
  • An implementation based on PowerShell's BeginInvoke and EndInvoke methods
  • Cancellation that works into PowerShell's own pipeline cancellation mechanisms

The only part I don't quite understand in your description @SeeminglyScience is why we need another thread for message dispatch.

@daxian-dbw might also be interested in this discussion

@ghost ghost added the Needs: Maintainer Attention Maintainer attention needed! label May 21, 2020
@SeeminglyScience
Copy link
Collaborator Author

The only part I don't quite understand in your description @SeeminglyScience is why we need another thread for message dispatch.

Oops that made sense in the context of the original thread, but OmniSharp already does that. I'll rip that out

@SeeminglyScience
Copy link
Collaborator Author

SeeminglyScience commented May 21, 2020

Love the code btw ❤️. Basically the same thing I was picturing, except include an option to queue a Action<EngineIntrinsics, CancellationToken> that sets Runspace.DefaultRunspace before invocation (maybe needed maybe not). That way we can maybe run ReadLine, CommandInvocationIntrinsics.GetCommand, and/or CommandCompletion.CompleteInput (we might already do this?) without a command processor.

Would also maybe make accessing some internals via reflection a bit safer. I think there have been a few situations where it would have been worth it to use an internal API, but threading concerns made it really risky.

@rjmholt
Copy link
Contributor

rjmholt commented May 21, 2020

The other thought I had was to also include a method to queue a non-PowerShell action, so we could run something on the pipeline thread without having the overhead of executing PowerShell

This is exactly what you're saying...

@daxian-dbw
Copy link
Member

daxian-dbw commented May 21, 2020

One of the choices for that enum is UseCurrentThread.

I think setting the Runspace.DefaultRunspace should be enough to directly calling ReadLine without requiring __Invoke-ReadLineForEditorService.
But we will need to make sure the Runspace is not used by anything else when PSReadLine is doing its work.

Setting UseCurrentThread in ThreadOptions only indicates that when invoking a script/command with that PowerShell instance (say ps), the actual invocation happens on the calling thread.
So if you later call ps.Invoke() to a different thread, then that thread will be used instead.
Also, a gotcha is that PowerShell's BeginInvoke and EndInvoke methods will cause a deadlock when using UseCurrentThread. It's more like a bug to me.

@rjmholt @SeeminglyScience Do you mean the Dispatch pool section in the PR description is not applicable anymore?

@SeeminglyScience
Copy link
Collaborator Author

One of the choices for that enum is UseCurrentThread.

I think setting the Runspace.DefaultRunspace should be enough to directly calling ReadLine without requiring __Invoke-ReadLineForEditorService.
But we will need to make sure the Runspace is not used by anything else when PSReadLine is doing its work.

I know there are a few other things that are thread static though. Drawing a blank on them, but iirc one has to do with DSC cache. Also I do think it would easier to manage access to the thread than it is to manage what is using the runspace (that's more or less what we do today).

Setting UseCurrentThread in ThreadOptions only indicates that when invoking a script/command with that PowerShell instance (say ps), the actual invocation happens on the calling thread.
So if you later call ps.Invoke() to a different thread, then that thread will be used instead.

Yeah, the idea being is that ps.Invoke() will only be called on the thread we create.

Also, a gotcha is that PowerShell's BeginInvoke and EndInvoke methods will cause a deadlock when using UseCurrentThread. It's more like a bug to me.

That's interesting... I kind of expected that would throw. Similar to the "cannot invoke async on nested pipeline" exception.

@rjmholt @SeeminglyScience Do you mean the Dispatch pool section in the PR description is not applicable anymore?

He was talking about a different part I already removed, but yeah that one kind of isn't either. That's also already mostly handled by omnisharp. I went ahead and just removed all that since @rjmholt's code shows what I'm talking about better than I was explaining it.

@rjmholt
Copy link
Contributor

rjmholt commented May 21, 2020

The parts I'm not quite clear on here are:

  • How do we handle when the debugger gets dropped into?
  • Can we use this design to fix other debugger issues like Wait-Debugger in an argument completer?
  • How do we programmatically trigger a cancellation in the current PowerShell execution and what do we do if we are unable to cancel it? Do we try to abort it, clean it up and start a new one, or is that going to usually leave things in a bad state anyway?
  • Is there a simple way we can traverse the queue and sift out things that have been cancelled ahead of time? Or perhaps it's enough to just evaluate if something's been cancelled before we evaluate it? If so, do we set an exception or return some indicative value?

@daxian-dbw
Copy link
Member

daxian-dbw commented May 21, 2020

fix other debugger issues like Wait-Debugger in an argument completer?

That sounds like an unexpected use to me. What is the scenario?

I looked into the PSES source code recently, and get the idea of how script execution for intellisense is done via the OnIdle event.
The logic might not work if PSES starts to call ReadLine directly instead of from a pipeline. Or to be more precise, the OnIdle event might not work with PSReadLine, because the Runspace will be idle for most of the time (PSReadLine spinning in ReadKey), so a PulsePipeline will be automatically started by the Runspace from time to time, but ReadLine doesn't expect that something is running in the pipeline, and may invoke script/command in a nested pipeline (PowerShell.Create(CurrentRunspace)), and result in conflicts and failures. Never mind, PipelineBase.DoConcurrentCheck handles the case where a pulse pipeline is running. So this is probably not a problem.

What does the existing ThreadController do in PSES?

@rjmholt
Copy link
Contributor

rjmholt commented May 21, 2020

@daxian-dbw
Copy link
Member

For PowerShell/vscode-powershell#2448, PSReadLine doesn't handle breaking into debugger or the nested debugger prompt. The debugger prompt is handled by the PowerShell host. With ConsoleHost, it will call into PSConsoleHostReadLine function again from the nested prompt.

For PowerShell/vscode-powershell#2246, the script block is converted to a delegate which calls ScriptBlock.InvokeAsDelegateHelper to execute the script in the DefaultRunspace. BTW, I cannot reproduce the hang.

@TylerLeonhardt TylerLeonhardt added Area-Engine Issue-Discussion Let's talk about it. and removed Needs: Maintainer Attention Maintainer attention needed! Needs: Triage Maintainer attention needed! labels May 21, 2020
@TylerLeonhardt
Copy link
Member

TylerLeonhardt commented May 21, 2020

I like all of this. It sounds like we could really simplify how we manage running (PowerShell or not) on the pipeline thread.

My thinking is that this is something that can be totally standalone and not specific to PSES... That way it can be self-contained and easy to test. (I'm not saying it should be its own assembly - just that it can be in its own folder and is clear what the purpose is for)

We tackled handling async requests with Omnisharp... the next big hurdle is our management of the PowerShell runspace... and this discussion should address the fear and uncertainty I have with the existing code which is great.

@ghost ghost added the Needs: Maintainer Attention Maintainer attention needed! label May 21, 2020
@TylerLeonhardt TylerLeonhardt removed the Needs: Maintainer Attention Maintainer attention needed! label May 21, 2020
@rjmholt
Copy link
Contributor

rjmholt commented May 21, 2020

My thinking is that this is something that can be totally standalone and not specific to PSES... That way it can be self-contained and easy to test. (I'm not saying it should be its own assembly - just that it can be in its own folder and is clear what the purpose is for)

Yeah that's my feeling too.

Because of how PowerShell is set up, this execution model is coupled to the host implementation, which is something I didn't appreciate until reading through the implementation; I believe debugging and nested prompts only work if the host attached to the code implements them (or something along those lines).

Basically, today we do a bunch of complicated stuff with the host implementation and instead my hope is to have the PowerShell executor and host implementation together in its own folder. Possibly for testing a dummy host will be needed.

@ghost ghost added the Needs: Maintainer Attention Maintainer attention needed! label May 21, 2020
@rjmholt
Copy link
Contributor

rjmholt commented May 21, 2020

Also thanks for looking at those @daxian-dbw -- have been meaning to ask you how those get handled when run in pwsh.exe. Down the line we might be able to do something similar

@TylerLeonhardt TylerLeonhardt removed the Needs: Maintainer Attention Maintainer attention needed! label May 21, 2020
@SeeminglyScience
Copy link
Collaborator Author

SeeminglyScience commented May 22, 2020

  • How do we handle when the debugger gets dropped into?

ScriptDebugger.DebuggerStop would give control back to the thread owner.

  • Can we use this design to fix other debugger issues like Wait-Debugger in an argument completer?

Yeah probably. Though this would be way easier to do if we stopped using OnIdle. I don't know if I ever recorded it anywhere, but I regret using that. Instead, PSRL should have a private contract delegate field (or a public API) that it calls on idle if not null. Using PowerShell's eventing is too flimsy and prone to dead locks. Especially when two things try to marshal a call back to the pipeline thread using it, it gets real messy.

Edit: I should say that's all a guess. I haven't actually tried it, but I'd guess it's getting hung up when we try to retake PSRL with eventing while the event manager is already processing our first tab completion request. ConsoleHost doesn't have this problem because it has no reason to retake PSRL, any completion requests are going to come from PSRL itself.

  • How do we programmatically trigger a cancellation in the current PowerShell execution and what do we do if we are unable to cancel it? Do we try to abort it, clean it up and start a new one, or is that going to usually leave things in a bad state anyway?

I say we record the current "task" that's processing. I'm thinking the queue will take an interface like IJob or something that has a void Execute() method and a void Cancel() method. For the Action<> job it would trigger a cancellation token, for commands it would hit the appropriate cancel mechanism (StopProcessCommand/Stop).

If we can't stop, then there's not much we can do. We can't pretend that we cancelled because it'll still be executing on the pipeline thread.

  • Is there a simple way we can traverse the queue and sift out things that have been cancelled ahead of time? Or perhaps it's enough to just evaluate if something's been cancelled before we evaluate it? If so, do we set an exception or return some indicative value?

I think you just check for cancelled at the start of item processing, if it's cancelled move on to the next one.

@SeeminglyScience
Copy link
Collaborator Author

  • How do we programmatically trigger a cancellation in the current PowerShell execution and what do we do if we are unable to cancel it? Do we try to abort it, clean it up and start a new one, or is that going to usually leave things in a bad state anyway?

I say we record the current "task" that's processing. I'm thinking the queue will take an interface like IJob or something that has a void Execute() method and a void Cancel() method. For the Action<> job it would trigger a cancellation token, for commands it would hit the appropriate cancel mechanism (StopProcessCommand/Stop).

If we can't stop, then there's not much we can do. We can't pretend that we cancelled because it'll still be executing on the pipeline thread.

Or if you mean how do we cancel something that may or may not be currently processing (like might be the current item or might be further in the queue) then: We'd probably also want to store a lookup of job ID to job. The job itself would know if it's currently processing or not, if it is it'd call the appropriate cancel method, if not it'd just mark itself as cancelled.

@SeeminglyScience
Copy link
Collaborator Author

Because of how PowerShell is set up, this execution model is coupled to the host implementation, which is something I didn't appreciate until reading through the implementation; I believe debugging and nested prompts only work if the host attached to the code implements them (or something along those lines).

That's right. If no one subscribes to Debugger.DebuggerStop, nothing happens. You can see that with DefaultHost on a background runspace. Same with nested prompts, if the host doesn't implement PSHost.EnterNestedPrompt() it just throws.

@SeeminglyScience
Copy link
Collaborator Author

What does the existing ThreadController do in PSES?

It's kind of like what this issue is discussing, but only for nested prompts/debug stops. The current execution flow is sort of like this:

  1. Is something running?
    1. No - queue PowerShell.Invoke on the thread pool
    2. Yes - Is ReadLine what is running?
      1. No - Is request a nested prompt or debugger stop?
        1. No - Wait until finished
        2. Yes - We're on the pipeline thread, process requests via ThreadController
      2. Yes - Is command request interactive (e.g. F8)?
        1. No - Take control from PSRL via OnIdle
        2. Yes - Cancel PSRL, run command request, restart PSRL

Part of the problem with the above is that so much of the implementation of ExecutionCommand pretends to be (and typically is) asynchronous. However, if it's called on the pipeline thread, it's forced to be synchronous because as soon as you await something you could end up on another thread. If everything goes through a thread controller like class that doesn't even pretend to be async (but instead facilitate async like @rjmholt's example) then we'll have less dead locks.

@daxian-dbw
Copy link
Member

  • Can we use this design to fix other debugger issues like Wait-Debugger in an argument completer?

Though this would be way easier to do if we stopped using OnIdle. I don't know if I ever recorded it anywhere, but I regret using that. Instead, PSRL should have a private contract delegate field (or a public API) that it calls on idle if not null. Using PowerShell's eventing is too flimsy and prone to dead locks. Especially when two things try to marshal a call back to the pipeline thread using it, it gets real messy.

A private delegate field can be considered, but not a public API, at least initially. It allows you to avoid the OnIdle event but the execution model is the same -- requests get executed 1-by-1, so we still need to handle the case where 2 things trying to marshal a call back to the pipeline thread.
I'm not very clear about the problem of depending on the OnIdle event to marshal script execution to a pipeline thread. Could you please elaborate a little more on it?

The challenge I see is that PSReadLine will hold the pipeline thread (the thread with Runspace.DefaultRunspace set to the PSES Runspace) indefinitely until debug/F8 cancels it. This execution model won't support Wait-Debugger in an argument completer by design, because PSReadLine will be blocked on the CommandCompletion.CompleteInput call when the debugger is triggered, and by that point PSReadLine cannot be cancelled.

The current execution flow is sort of like this:

@SeeminglyScience this summary is helpful for me to read the code, thanks!

@ghost ghost added the Needs: Maintainer Attention Maintainer attention needed! label May 22, 2020
@SeeminglyScience
Copy link
Collaborator Author

SeeminglyScience commented May 22, 2020

A private delegate field can be considered, but not a public API, at least initially. It allows you to avoid the OnIdle event but the execution model is the same -- requests get executed 1-by-1, so we still need to handle the case where 2 things trying to marshal a call back to the pipeline thread.

The problem isn't necessarily two things trying to marshal back at the same time, it's the nesting of them (see below).

I'm not very clear about the problem of depending on the OnIdle event to marshal script execution to a pipeline thread. Could you please elaborate a little more on it?

Yeah for sure, lets take this series of events:

  1. PSRL starts
  2. Intellisense is requested
  3. PSES uses OnIdle to take over PSRL and run TabExpansion2
  4. An argument completer or GetDynamicParameters method starts a new thread to process in
  5. In that thread, a call to a PowerShell API that needs to run on the pipeline thread is called
  6. That PowerShell API (like CommandInfo.get_Parameters for instance) queues it's invocation in the pipeline thread via PSEventManager and blocks until completed
  7. PSEventManager is still processing the OnIdle event from step 3. Step 3 is blocking on step 6 completing, which it never will because that event queued in step 6 won't start processing until step 3 is done.

(this is a real example btw, PackageManagement was doing this before @rjmholt fixed it)

The challenge I see is that PSReadLine will hold the pipeline thread (the thread with Runspace.DefaultRunspace set to the PSES Runspace) indefinitely until debug/F8 cancels it. This execution model won't support Wait-Debugger in an argument completer by design, because PSReadLine will be blocked on the CommandCompletion.CompleteInput call when the debugger is triggered, and by that point PSReadLine cannot be cancelled.

Well Wait-Debugger will synchronously call our DebugStop event handler no? It shouldn't matter that PSRL is still running because it gave pipeline thread control to the argument completer, which then gave control to the debug stop event, which then gave it to us.

@SeeminglyScience this summary is helpful for me to read the code, thanks!

❤️

@SeeminglyScience
Copy link
Collaborator Author

Just a reminder if anyone is still working on this currently, if y'all get stuck, want advise, or would like me to work on something for this, let me know. DM's are open on the PS discord same name there (there's also a #vscode-contributors channel). Always down to chat PSES! 🙂

@andyleejordan
Copy link
Member

There's a Discord too??? @rjmholt is currently working on this, and when I'm done getting my feet wet (read: setting up PSES so I can actually debug it all and get the OmniSharp update done) he's going to transition it over to me (read: work on finishing it together).

@JustinGrote
Copy link
Collaborator

@andschwa yes the PS discord is quite popular :)

@fMichaleczek
Copy link

@andschwa welcome back !

@SydneyhSmith SydneyhSmith removed the Needs: Maintainer Attention Maintainer attention needed! label Mar 29, 2021
@ghost ghost added the Status: In PR label May 10, 2021
@ghost ghost added Status: Fixed and removed Status: In PR labels Oct 28, 2021
@JustinGrote
Copy link
Collaborator

Big day! Congrats and thank you for all the hard word to get this done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet