Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Official API to create a third-party console host (like pty in Unix) #57

Closed
be5invis opened this issue Feb 1, 2018 · 69 comments
Closed
Assignees
Labels
Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Conhost For issues in the Console codebase Resolution-Fix-Available It's available in an Insiders build or a release
Milestone

Comments

@be5invis
Copy link

be5invis commented Feb 1, 2018

Yes, a full, comprehensive pty API. It should be able to handle all kinds of console applications that run in Conhost (WSL, Win32, UWP, even 16-bit DOS programs).

@zadjii-msft zadjii-msft added Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Conhost For issues in the Console codebase labels Feb 2, 2018
@zadjii-msft
Copy link
Member

I've definitely heard this ask more than a few times before.

"Official pty API for Conhost" does seem like a bit of a broad feature request. I'm not saying that it's totally impossible, but lets entertain the hypothetical for a bit here.

If I were to go about implementing this, what would be the most important things for me to implement? What kind of surface would this API have? How would you expect it to behave?

@be5invis
Copy link
Author

be5invis commented Feb 2, 2018

Well you know, CONHOST internally has a char matrix that places all the characters. So the API should output such things without opening a CONHOST window.

@parkovski
Copy link

parkovski commented Feb 2, 2018

My wishlist for this feature:

  • Provide a way to create a pair of console handles without going through conhost.
  • Document "implement" a function, let's call it ConsoleControl, that lets you generate the events a console client would expect to receive.
  • Make WSL tty interop handles translate to these.
  • Document what kind of black magic you have to use to make cmd.exe respond to a ^C.
    Bonus for 3rd party console authors: an official way to set these handles as "this process' console" so when someone calls AttachConsole it works (and similar for GetConsoleWindow).

@bgshacklett
Copy link

It would be really good to get some input from the authors of terminal emulators like Maximus5/ConEmu and zeit/hyper.

@parkovski
Copy link

Cc: @Maximus5, not sure of a good contact from hyper.

@Maximus5
Copy link

Maximus5 commented Feb 8, 2018

It's a real pain to implement ConEmu via official console API. I have to use detours to hook almost all available functions. And what for? Linux provides pty functions and signals for decades and they have many terminals for any taste.

@bgshacklett
Copy link

bgshacklett commented Feb 9, 2018

@zeit @leo and @chabou appear to be the most active on Hyper. Also, @Eugeny has the Terminus project.

@Eugeny
Copy link

Eugeny commented Feb 9, 2018

Both Hyper and Terminus use node-pty as a PTY backend, which in turn uses the winpty project to provide a pty API on Windows, which is basically a ton of black magic that runs a hidden conhost window and reads its contents.
Having an API identical to the POSIX pty controls would allow cutting the amount of code in node-pty in half and dropping winpty.

@parkovski
Copy link

parkovski commented Feb 11, 2018

The hooking issues are why I think this is important. I don't think it's even possible to create console handles without opening a conhost window. You can't even make a custom driver that acts like a console without hooking all the APIs.

I think it's great that the built-in console window is improving, but it shouldn't be this hard and hacky to implement a third party terminal. I don't think the exact shape of the API is that important, because if the functionality is there somebody will make a pty wrapper API.

This (for tmux) and a sudo command that doesn't go through UAC are the last pieces I need to make the Windows console the dev environment I want. I'm not familiar with the headless Windows server, but I'd imagine these things would be pretty important there too.

@be5invis
Copy link
Author

be5invis commented Feb 12, 2018

@parkovski
Historically the CONHOST has a deep link with the Kernel, and it is simulating a 1980s video adapter's text mode. I remember that in Windows Vista, all windows are in the glass-style material, but only the console is still in the old look and feel. You can even draw text outside the console window.
Perhaps we could move all the legacy compatibility into the "old" console, and provide PTY APY only for the "new" console.

@yatli
Copy link

yatli commented Feb 12, 2018

@be5invis does the pty features conflict with the legacy console? If not, we can regard the features as an "extension". For example, report terminal type as "VT-MSCMD" which allows things like, actively changing the size of the window. In this way both legacy apps (which ignore the PTY stuff) and the terminal bits will be happy.

If PTY mode is activated, behaviors outside the range of PTY feature set can be translated into escape codes specific to VT-MSCMD, just like how SIXEL cooperates with xterm.
And of course, a PTY program can ignore these escape codes or just write them to STDOUT..

@zadjii-msft zadjii-msft added this to the Backlog milestone Feb 12, 2018
@parkovski
Copy link

The implementation has changed a lot since then - the console window used to live in csrss, which is why it missed out on theming. There's a good overview of this here, but basically whenever a console subsystem program is launched or you call AllocConsole, csrss creates a new connection to the console driver and passes the handle to conhost.exe.

Conhost is now a regular user-land program, it's just blessed with the ability to be given new console handles and an API that we don't have. When I run WinAPIOverride on it, I see a bunch of calls to DeviceIoControl (undocumented flags) and ConsoleControl (undocumented function). The rest of the console APIs look like they call NtDeviceIoControlFile directly. If we could create new console handles and use these APIs without going through conhost, I think we'd have what we need to make a Unix pty wrapper. Basically, I think the functionality is there, it's just not exposed to anyone other than conhost right now.

@zadjii-msft
Copy link
Member

Hey all,

I just want you to know that we're definitely going to be CLOSELY following this discussion. A "PTY Implementation" is one of the highest items on the backlog, and it will likely be one of the most important things to get right the first time.

What I'm getting from this discussion:

  • Be able to create a "headless console" to host client commandline applications (cmd, bash, etc)
  • Be able to specify an input and output handle to be able to write input to this console, and read buffer contents
    • Important to be able to send Ctrl+C as a SIGINT to the client application
    • be able to send resizes to this console and receive them via some mechanism (another handle? in band with the input/output?)

Would these things satisfy most people? What am I missing, what could be improved, etc.

@be5invis
Copy link
Author

It would be good if we could provide a callback as a "layout function," i.e., the API would call this callback function to fill a string into the character matrix. It would be useful for multi-language support since the layout is a hard task.

@be5invis
Copy link
Author

be5invis commented Feb 12, 2018

My requirements:

  1. Crating a headless console
  2. Read the char matrix -- including character and properties.
  3. Send user interactions (key and mouse) to the headless console.
  4. Custom layout function (maybe HRESULT layout(ConsoleManipulator *manipulator, LPCWSTR str, DWORD x, DWORD y, ......)) for advanced i18n.

@Eugeny
Copy link

Eugeny commented Feb 12, 2018

@be5invis I think it's important to differentiate between a Pty, which is terminal/layout agnostic and only handles data and extra messages/signals and a terminal, which handles the visual layout.

@Maximus5
Copy link

@zadjii-msft I think you described proper requirements.
All console API functions (like ReadConsoleInput) shall be able to work with these handles. And Read/WriteConsoleInput shall be able to process all know event types of course.

There is also some craziness with ReadFile called on console handle. In some cases it behaves like ReadConsoleA (returns a line, processing Enter, arrows for history, Tab for completion...), but in other cases it just return ANSI characters one by one. It's hardly possible to find documentation and requirements for console modes ReadFile depends on...

@zadjii-msft
Copy link
Member

@Maximus5 How would you feel if the input "API" was sane on how it was handled, but didn't support certain keystrokes?

For example, Ctrl-M, Ctrl-J and Enter. Do we really care about differentiating between Ctrl-M and Enter, or would it be okay to map Ctrl-M to Enter (As long as it's documented as such)? They're different keystrokes on the keyboard, sure, but for any *nix style terminals, we can't differentiate those keys.

@Maximus5
Copy link

@zadjii-msft If we are talking about future POSIX PTY API it's OK, because indeed in Unix many keystrokes are mapped to one sequence. Personally I consider this as lack of interface, but compatibility makes the rules.

But proposed API in this topic is supposed to let us ability to implement "third-party conhost" windows. And I believe it must not be limited to neither Unix nor Windows limitations!

By Windows limitations I mean features like

  • true color support through ANSI (it's not possible to retrieve info from conhost)
  • ability to place cursor after the end of line without going to the next line
  • ability to understand where the console is "dirty" (was real output from APP even with spaces)
    And so on...

If we (third-party devs) may create real console handles which may be used with native console API functions I believe we would be able to implement any emulation type - POSIX or Windows.
I don't want to lose ability to run native console applications...

@be5invis be5invis changed the title Official pty API for Conhost Official API to create a third-party console host (like pty in Unix) Feb 13, 2018
@be5invis
Copy link
Author

Clarify the title.
What we need is a real abstraction layer, an API set to let us implement something like a console host, including third party consoles or something like tmux. It does not need to look exactly like Unix pty. Actually it should be stronger than Unix pty, since there are more black Magics in Conhost.

@yatli
Copy link

yatli commented Feb 13, 2018

@be5invis agree in the title change, that this should be a PTY-like layer. Additionally, I feel that we should separate these two things:

  1. implementing such a layer that 3rd party terminal can allocate real console handles (which is then used in the API surface of this layer)
  2. VT emulation, and PTY interface above this layer -- this should be just like how the console is used for WSL.

In this way there is no interference between existing windows console apps and the *nix things because the former leverages only 1) and the latter leverages 1) and 2).

@parkovski
Copy link

A few thoughts on complications:

  • If we're supporting "legacy" Windows mode, which I agree is important, how does input get exposed? It seems like ReadFile isn't well-equipped to handle that mode. Do we have to switch to Read/WriteConsoleInput in win32 mode? Have a second handle for non-text and/or non-ANSI input, and if so is overlapped IO necessary to get input order correct? Do we map things that can be mapped (like colors) to ANSI escapes? What about things that can't? Do we make new escapes for things like ^M or maybe one escape that says "the next n bytes are an INPUT_RECORD"?
  • Do we support mapping between the two modes? E.g. when we get tmux on Windows, how do we handle tmux being in ANSI mode, communicating with a child process in Windows mode? This also comes into play with WSL interop - what happens if we get an interop handle and then try to change the mode?
  • What about signals other than SIGINT? Thinking mainly about suspend/resume right now which is already supported by conhost, but also later on with interop, how do we translate WSL signals?
  • We don't have terminfo, so we probably need to be really clear on how a conforming implementation behaves.

@zadjii-msft
Copy link
Member

I do want to make a note, whatever we end up implementing is going to allow existing client applications to continue using the full Win32 console API, and ideally it's going to be without forcing 3rd party terminals from implementing the entire console API surface themselves. We'd love to hide the "black magic", and expose only a sane surface to implement on top of.

P.S. This discussion is my favorite of all of them, a lot of good thoughts in here.

@Tyriar
Copy link
Member

Tyriar commented Feb 18, 2018

Just echoing @Eugeny's thoughts; this would both greatly simplify and improve the stability of the terminal within VS Code, or where ever node-pty/winpty is used. Currently there's no way forward for many of the VS Code terminal Windows issues due to the way it's implemented. A proper pty API (with legacy emulation on conhost's side) would have massive impact.

@parkovski
Copy link

@mintty This would take the place of winpty or wslbridge - allows you to act as conhost without any hacks or workarounds. The console apps on the other end don't have to be modified, just the terminal app taking the place of conhost. Presumably CreatePseudoConsole will give you an input and output handle; pass those to a console process and whatever you write to the input it will see in its stdin, and when it writes to the output, you can interpret & display however you want.

@zadjii-msft
Copy link
Member

Well, we don't have real, official documentation on it ready quite yet, but I'll share some details:

  • No, existing console apps do not have to be modified. They will run just as they always have, attached to a full-featured conhost who is acting as the console API server.
    • case in point: WSL's interop in insider builds uses an early version of this feature, and I use that as my daily driver. Being able to run cmd.exe inside of tmux is bliss
  • Terminal applications will be able to create a pseudoconsole by passing a read/write pair of handles to CreatePseudoConosle. This function will give you back a handle you can then use to attach a client commandline application to. Stay tuned for documentation on how to do this, but essentially, the terminal app is now capable of launching commandline apps attached to a pseudoconsole instead of a windowed conhost.
  • With the other side of the read/write handles supplied to the pseudoconsole, callers will be able to provide input to the commandline application and read the output of the terminal. These work as streams of utf-8 characters and embedded VT sequences.

In conclusion:

  • @parkovski was almost exactly correct
  • yes we are hoping this might be able to replace wslbridge/winpty
  • it is backwards compatible with existing console apps
  • yes there will still be weirdness in RS5; hopefully with feedback from partners and the community we will be able to resolve any strange quirks you might see

@WSLUser
Copy link
Contributor

WSLUser commented Jul 11, 2018

How much easier does this make native conhost able to perform proper VT sequences and display UTF-8 chars (vice the legacy UTF)? It sounds like it would ensure far faster procurement of new features (such as emojis and Sixel support). If this is the case, then I would think terminal apps like Konsole (which I highly desire to be available on Windows) would be even more easily ported to Windows. Though that would be a bit strange to see a cmd.exe running inside Konsole (not that I'd complain). What would be funnier is if somebody ported Iterm2 to Windows using the API. Sounds like everything that's needed is in place afterall (minus the critical bugs you're squashing).

@miniksa
Copy link
Member

miniksa commented Jul 11, 2018

@DarthSpock, the latter half of your post there about being able to port other terminals that natively speak VT to Windows is something that should be very possible here when we round all the sharp edges off the feature.

As for conhost.exe itself displaying new things.... that's not super changing here. It's technically capable internally now of holding a lot more stuff within the Unicode planes than it was before to help enable this functionality (big points to @adiviness for working on this part!), but we haven't been able to make conhost display all of them correctly on its own surface yet. We need a renderer overhaul for that.... which is on our backlog. That's when you'd see emojis for sure and maybe... MAYBE sixels... show up on conhost's window itself.

Our major focus here has been to engineer a platform that is capable of servicing the content needed by others to "make terminals great again" on Windows. We're the only ones who can change the OS to enable others to succeed in this way, so that's been our focus this release. Making our inbox terminal surface great in conhost is something we'd love to do too.... but there are a lot of great terminals out there and we have limited resources. So we're trying to unblock the world first in a way that only we can and we'll assuredly turn our eyes back to our own terminal surface again in the future.

There's a pending blog post from us to explain what we've done here for this release that should include some of these details and what it is capable of and what it is not capable of. @bitcrazed already has a draft of the post in our inboxes for review. We're currently sassing him about the diagrams and some of the wording, so it will take a little more time before it rolls out.

I recommend you watch https://blogs.msdn.microsoft.com/commandline/ for our posts on the Windows Command-Line to find out more. Rich has really started to get the flywheel moving on pushing blog posts out. The devs are also chomping at the bit to start sharing some more of our workings with you via that platform as soon as we get the boilerplate series of articles out.

And I promise you we'll get to filling out docs for this at https://docs.microsoft.com/en-us/windows/console/ in the next few months. We're just at the peak of bug bashing time right now. When bugs slow down, we will spend more time on writing up the detailed documentation. We wish we had more hours in every day! Thanks for your patience and interest!

@zadjii-msft
Copy link
Member

@DarthSpock

Though that would be a bit strange to see a cmd.exe running inside Konsole

xterm close enough for you?
image

No? Maybe you want to set your default shell in gnome-terminal to cmd.exe?
image

@WSLUser
Copy link
Contributor

WSLUser commented Jul 11, 2018

That's totally understandable and think it's great what you've done! I'm just thinking there will be environments where even with this API, ported terminals may not be available (authorized) for download so if it can be leveraged for native conhost, that would be great news as well.

I always keep on eye on the commandline blog for new posts and have been following along the console series (which are great btw). Looking forward to seeing the docs updated. Enjoy the bug bashing!

@WSLUser
Copy link
Contributor

WSLUser commented Jul 11, 2018

@zadjii-msft Nice! Definitely strange to see but not in a bad way.

@bitcrazed
Copy link
Contributor

I have two new posts in this series in-progress as I type :) Firsts discussing Console Internals is nearing completion. The next talks PTYs. Stand by ;)

@Tyriar
Copy link
Member

Tyriar commented Aug 15, 2018

The blog post announcing "ConPTY" went out 🎉 https://blogs.msdn.microsoft.com/commandline/2018/08/02/windows-command-line-introducing-the-windows-pseudo-console-conpty/

@miniksa miniksa added the Resolution-Fix-Available It's available in an Insiders build or a release label Aug 15, 2018
@bitcrazed
Copy link
Contributor

Thanks @Tyriar - AT LAST, right? ;)

And with the public announcement of ConPTY API in current Insiders builds, and coming to this fall's release of Windows 10, I am closing this issue.

Thanks for your patience everyone ;)

str4d added a commit to str4d/zcash that referenced this issue Aug 17, 2018
Ctrl+C is not configured for Windows, as it does not work (yet):
microsoft/vscode#9347
microsoft/terminal#57
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Product-Conhost For issues in the Console codebase Resolution-Fix-Available It's available in an Insiders build or a release
Projects
None yet
Development

No branches or pull requests