Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support renderer fast presents #7147

Closed
mrange opened this issue Aug 1, 2020 · 12 comments
Closed

Support renderer fast presents #7147

mrange opened this issue Aug 1, 2020 · 12 comments
Labels
Area-Performance Performance-related issue Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Needs-Tag-Fix Doesn't match tag requirements Product-Terminal The new Windows Terminal.

Comments

@mrange
Copy link
Contributor

mrange commented Aug 1, 2020

Description of the new feature/enhancement

The current renderer has a complex workflow in order to render a frame. If one are looking into providing animated backgrounds or terminal effects that require high FPS rates the process is too slow especially since terminal effects disables partial (faster) repaints.

In addition; in DxRenderer.cpp there is lot of copying back-n-forth between forward and back buffer to enable partial updates and terminal effects.

This proposal is to:

  1. Allow fast presents (60+ FPS) for renderers where it makes sense.
  2. Enable partial updates even if terminal effects are enabled.
  3. Simplify and speed up DxRenderer.

I realize this is probably not high on the list of "need to" stuff but if the retro terminal effects is a thing I think it makes sense to make it render fast enough to support other cool scenarios.

Proposed technical implementation details (optional)

Proposed design

@mrange mrange added the Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. label Aug 1, 2020
@ghost ghost added Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting Needs-Tag-Fix Doesn't match tag requirements labels Aug 1, 2020
mrange added a commit to mrange/terminal that referenced this issue Aug 1, 2020
@zadjii-msft
Copy link
Member

Huh, I like the idea. I know that we've been tossing around ideas for making the retro effect work with differential drawing for a bit now, but this might be cleaner. I'm gonna summon @miniksa to get his thoughts on your spec (and fork), since he's our rendering expert

@zadjii-msft zadjii-msft added Area-Performance Performance-related issue Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Product-Terminal The new Windows Terminal. labels Aug 3, 2020
@ghost ghost removed the Needs-Tag-Fix Doesn't match tag requirements label Aug 3, 2020
@zadjii-msft zadjii-msft removed the Needs-Triage It's a new issue that the core contributor team needs to triage at the next triage meeting label Aug 12, 2020
@mrange
Copy link
Contributor Author

mrange commented Aug 16, 2020

Hi. Anyone had a chance to look at the proposal?

@miniksa
Copy link
Member

miniksa commented Aug 17, 2020

I'll read this today. Sorry for the delay.

@miniksa
Copy link
Member

miniksa commented Aug 17, 2020

I like the initial proposal.

I'm not sure where the assertion that the existing slow workflow is too slow for 60fps comes from. Is that specifically in the context of when a shader is applied? I don't think I've seen the partial rendering stalling us below 60fps on my machine for simple text...

I don't mind the texture idea. It's not a problem if we use MORE of DX11 than before. The whole point of moving to DX was that we'd use more and more of it over time anyway. Also, having a sort of backing buffer that is drawn to and then composing it on the front is fine. I think I did something like that in the GDI renderer.

Regarding the proposed impl:

  1. I definitely don't want a full invalidate on scroll. That's a big deal for outputting big.txt.
  2. Pixelshader always applied. I agree with your theoretical proposal to skip the shader route completely if no shader is applied. NOP shader is a creative idea, but I'd rather save all the memory overhead of not instantiating those things if no one is using them.

For the fast present impl:
A. For the renderers that don't support it, we generally return S_FALSE not S_OK. S_FALSE is still a SUCCEEDED() value, but it says that nothing was done or nothing needed to be done.
B. I feel like in lieu of using the frame limiting number for the FastPresent function if this shouldn't just be tied up with the WaitUntilCanRender method which would allow a FastPresent to be called every time the Waitable swap chain says it's ready until another the rest of the terminal is ready for a repaint and regular Present. Additionally... I wonder if we can't just have FastPresent and Present be the same thing with each renderer realizing it either can re-present the previous frame or that it should just S_FALSE until someone paints again. That saves adding another interface method and might simplify the thread.cpp loop/conditions.

Finally, we should continue to use Present1() where possible. There are significant performance advantages realized in the rest of the system rendering pipeline (specifically DirectComposition, the DWM, and any aware application that can use the additional metadata to do partial presentation such as Remote Desktop) by passing that data. I don't mind using Present for shaders enabled, but the common high-performance case must continue to use Present1.

@mrange
Copy link
Contributor Author

mrange commented Aug 17, 2020

  1. my intention was to restore scroll partial invalidate but it was too much work for the prototype.
  2. Yes, the initial idea was the NOP shader but I am back tracking from that idea in favour of CopySubResource or something similar.

WRT to Present1. I don't think Present1 is compatible with full screen shader effects but it could be possible to retain the current workflow and then have a separate workflow when shaders enabled. However, the drawback would be increased code-complexity.

With fullscreen refresh needed for the current implementation of how shaders are enabled it feels that repainting the full frame and then apply a destructive shader effect on the full frame feels like it unnecessary wasteful compared to just applying a pixel shader on a pre existing texture.

With that said I only have Debug built on my machine (disk space issues) and Debug is kind of sluggish at least for me.

Retaining the existing workflow and having a separate flow for pixel shaders is a significant change and it will take me some time to produce a working prototype.

@miniksa
Copy link
Member

miniksa commented Aug 18, 2020

It's okay if Present1 isn't compatible with full screen shader effects. This is my rationale:

  1. When using Remote Desktop with a Terminal, you're not likely to be using effects because you are sensitive to latency and bandwidth needs. Applications that use Present1 can be further compressed by Remote Desktop as the composition system is only notified of changes, therefore Remote Desktop only needs to carry those deltas to the remote machine. If we only used Present, Remote Desktop wouldn't get those nice notifications and would have to carry through a lot more data.
  2. When using effects, you're likely using a beefy machine locally and you value it looking pretty/flashy over raw performance either text-to-screen or over a network. If composition has to do more work at some level because it's given less information with Present over Present1, it's localized between a strong CPU, GPU, and large capacity bus bandwidth. You also don't have to contend with the bandwidth limitations of a remote scenario.

So having it bifurcated to support both user personas is likely what we need.

Also, we do not consider a Debug build as a valid measure of performance characteristics of the Terminal. There are a TON of debug-only logging, tracing, and guard mechanisms that are stripped out on Release. Additionally, the optimizer does quite a bit of work on our codebase that just isn't present in Debug builds. I would very much recommend that you use Release for performance profiling and Debug just for diagnostics, debugging, and trace purposes.

Finally, I agree with you. The destructiveness of the shader effect isn't something I was ever in favor of, but it was a code contribution from outside our core team. And I wasn't willing to hold up going from nothing to something over those details at the time. This is much of why we classified the feature in the experimental namespace: because we knew it would have to be further refined by someone at some future point. And if that someone is YOU and the time is NOW, that's perfectly okay with us!

@mrange
Copy link
Contributor Author

mrange commented Aug 18, 2020

I aware that DEBUG in C++ is very sluggish compared to RELEASE and I agree it's not a good measurement for good reasons. However, as this is at a conceptual stage currently I haven't bothered to do proper measurement. I guess I shouldn't have mentioned the performance reason without facts.

@mrange
Copy link
Contributor Author

mrange commented Aug 18, 2020

When it comes to Present1 the remote desktop is a good argument and a good test case where I can see the existing code flow being beneficial in terms over the approach in this proposal.

@zadjii-msft
Copy link
Member

We don't really have an issue open already for "re-enable differential drawing when shaders are enabled", do we? I suppose this is the closest one.

I probably should have read this thread closer before I tried messing with the shaders yesterday (dev/migrie/dol/messing-with-shaders-take-1). Whoops. Well, I guess I know what I'll be looking into next month 😉

zadjii-msft added a commit that referenced this issue Jul 27, 2021
  As mentioned in #7147, there's a better way of doing shader presenting.

  This seems about right for when the shader effect is enabled. It's hard to tell if this is actually working well or not though. Maybe need to try another shader.

  When the shader is disabled, this absolutely doesn't work.
@zadjii-msft
Copy link
Member

I'm trying this out this week, and it's really close to working. I'm getting caught up in the scrolling bit though - I definitely don't want to do the InvalidateAll thing - that's the whole point of the optimization here, to not do that.

@mrange do you have any pointers for how to do the

my intention was to restore scroll partial invalidate but it was too much work for the prototype.

thing you mentioned? Any pointers would be immensely helpful 😄

My WIP branch is dev/migrie/fhl-2021/differential-pixel-shading, which is branched from dev/migrie/fhl-2021/more-shader-variables

@zadjii-msft
Copy link
Member

Nevermind, I figured it out. I made two incredibly embarrassing mistakes

  • I forgot to pass some ComPtrs by ref into a function. Instead I was copy-constructing new ones, then creating objects into those copies, then assuming the value would persist to the caller. 🤦
  • I actually wasn't swapping the variables correctly. Thanks std::swap 🤦

That's what I get for trying to write code with a head cold. It works now, and works well. I'm pretty confident you could leave shaders on all the time and not feel the pain with these changes. It even works acceptably to have cmatrix running with retro effects full screen on my slaptop (bless it's little heart)

So now dev/migrie/fhl-2021/differential-pixel-shading looks a lot better. I gotta untie those branches but #shipit

@lhecker
Copy link
Member

lhecker commented Sep 1, 2022

With #13885 in place, our new text renderer can handle this situation now. Since shading on the GPU is fast, it simply redraws the entire screen on every frame. Sounds slow? With the retro shader enabled rendering my entire 4K monitor worth of content 144 times per second takes up about 2-5% of my GPU load.
It'll ship, enabled by default, in Windows Terminal Preview 1.16 sometime soon. 🙂

@lhecker lhecker closed this as completed Sep 1, 2022
@ghost ghost added the Needs-Tag-Fix Doesn't match tag requirements label Sep 1, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Area-Performance Performance-related issue Area-Rendering Text rendering, emoji, complex glyph & font-fallback issues Issue-Feature Complex enough to require an in depth planning process and actual budgeted, scheduled work. Needs-Tag-Fix Doesn't match tag requirements Product-Terminal The new Windows Terminal.
Projects
None yet
Development

No branches or pull requests

4 participants