Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Integrate Godot Super Scaling (Viewport Render Scaling) Shader Into Godot #3825

Closed
cybereality opened this issue Jan 19, 2022 · 35 comments
Closed

Comments

@cybereality
Copy link

Describe the project you are working on

Godot Super Scaling is a render scaling add-on that uses an upscaling and supersampling algorithm to allow games to scale to arbitrary resolutions and either gain performance or increase picture quality beyond native resolution. You could compare it to AMD FSR, though I think my algorithm is quite competitive in terms of quality and performance. It is a different look, but as an artistic choice, some users may prefer it to FSR.

This would be applicable to any 3D game (though it does work in 2D as well, the usefulness is low) and can potentially allow games to gain 200% or up to 300% of native performance (at the lowest setting). At more moderate settings, you can gain between 25% to up to 50% of native performance with almost no loss to picture quality. The feature was used in both the Decay and Ella demos, and tested with thousands of users, and it works and I have not had any single person report a problem.

To see the kind of quality and performance that can be achieved look at the below screenshots from Decay (taken using an Nvidia RTX 3060 graphics card).

Native 4K / 47 fps
Decay_Art_4K

1080p upscaled to 4k / 154 fps
Decay_Art_1080p

Native 4K / 44 fps
Decay_Back_4K

1080p upscaled to 4K / 145 fps
Decay_Back_1080p

Native 1080p / 209 fps
Decay_Front_1080p

540p upscaled to 1080p / 497 fps
Decay_Front_540p

Native 1080p / 204 fps
Decay_Mail_1080p

540p upscaled to 1080p / 475 fps
Decay_Mail_540p

Note, this is on the lowest acceptable render scale 50% which is one quarter of the pixels. At less extreme scaling, in the 80% to 90% range, the image (while slightly less defined) still looks very good and the performance increase is well above the image quality loss. If you want to see exactly how this looks in real time, please download the Decay demo.

https://cybereality.itch.io/decay-a-real-time-experience

Describe the problem or limitation you are having in your project

Performance was an issue as I was working on highly realistic real time 3D demos. To get performance to scale to older GPUs and some laptops or integrated graphics, a good render scaling algorithm was necessary, so I developed this myself to meet the needs of my work.

Describe the feature / enhancement and how it helps to overcome the problem or limitation

This scaling shader was crucial in allowing people with older GPUs to view my work, and I don't think there would have been any other way to get the demos functional on older computers while still maintaining a high visual quality.

You can watch this (rather long) video describing how it works.
https://www.youtube.com/watch?v=B70VOP80EHA

Describe how your proposal will work, with code, pseudo-code, mock-ups, and/or diagrams

I have not looked at the Godot source code in relation to this add-on yet, as I wanted to make sure this proposal is accepted first before investing any more time. The code of the shader is already open source MIT license on Github. You can take a look if you want. It is very small and elegant compared to something like FSR.

https://github.com/cybereality/godot-super-scaling

I can do the integration myself, I don't believe it would be difficult. I'll just have to get up to speed with the Godot source code, but I have extensive experience in C++, I don't think it would take more than a week to complete.

If this enhancement will not be used often, can it be worked around with a few lines of script?

This enhancement could be used in potentially any 3D game of any genre or type. The performance improvements are substantial and while you can do something similar yourself in GDScript, it is a lot more than a few lines of code and the vast majority of users could not figure this out on their own.

Is there a reason why this should be core and not an add-on in the asset library?

So the add-on is already in the Asset Library and available on Github for manual install.
https://godotengine.org/asset-library/asset/1141

However, the setup can be complex depending on your game. It is fairly easy to do in a new project, or with a small test. But if you have a large game deep in development, adding this at the end may be too difficult or error prone. In addition, many users do not know about it, so having it integrated into Godot would allow more users to benefit that would otherwise not have the capability.

In addition, the way I am doing it in the add-on only works at run time. If it were integrated into the engine, it could potentially also work in the editor viewport, allowing users on slower or older computers to create more advanced and realistic graphics that would otherwise not be possible on say an old laptop or integrated graphics.

Finally, this solution works right now in Godot 3.4, and I tested as far back as 3.3 and it still works. So unlike something like FSR, there is no requirements for advanced new API features, it works fine in Godot 3 across all platforms in both OpenGL ES2 and ES3. So while it is up for debate if my shader is better than FSR (I think it is), regardless, the Godot Super Scaling can allow users that cannot upgrade to Godot 4.0 (for whatever reason) to still have a viable solution.

@clayjohn
Copy link
Member

I'm very interested in this for Godot 3.x. Godot 4.0 already uses AMD FSR for high quality upscaling and bilinear for fast but poor quality upscaling. I'm not sure that Godot Super Scaling is able to match the quality/performance of AMD's FSR. That being said, it looks like a great option for lower end devices that can't support some of the advanced features needed to run FSR.

@mrjustaguy
Copy link

mrjustaguy commented Jan 19, 2022

I've attempted a comparison between the two, but I was forced to use Godot 3.4.2 and a pre-alpha 4.0 as I wasn't able to port the GDScript to 4.0 successfully yet (viewport hooking part has been an issue) to make it run there, but from my testing, the performance drop is very similar. In terms of quality I'd say GSS actually beats FSR on lower resolutions (540p upscaled to 1080p) but I'd have to get it working in 4.0 to be able to give good performance metrics for both and good quality comparison..

It's also considering it's very tiny amount of code compared to FSR (like what 200 for the GDScript and under 200 for the shader) I'd say it's simpler compared to FSR and worth a shot at porting to at least compare the two directly.

Also I don't see why not have and expose several Upscaling methods.. I mean in the future Intel's XeSS will be here and should be Open Source, possibly under a compatible License with Godot, that also runs on most hardware (albeit even more modern, compared to FSR as their fallback path uses instructions that are available on like RDNA2 and like Turing and later)

@cybereality
Copy link
Author

cybereality commented Jan 19, 2022

Thanks. Yes, FSR can look sharper and does look better at higher resolution (say at 90% scale), because they are doing a smart sharpening pass. In my solution I opt to do super sampling, which avoids the pixelation but results in a slightly blurry image. It wouldn't be hard to add sharpening, but the issue I was getting into is that my algorithm just runs based on the single viewport texture (the raw render) and I could not figure out how to pass the result of one full screen iteration to another shader. If I could do that, then I think I could get similar or better quality to FSR.

@Calinou
Copy link
Member

Calinou commented Jan 19, 2022

I feel FSR (and FSR Lite) are more suited to be used in core. If you want built-in support for nearest-neighbor resolution scaling, it's planned too 🙂
Temporal antialiasing could also be used as a temporal upscaling solution. With a good TAAU implementation, a render scale of 70% will look almost as good as 100% in (mostly) still scenes.

It is possible to implement FSR purely as a fragment shader (including FSR Lite), so it could be made to work in 3.x's GLES3 renderer too. That said, we may want to only implement FSR Lite in 3.x given the overall lower-end focus and implementation simplicity.

@ArseniyMirniy
Copy link

ArseniyMirniy commented Jan 20, 2022

Is there a way to make it slightly sharper? Seems like FSR is gonna be in the core and provides slightly better results.

@Calinou
Copy link
Member

Calinou commented Jan 20, 2022

Is there a way to make it slightly sharper? Seems like FSR is gonna be in the core and provides slightly better results.

FSR is already implemented in core in master, but its sharpening filter currently has no effect due to a bug.

In 3.x's GLES3 renderer, you can also use contrast-adaptive sharpening (Sharpen Intensity in the Project Settings). CAS could be added separately from FSR in master, but it has an added maintenance cost that may not be worth it (as we can just reuse FSR's RCAS).

@cybereality
Copy link
Author

Yes, I can update it. I still have to fix the multi-pass rendering and I can add a sharpening filter. It will probably take a few days to a week, but I believe I can get similar quality to FSR if I do that.

@ghost
Copy link

ghost commented Jan 20, 2022

@Calinou FSR will never get backported to 3.x therefore I would say this feature could be implemented for 3.x as core for GLES3 only to parallel FSR feature in master. Seems pretty trivial as is simply a shader and a few project settings options. Also doesn't break compat.

Also this is incredible! Looks great and can see this potentially improving performance on low end systems by using a smaller surface for rendering.

@Calinou
Copy link
Member

Calinou commented Jan 20, 2022

@Calinou FSR will never get backported to 3.x therefore I would say this feature could be implemented for 3.x as core for GLES3 only to parallel FSR feature in master.

I'm still not convinced with this super-scaling shader. To me, it looks like a mix of nearest-neighbor and linear scaling. It doesn't seem to do much better than either, especially at 50% resolution scale where you can just use nearest-neighbor scaling without artifacts. (Lanczos scaling can also be implemented if you want better-than-linear scaling.)

If you're looking for a project setting to reduce the 3D rendering resolution without affecting the 2D rendering resolution in a given viewport, this should be implemented separately. (This is already available in master, even without FSR.)
Such a project setting won't be too difficult to backport to 3.x's GLES3 renderer, but it will be much more challenging in GLES2:

Retrofitting this to GLES2 in Godot 3 will be challenging simply because we're rendering everything into one buffer there and we'd almost be rewriting it to the same approach as in the GLES3 renderer.
It should in theory be easy enough to apply to the GLES3 renderer but there are too many mobile unfriendly techniques applied there. Wonder if its worth the trouble though I guess Godot 4 is still some months away before its viable...

godotengine/godot#51870 (comment)

@ghost
Copy link

ghost commented Jan 20, 2022

@Calinou That's fair assesment. It would require a bit of testing. I will implement this locally in my custom Godot build and using my game since is technically a pixelated retro looking 3D game it should make no difference in quality. I'm mostly looking for artifacts and performance gain. Will report back here if I get the chance to use this.

If does produce the claimed performance benefits I would say is still something worth considering for 3.x branch which is not well optimized for lower end systems.

The problem as far as I understand it is that the renderer is not very efficient making CPU wait for GPU to render thus causing massive slowdowns on slower systems.

Based on my understand this would in fact draw on a smaller surface and upscale the result to 50% to produce the illusion that the game is higher resolution than it actually is. This would mean less work for the GPU thus making CPU wait less thus potentially increasing performance.

@cybereality
Copy link
Author

So I was able to get multi-pass rendering working and adjusted the super sampling parameters. It now looks noticeably more clear and higher quality at all resolution scales, but especially helps at low res. Here are some shots of the latest version with 540p -> 1080p 50% scaling. I will update Github later today so you all can check it out.

Decay_GSS_1080p

Decay_GSS_540p

You can notice on the left wall, the floor, the cafe window, and the back door, that the quality loss is not that bad at all, considering this is rendering 1/4 the amount of pixels. On the phone booth, the text and graffiti suffer, as there is obviously not enough information at 540p, and also the reflection on the SSR for the metal walkway. Even so, we are getting over 50% increase in performance, which would be much needed for older laptops and integrated graphics.

@cybereality
Copy link
Author

I pushed the update live on Github. @Calinou I would recommend testing it yourself on one of your projects. The quality is substantially better than nearest neighbor or bilinear, and IMO beats FSR. But you can be the judge.
https://github.com/cybereality/godot-super-scaling/releases/tag/v1.1.0

@cybereality
Copy link
Author

And so you can see what it does on the high end. Here is Ella running at 4K native ultra settings getting 25 fps. With Godot Super Scaling at 50% (1080p upscaled) the performance almost triples to 70 fps and it still looks great.

Ella_GSS_New_1

Ella_GSS_New_2

But you don't have to take my word on it, the patch is live now, you can download yourself.
https://cybereality.itch.io/ella-a-study-in-realism

@cybereality
Copy link
Author

I found a somewhat serious bug that was causing some blurring at native resolution. I should have tested more content before pushing, but my demos looked good. In any case, this fixes the sampling radius so that native (or close to native) render scales are not blurred.
https://github.com/cybereality/godot-super-scaling/releases/tag/v1.1.1

The only downside is that smaller scales (such as 50%) have more aliasing, because the upscaling is only really handling inner surfaces (like faces, textures on floors or walls, etc.). So at 540p, the edges of polygons will look exactly like a 540p render with no filtering. However, if you enable MSAA and/or FXAA it can help, and still look overall better than before and it is clear (not blurry like bilinear filtering).

@cybereality
Copy link
Author

Actually, you might be right @Calinou . I did some direct comparisons between the super scaling and nearest and linear, and the difference is not as huge as I thought. There was a big difference in the beginning when I was developing it, but I have rewritten things and tweaked a lot since then and haven't done a direct head-to-head in a while. It for sure looks better than nearest, but there is only a slight improvement from linear. I still do think it looks better, but it is a very small difference and maybe not worth the effort. Anyhow, I will post the shots here for historical purposes.

Ella_GSS_540p_Nearest

Ella_GSS_540p_Linear

Ella_GSS_540p_Super

@ghost
Copy link

ghost commented Jan 21, 2022

I'm most interested in increasing performance for lower end systems with acceptable quality loss as long as quality is better than just doing a regular scale up.

The output seems too blurry though. There needs to be more sharpness on the output even if it's exagerrated.

When I do AI upscaling it actually helps to scale down a bit, sharpen to remove all the noise but keep the most important details before scaling up. Maybe smoothing then sharpening on native resolution before upscaling could produce better results?

@cybereality
Copy link
Author

Yes, I could add sharpening, but this I believe is already there via CAS and FSR. My original proposal was with the understanding that AMD FSR would not be possible in Godot 3.x, as I thought it required Vulkan/DX12. If that is not the case, and there is a way to back port it to Godot 3.x, then I think that would be the best way forward.

@ghost
Copy link

ghost commented Jan 21, 2022

I don't think FSR requires Vulkan . Is just a compute shader. Compute shaders have been implemented since OpenGL 4.3 or GLES 3.2 I think but I highly doubt anyone will ever implement compute shaders for 3.x. Vulkan requirement is because it was not tested on OpenGL and possibly more of a "branding" thing to create demand around Godot 4/Vulkan. Retroarch implemented FSR in OpenGL and works unmodified. The feature seems to be sorounded by an aura of mystique everyone saying different things. This article explains it better what it is and isn't https://jntesteves.github.io/shadesofnoice/graphics/shaders/upscaling/2021/09/11/amd-fsr-demystified.html. It is good but is not DLSS. also see here https://twitter.com/libretro/status/1433511745641922572

If you read the article it mentions you can in fact implemented FSR as a fragment or vertex shader with some modifications. It primarily does a lot of math then outputs Fragment color pretty much it.

Anyway, I was trying to test your super scaler it but I ran into some issues seems I need to make extensive modifications to fit this in. I'll have a look at it tomorrow. Also I discovered a few other issues my Remote is no longer working lol. I need to look into it.

@mrjustaguy
Copy link

The benefit of GSS is the fact you can pick a pixel smoothness level, while normally you get one or the other (nearest or linear scaling) but no values in between.

@cybereality
Copy link
Author

Actually, there was a major bug with the last patch I pushed, basically eliminating the upscaling. While that patch looked more clear, and did appear better at native (or higher than native) it ended up looking much worse at sub-native resolution. I spent the day fixing it, and did get it working, however I realized the multi pass method is not a good solution. Though it did look nice in the end, the performance hit was too great and essentially worse than just not scaling at all. I won't get into the details, but using multiple viewports has disadvantages, and I don't think it will work with this algorithm, at least not without major changes.

In any case, I reverted back to the single pass method, but added back in all the tweaks from the last 2 days, so it looks a lot better and more clear. This is definitely the best version of the shader, and performance is at the original level (which is 200% to 300% at the lowest setting). I've taken some screenshots of Decay doing 1080p to 4K upscaling, since it will be easier to see what is happening. Note, these are on the lowest settings with everything disabled (no anti-aliasing or post-processing) so it will be more clear what the algorithm is doing. If you want to see how the graphics look with all the effects, download the demo, it has been updated on Itch.

Decay_GSS_Fixed_Nearest

Decay_GSS_Fixed_Linear

Decay_GSS_Fixed_Super_1080p

Decay_GSS_Fixed_Super_4K

@cybereality
Copy link
Author

@filipworksdev Yes, I am very familiar with FSR. I have been testing it on my PC since day one, and it works pretty well. Obviously not as good as DLSS, but for a fast spatial algorithm it is very good. Originally I was looking into porting it to Godot 3.x and read through the source code but it is very long and convoluted. I'm sure I could have got it working in a few days, but I decided to just write my own algorithm that would be smaller and easier to understand. FSR does have a more clear image, and can generate detail, which works well at higher resolutions (for example 1080p to 1440p or 1440p to 4K).

However, in my testing I found that it does not do a good job with large jumps, such as 1080p to 4K and is unacceptably bad at or below 720p. So it is of most use for high end PCs, people that have a 4K monitor, for example, to run the game at 1440p, which looks great. However, the vast majority of the market (especially people that buy indie games) have 1080p monitors and probably a GTX 1060 / RX 580 or worse. Many are on older laptops or integrated graphics. And if you are on a 1080p monitor and use FSR at 720p it looks pretty bad, you can forget going from 540p to 1080p.

I think I have some ideas to make the setup on GSS easier, cause I realize it can be involved with a larger project. There still has to be some setup, I'm not sure it can be automatic, but I will look into it in the next few days.

@mrjustaguy Yes, the smoothness slider is an advantage. This is because there are essentially two shaders running. An upscaler and a super sampler (those are the two sliders). They are both always running, even at native res, so you can move them without additional performance cost. I did this because the upscaler alone was too pixelated but super sampling only works well at higher than native resolution. However, by mixing them, you can get a more pleasing image. When running at lower than native res, the super sampler is essentially similar to traditional scaling (e.g. linear) so it is just there to soften the image. At 100% smoothness it would be very similar to just using linear interpolation, though I am using a rotated box sampling pattern, so it's still a little more advanced. In general, though, the best results will be at between 25% and 75% smoothness, though I left the full slider in for artistic taste.

@Calinou
Copy link
Member

Calinou commented Jan 21, 2022

In any case, I reverted back to the single pass method, but added back in all the tweaks from the last 2 days, so it looks a lot better and more clear. This is definitely the best version of the shader, and performance is at the original level (which is 200% to 300% at the lowest setting).

GSS seems to look much better now 🙂

It looks pretty close to what Lanczos would look nice now, which seems more logical to me.

@ghost
Copy link

ghost commented Jan 21, 2022

Looks much better in the last example. GSS has a bit more detail and sharper edges. Really sharp edges do look a bit lower res though but perhaps antialias FXAA can help with that smooth out the jaggies.

@ArseniyMirniy
Copy link

@cybereality very impressive upgrade!

@cybereality
Copy link
Author

@filipworksdev Yes, because the upscaling only really deals with inner surfaces (it tries to blend similar colors to avoid blurring separate objects). This means it does almost nothing for AA, and it looks best with 4x MSAA. At higher resolutions FXAA can help, but at lower resolutions (like 75% or below) you actually want to disable FXAA because it ruins the image clarity. MSAA still works at all scales.

@cybereality
Copy link
Author

I've pushed a new release for GSS, this doesn't change the visuals, but makes installation a lot easier. You don't need any scene tree changes, so this can be added without extensive modification. However, some code changes are still needed when using get_node() and absolute paths, more details on my page.

https://github.com/cybereality/godot-super-scaling/releases/tag/v1.2.0

Decay_GSS_4K

Decay_GSS_1080p

Probably next week I will attempt to port this to Godot 4.0, so we can do a direct comparison against FSR and see where it stands. My feeling is that FSR will be better at moderate scales, but my solution may look better at lower scales like 50%. However it would be good to quantify this.

@cybereality
Copy link
Author

Here's a video of the latest version.

https://www.youtube.com/watch?v=AW0A-G5mMcw

I am thinking about porting this to GDNative to see if that gives me enough control. I want to get that working first, and then I will look into Godot 4.0, as I think as a native plug-in I can avoid the kind of hacked setup I have now.

@ArseniyMirniy
Copy link

By the way, 4.0 alpha is very nice in comparison to nightly builds. Check it out!

@MossFrog
Copy link

MossFrog commented Feb 3, 2022

This can probably also be achieved by rendering everything in a viewport with double the resolution and then streaming it to a viewport container the same resolution as the display window. But this should be an option in the project settings such as render scale (for example 200% being double the render resolution downscaled to the window resolution)

@Calinou
Copy link
Member

Calinou commented Feb 3, 2022

This can probably also be achieved by rendering everything in a viewport with double the resolution and then streaming it to a viewport container the same resolution as the display window. But this should be an option in the project settings such as render scale (for example 200% being double the render resolution downscaled to the window resolution)

This is already feasible in the master branch with the scaling_3d_scale property on Viewport (and in the project settings). However, supersampling is very expensive and is generally best left to running old games on modern high-end GPUs.

@cybereality
Copy link
Author

It took a day, but I was able to port GSS to Godot 4.0 alpha. It wasn't too hard. I was able to compare it directly against the FSR and Bilinear implementations. It looks pretty close to Bilinear, though I think GSS does still look slightly better. However FSR definitely looks better, more sharp and more detail (particularly at far camera angles with smaller objects). So there is no question for me now that FSR is better, though I still think GSS looks decent considering I developed it myself in a few weeks. Anyhow, I'll still be using GSS for my next demo, as FSR has some bad bugs with SSIL, so it's not usable currently. But I plan to add a command line switch, so you all can test it and compare for yourself.

Also, I was able to integrate FSR and Bilinear support into my front end. So GSS can now be used to select different scaling methods and use the same sliders. It might take some work to switch in real time, I will look into that tomorrow. But even if we don't want to add it to Godot, I think GSS can still be a useful add-on as it makes the scaling code easier (basically a front end) so people can just move sliders and not have to code anything.

@Calinou
Copy link
Member

Calinou commented Feb 4, 2022

Note that if you want an in-between linear and FSR scaling, bicubic (or lanczos) scaling could be integrated to the engine. It will require a dedicated shader to perform the scaling (like FSR), but it's a bit cheaper to do compared to FSR (while looking sharper compared to linear filtering).

@cybereality
Copy link
Author

Just wanted to update, on my latest Godot 4.0 demo, Aniela, I ended up going with Bilinear scaling.
https://cybereality.itch.io/aniela

After testing all the options, I realized that the Bilinear mode looks about as good as my scaler (well better in some ways and worse in others). But overall about equal, depending on if users want a crisper or softer image. And it is very fast and had better compatibility. So I think in terms of Godot 4.0, the existing options are good, and when FSR is finished that will be even better.

However, I do wonder about Godot 3.x. Is it possible to port the Bilinear scaling to 3.x? Otherwise I could continue supporting GSS as an add-on for 3.x as a stop gap until 4.0 is stable.

@Calinou
Copy link
Member

Calinou commented Feb 9, 2022

Is it possible to port the Bilinear scaling to 3.x?

This is already achievable when using GLES2 (or GLES3 with a custom viewport), but not GLES3 when scaling the root viewport. However, you'll probably want to use a custom viewport in real world scenarios to keep 2D elements crisp at lower 3D rendering resolutions. There's an official demo showcasing how to do this: https://github.com/godotengine/godot-demo-projects/tree/master/viewport/3d_scaling

godotengine/godot#51870 wasn't backported to 3.x and while it can be done for GLES3, it's not exactly trivial. Quoting Bastiaan from that PR:

Retrofitting this to GLES2 in Godot 3 will be challenging simply because we're rendering everything into one buffer there and we'd almost be rewriting it to the same approach as in the GLES3 renderer.
It should in theory be easy enough to apply to the GLES3 renderer but there are too many mobile unfriendly techniques applied there.

@cybereality
Copy link
Author

Okay. I understand. Thanks for the information.

@aaronfranke aaronfranke closed this as not planned Won't fix, can't repro, duplicate, stale Sep 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants