Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix shader uniform storage conversions and crash #74937

Merged
merged 1 commit into from
Jun 9, 2023

Conversation

bitsawer
Copy link
Member

@bitsawer bitsawer commented Mar 15, 2023

Also fixes an unreported (?) crash that can happen when passing too little data in for certain types using set_shader_parameter(). It was caused by incorrect bounds checking which would read and write garbage memory or rarely crash if the read memory was not mapped in the process.

This PR also allows more types to be used with GDScript arrays when calling set_shader_parameter(). It relies on the normal Variant behavior to figure out the conversion. Previously, the only thing that worked worked for setting many shader type arrays was creating a raw PackedXXArray.

Array auto-conversion already works for many classes (Vector2, Vector3 etc.) that have their own PackedXXXArray. For example there is PackedVector3Array. But some classes like Vector4, Transform2D, Transform3D, Basis, Projection etc. don't have these and thus did not receive the Variant array auto-conversion goodies. This PR adds some of that conversion behavior without adding new PackedXXXArray's for each class.

I created a new file for the conversion functions that is shared between Vulkan and OpenGL renderer. There are other files in there that are shared like that, but let me know if I should put the functions somewhere else.

I'm not 100% sure if this PR breaks compatibility if user has previously saved some global shader values in the project settings. After a some quick testing I think not, but someone more familiar with that area might need to confirm.

edit 2:

I also updated some more uniform/global conversions, some of them haven't worked at all in a long time, but I guess nobody either used them or bothered to make a bug report. Also, here's a simple test project that can be used to test this functionality:

UniformTest.zip (OLD test project)

UniformTester.zip

UniformTester3.zip (NEWEST test project)

This test project was generated in master branch (d6dde81) and contains all incorrect global uniform and shader uniform types and data saved to main.tscn and project.godot. Opening and running the project will show a black box in the master, as the data is not converted properly. After applying this PR, the box should be green in editor (this test saved resource data conversion) and dark blue when run pressing F5 (this test runtime Shader.set_shader_parameter() conversion). The cube also moves once per second to test some x, y and z values. The cube also rotates when demo is run (not in editor mode).

NOTE: When using Compatiblity renderer, remove or comment out line #define GLOBALS_CHECK in uniforms.gdshader, seems like all global uniforms are not yet supported there (or they are buggy).

clayjohn
clayjohn previously approved these changes Apr 14, 2023
Copy link
Member

@clayjohn clayjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm happy with the changes.

it would be nice to have @Chaosus take a look and confirm that the array stuff is correct

@LaneSun
Copy link

LaneSun commented Apr 16, 2023

Thanks for the work, it helps a lot.

Chaosus
Chaosus previously approved these changes Apr 17, 2023
Copy link
Member

@Chaosus Chaosus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seems fine for me, and the code looks much cleaner now.

@bitsawer
Copy link
Member Author

bitsawer commented Apr 17, 2023

Fixed a few more types in shader_language.cpp and also added a compatibility Color conversion to convert_array_std140() for Color types which fixes #76157. Note that only the Vulkan renderer needs the srgb-to-linear conversion part. However, the real fix is probably to investigate why the PackedColorArray is serialized as a variant Array containing Color-variants instead of PackedColorArray (see #76157 (comment)). Still, adding that Color handling is also generally useful and also adds some backwards compatibility.

It's pretty clear by now that this PR breaks compatibility to some extent, but I think its worth it. Previously too many things were just broken in some way and there is no easy way to add backwards conversion compatibility everywhere. However, most issues should be pretty easy to fix as long as we tell users to where to look during the next release.

If we want to be super careful about maintaining compatibility, the only reliable way is to pretty much to add more conversions to Variant. For example, currently Quaternion or Plane does not automatically convert to Vector4 etc.

@YuriSizov
Copy link
Contributor

@clayjohn @Chaosus Can you take a look again?

However, most issues should be pretty easy to fix as long as we tell users to where to look during the next release.

Can you provide a list of what the incompatible changes are and how do you propose to address them in user projects?

@bitsawer
Copy link
Member Author

@YuriSizov A quick list where we could expect some breakage. The real type/conversion case combination matrix is huge, so I might have missed some.

  • Setting certain global (non-global are a bit different) shader uniform types in code. For example setting a (non-array) global uniform vec4 previously required using a Plane, which didn't always work in all cases. Now the type is correctly Vector4 and should work correctly everywhere. So users have to change certain data types they pass to RenderingServer.global_shader_parameter_set(), although most should continue to work just fine.

  • The editor UI for setting global shader values. Most should work without breakage, but some types (like mat4 etc.) never worked at all and simply resulted in garbage in the shader uniform values. If people have existing and incorrect global uniforms, simply deleting the old global value in the editor and re-creating it with the same name and type (which should then use correct type) should be enough.

  • Setting shader uniform parameters in editor inspector. When calling Shader.set_shader_parameter() directly in code there should be no breakage, but editor UI does save uniform type and data into resource files which could be affected. Mostly affects arrays for types that don't have their own PackedXXXVector. Like Vector4, Basis, Projection etc. I think most should be 100% compatible, but if the worst happens deleting, saving and re-creating the value should fix the type.

If we want to be careful, we could merge this into master and see if our power users report any issues. If nothing major comes up, we could cherry pick for 4.0.x Some of the conversion issues can be currently worked around by users by for example passing a raw PackedFloat32Array etc. instead of objects or variant arrays (this workaround also continues to work with this PR). Then again, the more we delay the more breakage there will be when we finally fix this.

And if we want to ensure maximum compatibility, we could add more conversions to Variant like I mentioned in my previous message. For example, we could make it possible to convert a Variant Quaternion or Plane into a Variant Vector4, which would remove a lot of potential issues.

@YuriSizov
Copy link
Contributor

If we want to be careful, we could merge this into master and see if our power users report any issues. If nothing major comes up, we could cherry pick for 4.0.x

We can't really consider 4.0.x for this if changes aren't backwards compatible. Even for 4.x BC should be considered seriously.

@bitsawer
Copy link
Member Author

True enough, better not break things if we can help it. Then again #75701 was a breaking change and that was done just to improve performace. But I might be able to remove some compatibility issues. The only one I know that for sure can cause issues is data coming from RenderingServer.global_shader_parameter_set(), others are more theoretical, because there is Array auto-conversion for much of the data. I can try to test some more this week. Adding some explicit conversion logic to a few places might be enough. Everyone else is free to test and suggest potential issues, too.

@YuriSizov
Copy link
Contributor

Then again #75701 was a breaking change and that was done just to improve performace.

Ultimately, it wasn't, the old notification value was kept in the engine specifically to avoid breaking compat. The behavior changed though, that is true.

@clayjohn
Copy link
Member

I think most of the changes that could be considered breaking are also places where the current behaviour is totally broken. Like changing the type on the global shader vec4 makes sense because currently it is totally broken and leaks garbage

@bitsawer bitsawer force-pushed the fix_uniform_storage branch 2 times, most recently from 2d2c8ba to 9368423 Compare April 20, 2023 11:00
@bitsawer
Copy link
Member Author

bitsawer commented Apr 20, 2023

Rebased and updated the PR. It should now be 100% compatible with old projects. I also updated the example project and its description in the first post, it tests uniform combination values and types that are affected by this PR (including saved global and Shader resource uniform data). Also tested bug report MRP projects, this PR seems to fix them all.

As a result of allowing much better type conversion with good backwards compatibility, I had to clean up the code a bit more. The templated conversion code is more complex than the current one, but it's either that or adding a few thousand lines of copy-paste Variant cross-type conversions. Changes also make it trivial to add support for new Variant or uniform types in the future and they should automatically receive full conversion support.

Previously, there were also several indexing errors in those uniform conversion loops as mentioned before (BVECX heap over-reads, wrong loop step counter in UVEC4 resulting in reading heap garbage etc.). I also made a few minor empty line additions and removals to make the two material_storage.cpp-files look more similar and consistent, easier to compare side-by-side.

Also noticed and fixed a subtle (unreported?) bug between Vulkan and OpengGL storage, for example passing Vector2/3/4(i), Quaternion or Plane would sometimes lead to different or bad values in vec4 uniforms. The material_storage.cpp files had gotten out-of-sync and the order of type checks in vec4 branch was different, which caused several incorrect type conversion combinations in OpenGL compatibility renderer.

Also fixes #75191 (added to first post). The vec4 values remain the same in resource file, but now in the editor there is a revert button which will fix the vec4 values to correct defaults (as the editor notices that old incorrect default value is not currently default). Newly added uniforms will also have the correct value now.

Also fixes #66619 (added to first post).

While I tested this a lot, there might still be some subtle bugs around. Still, this PR should improve the current situation significantly.

@bitsawer
Copy link
Member Author

Also fixes #63064 and #74680 (added to first post). I made (and reverted) a separate PR for those (#76438), but as some uniforms have a wrong type currently that fix should be incorporated into this PR for better backwards compatibility so that all type updates happen at the same commit.

I'll need to test and fix a few small type compatibility issues so that all this stuff works together. Converted to draft until then.

@bitsawer bitsawer marked this pull request as ready for review May 29, 2023 13:46
@bitsawer
Copy link
Member Author

bitsawer commented May 29, 2023

Finally had some time to go through all issues, MRP projects and demos to verify everything works together. I added a few compatibility tweaks and rebased this PR, it should be fully backwards compatible and I didn't notice any regressions during my testing. Still, there are a lot of changes so something might have slipped past me, but any issues should be easily fixable once someone reports them.

Ready for a review. Also updated the test project in the first post.

edit:

Added another issue to the fixed pile: #64832 (also added to first post).

Comment on lines +55 to +57
PackedInt32Array ba = value;
for (int i = 0; i < ba.size(); i++) {
ba.set(i, ba[i] ? 1 : 0);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this necessary? The code seems highly suspicious. It looks like it is copying over the entire value array, then iterating over it to clean the values, then passing the whole array to a new function to copy each value one-by-one into another array. Could we not convert, clean, and copy all at once as we do now?

Its nice that this templated approach cuts the amount of code in half, but I have a feeling it is forcing us to do way more copies and way more operations to do the same amount of work.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note that old code does convert+copy+iterate passes too, they are just harder to see. In the old way, these lines:

const PackedInt32Array &ba = value;

create a new temporary: a PackedInt32Array from a value variant object using Variant conversion. It just binds it as a const reference, but a temporary is still made and allocated. So a full conversion and array allocation is done here, anyway, it's just harder to see. The following manual for-loop is replaced by write_array_std140(), but they do the same thing. The amount of work done should be roughly the same between the old and new code. And when passed the same values, the new code does not really do any more conversions than the old one, there is a fast path in convert_array_std140() for many packed arrays. The only extra copying comes when passing more complex object arrays which have to allocate some temporary extra space, but these didn't work at all in the old code. I suspect because doing these conversions for arbitrary arrays and objects properly is actually pretty hard, as evident. I though about special casing all types and use internal variant accessors in variant_internal.h to handle these conversions without temporary copies, but it turned out that kind of code is even harder to write, read and understand.

The only exception here is indeed the boolean uniform handling (bool, bvec2, bvec3, bvec4), which has to do extra work for the "force to 1 or 0"-loop. However, the only extra cost is the loop (which is typically much less than 100 elements), because the temporary COW packed array is already made and its CowData reference count is 1, which means it doesn't have to copy anything when written into because it is already a non-shared copy which had to be made anyway (both in old and new versions).

In practice, it shouldn't have much effect on performance, but uniform boolean conversions are probably is a bit slower than the original version for the same arguments. Of course, I doubt many people actually use uniform boolean in the first place. However, to match the original performance and not make copy of the COW vector, I could write:

const PackedInt32Array &ba = convert_array_std140<Vector2i, int32_t>(value);
write_array_std140<Vector2i>(ba, gui, p_array_size, 4);
for (int = 0; i < p_array_size * 4) {
    gui[i] = gui[i] ? 1 : 0;
}

It still makes one extra loop over the data, but as uniform arrays are not usually very large in practice it shouldn't matter too much. Alternatively, we could add an optional lambda function converter to inline the conversion into the copy loop, which the compiler can optimize nicely:

const PackedInt32Array &ba = convert_array_std140<Vector2i, int32_t>(value);
write_array_std140<Vector2i>(ba, gui, p_array_size, 4, [](int32_t value) { return value ? 1: 0 });

With this version, the performance should be pretty much identical. Or we could leave the new code as-is, it's pretty simple to understand this way with a pretty minimal performance penalty.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for the detailed response. I didn't realize that const PackedInt32Array &ba = value; was copying the entire array.

Indeed the boolean arrays are the worst offenders in terms of copying necessarily, but they are also the cheapest to copy and likely won't be that large.

I'm happy with the current state of things in that case. I would avoid making it any more complicated. Already this PR adds significant complexity by adding layers of templated functions. I understand that was done in order to make handling all the edge cases manageable, but adding more complexity will make this impossible to debug six months from now.

Copy link
Member

@clayjohn clayjohn left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Seeing as my only concern turned out to be my misunderstanding, I am happy to go ahead with this PR. I trust that @bitsawer has properly handled things and this is a significant improvement over the status quo.

I would go ahead and merge this so we can get wide testing before releasing 4.1

@akien-mga akien-mga merged commit 166643d into godotengine:master Jun 9, 2023
@akien-mga
Copy link
Member

Thanks a ton!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment