-
-
Notifications
You must be signed in to change notification settings - Fork 21.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix shader uniform storage conversions and crash #74937
Conversation
9fc364b
to
bc48bcd
Compare
bc48bcd
to
32b33de
Compare
32b33de
to
ac41823
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm happy with the changes.
it would be nice to have @Chaosus take a look and confirm that the array stuff is correct
Thanks for the work, it helps a lot. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seems fine for me, and the code looks much cleaner now.
ac41823
to
32b209e
Compare
Fixed a few more types in It's pretty clear by now that this PR breaks compatibility to some extent, but I think its worth it. Previously too many things were just broken in some way and there is no easy way to add backwards conversion compatibility everywhere. However, most issues should be pretty easy to fix as long as we tell users to where to look during the next release. If we want to be super careful about maintaining compatibility, the only reliable way is to pretty much to add more conversions to Variant. For example, currently Quaternion or Plane does not automatically convert to Vector4 etc. |
@YuriSizov A quick list where we could expect some breakage. The real type/conversion case combination matrix is huge, so I might have missed some.
If we want to be careful, we could merge this into master and see if our power users report any issues. If nothing major comes up, we could cherry pick for 4.0.x Some of the conversion issues can be currently worked around by users by for example passing a raw PackedFloat32Array etc. instead of objects or variant arrays (this workaround also continues to work with this PR). Then again, the more we delay the more breakage there will be when we finally fix this. And if we want to ensure maximum compatibility, we could add more conversions to Variant like I mentioned in my previous message. For example, we could make it possible to convert a Variant Quaternion or Plane into a Variant Vector4, which would remove a lot of potential issues. |
We can't really consider 4.0.x for this if changes aren't backwards compatible. Even for 4.x BC should be considered seriously. |
True enough, better not break things if we can help it. Then again #75701 was a breaking change and that was done just to improve performace. But I might be able to remove some compatibility issues. The only one I know that for sure can cause issues is data coming from |
Ultimately, it wasn't, the old notification value was kept in the engine specifically to avoid breaking compat. The behavior changed though, that is true. |
I think most of the changes that could be considered breaking are also places where the current behaviour is totally broken. Like changing the type on the global shader vec4 makes sense because currently it is totally broken and leaks garbage |
2d2c8ba
to
9368423
Compare
Rebased and updated the PR. It should now be 100% compatible with old projects. I also updated the example project and its description in the first post, it tests uniform combination values and types that are affected by this PR (including saved global and Shader resource uniform data). Also tested bug report MRP projects, this PR seems to fix them all. As a result of allowing much better type conversion with good backwards compatibility, I had to clean up the code a bit more. The templated conversion code is more complex than the current one, but it's either that or adding a few thousand lines of copy-paste Variant cross-type conversions. Changes also make it trivial to add support for new Variant or uniform types in the future and they should automatically receive full conversion support. Previously, there were also several indexing errors in those uniform conversion loops as mentioned before (BVECX heap over-reads, wrong loop step counter in UVEC4 resulting in reading heap garbage etc.). I also made a few minor empty line additions and removals to make the two Also noticed and fixed a subtle (unreported?) bug between Vulkan and OpengGL storage, for example passing Vector2/3/4(i), Quaternion or Plane would sometimes lead to different or bad values in vec4 uniforms. The Also fixes #75191 (added to first post). The vec4 values remain the same in resource file, but now in the editor there is a revert button which will fix the vec4 values to correct defaults (as the editor notices that old incorrect default value is not currently default). Newly added uniforms will also have the correct value now. Also fixes #66619 (added to first post). While I tested this a lot, there might still be some subtle bugs around. Still, this PR should improve the current situation significantly. |
9368423
to
a889d22
Compare
a889d22
to
49a9868
Compare
Also fixes #63064 and #74680 (added to first post). I made (and reverted) a separate PR for those (#76438), but as some uniforms have a wrong type currently that fix should be incorporated into this PR for better backwards compatibility so that all type updates happen at the same commit. I'll need to test and fix a few small type compatibility issues so that all this stuff works together. Converted to draft until then. |
49a9868
to
23c375d
Compare
Finally had some time to go through all issues, MRP projects and demos to verify everything works together. I added a few compatibility tweaks and rebased this PR, it should be fully backwards compatible and I didn't notice any regressions during my testing. Still, there are a lot of changes so something might have slipped past me, but any issues should be easily fixable once someone reports them. Ready for a review. Also updated the test project in the first post. edit: Added another issue to the fixed pile: #64832 (also added to first post). |
PackedInt32Array ba = value; | ||
for (int i = 0; i < ba.size(); i++) { | ||
ba.set(i, ba[i] ? 1 : 0); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this necessary? The code seems highly suspicious. It looks like it is copying over the entire value
array, then iterating over it to clean the values, then passing the whole array to a new function to copy each value one-by-one into another array. Could we not convert, clean, and copy all at once as we do now?
Its nice that this templated approach cuts the amount of code in half, but I have a feeling it is forcing us to do way more copies and way more operations to do the same amount of work.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Note that old code does convert+copy+iterate passes too, they are just harder to see. In the old way, these lines:
const PackedInt32Array &ba = value;
create a new temporary: a PackedInt32Array from a value variant object using Variant conversion. It just binds it as a const reference, but a temporary is still made and allocated. So a full conversion and array allocation is done here, anyway, it's just harder to see. The following manual for-loop is replaced by write_array_std140(
), but they do the same thing. The amount of work done should be roughly the same between the old and new code. And when passed the same values, the new code does not really do any more conversions than the old one, there is a fast path in convert_array_std140()
for many packed arrays. The only extra copying comes when passing more complex object arrays which have to allocate some temporary extra space, but these didn't work at all in the old code. I suspect because doing these conversions for arbitrary arrays and objects properly is actually pretty hard, as evident. I though about special casing all types and use internal variant accessors in variant_internal.h to handle these conversions without temporary copies, but it turned out that kind of code is even harder to write, read and understand.
The only exception here is indeed the boolean uniform handling (bool, bvec2, bvec3, bvec4), which has to do extra work for the "force to 1 or 0"-loop. However, the only extra cost is the loop (which is typically much less than 100 elements), because the temporary COW packed array is already made and its CowData reference count is 1, which means it doesn't have to copy anything when written into because it is already a non-shared copy which had to be made anyway (both in old and new versions).
In practice, it shouldn't have much effect on performance, but uniform boolean conversions are probably is a bit slower than the original version for the same arguments. Of course, I doubt many people actually use uniform boolean in the first place. However, to match the original performance and not make copy of the COW vector, I could write:
const PackedInt32Array &ba = convert_array_std140<Vector2i, int32_t>(value);
write_array_std140<Vector2i>(ba, gui, p_array_size, 4);
for (int = 0; i < p_array_size * 4) {
gui[i] = gui[i] ? 1 : 0;
}
It still makes one extra loop over the data, but as uniform arrays are not usually very large in practice it shouldn't matter too much. Alternatively, we could add an optional lambda function converter to inline the conversion into the copy loop, which the compiler can optimize nicely:
const PackedInt32Array &ba = convert_array_std140<Vector2i, int32_t>(value);
write_array_std140<Vector2i>(ba, gui, p_array_size, 4, [](int32_t value) { return value ? 1: 0 });
With this version, the performance should be pretty much identical. Or we could leave the new code as-is, it's pretty simple to understand this way with a pretty minimal performance penalty.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for the detailed response. I didn't realize that const PackedInt32Array &ba = value;
was copying the entire array.
Indeed the boolean arrays are the worst offenders in terms of copying necessarily, but they are also the cheapest to copy and likely won't be that large.
I'm happy with the current state of things in that case. I would avoid making it any more complicated. Already this PR adds significant complexity by adding layers of templated functions. I understand that was done in order to make handling all the edge cases manageable, but adding more complexity will make this impossible to debug six months from now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Seeing as my only concern turned out to be my misunderstanding, I am happy to go ahead with this PR. I trust that @bitsawer has properly handled things and this is a significant improvement over the status quo.
I would go ahead and merge this so we can get wide testing before releasing 4.1
Thanks a ton! |
mat4
type are set incorrectly when an Array of Transform3D is supplied #64051Also fixes an unreported (?) crash that can happen when passing too little data in for certain types using
set_shader_parameter()
. It was caused by incorrect bounds checking which would read and write garbage memory or rarely crash if the read memory was not mapped in the process.This PR also allows more types to be used with GDScript arrays when calling
set_shader_parameter()
. It relies on the normal Variant behavior to figure out the conversion. Previously, the only thing that worked worked for setting many shader type arrays was creating a raw PackedXXArray.Array auto-conversion already works for many classes (Vector2, Vector3 etc.) that have their own PackedXXXArray. For example there is PackedVector3Array. But some classes like Vector4, Transform2D, Transform3D, Basis, Projection etc. don't have these and thus did not receive the Variant array auto-conversion goodies. This PR adds some of that conversion behavior without adding new PackedXXXArray's for each class.
I created a new file for the conversion functions that is shared between Vulkan and OpenGL renderer. There are other files in there that are shared like that, but let me know if I should put the functions somewhere else.
I'm not 100% sure if this PR breaks compatibility if user has previously saved some global shader values in the project settings. After a some quick testing I think not, but someone more familiar with that area might need to confirm.
edit 2:
I also updated some more uniform/global conversions, some of them haven't worked at all in a long time, but I guess nobody either used them or bothered to make a bug report. Also, here's a simple test project that can be used to test this functionality:
UniformTest.zip (OLD test project)UniformTester.zipUniformTester3.zip (NEWEST test project)
This test project was generated in master branch (d6dde81) and contains all incorrect global uniform and shader uniform types and data saved to main.tscn and project.godot. Opening and running the project will show a black box in the master, as the data is not converted properly. After applying this PR, the box should be green in editor (this test saved resource data conversion) and dark blue when run pressing F5 (this test runtime
Shader.set_shader_parameter()
conversion). The cube also moves once per second to test some x, y and z values. The cube also rotates when demo is run (not in editor mode).NOTE: When using Compatiblity renderer, remove or comment out line
#define GLOBALS_CHECK
in uniforms.gdshader, seems like all global uniforms are not yet supported there (or they are buggy).