-
-
Notifications
You must be signed in to change notification settings - Fork 21.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Implement Vulkan pipeline caching #76348
Conversation
e70f83c
to
195bbe8
Compare
@clayjohn @RandomShaper I asked around about driverABI's purpose and it seems that it is used to check if the os is 32-bit or 64-bit. There aren't any plans to support 32-bits right? If so then I can remove the variable from PipelineCacheHeader. |
I've also added a pipeline cache save interval into project settings based on @reduz's one of the suggestions. |
195bbe8
to
1ae1004
Compare
1ae1004
to
6f14b27
Compare
Could you test this PR with a larger project like the TPS-demo and check what the size of the pipeline cache ends up being? I know reduz has stated earlier that he only wants to cache certain pipelines to file size and load time from the cache. I think he specifically wanted to only cache the main specialization constant variants as ideally all the other variants should be compiled on a background thread |
6f14b27
to
486b033
Compare
I wish I could but at the moment at least I'm having a hard time to get the project to open since it gets stuck on importing level.exr edit: deleting the file fixed the issue. |
Ah, ya compressing exr files is super flow in debug builds. If you run it with debug builds you should open up level.exr.import and change the compress/mode to 3. That way it will import as VRAM uncompressed and will be much faster |
@clayjohn after doing a quick test with TPS Demo and Gdquest's third person platforming demo. The results are: And I assume that these graphical artifact are due to me using VRAM uncompressed with level.exr? |
One moment |
1a60b33
to
4483576
Compare
Done |
So I took some time and built this PR (rebased on the current master) locally. The code looks good to me and some quick debugging would imply that Vulkan is correctly loading stuff from the cache (instead of compiling it). Creating a pipeline without the cache takes on average 30.000 usec. This PR cuts that down to ~260 usec, while my driver's built-in cache averages 80. |
@Ansraer I disagree, there are cases in desktop as well where pipeline cache would help. In TPS Demo, shooting a bullet causes a stutter in multiple times when you boot the game before the driver builds up a cache. With pipeline cache, it only happens on a first run. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a bit late to the party, but I made a few comments about the code. Feel free to ignore them if there's a reason behind the issues I pointed out.
float save_interval = GLOBAL_GET("rendering/rendering_device/pipeline_cache/save_interval_mb"); | ||
VkResult vr = vkGetPipelineCacheData(device, pipelines_cache.cache_object, &pso_blob_size, nullptr); | ||
ERR_FAIL_COND(vr); | ||
size_t difference = (pso_blob_size - pipelines_cache.current_size) / (1024.0f * 1024.0f); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should probably use integer division here.
break; | ||
} | ||
} | ||
if (header.data_hash != hash_murmur3_one_64(pipelines_cache.buffer.size()) || header.data_size != (uint32_t)pipelines_cache.buffer.size() || header.vendor_id != props.vendorID || header.device_id != props.deviceID || header.driver_abi != sizeof(void *) || invalid_uuid) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this use hash_murmur3_buffer()
instead to hash the full contents of pipelines_cache.buffer
? Now we are just hashing the size integer which doesn't seem all that useful as we could just store the integer directly.
Of course, hashing the full buffer and megabytes of data might cause a small slowdown, but hopefully it shouldn't be too bad.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I might have interoperated "robust pipeline cache" article incorrectly so this could indeed be the correct way to hash.
} | ||
|
||
if (FileAccess::exists("user://vulkan/pipelines.cache")) { | ||
Vector<uint8_t> file_data = FileAccess::get_file_as_bytes("user://vulkan/pipelines.cache"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This part could use some basic error and data bounds checking just in case, for example if the file exists but is corrupted, truncated or could not be read this will crash soon afterwards.
PipelineCacheHeader header = {}; | ||
header.magic = 868 + VK_PIPELINE_CACHE_HEADER_VERSION_ONE; | ||
header.data_size = pipelines_cache.buffer.size(); | ||
header.data_hash = hash_murmur3_one_64(pipelines_cache.buffer.size()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Same as above, should this use hash_murmur3_buffer()
?
@bitsawer thanks in general for good feedback :) I'll look at it tomorrow asap |
@@ -8957,6 +8966,102 @@ void RenderingDeviceVulkan::initialize(VulkanContext *p_context, bool p_local_de | |||
draw_list_split = false; | |||
|
|||
compute_list = nullptr; | |||
_load_pipeline_cache(); | |||
print_line(vformat("Startup PSO cache (%.1f MiB)", pipelines_cache.buffer.size() / (1024.0f * 1024.0f))); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Might be better to use print_verbose
instead of print_line
for all logging like this (including the one in _update_pipeline_cache
).
4483576
to
c7edfc0
Compare
@bitsawer @clayjohn @RandomShaper file size has shrunk a bit after changing to Let me know if the file error checks that I have added are good enough. |
Changes look mostly good. However, I would write the file check something like this (not tested or compiled): Error file_error;
Vector<uint8_t> file_data = FileAccess::get_file_as_bytes("user://vulkan/pipelines.cache", &file_error);
if (file_error != OK || file_data.size() <= sizeof(PipelineCacheHeader)) {
WARN_PRINT("Invalid/corrupt pipelines cache.");
return;
}
PipelineCacheHeader header = {};
... Instead of checking multiple possible error codes, just check if the read was not OK. This also checks if we actually read enough data so that the following memcpy() and header check doesn't blow up. For example, reading an empty or partially written file would currently crash because reading would be successful but it would not read enough data. As a super minor nitpick I didn't notice earlier, you could also tweak the comment formatting like this (space after // and captialized):
After changing those, looks good to me. |
c7edfc0
to
2d3d92c
Compare
@bitsawer Done, hopefully this is good to go for merging |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good to me after the changes.
@clayjohn @RandomShaper Please make sure you're okay with the changes, when you have time :) |
I'm finally reviewing this! Sorry for not doing it earlier. Looks very well! Now, I'm wondering if it wouldn't be nice to save the cache from another thread; namely, a low priority task in the |
doc/classes/ProjectSettings.xml
Outdated
@@ -2305,6 +2305,9 @@ | |||
<member name="rendering/rendering_device/driver.windows" type="String" setter="" getter="" default=""vulkan""> | |||
Windows override for [member rendering/rendering_device/driver]. | |||
</member> | |||
<member name="rendering/rendering_device/pipeline_cache/save_interval_mb" type="float" setter="" getter="" default="3.0"> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This wording makes it sound as if this was about time instead of size. Let me suggest save_increment_mb
, for instance.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of the meanings of interval is a gap between two points. The setting describes a gap between last time we saved and next time. It could be maybe save_gap_mb
or save_delta_mb
if not interval
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could be save_chunk_size_mb
? Or even just chunk_size_mb
. (I didn't check implementation details to see what this is used for exactly so TIWAGOS.)
I could implement one quickly. I'm not experienced though with multi-threading so I'd like to know, should I call |
Nope, because by doing so you would be making the operation synchronous (locking, stalling) regardless the use of threads. You should make the call to add the task and just remember the task id. Then, when it's time to save the cache again, if the task is still in flight, skip. Upon closing the engine, you would wait for any in flight cache save task to complete and then call it normally to ensure the latest state is saved. Does that make sense to you? |
It does yeah |
08d73a4
to
21bfd7d
Compare
@RandomShaper added work threading and updated project setting name. |
21bfd7d
to
dded713
Compare
Thanks! |
An implementation of Vulkan pipeline caching with cache validation when reading the file. Looking for feedback before merging this pr :)