Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gpu: Check index buffer contents in pre-draw validation #8489

Open
wants to merge 2 commits into
base: main
Choose a base branch
from

Conversation

arno-lunarg
Copy link
Contributor

@arno-lunarg arno-lunarg commented Sep 3, 2024

Fixes #2492

Completely rework gpuav_draw, to make it easier to add validation, and debug existing one.
When adding validation for some VkBuffer, the 2 more importants parts are 1) creating the validation shader and 2) have an informative error message. gpuav_draw.cpp has been redesigned around that idea. Each shader is in charge of its small validation, no uber shader trying to cater to all needs. Doing that ends up being a mess, with the need to correctly pipe data to the validation shader.
Validation shaders are now more fine grained.
The C++ has been reworked around some template to streamline the process of adding validation. The general idea is the following: 1) Declase a new "shader" struct representing validation shader bindings and push contants
2) fill that struct
3) bind this struct and the validation pipeline
4) dispatch the appropriate number of draws

About 4): we used to dispatch one draw, and do for loop in the shader to scan buffer elements. GPU are not good at that, instead a proper amount of draws should be dispatched, each in charge of doing it's small validation logic.

Example of out of bounds Vertex Index error message:

Validation Error: [ VUID-vkCmdDrawIndexed-None-02721 ] Object 0: handle = 0x29bdd8f9570, type = VK_OBJECT_TYPE_QUEUE; Object 1: handle = 0x29bdd8ad0c0, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x24afafc5 | vkCmdDrawIndexed():  Vertex index 666 is not within the smallest bound vertex buffer.
index_buffer[1] (666) + vertexOffset (0) = Vertex index 666
Smallest vertex buffer binding info:
- Buffer: VkBuffer 0xa21a4e0000000030[]
- Binding: 0
- Binding offset: 0
- Binding size: 36 bytes
Index buffer binding info:
- Buffer: VkBuffer 0x980b0000000002e[]
- Index type: VK_INDEX_TYPE_UINT32
- Binding offset: 0
- Binding size: 12 bytes (or 3 VK_INDEX_TYPE_UINT32)
. The Vulkan spec states: If robustBufferAccess is not enabled, and that pipeline was created without enabling VK_PIPELINE_ROBUSTNESS_BUFFER_BEHAVIOR_ROBUST_BUFFER_ACCESS_EXT for vertexInputs, then for a given vertex buffer binding, any attribute data fetched must be entirely contained within the corresponding vertex buffer binding, as described in Vertex Input Description (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-vkCmdDrawIndexed-None-02721)

Out of bounds index:

Validation Error: [ VUID-VkDrawIndexedIndirectCommand-robustBufferAccess2-08798 ] Object 0: handle = 0x13f72c903a0, type = VK_OBJECT_TYPE_QUEUE; Object 1: handle = 0x13f72c9aee0, type = VK_OBJECT_TYPE_COMMAND_BUFFER; | MessageID = 0x294f0c80 | vkCmdDrawIndexedIndirect():  Index 4 is not within the bound index buffer. Computed from VkDrawIndexedIndirectCommand[0] (.firstIndex = 1, .indexCount = 3), stored in VkBuffer 0xa43473000000002d[]
Index buffer binding info:
- Buffer: VkBuffer 0x4b7df1000000002f[]
- Index type: VK_INDEX_TYPE_UINT32
- Binding offset: 0
- Binding size: 12 bytes (or 3 VK_INDEX_TYPE_UINT32)
Supplied buffer parameters in indirect command: offset = 0, stride = 20 bytes. The Vulkan spec states: If robustBufferAccess2 is not enabled, (indexSize × (firstIndex + indexCount) + offset) must be less than or equal to the size of the bound index buffer, with indexSize being based on the type specified by indexType, where the index buffer, indexType, and offset are specified via vkCmdBindIndexBuffer or vkCmdBindIndexBuffer2KHR. If vkCmdBindIndexBuffer2KHR is used to bind the index buffer, the size of the bound index buffer is vkCmdBindIndexBuffer2KHR::size (https://www.khronos.org/registry/vulkan/specs/1.3-extensions/html/vkspec.html#VUID-VkDrawIndexedIndirectCommand-robustBufferAccess2-08798)

Still missing (planned for followup PRs):

  • Index out of bounds of index buffer for non indirect indexed draw (indirect draw done)
  • Need to redo DestroyRenderPassMappedResources
  • Need to add last used pipeline/shader object to error messages

@arno-lunarg arno-lunarg requested a review from a team as a code owner September 3, 2024 15:15
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 248617.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17366 running.

@arno-lunarg arno-lunarg force-pushed the jeremyg-issue-2492-arno-takeover branch from b92946b to 2a98b80 Compare September 3, 2024 15:28
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 248652.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17368 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17368 failed.

}

bufval::CountBuffer(*this, *cb_state, record_obj.location, buffer, offset, sizeof(VkDrawIndirectCommand),
"VkDrawIndirectCommand", stride, countBuffer, countBufferOffset,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
"VkDrawIndirectCommand", stride, countBuffer, countBufferOffset,
vvl::Struct::VkDrawIndirectCommand, stride, countBuffer, countBufferOffset,

tests/CMakeLists.txt Outdated Show resolved Hide resolved
m_errorMonitor->VerifyFound();
}

#if 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Use DISABLED_ with a comment on top

Copy link
Contributor

@jeremyg-lunarg jeremyg-lunarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any way to measure the perf difference of the new multi-shader approach vs what was happening before?

// Flag values for all pre draw types

// Set if the count buffer is bound
const uint kPreDrawValidationFlags_CountBuffer = (1 << 0);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

are these flags still used now that you split up the shaders even further?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nope they were not, got rid of them!

uint get_vertex_index(uint i) {
if (pc.index_width == 32) {
return index_buffer[i];
} else if (pc.index_width == 16) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

indentation in all of these shaders is inconsistent. Looks like a mix of tabs and spaces.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah yes my editor was not correctly configured... will go by hand and change that

@@ -213,8 +213,18 @@ target_sources(vvl PRIVATE
${API_TYPE}/generated/cmd_validation_copy_buffer_to_image_comp.cpp
${API_TYPE}/generated/cmd_validation_dispatch_comp.h
${API_TYPE}/generated/cmd_validation_dispatch_comp.cpp
${API_TYPE}/generated/cmd_validation_draw_vert.h
${API_TYPE}/generated/cmd_validation_draw_vert.cpp
${API_TYPE}/generated/cmd_validation_count_buffer_vert.h
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
${API_TYPE}/generated/cmd_validation_count_buffer_vert.h
${API_TYPE}/generated/cmd_validation_count_buffer_vert.h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@arno-lunarg still need to fix this

Copy link
Contributor

@spencer-lunarg spencer-lunarg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There is a LOT going on here, overall really really happy how well this turned out, it is all well localized to gpuav_draw.cpp (thanks to our work with the refactor for the last months!)

Some things to cleanup, but hopefully we can get this in soon and push from here as a new basepoint, this PR is really pushing things in a good direction for GPU-AV!

layers/gpu/cmd_validation/gpuav_draw.cpp Outdated Show resolved Hide resolved
layers/gpu/cmd_validation/gpuav_draw.cpp Outdated Show resolved Hide resolved
@@ -507,7 +563,18 @@ void Validator::PreCallRecordCmdDrawMeshTasksIndirectEXT(VkCommandBuffer command
InternalError(commandBuffer, record_obj.location, "Unrecognized command buffer.");
return;
}
InsertIndirectDrawValidation(*this, record_obj.location, *cb_state, buffer, offset, drawCount, VK_NULL_HANDLE, 0, stride);

bufval::DrawMeshIndirectVuids vuids;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

setting these every PreCallRecordCmdDrawMeshTasksIndirectEXT seems like a waste, but honestly also not highly motivated to fix this now, so maybe just a // TODO for now to move this to something like the vuid_maps.h files we have in the future

layers/gpu/resources/gpu_resources.cpp Outdated Show resolved Hide resolved
tests/CMakeLists.txt Outdated Show resolved Hide resolved
layers/state_tracker/state_tracker.cpp Outdated Show resolved Hide resolved
layers/state_tracker/pipeline_state.h Outdated Show resolved Hide resolved
const uint32_t vertex_index = error_record[kPreActionParamOffset_3];
const uint32_t index_buffer_value = static_cast<uint32_t>(int32_t(vertex_index) - vertex_offset);

gpuav.LogError(
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just want to leave a 💯 for these error messages, this is more of the quality I wish all messages were like, an extra 💯 for making the code here also readable!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

for big printf we should try to pack parameters like this more often, it makes figuring the mapping a bit easier

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

example number 1345 of why we need log message streams

layers/gpu/cmd_validation/gpuav_draw.cpp Show resolved Hide resolved

struct CountBufferPushData {
uint draw_cmds_byte_stride;
uint64_t draw_buffer_offset;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This will get bad if they don't support shaderInt64

My advise is leave this, and look up if (!supported_features.shaderInt64) { and just do that check and turn off validate_indirect_draws_buffers if they don't support it

I am more inclined to just disable GPU-AV features then do crazy uvec2 hacks here in the short term

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added logic in Validator::PreCallRecordCreateDevice to force feature on if available

@arno-lunarg
Copy link
Contributor Author

Is there any way to measure the perf difference of the new multi-shader approach vs what was happening before?

I probably can come up with some small stress test to figure this out. And will also have a look at Doom Eternal and the lightweight VK sample

@arno-lunarg arno-lunarg force-pushed the jeremyg-issue-2492-arno-takeover branch from 2a98b80 to a745204 Compare September 4, 2024 14:32
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 249357.

@arno-lunarg arno-lunarg force-pushed the jeremyg-issue-2492-arno-takeover branch from a745204 to ec752ae Compare September 11, 2024 14:40
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 254123.

@arno-lunarg arno-lunarg force-pushed the jeremyg-issue-2492-arno-takeover branch from ec752ae to 3879aec Compare September 11, 2024 14:43
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 254136.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17437 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17437 passed.

@arno-lunarg
Copy link
Contributor Author

arno-lunarg commented Sep 11, 2024

Running lightweight VK showed problems in handling of dynamic states. Since we bind our own pipelines, we have to restore them. I have yet to add testing in VVL, but at least lightweight VK is cleared now. I pushed my new changes to a separate commit.

So @jeremyg-lunarg for some perf stats:

  • Index buffer validation costs around 4 ms on the Tiny mesh large lightweight VK sample. On my AMD laptop, I go from a frame time of 24 ms to 28 ms.
  • On Doom Eternal, performance is.... better. Actually on main, there seem to be some kind of leak in GPU-AV buffer validation code. Maybe something happening in the GPU. Frame time keeps increasing over time. It does not happen on this branch. And I checked with some custom logging that validation was actually performed. Since this branch fixes this "leak", I did not investigate it

@jeremyg-lunarg
Copy link
Contributor

  • On Doom Eternal, performance is.... better. Actually on main, there seem to be some kind of leak in GPU-AV buffer validation code. Maybe something happening in the GPU. Frame time keeps increasing over time. It does not happen on this branch. And I checked with some custom logging that validation was actually performed. Since this branch fixes this "leak", I did not investigate it

That's disturbing! But yeah since you've fixed it not worth figuring out what was happening in the old code.

layers/gpu/resources/gpuav_subclasses.cpp Outdated Show resolved Hide resolved
tests/unit/gpu_av_indirect_buffer.cpp Outdated Show resolved Hide resolved
tests/unit/gpu_av_indirect_buffer.cpp Show resolved Hide resolved
char const *vsSource = R"glsl(
#version 450

layout(location=0) in vec3 pos;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually these are all the same shader. can it be factored out to a single string constant?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also want that, but then not having shader code inline makes it not obvious what location=0 is, and thus how to properly fill the corresponding Vulkan vertex binding description. And having a helper a function that would basically spit out a valid graphics pipeline based on this single shader seems overkill?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

you could always name is something that makes it obvious, like kVertexShaderVec3Location0. But not a big deal either way

@arno-lunarg arno-lunarg force-pushed the jeremyg-issue-2492-arno-takeover branch from 3879aec to 217afab Compare September 12, 2024 09:59
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 254572.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17445 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17445 failed.

@arno-lunarg arno-lunarg force-pushed the jeremyg-issue-2492-arno-takeover branch from 217afab to af79e52 Compare September 12, 2024 15:51
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 254858.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17449 running.

@arno-lunarg arno-lunarg force-pushed the jeremyg-issue-2492-arno-takeover branch from af79e52 to 95a8be8 Compare September 12, 2024 16:01
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 254886.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17451 running.

jeremyg-lunarg and others added 2 commits September 12, 2024 18:13
Fixes KhronosGroup#2492

Completely rework gpuav_draw, to make it easier to add validation,
and debug existing one.
When adding validation for some VkBuffer, the 2 more importants parts
are 1) creating the validation shader and 2) have an informative error
message. gpuav_draw.cpp has been redesigned around that idea.
Each shader is in charge of its small validation, no uber shader trying
to cater to all needs. Doing that ends up being a mess, with the need
to correctly pipe data to the validation shader.
Validation shaders are now more fine grained.
The C++ has been reworked around some template to streamline the
process of adding validation. The general idea is the following:
1) Declase a new "shader" struct representing validation shader
bindings and push contants
2) fill that struct
3) bind this struct and the validation pipeline
4) dispatch the appropriate number of draws

About 4): we used to dispatch one draw, and do for loop in the shader
to scan buffer elements. GPU are not good at that, instead a proper
amount of draws should be dispatched, each in charge of doing it's
small validation logic.
@arno-lunarg arno-lunarg force-pushed the jeremyg-issue-2492-arno-takeover branch from 95a8be8 to e566158 Compare September 12, 2024 16:13
@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build queued with queue ID 254915.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17452 running.

@ci-tester-lunarg
Copy link
Collaborator

CI Vulkan-ValidationLayers build # 17452 passed.

DispatchGetPhysicalDeviceFeatures(physicalDevice, &physical_device_features);
if (physical_device_features.shaderInt64) {
if (auto enabled_features = const_cast<VkPhysicalDeviceFeatures*>(modified_create_info->pEnabledFeatures)) {
enabled_features->shaderInt64 = VK_TRUE;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

will mark as a TODO (For me) after, but we need to cleanup how we do this, like we are not warning users if we are setting things to true under them

DispatchCmdSetFrontFace(VkHandle(), dynamic_state_value.front_face);
}
} break;
default:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there are a LOT of dynamic states missing here, like VK_DYNAMIC_STATE_STENCIL_OP

@@ -216,6 +216,10 @@ class CommandBuffer : public RefcountedStateObject {
bool depth_test_enable;
// VK_DYNAMIC_STATE_DEPTH_BOUNDS_TEST_ENABLE
bool depth_bounds_test_enable;
// VK_DYNAMIC_STATE_DEPTH_COMPARE_OP
VkCompareOp depth_compare_op;
Copy link
Contributor

@spencer-lunarg spencer-lunarg Sep 12, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

so CommandBuffer is already by far the largest memory object we have, this is going to add a lot of memory overhead

I knowing we are still missing 50+ states for the RestoreDynamicState and my worry is we are going to really bloat this up

we either should

  1. Move these to the GPU-AV CommandBuffer state object
  2. We just serialize the vkCmdSet calls and replay them back

... please note there is a LOT of nasty dynamic state you have not touched yet

}

void RestorablePipelineState::Restore() const {
if (rendering_info_) {
DispatchCmdEndRendering(cb_state_.VkHandle());
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

discussing with driver people

  1. This will trash runtime speed on some drivers if we are going
vkCmdBeginRendering();
vkCmdDrawIndexed();
vkCmdDrawIndexed();
vkCmdDrawIndexed();
vkCmdDrawIndexed();
vkCmdDrawIndexed();
vkCmdEndRendering():

too

vkCmdBeginRendering();
vkCmdDrawIndexed();
vkCmdEndRendering():
vkCmdBeginRendering();
vkCmdDrawIndexed();
vkCmdEndRendering():
vkCmdBeginRendering();
vkCmdDrawIndexed();
vkCmdEndRendering():
vkCmdBeginRendering();
vkCmdDrawIndexed();
vkCmdEndRendering():
vkCmdBeginRendering();
vkCmdDrawIndexed();
vkCmdEndRendering():
  1. There might be dependency (via VK_KHR_dynamic_rendering_local_read) that are going to just blow up if we don't chain it for them

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

GPU-AV check index buffer out-of-bounds
4 participants