-
-
Notifications
You must be signed in to change notification settings - Fork 21.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Crash (segfault) when opening/creating project due to Vulkan debug utils [Mesa Haswell] #51955
Comments
I only noticed the difference in the Vulkan API version returned when I started writing up this issue but I had previously noticed the following recent commits that I thought could be connected due to the area of the code affected:
More debug infoI attempted to get some more useful Vulkan debug information but mostly just got annoyed with Vulkan debugging. :D :/ However, running under
Then:
Note: While there is a message that says |
Notes:
Broken:
Works:
Related:
|
AFAIK Haswell Vulkan support in MESA has never been completed, so I think the warning is still accurate. It's unlikely that Godot's Vulkan renderer would be usable in production on Haswell, though it's still worth investigating what change triggered this issue. |
As mentioned on twitter I was able to get the editor to run ( break VulkanContext::command_begin_label Then when it was hit first time, ran this: set this->enabled_debug_utils = 0
disable 1
continue The editor then ran without issue. I was able to add a mesh, light & camera & then play the scene[0] (after the repeating the same steps for the new process). From a quick look at the source history it seems this code was introduced in 7323cba ("Add named resources and debug labels in RenderDoc"). Will investigate further... |
My curiousity is piqued by this change: 7323cba#diff-3c95ffea6d81c05adea361b3206db7c6f12ff45f6876e25395eab944afcef2f9L254 @@ -251,9 +252,8 @@ Error VulkanContext::_initialize_extensions() {
}
}
if (!strcmp(VK_EXT_DEBUG_UTILS_EXTENSION_NAME, instance_extensions[i].extensionName)) {
- if (use_validation_layers) {
- extension_names[enabled_extension_count++] = VK_EXT_DEBUG_UTILS_EXTENSION_NAME;
- }
+ extension_names[enabled_extension_count++] = VK_EXT_DEBUG_UTILS_EXTENSION_NAME;
+ enabled_debug_utils = true;
}
if (enabled_extension_count >= MAX_EXTENSIONS) {
free(instance_extensions);
@@ -436,7 +436,7 @@ Error VulkanContext::_create_physical_device() {
" extension.\n\nDo you have a compatible Vulkan installable client driver (ICD) installed?\n"
"vkCreateInstance Failure");
- if (use_validation_layers) {
+ if (enabled_debug_utils) {
// Setup VK_EXT_debug_utils function pointers always (we use them for
// debug labels and names).
CreateDebugUtilsMessengerEXT = I think this is the specific commit ( https://github.com/godotengine/godot/tree/770a1d00a3b14c4abd1cd57c9621de93b24478aa Potentially related: #45656 Unfortunately I've never been able to enable |
Additional notes... Backtrace from immediately before crash: gef> bt
#0 0x00007ffff2c2aaa3 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan.so.1
#1 0x00000000035e7439 in VulkanContext::command_begin_label (this=0xa1be640, p_command_buffer=0xa52c140, p_label_name=..., p_color=...) at drivers/vulkan/vulkan_context.cpp:2144 Which is within gef> print (this->CmdBeginDebugUtilsLabelEXT)
$27 = (PFN_vkCmdBeginDebugUtilsLabelEXT) 0x7ffff2c2aaa0 gef> print *(this->CmdBeginDebugUtilsLabelEXT)
$26 = {void (VkCommandBuffer, const VkDebugUtilsLabelEXT *)} 0x7ffff2c2aaa0 gef> xinfo 0x00007ffff2c2aaa0
───────────────────────────────────────────────────────────────── xinfo: 0x7ffff2c2aaa0 ─────────────────────────────────────────────────────────────────
Page: 0x00007ffff2c1c000 → 0x00007ffff2c6a000 (size=0x4e000)
Permissions: r-x
Pathname: /usr/lib/x86_64-linux-gnu/libvulkan.so.1.1.70
Offset (from page): 0xeaa0
Inode: 10095548
Segment: .text (0x00007ffff2c29fd0-0x00007ffff2c596f9)
Offset (from segment): 0xad0 Crash occurs after retrieving value at gef> disassemble 0x7ffff2c2aaa0,+32
Dump of assembler code from 0x7ffff2c2aaa0 to 0x7ffff2c2aac0:
0x00007ffff2c2aaa0: mov rax,QWORD PTR [rdi]
0x00007ffff2c2aaa3: jmp QWORD PTR [rax+0x640]
0x00007ffff2c2aaa9: nop DWORD PTR [rax+0x0]
0x00007ffff2c2aab0: mov rax,QWORD PTR [rdi]
0x00007ffff2c2aab3: jmp QWORD PTR [rax+0x648]
0x00007ffff2c2aab9: nop DWORD PTR [rax+0x0]
End of assembler dump. Which, if I remember the ABI correctly, is derived from the command buffer argument: [...]
gef> frame 1
[...]
gef> print p_command_buffer
$28 = (VkCommandBuffer) 0xa52c140 [TODO] [...]
gef> frame 0
#0 0x00007ffff2c2aaa3 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan.so.1
gef> print $rdi
$33 = 0xa52c140 gef> disassemble $rip,+4
Dump of assembler code from 0x7ffff2c2aaa3 to 0x7ffff2c2aaa7:
=> 0x00007ffff2c2aaa3: jmp QWORD PTR [rax+0x640]
End of assembler dump. gef> print $rax+0x640
$37 = 0xa415d60
gef> x/xg $rax+0x640
0xa415d60: 0x0000000000000000 |
Looks like it may be a combination of 7323cba and our recent move to the Volk loader. My guess is the way we initialize the Relevant issue in Volk can probably help us find a solution: zeux/volk#59 |
Full backtrace from the binary with debug symbols... gef> bt
#0 0x00007ffff2c2aaa3 in ?? () from /usr/lib/x86_64-linux-gnu/libvulkan.so.1
#1 0x00000000035e7439 in VulkanContext::command_begin_label (this=0xa1be640, p_command_buffer=0xa52c140, p_label_name=..., p_color=...) at drivers/vulkan/vulkan_context.cpp:2144
#2 0x000000000358e39b in RenderingDeviceVulkan::draw_command_begin_label (this=0xa413d00, p_label_name=..., p_color=...) at drivers/vulkan/rendering_device_vulkan.cpp:8493
#3 0x00000000057ef60e in RendererSceneRenderImplementation::RenderForwardClustered::_render_scene (this=0xa8e9fd0, p_render_data=0x7fffffffc4e0, p_default_bg_color=...) at servers/rendering/renderer_rd/forward_clustered/render_forward_clustered.cpp:1266
#4 0x0000000005741539 in RendererSceneRenderRD::render_scene (this=0xa8e9fd0, p_render_buffers=..., p_camera_data=0x7fffffffcc60, p_instances=..., p_lights=..., p_reflection_probes=..., p_voxel_gi_instances=..., p_decals=..., p_lightmaps=..., p_environment=..., p_camera_effects=..., p_shadow_atlas=..., p_occluder_debug_tex=..., p_reflection_atlas=..., p_reflection_probe=..., p_reflection_probe_pass=0xffffffff, p_screen_lod_threshold=0.0009765625, p_render_shadows=0xa5a0b08, p_render_shadow_count=0x0, p_render_sdfgi_regions=0xa5a8b10, p_render_sdfgi_region_count=0x0, p_sdfgi_update_data=0xa5a9050, r_render_info=0xbc6e090) at servers/rendering/renderer_rd/renderer_scene_render_rd.cpp:4172
#5 0x00000000058f5184 in RendererSceneCull::_render_scene (this=0xa59faf0, p_camera_data=0x7fffffffcc60, p_render_buffers=..., p_environment=..., p_force_camera_effects=..., p_visible_layers=0xfffff, p_scenario=..., p_viewport=..., p_shadow_atlas=..., p_reflection_probe=..., p_reflection_probe_pass=0xffffffff, p_screen_lod_threshold=0.0009765625, p_using_shadows=0x1, r_render_info=0xbc6e090) at servers/rendering/renderer_scene_cull.cpp:3091
#6 0x00000000058f0c52 in RendererSceneCull::render_camera (this=0xa59faf0, p_render_buffers=..., p_camera=..., p_scenario=..., p_viewport=..., p_viewport_size=..., p_screen_lod_threshold=0.0009765625, p_shadow_atlas=..., p_xr_interface=..., r_render_info=0xbc6e090) at servers/rendering/renderer_scene_cull.cpp:2436
#7 0x0000000005918d7b in RendererViewport::_draw_3d (this=0xa59fa50, p_viewport=0xbc6df90) at servers/rendering/renderer_viewport.cpp:98
#8 0x0000000005919359 in RendererViewport::_draw_viewport (this=0xa59fa50, p_viewport=0xbc6df90, p_view_count=0x1) at servers/rendering/renderer_viewport.cpp:151
#9 0x000000000591bc54 in RendererViewport::draw_viewports (this=0xa59fa50) at servers/rendering/renderer_viewport.cpp:583
#10 0x0000000005681ece in RenderingServerDefault::_draw (this=0xa56eca0, p_swap_buffers=0x1, frame_step=0.86606300000000003) at servers/rendering/rendering_server_default.cpp:94
#11 0x0000000005683b38 in RenderingServerDefault::draw (this=0xa56eca0, p_swap_buffers=0x1, frame_step=0.86606300000000003) at servers/rendering/rendering_server_default.cpp:376
#12 0x0000000002172a29 in Main::iteration () at main/main.cpp:2559
#13 0x0000000002132d30 in OS_LinuxBSD::run (this=0x7fffffffd770) at platform/linuxbsd/os_linuxbsd.cpp:342
#14 0x000000000212f434 in main (argc=0x5, argv=0x7fffffffdc58) at platform/linuxbsd/godot_linuxbsd.cpp:58 The command buffer appears to be via: godot/drivers/vulkan/rendering_device_vulkan.cpp Lines 8492 to 8494 in 770a1d0
Where godot/drivers/vulkan/rendering_device_vulkan.h Lines 1009 to 1014 in 770a1d0
(Side note, I noticed this PR also changed f20999f code around the area: f20999f#diff-3c95ffea6d81c05adea361b3206db7c6f12ff45f6876e25395eab944afcef2f9R1628 / #45672) (Unfortunately all the PR test builds from around this time have expired...) |
@clayjohn Thanks for taking a look. That timing seems consistent with the PR builds that work/don't work. [Edit: Add below...] Works:
Doesn't work:
|
Additional observations (based on
|
So, based on what I've observed, it seems that the issue does arise (at least in part) from not calling godot/drivers/vulkan/vulkan_context.cpp Lines 214 to 264 in 77721b3
It would be good to know why the equivalent of (Stab in the dark here, but could it be related to configuration of implicit layers: https://vulkan.lunarg.com/doc/view/1.2.189.0/linux/layer_configuration.html? Also, looking at that page suggests some possibilities for reducing the log spam which I might look into.) Where to from here?Potential future actions to take from here:
|
In terms of the content of the half gigabyte
[Edit: The link (with a (semi-)working id fragment without a closing parenthesis is: https://www.khronos.org/registry/vulkan/specs/1.0-extensions/xhtml/vkspec.html#resources-bufferimagegranularity] |
FWIW, after setting a breakpoint on
Also, at one point I noticed the following warning:
|
Device vs driver Vulkan version supportOkay, this is interesting, it turns out the warning message is via the parameter validation layer and On this laptop the output of
And
So, as I understand it, the GPU is capable of supporting 1.2.131 but the validation layers are only from 1.1.70 so don't recognize the higher version number. Results with Vulkan SDK version
|
In the interest of "completeness" I note that the output with $ ./godot-v4-linux-nightly-debug-symbols-2021-10-06 --vk-layers --path /<path>/vulk-play-gg Results in output that includes:
With the error message repeated multiple times but AFAICT no other errors mentioned. |
After testing that the project ran okay, I then tested the more recent SDK setup with the editor (as there seems a difference between those two scenarios in general). Running the editor via this command: $ ./godot-v4-linux-nightly-debug-symbols-2021-10-06 --path /<path>/vulk-play-gg --editor Resulted in quite a long pause but then the editor eventually opened. Partial output included:
Editor as it appears after opening: |
Subsequent testing suggests that the |
And, as a final data point (for now :) ) if I rerun these tests (using the more recent 1.2.182 SDK driver) with the most pre-alpha release this all started with ( (Lol, oh dear, but when I hovered over the "Scene" menu title & it gets highlighted the pre-alpha version outputs a bunch of validation errors which the nightly version with debug symbols doesn't--so hopefully it was just a temporary issue which has already been fixed, right..? :D ) |
It turns out that the 1.2.182 SDK driver also enables the win64 Godot v4 pre-alpha to (~somewhat/mostly[0]) run (both play & editor) under Wine again now too: So, that's handy... [0] The original 3D test scene I was using did crash but a simplified 3D scene worked okay. $ wine --version
wine-6.0.1 $ wine /<path>/Godot_v4.0-dev.20211004_win64.exe --path "/<path>/.wine/dosdevices/z:/<path>/vulk-play-gg" --editor --verbose Node3DTest_recovered.tscn
INTEL-MESA: warning: Haswell Vulkan support is incomplete
TextServer: Added interface "ICU / HarfBuzz / Graphite"
Godot Engine v4.0.dev.20211004.official.2e8cba0bd - https://godotengine.org
INTEL-MESA: warning: Haswell Vulkan support is incomplete
Vulkan API 1.2.0 - Using Vulkan Device #0: Intel - Intel(R) HD Graphics 4400 (HSW GT2)
[...] |
@follower Can you (or anyone else) still reproduce this bug in Godot 4.0.beta5 or any later release? |
Godot version
4.0.dev.20210820.official.75697c0df
(via https://twitter.com/Akien/status/1428808742989148168 / https://downloads.tuxfamily.org/godotengine/testing/4.0/4.0-dev.20210820/)
System information
Linux, Elementary OS 5.1, Intel(R) HD Graphics 4400 (HSW GT2), Vulkan API 1.1.0 [0]
Issue description
Crash (segfault) when opening/creating any project.
Steps to reproduce
Opening just the project manager works okay:
Previous pre-built binary version that worked (via https://downloads.tuxfamily.org/godotengine/testing/4.0/4.0-dev.20210811/):
[0] Note: Previous working version reports
Vulkan API 1.2.162
but latest version reportsVulkan API 1.1.0
.Minimal reproduction project
N/A
The text was updated successfully, but these errors were encountered: