Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Game crashing on swap_buffers - Seemingly random #67404

Open
Tracked by #71929
SnaveSutit opened this issue Oct 14, 2022 · 19 comments
Open
Tracked by #71929

Game crashing on swap_buffers - Seemingly random #67404

SnaveSutit opened this issue Oct 14, 2022 · 19 comments

Comments

@SnaveSutit
Copy link

Godot version

4.0 Beta2

System information

OS: Windows 11 | CPU: AMD Ryzen 5 3600X 6-Core | GPU: AMD Radeon RX 5700 XT (Driver: 22.20.19.16-221003a-384125E-AMD-Software-Adrenalin-Edition) | Rendering Backend: Vulkan

Issue description

From what I can tell from my search, this exact issue has not been reported. Very similar issues exist, but none that describe this particular crash.

Description
Every so often while my game is running it will crash my GPU with this error:
image
Error source link

And AMD's crash detection will show up with this message:
image

However, it doesn't seem to be any fault of mine. Even if there's nothing going on in-game, or it's just a blank window containing a scene with a single node, it will randomly crash my GPU and show that error.
It does seem to be a 3D issue, I've been unable to reproduce it in a project without 3D rendering.

I've tried out a few fixes from issues similar to this one:

But so far none of them have worked. 😢

Steps to reproduce

As far as I can tell, this is an AMD GPU specific issue. So reproducing it on other hardware will probably be impossible.

  • Create an empty project, and make a simple 3D main scene with one Node3D in it.
  • Run the project, and wait. After anywhere between 10 seconds and 8 minutes it should cause your GPU to throw an error and the project to freeze/crash.

Minimal reproduction project

No response

@SnaveSutit
Copy link
Author

SnaveSutit commented Oct 14, 2022

I am willing to join a voice call on Discord to show off the error and share more information if needed.
SnaveSutit#0042

@SnaveSutit
Copy link
Author

Recently got this in the log files as well, not sure if it helps

USER ERROR: Vulkan: Did not create swapchain successfully.
   at: prepare_buffers (drivers/vulkan/vulkan_context.cpp:2056) - Condition "err != VK_SUCCESS" is true. Breaking.
USER ERROR: Condition "err" is true. Returning: ERR_CANT_CREATE
   at: swap_buffers (drivers/vulkan/vulkan_context.cpp:2133) - Condition "err" is true. Returning: ERR_CANT_CREATE
USER ERROR: Condition "err" is true. Returning: ERR_CANT_CREATE
   at: _update_swap_chain (drivers/vulkan/vulkan_context.cpp:1746) - Condition "err" is true. Returning: ERR_CANT_CREATE

@Flavelius
Copy link
Contributor

Flavelius commented Jan 13, 2023

Same for me on beta11, nvidia gtx1060. It began happening after i played around with environment settings and skies. The editor even crashes when opening the scene containing that world environment.
Here's a stripped down project that also crashes on startup (this even locks up my whole pc for around 20s)
CrashProject.zip
(DefaultWorldMap.tscn under Shared/Art/Environments/Test is the offending scene, crashes/freezes reliably for me after having opened the project the second time)

@Calinou
Copy link
Member

Calinou commented Jan 13, 2023

Same for me on beta11, nvidia gtx1060. It began happening after i played around with environment settings and skies. The editor even crashes when opening the scene containing that world environment. Here's a stripped down project that also crashes on startup (this even locks up my whole pc for around 20s) CrashProject.zip (DefaultWorldMap.tscn under Shared/Art/Environments/Test is the offending scene, crashes/freezes reliably for me after having opened it the second time)

I can't reproduce this on 4.0.beta12 on with the project you linked. I've tried opening all 3 scenes, closing them and opening them a second time:

image

Specs: Fedora 37, GeForce RTX 4090 (NVIDIA 525.60.11)

@Flavelius
Copy link
Contributor

Flavelius commented Jan 14, 2023

It freezes reliably on my pc, even after rebooting (4.0 beta 11, Win 10, 16gb ram, nvidia gtx1060 6gb (driver version 516.59)).
Here's a phone capture (screen capture didn't work obviously):

out.mp4

This is right after opening the project from the project manager (the offending scene is set as main scene). It only started happening after i opened the project the second time though (not sure if that's relevant or just coincidence)

Edit: on beta12 it's the same, only that there are fewer errors printed to the console (these are all):
ERROR: Condition "err" is true. Returning: ERR_CANT_CREATE
at: swap_buffers (drivers/vulkan/vulkan_context.cpp:2299)
ERROR: Condition "err" is true. Returning: ERR_CANT_CREATE
at: swap_buffers (drivers/vulkan/vulkan_context.cpp:2299)

@Calinou
Copy link
Member

Calinou commented Jan 14, 2023

While the Godot editor is closed, can you try editing the .tscn/.tres files with a text editor and remove entries such as ssao_enabled = true until it opens successfully?

@Flavelius
Copy link
Contributor

Flavelius commented Jan 14, 2023

I'll try that later when I'm at my PC again. I already edited the environment resource directly (in-editor inspector) earlier with the scene closed though, where it didn't seem to have any effect, but that could also have been the result of some caching

@Flavelius
Copy link
Contributor

Flavelius commented Jan 14, 2023

Ok, it happens when i add a Sky to the environment settings (removed all references to it from the tres before), even when i try to set it in the editor it freezes.
I'm now also using the newest nvidia drivers for my card (528.02), but that doesn't make a difference.

Edit: it does not freeze when i edit the environment settings (sky), while no 'WorldEnvironment' is currently in the scene. When i then add that to the scene it works just fine, but when i add the sky while a WorldEnvironment with that tres assigned is currently active it reliably freezes (and after project reloads from then, just like before).

@Zireael07
Copy link
Contributor

Have you changed any Sky settings from the default?

@Flavelius
Copy link
Contributor

It happens right when I select new sky in the dropdown for the corresponding sky field directly inside the environment settings, i can't even get to its sub settings.

@Zireael07
Copy link
Contributor

Zireael07 commented Jan 14, 2023

Once you've added a sky (maybe in a copy of the scene/project so that it doesn't freeze your main project) you should IIRC be able to edit its values in the scene file in any text editor - I'm wondering whether it's a general "fail to work with the shader" situation or if one of the settings is to blame (maybe the radiance size)

@Flavelius
Copy link
Contributor

Flavelius commented Jan 14, 2023

I was able to create a sky by using the steps mentioned above, deleting the worldenvironment, editing the env-resource, then creating the corresponding node, and it seems the freeze happens when or as the sky is created with no sky material.
if i assign one (all of them work the same) and restart the project, it doesn't freeze, but when i delete the assigned material and restart the project it freezes again. Interestingly, before restarting in this case while a WorldEnvironment is still in the scene i can freely delete and reassign the sky material without it freezing.

@Flavelius
Copy link
Contributor

Flavelius commented Jan 16, 2023

It also freezes reliably with PanoramaSkyMaterial (with or without texture) assigned to the sky (but not with physical or procedural sky, nor with a physical sky that has a night sky texture assigned)

@Flavelius
Copy link
Contributor

My main project and the example do not crash anymore in beta16 under the same scenarios.

@RenaKunisaki
Copy link

Having the same problem with AMD on Artix Linux. Just having the editor open is enough; it likes to happen when I'm not even at the computer. It happens about once per day. Takes out the entire desktop session.

Apr 13 21:06:12 greymon kernel: traps: gdbus[10754] general protection fault ip:7fe1989db537 sp:7fe196ffc420 error:0 in libc
.so.6[7fe198969000+15a000]
Apr 13 21:06:41 greymon kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -22!

@Calinou
Copy link
Member

Calinou commented Apr 14, 2023

@RenaKunisaki Which graphics card model do you have, and which Vulkan driver are you using (RADV or AMDVLK)?

@RenaKunisaki
Copy link

My system is ASUS ROG Strix G513QY, which has two GPUs: Radeon integrated with 512MB (its name seems to be just "AMD Radeon Graphics"), and Radeon RX 6800M with 12GB. By running DRI_PRIME=1 godot I can force it to run on the latter and that's when it's been crashing. So far using it without that option it hasn't crashed, but I'll report back if it does. I'm using AMDVLK.

@RenaKunisaki
Copy link

RenaKunisaki commented Apr 18, 2023

Just had it happen again with the integrated GPU.

Apr 18 12:24:00 greymon kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* amdgpu_vm_validate_pt_bos() failed.
Apr 18 12:24:00 greymon kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -22!
Apr 18 12:24:08 greymon root[28496]: ACPI group/action undefined: button/up / UP
Apr 18 12:24:12 greymon kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* amdgpu_vm_validate_pt_bos() failed.
Apr 18 12:24:12 greymon kernel: [drm:amdgpu_cs_ioctl [amdgpu]] *ERROR* Failed to process the buffer list -22!
Apr 18 12:24:12 greymon kernel: pagefault_out_of_memory: 1309 callbacks suppressed
Apr 18 12:24:12 greymon kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Apr 18 12:24:12 greymon kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Apr 18 12:24:12 greymon kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Apr 18 12:24:12 greymon kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Apr 18 12:24:12 greymon kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Apr 18 12:24:12 greymon kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Apr 18 12:24:12 greymon kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Apr 18 12:24:12 greymon kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Apr 18 12:24:12 greymon kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Apr 18 12:24:12 greymon kernel: Huh VM_FAULT_OOM leaked out to the #PF handler. Retrying PF
Apr 18 12:30:07 greymon root[29265]: ACPI group/action undefined: button/up / UP
Apr 18 12:35:11 greymon root[29717]: ACPI group/action undefined: button/up / UP
Apr 18 12:36:22 greymon root[29835]: ACPI group/action undefined: button/up / UP
Apr 18 12:36:38 greymon kernel: godot.linuxbsd.[29853]: segfault at 1ab000001f9 ip 000001ab000001f9 sp 00007ffd40029f18 error 14 in godot.linuxbsd.template_release.x86_64[55fccb97f000+2a3000] likely on CPU 9 (core 4, socket 0)
Apr 18 12:36:38 greymon kernel: Code: Unable to access opcode bytes at 0x1ab000001cf.

Since my RAM usage is normally around 50% and there are no logs mentioning oom-killer, I assume either it was a VRAM allocation failure or godot somehow leaked 32GB within a few minutes. This crash actually seems to be accidentally exploiting CVE-2023-0047, at least in this instance. I guess this is probably a driver bug, but only godot seems to trigger it; I've never had it happen before I started using godot.

@darksylinc
Copy link
Contributor

Hi!

Could you test this version to see if you can still repro the problem? Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

8 participants