Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Upgrade Vulkan Memory Allocator, use Volk on Android #51524

Merged
merged 2 commits into from
Aug 13, 2021

Conversation

RandomShaper
Copy link
Member

@RandomShaper RandomShaper commented Aug 11, 2021

This is about #48978.

My profiling has led me to the Vulkan Memory Allocator (VMA) class as the culprit. Then I've found this: GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator#178

What this PR does is take the new experimental, faster implementation from https://github.com/GPUOpen-LibrariesAndSDKs/VulkanMemoryAllocator/tree/feature-small-buffers and do a couple minor changes needed to adapt to the new implementation.

I've done my measurements by using slightly modified versions of the MRP provided in the original issue, adapted to work in the current Godot 4 and to be editor scripts: ClassProfiling.zip

Now, the numbers (tested on Windows 10 x64 with locally built release binaries of the Godot editor, doing 1000 iterations):

World:                                   3424 miliseconds
ScriptCreateDialog:                      2540 miliseconds
EditorFileDialog:                        1224 miliseconds
FileDialog:                              844 miliseconds
EditorSpinSlider:                        835 miliseconds
ColorPicker:                             544 miliseconds
World2D:                                 506 miliseconds
EditorSettings:                          385 miliseconds
GraphEdit:                               165 miliseconds
EditorInspector:                         123 miliseconds
_Directory:                              111 miliseconds
ConfirmationDialog:                      89 miliseconds
CPUParticles2D:                          89 miliseconds
Tree:                                    82 miliseconds
TextEdit:                                77 miliseconds
AcceptDialog:                            61 miliseconds
Viewport:                                58 miliseconds
Sprite3D:                                53 miliseconds
AnimatedSprite3D:                        49 miliseconds
EditorFileSystemDirectory:               47 miliseconds
EditorResourcePicker:                    47 miliseconds
EditorImportPlugin:                      46 miliseconds
EditorScriptPicker:                      45 miliseconds
Particles:                               44 miliseconds
Particles2D:                             42 miliseconds
SpinBox:                                 41 miliseconds
AnimatedTexture:                         40 miliseconds
EditorSpatialGizmo:                      35 miliseconds
LineEdit:                                29 miliseconds
Animation:                               28 miliseconds
ScrollContainer:                         28 miliseconds
OpenSimplexNoise:                        27 miliseconds
AudioEffectEQ21:                         24 miliseconds
CPUParticles:                            24 miliseconds
PacketPeerUDP:                           23 miliseconds
Light2D:                                 22 miliseconds
AnimationNodeBlendSpace2D:               19 miliseconds
ARVRCamera:                              18 miliseconds
PacketPeerStream:                        17 miliseconds
ExternalTexture:                         17 miliseconds
  • Godot 4.0 (62047e4)
    • BEFORE THIS PR
ScriptCreateDialog:                      22653 miliseconds
FileDialog:                              6304 miliseconds
EditorFileDialog:                        5099 miliseconds
Sprite3D:                                4477 miliseconds
Tree:                                    2736 miliseconds
EditorPaths:                             2092 miliseconds
World3D:                                 1654 miliseconds
Window:                                  1270 miliseconds
SubViewport:                             1205 miliseconds
EditorCommandPalette:                    1050 miliseconds
PopupMenu:                               766 miliseconds
OptionButton:                            751 miliseconds
MenuButton:                              730 miliseconds
PopupPanel:                              726 miliseconds
Popup:                                   686 miliseconds
ColorPicker:                             617 miliseconds
EditorSettings:                          417 miliseconds
CPUParticles2D:                          392 miliseconds
ConfirmationDialog:                      285 miliseconds
AnimatedSprite3D:                        203 miliseconds
AcceptDialog:                            187 miliseconds
CodeEdit:                                151 miliseconds
GraphEdit:                               145 miliseconds
_Directory:                              105 miliseconds
AnimatedTexture:                         92 miliseconds
WorldEnvironment:                        91 miliseconds
TextEdit:                                76 miliseconds
Animation:                               64 miliseconds
EditorSpinSlider:                        39 miliseconds
SpinBox:                                 31 miliseconds
World2D:                                 28 miliseconds
EditorScriptPicker:                      26 miliseconds
EditorInspector:                         26 miliseconds
EditorResourcePicker:                    25 miliseconds
OpenSimplexNoise:                        23 miliseconds
AudioEffectEQ21:                         22 miliseconds
RichTextLabel:                           19 miliseconds
ScrollContainer:                         18 miliseconds
CheckBox:                                17 miliseconds
AnimationNodeBlendSpace2D:               14 miliseconds
  • AFTER THIS PR
ScriptCreateDialog:                      16423 miliseconds
FileDialog:                              4712 miliseconds
EditorFileDialog:                        4318 miliseconds
EditorPaths:                             2046 miliseconds
Tree:                                    1897 miliseconds
EditorCommandPalette:                    914 miliseconds
Window:                                  845 miliseconds
SubViewport:                             804 miliseconds
ColorPicker:                             617 miliseconds
PopupMenu:                               532 miliseconds
OptionButton:                            518 miliseconds
MenuButton:                              508 miliseconds
PopupPanel:                              507 miliseconds
Popup:                                   452 miliseconds
EditorSettings:                          369 miliseconds
ConfirmationDialog:                      251 miliseconds
AcceptDialog:                            178 miliseconds
CodeEdit:                                147 miliseconds
GraphEdit:                               143 miliseconds
_Directory:                              122 miliseconds
WorldEnvironment:                        103 miliseconds
TextEdit:                                70 miliseconds
AnimatedTexture:                         66 miliseconds
World3D:                                 63 miliseconds
EditorResourcePicker:                    43 miliseconds
EditorSpinSlider:                        39 miliseconds
OpenSimplexNoise:                        36 miliseconds
Animation:                               30 miliseconds
World2D:                                 30 miliseconds
SpinBox:                                 30 miliseconds
EditorScriptPicker:                      25 miliseconds
EditorInspector:                         25 miliseconds
AudioEffectEQ21:                         25 miliseconds
CheckBox:                                25 miliseconds
AnimationNodeBlendSpace2D:               24 miliseconds
AudioEffectEQ:                           20 miliseconds
ORMMaterial3D:                           20 miliseconds
ScrollContainer:                         18 miliseconds
AnimationNodeBlendSpace1D:               18 miliseconds
RichTextLabel:                           18 miliseconds

Let's see it in a more convenient way, comparing before and after between a set of heavy cases:

Class Before After Gain (Before - After)
AcceptDialog 187 178 9
AnimatedSprite3D 203 15 188
AnimatedTexture 92 66 26
Animation 64 30 34
CPUParticles2D 392 13 379
ConfirmationDialog 285 251 34
EditorFileDialog 5099 4318 781
EditorPaths 2092 2046 46
FileDialog 6304 4712 1592
GraphEdit 145 143 2
MenuButton 730 508 222
ORMMaterial3D 10 20 -10
OptionButton 751 518 233
ParticlesMaterial 10 8 2
PhysicalSkyMaterial 4 10 -6
Popup 686 452 234
PopupMenu 766 532 234
PopupPanel 726 507 219
RichTextLabel 19 18 1
ScriptCreateDialog 22653 16423 6230
ScrollContainer 18 18 0
Window 1270 845 425
World2D 28 30 -2
World3D 1654 63 1591
WorldEnvironment 91 103 -12
_Directory 105 122 -17

As you can see, some are slightly worsened, but I'd say that can be considered noise.

If the work-in-progress new VMA is considered stable enough for the current development stage of Godot 4, this PR could be merged, knowing that we'll need to update the VMA when 3.0.0 is stable; or we can just wait until that happens, keeping the current worse performance in the meantime.

@mhilbrunner
Copy link
Member

Needs to be rebased on master for fixing CI (after #51523)

@RandomShaper RandomShaper force-pushed the faster_vma branch 2 times, most recently from 3d7bf5e to 9753e5e Compare August 11, 2021 21:37
@RandomShaper RandomShaper requested review from a team as code owners August 11, 2021 21:37
@akien-mga
Copy link
Member

akien-mga commented Aug 11, 2021

Android fails:

drivers/vulkan/rendering_device_vulkan.cpp:8797:17: error: no member named 'instance' in 'VmaAllocatorCreateInfo'
                allocatorInfo.instance = p_context->get_instance();
                ~~~~~~~~~~~~~ ^

We use the Vulkan headers from the NDK, which might be too old :/

And getting newer headers would require updating the NDK and thus fixing #44055.

#51516 may help with the Android issue though, we'll see.

@RandomShaper RandomShaper force-pushed the faster_vma branch 2 times, most recently from a65574c to c1ce1fd Compare August 11, 2021 21:52
@bruvzg
Copy link
Member

bruvzg commented Aug 12, 2021

We use the Vulkan headers from the NDK, which might be too old :/

It's not just NDK Vulkan headers, Android is using VMA implementation for NDA as well (In #51516, it's doing it only if volk is disabled).

@akien-mga
Copy link
Member

akien-mga commented Aug 12, 2021

So based on #51516, actually Android uses its own version of VMA in thirdparty/vulkan/android/vk_mem_alloc.cpp. Maybe we just need to update that one to fix the build issue here, instead of the #ifdef.

See #37745 for context, CC @pouleyKetchoupp.

thirdparty/README.md Outdated Show resolved Hide resolved
@RandomShaper
Copy link
Member Author

@akien-mga, I've addressed your feedback. Also, note that I've changed the commit message. I didn't quite like VMA, which sounded to me more like Virtual Memory Address and the Vulkan word may be useful for potential grepping.

@akien-mga akien-mga changed the title Use faster version of VMA Upgrade Vulkan Memory Allocator Aug 12, 2021
@akien-mga
Copy link
Member

akien-mga commented Aug 12, 2021

So Android is still not happy, probably because we're still using the old NDK Vulkan headers by default. I don't remember why we do that, would need to check Git history. But maybe we can get away with not doing that and using our vendored copy of the headers?

i.e. something like this:

diff --git a/drivers/vulkan/SCsub b/drivers/vulkan/SCsub
index 3e0f5788c3..5db3f4a5b4 100644
--- a/drivers/vulkan/SCsub
+++ b/drivers/vulkan/SCsub
@@ -10,19 +10,8 @@ if env["use_volk"]:
     env.AppendUnique(CPPDEFINES=["USE_VOLK"])
     env.Prepend(CPPPATH=[thirdparty_volk_dir])
 
-if env["platform"] == "android" and not env["use_volk"]:
-    # Use NDK Vulkan headers
-    ndk_vulkan_dir = env["ANDROID_NDK_ROOT"] + "/sources/third_party/vulkan/src"
-    thirdparty_includes = [
-        ndk_vulkan_dir,
-        ndk_vulkan_dir + "/include",
-        ndk_vulkan_dir + "/layers",
-        ndk_vulkan_dir + "/layers/generated",
-    ]
-    env.Prepend(CPPPATH=thirdparty_includes)
-else:
-    # Use bundled Vulkan headers
-    env.Prepend(CPPPATH=[thirdparty_dir, thirdparty_dir + "/include"])
+# Use bundled Vulkan headers
+env.Prepend(CPPPATH=[thirdparty_dir, thirdparty_dir + "/include"])
 
 if env["platform"] == "android":
     env.AppendUnique(CPPDEFINES=["VK_USE_PLATFORM_ANDROID_KHR"])

@bruvzg
Copy link
Member

bruvzg commented Aug 12, 2021

i.e. something like this:

It seems to building fine with system loader and our headers. Not sure why it was done it this way (Android Vulkan support was added like this), Android use custom loader, but headers should be standard.

@akien-mga
Copy link
Member

Apparently we were using the NDK files because upstream Vulkan Loader doesn't support Android formally: KhronosGroup/Vulkan-Loader#96.

But now that we switched to volk, this should no longer be needed, so we can remove that code.

@RandomShaper RandomShaper force-pushed the faster_vma branch 2 times, most recently from 0aeb212 to 68aa484 Compare August 12, 2021 19:17
@RandomShaper
Copy link
Member Author

image
image

@RandomShaper
Copy link
Member Author

Is it OK that without volk the VMA is restricted to Vulkan 1.0? Maybe that's not needed it targeting a more recent API level or using a newer NDK. Not familiar with the status of these things at this point.

@akien-mga
Copy link
Member

akien-mga commented Aug 12, 2021

Is it OK that without volk the VMA is restricted to Vulkan 1.0? Maybe that's not needed it targeting a more recent API level or using a newer NDK. Not familiar with the status of these things at this point.

Why is this needed? Also as it's implemented now it will affect all platforms, e.g. iOS (which defaults to use_volk=no), macOS with statically linked MoltenVK, or Linux if linking against system Vulkan (though there's little incentive for it now that we use volk).

FWIW, we currently target Vulkan 1.1.

The current PR builds fine for me for Android with env_thirdparty_vma.AppendUnique(CPPDEFINES=["VMA_VULKAN_VERSION=1000000"]) removed and use_volk=no.

@RandomShaper
Copy link
Member Author

I've added that because otherwise I was getting linking errors on the CI about missing symbols, corresponding to Vulkan functions ending with 2. I checked and they belong to Vulkan 1.1.
However, you're right in that my change affects too many platforms and also it seems not to be needed even for Android.

I don't know what to do...

We no longer build the Vulkan loader, and volk lets us load it dynamically.
Roblox uses volk on Android so it should work well for us too.
@akien-mga
Copy link
Member

We can use volk by default on Android, I'll make a separate PR to fix this up and you can rebase afterwards.

@akien-mga
Copy link
Member

Well slow CI will make this an overnight project, but once #51592 is merged, you can rebase this PR and add:

diff --git a/COPYRIGHT.txt b/COPYRIGHT.txt
index 5bd67960da..ef444721b2 100644
--- a/COPYRIGHT.txt
+++ b/COPYRIGHT.txt
@@ -453,7 +453,7 @@ License: Apache-2.0
 
 Files: ./thirdparty/vulkan/vk_mem_alloc.h
 Comment: Vulkan Memory Allocator
-Copyright: 2017-2019, Advanced Micro Devices, Inc.
+Copyright: 2017-2021, Advanced Micro Devices, Inc.
 License: Expat
 
 Files: ./thirdparty/wslay/
diff --git a/drivers/vulkan/SCsub b/drivers/vulkan/SCsub
index ab45863f5b..8fe75367a8 100644
--- a/drivers/vulkan/SCsub
+++ b/drivers/vulkan/SCsub
@@ -36,6 +36,10 @@ if env["use_volk"]:
 
     thirdparty_sources_volk = [thirdparty_volk_dir + "/volk.c"]
     env_thirdparty_volk.add_source_files(thirdparty_obj, thirdparty_sources_volk)
+elif env["platform"] == "android":
+    # Our current NDK version only provides old Vulkan headers,
+    # so we have to limit VMA.
+    env_thirdparty_vma.AppendUnique(CPPDEFINES=["VMA_VULKAN_VERSION=1000000"])
 
 env_thirdparty_vma.add_source_files(thirdparty_obj, thirdparty_sources_vma)
 

And this should be good to merge.

@RandomShaper
Copy link
Member Author

I've rebased on top of your PR, which should save a little time if everything goes well.

@akien-mga akien-mga changed the title Upgrade Vulkan Memory Allocator Upgrade Vulkan Memory Allocator, use Volk on Android Aug 12, 2021
@RandomShaper
Copy link
Member Author

Now this is finally relevant: #51524 (comment)

@akien-mga akien-mga merged commit 4c53669 into godotengine:master Aug 13, 2021
@akien-mga
Copy link
Member

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants