-
Notifications
You must be signed in to change notification settings - Fork 402
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GetQueryPoolResults in core validation is slow #4
Comments
Comment by chrisforbes (MIGRATED) I can see ways to make the query tracking substantially faster. It's going to require significant rework, though. Since this has been sitting around since last August, how important is it? |
Comment by karl-lunarg (MIGRATED) It is very important. The users reporting the problem originally have reiterated its importance pretty recently. We've been trying to work towards a workaround by implementing switches that could be used to turn off expensive checks like this one, but that effort may not be getting the priority it needs. These checks can slow down the app enough to change its behavior. So, yes, help needed here. |
Issue by karl-lunarg (MIGRATED)
Friday Aug 05, 2016 at 20:24 GMT
Originally opened as KhronosGroup/Vulkan-LoaderAndValidationLayers#829
Just logging this as the result of some analysis from LunarXchange issue 305.
Some perf analysis shows that GetQueryPoolResults spends about 80% of its time creating the temporary unordered_map for very large query pools (e.g., thousands). Yes, this isn't an issue for smaller pools and creating the map certainly speeds up the subsequent validation operations. But this overhead can be too costly for some applications. (See LX 305 issue). We wonder if there is a way to avoid creating this map. Also, this general problem might be addressed in a future layers architecture where expensive tests can be disabled.
The issue can be reproduced fairly easily by modifying the occlusion_query sample app from the LunarG/VulkanSamples repo. Change the queryCount argument in vkCmdResetQueryPool to a large number like 16,000. (This is a missing validation check, btw) And comment out the call to vkQueueWaitIdle so that the CB with the queries is still in flight. The subsequent call to vkGetQueryPoolResults then runs slowly.
Note also that the implementation of std::unordered_map in VS 2013 has serious performance issues, which made the (implicit) free of the unordered_map take MUCH longer than desired. See http://stackoverflow.com/questions/21014822/very-slow-unordered-map-clearing. Moving to VS 2015 changed the cost of the free from 850 ms on VS 2013 to 7 ms on VS 2015 for a particular test. Since unordered_map is used a lot in the layers, this implies that using VS 2015 might help the perf of the VLs in a more general way.
The text was updated successfully, but these errors were encountered: