Performance profiling #900

CedricGuillemet · 2021-09-28T12:21:50Z

CedricGuillemet
Sep 28, 2021
Maintainer

Starting discussion here. An issue will be open once rough specs will be determined.
Idea is to measure performance (CPU, GPU, CPU ram, GPU ram, latencies, I/O,...) between 2 PRs to avoid regression and help find bottlenecks.

related issue : #518

Use of Validation Tests

Validationt test provides a good set of scenes to test. We can add more. But most tests render only a few frames or just 1. Not enough for CPU/GPU profiling. We can use Validation tests as a base but script will be different with mutualized parts.

Data output

Write a json with metrics and compare them with reference in repo (just like VT reference images)

bgfx provides statistics:

Many things like GPU timing and estimate of memory used:
https://bkaradzic.github.io/bgfx/bgfx.html#_CPPv4N4bgfx5StatsE
We can expose that from nativeEngine, testUtils plugin, other ?

Frame statistics

Number of frames rendered must be enough to have solid datas and 1st frames must not be taken into account (driver may actually commit texture loading when used with d3d. Not sure for other platforms)

Use of CI agents

It's not possible to have reliable timings on the CI. Agents are different. Just check the build time delta between 2 PRs to see how volatile timings are.
A self owned station can be used to build and run tests. We will have to do something similar for Apple platforms as it's not possible to test on CI.

Code introspection and/or tools

We can hook performance checks in various places for (# of calls, CPU timings), in native or JS sides. I'm not a big fan of adding macros or calls behind #idef everywhere in the code.
I'd prefer external tools like Event Tracing for Windows (ETW)
https://docs.microsoft.com/en-us/windows-hardware/test/wpt/windows-performance-recorder

When to test?

If 1000 frames are rendered for each test, on 1 or more platform, CI Build time will skyrocket. On the other hand, being able to get statistics and regression for a specific branch can be invaluable. Maybe nightly GitHub action + the possibility to check any branch at will is better.

Proposal

Expose bgfx statistics with an accessoir in NativeEngine
Create/modify validation test to render N frames per test and log statistics (min/max/average per item per scene, per frame ?)
Output everything to a format that Chrome/Edge can parse
=> enough to test performance between 2 branches locally

CedricGuillemet · 2021-10-01T06:29:17Z

CedricGuillemet
Oct 1, 2021
Maintainer Author

Also, do we want to do regression/performance tests only on rendering?
Maybe we should test XMLHttpRequest as well ? Do you see any critical part? In VFX we used to monitor time before 1st rendered pixel (loading -> processing -> rendering). Initialization time can be important too when doing integration with partners.

0 replies

CedricGuillemet · 2021-10-04T07:37:08Z

CedricGuillemet
Oct 4, 2021
Maintainer Author

Do we want to monitor the binary size?
For the datas that we want to monitor along time/PRs, can we append datas to a file and commit it to a special repo? somewhere else?
Maybe start a repo with tools used for reporting (like a python script that generates a graph of binary size per date/commit) with datas.

We could also publish transformed datas/graph every time a new data set is commited to this repo.

0 replies

ryantrem · 2021-10-04T16:22:16Z

ryantrem
Oct 4, 2021
Maintainer

Thanks for starting this discussion @CedricGuillemet! Some thoughts:

bgfx stats sounds like good info, but for sure I think we need JS side stats too since we see a lot of regressions there. Something like average frame time and frame time variance over some number of frames (which we can get from BJS engine already).
I think having a separate CI for perf that runs for every commit to master would be fine. Even if that CI build/perf test takes a few hours, as long as we are not merging changes to BN more frequently than every few hours it won't get backed up.
Regarding perf hooks / instrumenting code, on the native side it would be good to add some abstraction for this (perf logging basically). For iOS, I was recently sprinkling the native code with points of interest logging that can show up in the Xcode Instruments profiles, so it would be nice if we had some abstraction for this that could log those out for XCode Instruments or ETW or whatever is appropriate for each platform. We should chat more about this before making changes to support it to see if we can come up with a simple solution that meets both goals.
For me, I feel this is the order of priority for things we measure:
1. Rendering perf (frame time/variance)
2. Startup time (total amount of time until first frame is rendered for an empty scene or something like that)
3. Main memory footprint
4. Disk footprint (lib sizes)

As far as where the data is stored, we possibly could log it to PowerBI and then create all kinds of charts and dashboards from that data. I'm sure there are other options too.

0 replies

CedricGuillemet · 2021-10-05T09:24:13Z

CedricGuillemet
Oct 5, 2021
Maintainer Author

We can test shader performance. I don't think it's tested on JS side but it might be interesting to check for regression as well.
GPU time is split in 2 parts: the driver part and the GPU part. Most of the time, there 1 single value that groups both times.
It's possible to decompose that. Doing a performance test with a reduced back buffer size (and reduced RT size too) and doing the same test with a much higher resolution (like 320x200 then 1920x1080). The timing difference will correspond to effective GPU time, without driver overhead. If a shader has been changed and gets slower, that can be detected that way.

0 replies

CedricGuillemet · 2021-10-07T09:56:44Z

CedricGuillemet
Oct 7, 2021
Maintainer Author

Looking at Chrome profiling JSON/flame graph made me realize that 2 types of profiling datas are available and not the same nature:

time based data (FPS, GPU time, ...) that may not be reliable on CI
Resources based datas (number of HTTP request, amount of ram used, number of vertex arrays...) that are not dependent of time and can be tested on CI

Because of that, Resources checks can happen in ValidationTests with a Reference JSON containing infos per scene.

0 replies

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Performance profiling #900

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Replies: 5 comments

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{title}}

{{title}}

{{editor}}'s edit

{{editor}}'s edit

{{title}}

{{editor}}'s edit

{{editor}}'s edit

Select a reply

Performance profiling #900

CedricGuillemet Sep 28, 2021 Maintainer

Use of Validation Tests

Data output

bgfx provides statistics:

Frame statistics

Use of CI agents

Code introspection and/or tools

When to test?

Proposal

Replies: 5 comments

CedricGuillemet Oct 1, 2021 Maintainer Author

CedricGuillemet Oct 4, 2021 Maintainer Author

ryantrem Oct 4, 2021 Maintainer

CedricGuillemet Oct 5, 2021 Maintainer Author

CedricGuillemet Oct 7, 2021 Maintainer Author

CedricGuillemet
Sep 28, 2021
Maintainer

CedricGuillemet
Oct 1, 2021
Maintainer Author

CedricGuillemet
Oct 4, 2021
Maintainer Author

ryantrem
Oct 4, 2021
Maintainer

CedricGuillemet
Oct 5, 2021
Maintainer Author

CedricGuillemet
Oct 7, 2021
Maintainer Author