-
Notifications
You must be signed in to change notification settings - Fork 13
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
The document-policy: js-profiling
header adds 3% overhead
#79
Comments
@paulirish It would be helpful if you could share how the overhead was measured so that others can attempt to replicate it and share their findings. Afaik the overhead varies between applications as it depends on the amount of heap objects allocated in the isolate at that point in time (@nornagon optimized), but having more measurements here would be nice.
There was a similar idea (cant recall by who) to enable profiling based on a longtask occurrence heuristic. |
Having a dual mode makes sense to me. ^^ @acomminos - I'd love your thoughts. |
Dual mode is an interesting idea, and I like the framings you've provided @paulirish! While interactivity is a big concern with enabling on-demand profiler usage, I'm also a bit worried about this potentially being a side channel for reading the overall size of the heap. If we could perform the warmup work asynchronously (and make it awaitable), I think that would make a dent in some of the cases where we might've otherwise incurred wasteful overhead through conditional invocation, while still granting us levers for side channel mitigation. UAs might also opt to perform that task at a time when the event loop is likely to be idle, helping interactivity as well. Curious about folks' thoughts on an API with these semantics (maybe not necessarily this shape): await Profiler.warmup(); // Engine kicks off EagerLogging on a (potentially lower priority) task.
const profiler = new Profiler(); // Does not block, we've warmed up! \o/
This would also be helpful for a 5th case that we've heard has been useful: random/idle tracing, with the goal of obtaining an aggregate view of overall page execution. This seems like it has similar considerations to cases 3 and 4 (in that the long task execution is less of a concern), but is a little less latency sensitive (i.e. we care even less if the profiler is slow to start up, since we don't have a specific interaction to report on). Thoughts? |
Sure. I tried replicating my speedometer results and now can't measure a significant difference between baseline and with the header set. However, I also realize that the actual work in Speedometer happens within an iframe, so it wasn't a great test case anyhow. I repeated things with the Octane 2.0 benchmark. (Admittedly I can't expect a case that would demonstrate overhead more, given that it's full throttle, non-stop JS) Repro:
Results
Yes that matches up with what I've seen. I'd expect both the EagerLogging ongoing overhead and (cold) profiler startup cost to scale with heap size. From the CL description:
Huh. This landed well after the header for EagerLogging was introduced. I wonder if this work mitigates the need for it. EDIT (Jan 2024): There's a PR up to remove the response header requirement: #80 |
(Admittedly, this is implementation-specific, but.. this implementation quirk is the reason for this spec'd header..)
I heard some concerns about the overhead of when
document-policy: js-profiling
is set. (BeforeProfiler.start()
has been called). I confirmed ~3% overhead during the Speedometer benchmark.Background
The CL that adds the header in Chromium says:
This pre-warming flips on V8 EagerLogging mode which tracks execution contexts, scripts, initializes V8 profiler instances, and quite a few things. (Later, once the
Profiler.start()
is called, these V8 profiler instances callCPUProfiler::startProfiling
, and properly collect samples.I do know that a fresh non-warmed
CpuProfiler::StartProfiling
cost can be kinda big. (DevTools now shows this as 'Profiling Overhead', but 'V8 Sampling Profiler Startup' would be a more accurate name.) The cost is ~50ms on a loaded theverge.com page. Meanwhile if it's started at the beginning of pageload, it's <1ms.I'm assuming that pre-warming with
EagerLogging
was selected to avoid the risk of this long task occurring at a sensitive time.Overhead Measurement
After some quick benchmarking environment setup, I grabbed some numbers..
I ran Speedometer 2.1 a couple times, in 3 different setups. One runs' results look like this:
Baseline (no policy):
Arithmetic Mean: 127.8 ‡ 1.1 (0.83%)
Arithmetic Mean: 128 ‡ 1.7 (1.3%)
Policy enabled (yeah, just the response header):
Arithmetic Mean: 124 ‡ 1.5 (1.2%)
Arithmetic Mean: 125 ‡ 1.5 (1.2%)
Policy enabled + Profiler started:
Arithmetic Mean: 114 ‡ 1.2 (1.1%)
Arithmetic Mean: 117 ‡ 2.3 (2.0%)
This looks like 3% overhead with EagerLogging. And 10% overhead with profiler started.
(Of course I'd love to see if someone's run similar tests with more rigor)
Developer usecases
I see four styles of use of the Profiler:
start()
. Makes sense.EagerLogging pre-warm is moot for 1, a win for 2, and a loss for 3, 4.
Do folks have thoughts on how we can help the 3 & 4 developers?
@yoavweiss floated a possibility.. we have two modes: the eager prewarm mode and a lazy on-demand one. Seems attractive. Though I also wonder if there's another resolution to case 2 above..
The text was updated successfully, but these errors were encountered: