-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: count globals toward GC trigger #19839
Comments
Maybe our global scanning code is suboptimal somehow? |
Found an (utterly obvious in retrospect) workaround. Instead of var cache [10000]obj.Prog do var cache *[10000]obj.Prog
func init() {
cache = new([10000]obj.Prog)
} With that change, performance returns to previous levels. We can revert that workaround once this issue is fixed.
Or because globals are roots, they are always (expensively) rescanned? I'll leave this for people who know the GC. |
Scanning globals is probably slightly more efficient than scanning the heap (the process is basically identical, but globals use a 1 bit bitmap instead of a 2 bit bitmap). But I don't think that's what's going on here. GC is triggered by the size of the heap, so if you move a large block of memory from the heap to a global, GC is going to be triggered more often, but it still has just as much work to do, so you spend more aggregate time in GC. |
Add heap ballast to the compiler? :P |
Sigh. This is a pretty silly situation. |
Silliness aside, this also seems like the sort of thing that encourages abuse. Should globals perhaps be included in the trigger calculation? |
That's not a bad idea. In fact, part of the point of GOGC is to amortize the cost of scanning and if we don't count all of the scannable stuff, we're not getting the full amortization. Some stuff is just hard to count (e.g., stacks), but scannable globals would be easy. @RLH, thoughts? |
Counting bss, data, and stacks as part of GOGC is probably dictated by the
doctrine of least surprise. I am concerned about a situation where a
program has tens of gigabytes in a scanable global while using the heap for
transient data.
A partial solution would be to only count the globals that need scanning
and round up stacks to span granularity.
…On Tue, Apr 4, 2017 at 7:58 PM, Austin Clements ***@***.***> wrote:
Should globals perhaps be included in the trigger calculation?
That's not a bad idea. In fact, part of the point of GOGC is to amortize
the cost of scanning and if we don't count all of the scannable stuff,
we're not getting the full amortization. Some stuff is just hard to count
(e.g., stacks), but scannable globals would be easy.
@RLH <https://github.com/RLH>, thoughts?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#19839 (comment)>, or mute
the thread
<https://github.com/notifications/unsubscribe-auth/AA7WnxGLUysyhDEMKuqXXQBpoFXRSor5ks5rstksgaJpZM4MzE86>
.
|
CL https://golang.org/cl/39713 mentions this issue. |
There's a surprising amount of variety in the drivers of compilation speed. It's helpful to have a variety of packages here. For example, archive/tar exhibits golang/go#19839 much more than the others. Change-Id: If66b332d63427fb246305cb14cfee9ef450bcdcf Reviewed-on: https://go-review.googlesource.com/39713 Reviewed-by: Matthew Dempsky <[email protected]>
Yup, this is done. |
BTW, is the GC pacer resign mentioned in #44167 still only available in experimental mode? |
It's been enabled by default since Go 1.18 released (and the internal flag was removed in 1.19). |
CL 39471 moves a large cache from the heap to a global. (A single Ctxt object is allocated at the beginning of compilation, so there's exactly one of these Prog caches, both before and after the CL.) The CL is just a simplified demo, but it mimics something real I want to do.
The CL causes GC to use lots more CPU when compiling package archive/tar. (It affects other packages as well, but archive/tar shows it most prominently.)
Note that the real time is about the same--the compiler itself is doing the same amount of work--but the CPU consumed goes up considerably. I'd expect it to be unchanged.
@aclements @RLH
The text was updated successfully, but these errors were encountered: