Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cmd/trace: the "GC pause" column in the user-defined regions table doesn't make sense #62443

Closed
win5do opened this issue Sep 4, 2023 · 2 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.
Milestone

Comments

@win5do
Copy link

win5do commented Sep 4, 2023

What version of Go are you using (go version)?

$ go version
go version go1.21.0 darwin/amd64

Does this issue reproduce with the latest release?

Yes.

What operating system and processor architecture are you using (go env)?

go env Output
$ go env
GO111MODULE='on'
GOARCH='amd64'
GOBIN=''
GOCACHE='/Users/admin/Library/Caches/go-build'
GOENV='/Users/admin/Library/Application Support/go/env'
GOEXE=''
GOEXPERIMENT=''
GOFLAGS=''
GOHOSTARCH='amd64'
GOHOSTOS='darwin'
GOINSECURE=''
GOMODCACHE='/Users/admin/code/gopath/pkg/mod'
GONOPROXY=''
GONOSUMDB=''
GOOS='darwin'
GOPATH='/Users/admin/code/gopath'
GOPRIVATE='*.poizon.com'
GOPROXY='https://goproxy.cn,direct'
GOROOT='/usr/local/opt/go/libexec'
GOSUMDB='sum.golang.org'
GOTMPDIR=''
GOTOOLCHAIN='auto'
GOTOOLDIR='/usr/local/opt/go/libexec/pkg/tool/darwin_amd64'
GOVCS=''
GOVERSION='go1.21.0'
GCCGO='gccgo'
GOAMD64='v1'
AR='ar'
CC='cc'
CXX='c++'
CGO_ENABLED='1'
GOMOD='/dev/null'
GOWORK=''
CGO_CFLAGS='-O2 -g'
CGO_CPPFLAGS=''
CGO_CXXFLAGS='-O2 -g'
CGO_FFLAGS='-O2 -g'
CGO_LDFLAGS='-O2 -g'
PKG_CONFIG='pkg-config'
GOGCCFLAGS='-fPIC -arch x86_64 -m64 -pthread -fno-caret-diagnostics -Qunused-arguments -fmessage-length=0 -ffile-prefix-map=/var/folders/yq/ybt11j913mq3ppqx5tdl2phm0000gn/T/go-build920501220=/tmp/go-build -gno-record-gcc-switches -fno-common'

What did you do?

Use go tool trace's custom region to track function execution time.

Test Code: https://go.dev/play/p/idO-4ycK41K

$ GODEBUG='gctrace=1' go test -bench=BenchmarkGoTrace -benchtime=1000x -run=NONE -trace=trace.out

cpu: Intel(R) Core(TM) i5-1038NG7 CPU @ 2.00GHz
BenchmarkGoTrace-8          1000      22500692 ns/op

gc 2259 @26.566s 5%: 0.24+3.9+0.44 ms clock, 1.9+0.35/1.3/0+3.5 ms cpu, 128->156->47 MB, 145 MB goal, 0 MB stacks, 0 MB globals, 8 P
gc 2260 @26.574s 5%: 0.21+8.1+0.088 ms clock, 1.6+4.3/6.7/0+0.71 ms cpu, 85->100->85 MB, 95 MB goal, 0 MB stacks, 0 MB globals, 8 P
gc 2261 @26.595s 5%: 0.19+2.5+0.094 ms clock, 1.5+0.032/0.85/0+0.75 ms cpu, 149->163->47 MB, 171 MB goal, 0 MB stacks, 0 MB globals, 8 P


$ go tool trace trace.out

image
image
image

What did you expect to see?

The actual pause time of goroutine due to GC.

What did you see instead?

The GC pause time is extremely large and does not match the time output by GODEBUG='gctrace=1'.

"GC pasue" uses the GCTime field, which corresponds to the GCDone event. It seems that this is the total GC time, and it is calculated for each goroutine.

cmd/trace/goroutines.go#L298

<table class="details">
<tr>
<th onclick="reloadTable('sortby', 'SweepTime')" class="sweep-time"> GC sweeping</th>
<th onclick="reloadTable('sortby', 'GCTime')" class="pause-time"> GC pause</th>
</tr>
{{range .GList}}
  <tr>
    <td> {{prettyDuration .SweepTime}} {{percent .SweepTime .TotalTime}}</td>
    <td> {{prettyDuration .GCTime}} {{percent .GCTime .TotalTime}}</td>
  </tr>
{{end}}
</table>

internal/trace/goroutines.go#L275

const (
	EvGCStart           = 7  // GC start [timestamp, seq, stack id]
	EvGCDone            = 8  // GC done [timestamp]
	EvSTWStart          = 9  // GC mark termination start [timestamp, kind]
	EvSTWDone           = 10 // GC mark termination done [timestamp]
	EvGCSweepStart      = 11 // GC sweep start [timestamp, stack id]
	EvGCSweepDone       = 12 // GC sweep done [timestamp, swept, reclaimed]
)

func GoroutineStats() {
	// ...

	case EvGCStart:
		gcStartTime = ev.Ts
	case EvGCDone:
		for _, g := range gs {
			if g.EndTime != 0 {
				continue
			}
			if gcStartTime < g.CreationTime {
				g.GCTime += ev.Ts - g.CreationTime
			} else {
				g.GCTime += ev.Ts - gcStartTime
			}
		}
		gcStartTime = 0 // indicates gc is inactive.
}
@gopherbot gopherbot added the compiler/runtime Issues related to the Go compiler and/or runtime. label Sep 4, 2023
@rhysh
Copy link
Contributor

rhysh commented Sep 5, 2023

I agree that the title of "GC pause" doesn't entirely match the calculation for that column.

The commit message of 61f92ee, when the calculation was last edited (during the Go 1.11 cycle), says "Goroutine analysis reports the sum of all overlapping GC intervals as the GCTime of a goroutine."

It looks like the calculation would make sense when running with a STW / non-concurrent garbage collector — but both the execution tracer and the concurrent GC were introduced in Go 1.5.

The calculation for the "GC sweeping" column does seem to be focused on the actual slowdowns to this particular goroutine: https://github.com/golang/go/blob/go1.21.0/src/internal/trace/goroutines.go#L262

		case EvGCSweepStart:
			g := gs[ev.G]
			if g != nil {
				// Sweep can happen during GC on system goroutine.
				g.blockSweepTime = ev.Ts
			}
		case EvGCSweepDone:
			g := gs[ev.G]
			if g != nil && g.blockSweepTime != 0 {
				g.SweepTime += ev.Ts - g.blockSweepTime
				g.blockSweepTime = 0
			}

Maybe the calculation for the "GC pause" column should instead be the sum of the GC-related STW pauses, plus any time in GCAssist? Plus time in the assist queue, for unsatisfied assists?

There's already a column for "Scheduler wait". In my experience, high levels of GC assist can lead to scheduler wait time for other goroutines — but maybe listing a non-zero value in the "GC pause" column would be enough to hint at that additional contributor to CPU starvation.

CC @mknyszek (and @golang/runtime ?)

@mknyszek mknyszek changed the title cmd/trace: The time of "GC pause" column in the User-defined regions table is too large cmd/trace: the "GC pause" column in the user-defined regions table doesn't make sense Sep 6, 2023
@mknyszek mknyszek self-assigned this Sep 6, 2023
@mknyszek mknyszek added this to the Backlog milestone Sep 6, 2023
@gopherbot
Copy link
Contributor

Change https://go.dev/cl/541257 mentions this issue: cmd/trace/v2: add goroutine analysis pages

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime.
Projects
None yet
Development

No branches or pull requests

4 participants