Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Receiver crashes occasionally with panic #5626

Closed
fpetkovski opened this issue Aug 22, 2022 · 5 comments · Fixed by #5655
Closed

Receiver crashes occasionally with panic #5626

fpetkovski opened this issue Aug 22, 2022 · 5 comments · Fixed by #5655

Comments

@fpetkovski
Copy link
Contributor

Thanos, Prometheus and Golang version used:
Thanos version main-2022-08-08-66700939

What happened:
A Thanos receiver can crash occasionally with the following error

panic: duplicate metrics collector registration attempted

goroutine 3776247271 [running]:
github.com/prometheus/client_golang/prometheus.(*wrappingRegisterer).MustRegister(0xc2d4ad4b70, {0xc3203ac0d0?, 0x1, 0x0?})
	/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/wrap.go:104 +0x151
github.com/prometheus/client_golang/prometheus/promauto.Factory.NewCounter({{0x2534930?, 0xc2d4ad4b70?}}, {{0x0, 0x0}, {0x0, 0x0}, {0x20c166b, 0x1e}, {0x20b06dd, 0x19}, ...})
	/go/pkg/mod/github.com/prometheus/[email protected]/prometheus/promauto/auto.go:265 +0xfd
github.com/thanos-io/thanos/pkg/shipper.newMetrics({0x2534930, 0xc2d4ad4b70}, 0x0)
	/app/pkg/shipper/shipper.go:45 +0x85
github.com/thanos-io/thanos/pkg/shipper.New({0x2522cc0, 0xc3c351a230}, {0x2534930?, 0xc2d4ad4b70?}, {0xc48d6e1590, 0x30}, {0x254cbe8?, 0xc00048cf30}, 0xc4256f08a0, {0x2087cd5, ...}, ...)
	/app/pkg/shipper/shipper.go:114 +0x69
github.com/thanos-io/thanos/pkg/receive.(*MultiTSDB).startTSDB(0xc0002af540, {0x2522cc0, 0xc3c351a230}, {0xc47156a4c0, 0x1c}, 0x0?)
	/app/pkg/receive/multitsdb.go:461 +0x507
github.com/thanos-io/thanos/pkg/receive.(*MultiTSDB).getOrLoadTenant.func1()
	/app/pkg/receive/multitsdb.go:508 +0x4d
created by github.com/thanos-io/thanos/pkg/receive.(*MultiTSDB).getOrLoadTenant
	/app/pkg/receive/multitsdb.go:507 +0x32b

What you expected to happen:
Registering duplicate metric collectors should either be handled gracefully, or should not happen at all.

How to reproduce it (as minimally and precisely as possible):
Haven't been able to reproduce this locally yet.

@GiedriusS
Copy link
Member

Maybe somehow related to tenant prunning? 🤔

@fpetkovski
Copy link
Contributor Author

Yes, possibly. I will look into it this week.

@fpetkovski
Copy link
Contributor Author

fpetkovski commented Aug 22, 2022

I think you might be right @GiedriusS, we had a tenant on this receiver that was pruned. It then ended up on the receiver again which caused shipper to try registering metrics again for the tenant.

We use promauto everywhere, and that relies on MustRegister. So I wonder how this can be solved 🤔

CC @bwplotka or @kakkoyun you might have some ideas as client_golang maintainers.

@kakkoyun
Copy link
Member

I might be out of context here, the way to work around this, either introduce a tenant label to wrap around the register. Or unregister the metrics when the tenant goes away.

@fpetkovski
Copy link
Contributor Author

fpetkovski commented Aug 22, 2022

Wrapping the register is a cool idea! Maybe this already exists in Thanos?

type UnRegisterer struct {
prometheus.Registerer
}
func (u *UnRegisterer) MustRegister(cs ...prometheus.Collector) {
for _, c := range cs {
if err := u.Register(c); err != nil {
if _, ok := err.(prometheus.AlreadyRegisteredError); ok {
if ok = u.Unregister(c); !ok {
panic("unable to unregister existing collector")
}
u.Registerer.MustRegister(c)
continue
}
panic(err)
}
}
}

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants