Liveness and readiness probes are failing with 503 in compactor and prometheus after upgrading to 0.17.0 #3466

dimm0 · 2020-11-18T23:11:30Z

Thanos, Prometheus and Golang version used:

Thanos 0.17.0, prometheus 2.22.2

Object Storage Provider:
ceph S3

What happened:
After upgrading thanos to 0.17.0, the liveness probes in prometheus and thanos-compactor pods start failing with 503 error, which causes pods to restart

What you expected to happen:
Pods keep running

How to reproduce it (as minimally and precisely as possible):
Not sure what triggered it, I simply upgraded a working installation

Full logs to relevant components:
Couldn't get any error in logs since container simply restarts, no errors logged

dimm0 · 2020-11-18T23:11:44Z

The problem is gone when reverted to 0.16.0

kesor · 2020-11-24T11:10:58Z

When running thanos compact --http-address=0.0.0.0:10902 it doesn't listen, checked with netstat -na | grep LISTEN by exec into the container running compact.

Kampe · 2020-11-30T22:11:21Z

I see the same issues with using thanos store in v0.16, did not see in v0.15

ahurtaud · 2020-12-02T18:48:29Z

We have the same issue I think.
What is weird is:
For some buckets it is working fine, for some other, it gets killed by the liveness probe.

bucket config: Scality S3

Fails on 0.17.1 and works on 0.16.0

ahurtaud · 2020-12-02T18:52:54Z

dupe of #3395 ?

bwplotka · 2020-12-02T19:02:52Z

Yup, that looks like the same issue.

Thank you all for reporting this, let's fix and release 0.17.2 (:

Also commented on the mentioned issue. Closing this one as dup 🤗

Kampe · 2020-12-03T17:28:05Z

@bwplotka I am seeing the same behavior with this issue in thanos-store with an upgrade to 0.17 as well, while not the same component - pods will be end up redeploying after liveness checks fail for no apparent reason.

bwplotka closed this as completed Dec 2, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Liveness and readiness probes are failing with 503 in compactor and prometheus after upgrading to 0.17.0 #3466

Liveness and readiness probes are failing with 503 in compactor and prometheus after upgrading to 0.17.0 #3466

dimm0 commented Nov 18, 2020

dimm0 commented Nov 18, 2020

kesor commented Nov 24, 2020

Kampe commented Nov 30, 2020

ahurtaud commented Dec 2, 2020

ahurtaud commented Dec 2, 2020

bwplotka commented Dec 2, 2020

Kampe commented Dec 3, 2020

Liveness and readiness probes are failing with 503 in compactor and prometheus after upgrading to 0.17.0 #3466

Liveness and readiness probes are failing with 503 in compactor and prometheus after upgrading to 0.17.0 #3466

Comments

dimm0 commented Nov 18, 2020

dimm0 commented Nov 18, 2020

kesor commented Nov 24, 2020

Kampe commented Nov 30, 2020

ahurtaud commented Dec 2, 2020

ahurtaud commented Dec 2, 2020

bwplotka commented Dec 2, 2020

Kampe commented Dec 3, 2020