-
Notifications
You must be signed in to change notification settings - Fork 2.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Ruler: err="no query peer reachable" with no details. #2020
Comments
Hi, this error message can mean two things:
Do you have any preceding error log lines? |
Sorry, my mistake, I overlooked some filters in use (hiding other lines).
Well, the question |
I'm deeply sorry, accidentally I've picked out an example when ruler got oom-killed, :-/. |
Hi,
|
So, the log lines I'd like to present are as follows (just snippets to make it as concise as possible):
Please note the My questions:
|
Hi, regarding the coupling of the logs. I'd recommend using some kind of distributed tracing, Thanos does support most of the providers so that would definitely make things easier for you. I believe there is a plan to log the trace ids which would make it possible to couple those related logs. Regarding the issue, I personally bumped to the exact same issue reported here #2022 But it is known issue, the interesting is that it occurred just now. There is open PR which should mitigate this. Can you share what changes led to this? In my case there was upgrade from Thanos 0.7.0 to 0.9.0 and upgrading Minio which it uses also. |
Hi,
Well, there's Jaeger in use here at the moment and it's not good (probably tampering with sampling could help).
Good, it could be nice.
According to our remaining "historical" logs we experienced the |
I'm having the same problem. When I configure Ruler to communicate with Query, it gives me:
I'm running Thanos 0.10.1 on Kubernetes version 1.16.3. Query DNS discovery is enabled on Ruler and Rule DNS Discovery is enabled on Query. Query has gRPC client TLS data (cert, key, and CA) and Ruler is TLS-secured via an nginx ingress. Communication works perfectly fine if I tell Ruler to hit the Query HTTP endpoints, but universally fails if I try to get Ruler to communicate with it over gRPC. |
Hi @AlexDHoffer, |
Thanks for the clarity. For some reason I thought we could query that data over the gRPC port. Will fix. |
This issue/PR has been automatically marked as stale because it has not had recent activity. Please comment on status otherwise the issue will be closed in a week. Thank you for your contributions. |
Thanos, Prometheus and Golang version used:
Thanos: 0.10.0
Prometheus: 2.15.2
Golang: 1.13.1
Object Storage Provider:
What happened:
From time to time we're experiencing a bunch of error messages from ruler as follows:
What you expected to happen:
I was about to ask whether it would be possible to elaborate on details of such an error?
What reason, which peer? At least.
I've noticed the commit 1a419c2#diff-8b6a7e1ae18dc0ef5f768a537dff89f5L772 , however I'm not sure if this solve my problem...
Thank you,
hawran
How to reproduce it (as minimally and precisely as possible):
Full logs to relevant components:
Anything else we need to know:
The text was updated successfully, but these errors were encountered: