metrics, observer: purge backend metrics when backend is down for too long #585
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What problem does this PR solve?
Issue Number: close #582
Problem Summary:
The backend metric is never cleared after the backend is down. In auto-scaling workload, the backends change frequently and then the metrics keep growing.
What is changed and how it works:
status
label forBackendStatusGauge
.BackendStatusGauge
is not used in Grafana so it's fine.Check List
Tests
Steps:
tiup playground v8.1.0 --db=2 --tiproxy=1 --tiproxy.version=v1.1.0 --tiflash=0
127.0.0.1:4001
127.0.0.1:4001
existedcurl -L 127.1:3080/metrics | grep -c 127.0.0.1:4001 204
127.0.0.1:4001
disappeared:curl -L 127.1:3080/metrics | grep -c 127.0.0.1:4001 0
127.0.0.1:4001
existed when it was up but disappeared after it was down:Notable changes
Release note
Please refer to Release Notes Language Style Guide to write a quality release note.