Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gaby: monitor timed.Watcher progress #3

Open
rsc opened this issue Aug 15, 2024 · 3 comments
Open

gaby: monitor timed.Watcher progress #3

rsc opened this issue Aug 15, 2024 · 3 comments
Assignees

Comments

@rsc
Copy link
Contributor

rsc commented Aug 15, 2024

Known timed.Watchers should advance their "latest" time in the database.
We should have a health check endpoint that reports whether the known
watchers are too old. This will help diagnose cron-based jobs getting stuck,
like when GitHub got stuck due to the non-issue issue event.

@jba jba self-assigned this Aug 15, 2024
gopherbot pushed a commit that referenced this issue Aug 19, 2024
Add a package that sets up and wraps Open Telemetry metrics.

Define one metric, a counter for the number of cron requests.
It's not very interesting but useful to test that metrics are
being exported properly.

For #3.

Change-Id: Ia99b618e97df25af10bdda3027fb0bce2bdcf701
Reviewed-on: https://go-review.googlesource.com/c/oscar/+/606815
LUCI-TryBot-Result: Go LUCI <[email protected]>
Reviewed-by: Tatiana Bradley <[email protected]>
@jba
Copy link
Contributor

jba commented Aug 21, 2024

I set up a metric for watcher latest times.

To be done: an alert based on the metric.

@hyangah
Copy link
Contributor

hyangah commented Oct 24, 2024

The current metric exports DBTime which looks like a logical time.
What will be the reasonable alerting condition? (staying on the same value for N hours in real clock?)

@jba
Copy link
Contributor

jba commented Oct 24, 2024

Not making progress is certainly a necessary condition. But we don't know if that's because something's stuck, or because there's nothing new. With the Go issue tracker, no progress in an hour would almost certainly be a problem. But with discussions, for example, there might be no progress for days. So it's tricky.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants