-
Notifications
You must be signed in to change notification settings - Fork 3.8k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ui: update changefeed metrics page #101790
Conversation
So great! Should we take the opportunity to add some explanatory text next to each of the graphs? |
do we want to include number of currently running/paused changefeeds? |
We can close this one too! #97931 |
a041ee6
to
193d124
Compare
@shermanCRL I have tooltip text for "Max Checkpoint Latency", "Protected Timestamp Max Age", and "Backfill Pending Ranges", and I think the others are pretty self explanatory. @miretskiy Added at the very top |
193d124
to
c36c082
Compare
How difficult is it to change the scale on those graphs? Max protected age -- would be nice not to divide by 1e9; |
Age gets shown as Seconds right now, I looked into adding minutes/hours and it's doable just needs new frontend changes since they currently only support up to seconds. I'll make it show "1h 2m 3s" and see what they think about it. |
I think that is a good idea, there is a public facing runbook that details why we care about certain changefeed metrics we can either copy from or link to: https://github.com/cockroachlabs/cockroachdb-runbook-template/blob/main/monitoring-alerts/monitoring-dashboard-custom.md#essential-monitoring-metrics-for-changefeeds |
Added P50/P90/P99 histogram for Commit Latency as well as the 2nd graph right below "Changefeed Status" |
fabe1e3
to
661ced3
Compare
661ced3
to
191d065
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! The latest screenshot example looks great. my comments are just around some docs/tests.
Reviewed 6 of 7 files at r1, all commit messages.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @HonoreDB and @samiskin)
pkg/ui/workspaces/cluster-ui/src/graphs/utils/domain.ts
line 224 at r1 (raw file):
} function ComputeDurationAxisDomain(extent: Extent): AxisDomain {
can you write a few tests for this?
pkg/ui/workspaces/db-console/src/views/shared/components/metricQuery/index.tsx
line 92 at r1 (raw file):
title?: string; rate?: boolean; scale?: number;
I know the existing code doesn't do this but would you mind adding documentation for this parameter? Seems like others might be motivated to use it.
191d065
to
42bb93d
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian and @HonoreDB)
pkg/ui/workspaces/cluster-ui/src/graphs/utils/domain.ts
line 224 at r1 (raw file):
Previously, dhartunian (David Hartunian) wrote…
can you write a few tests for this?
Added tests in a new pkg/ui/workspaces/cluster-ui/src/graphs/utils/domain.spec.ts
file 👍
pkg/ui/workspaces/db-console/src/views/shared/components/metricQuery/index.tsx
line 92 at r1 (raw file):
Previously, dhartunian (David Hartunian) wrote…
I know the existing code doesn't do this but would you mind adding documentation for this parameter? Seems like others might be motivated to use it.
Added comment 👍
3197d11
to
f018344
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Reviewable status: complete! 0 of 0 LGTMs obtained (waiting on @dhartunian, @HonoreDB, and @samiskin)
a discussion (no related file):
This is really great work :)
pkg/ccl/changefeedccl/metrics.go
line 386 at r3 (raw file):
metaChangefeedMaxBehindNanos = metric.Metadata{ Name: "changefeed.max_behind_nanos", Help: "Time since the changefeed last checkpointed its progress",
The comment implies that it's the time since the most recent checkpoint. It shows the time since the oldest checkpoint. It's also calculated across all changefeeds.
edit: I just saw the tooltip text. We can probably use the first sentence of that here.
pkg/ui/workspaces/db-console/src/views/cluster/util/graphs.ts
line 62 at r3 (raw file):
const scaledValues = result.datapoints.map(v => ({ ...v, value: v.value && v.value * (s.props.scale ?? 1),
The &&
is confusing. Is that right?
Previously, jayshrivastava (Jayant) wrote…
It’s kinda idiomatic truthy javascript. Maybe some parens would make it clearer. Or make it more clearly booleans instead of truthy if it’s confusing. |
Remind me @samiskin, is Commit Latency the commit-to-emit latency? Asking because, now that we have reduced latency, where is the best place to observe it? |
f018344
to
bf52b84
Compare
@shermanCRL Yep, its the time that we confirmed emitting of the message minus the MVCC timestamp. We should be able to view the reduction with this metric, most directly by running without batching (which would add its own latency) |
bors r+ |
🕐 Waiting for PR status (GitHub check) to be set, probably by CI. Bors will automatically try to run when all required PR statuses are set. |
bf52b84
to
474680b
Compare
Looks like this needs follow-up? |
474680b
to
169ed79
Compare
This PR does the following changes: - Added a `scale` parameter to `Metrics` so that I could support a duration metric that's being emitted in Seconds rather than Nanoseconds. - Added support for minutes and hours in Duration graphs - There is now a "Changefeed Status" graph to show counts of Running/Paused/Failed - There is now a "Commit Latency" graph to show P50,P90, and P99 commit latencies - Sink Byte Traffic is now Emitted Bytes - Sink Timings has been removed because I don't believe either of the metrics exist anymore - Max Changefeed Latency is now Max Checkpoint Latency - There is now a Backfill Pending Ranges graph - There is now a Oldest Protected Timestamp graph - There is now a Schema Registry Registrations graph Release note (ui change): The metrics page for changefeeds has been updated with new graphs to track backfill progress, protected timestamps age, and number of schema registry registrations.
169ed79
to
e3c9c3e
Compare
bors r+ |
Build succeeded: |
Resolves #98085
Resolves #98088
Resolves #99409
Resolves #100640
Resolves #97931
Resolves #98086
This PR does the following changes:
scale
parameter toMetrics
so that I could support a duration metric that's being emitted in Seconds rather than Nanoseconds. Would like frontend feedback on whether this is ok.Release note (ui change): The metrics page for changefeeds has been updated with new graphs to track backfill progress, protected timestamps age, and number of schema registry registrations.