Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add Cassandra dashboard #63

Merged
merged 5 commits into from
Jan 18, 2022
Merged

Conversation

JonathanWamsley
Copy link
Contributor

This pull request is contingent upon the corresponding metrics receiver pull request.

@draffensperger
Copy link
Contributor

Hi Jonathan! I'm an engineer at Google who works more on the frontend UI side of Cloud Monitoring and have been looking at some of these sample dashboards as they come by.

As a quick question - what does the Latency Count metric mean? Is that the same as an operation count?

And would it make sense to specify units for the latency charts? The numbers in the sample dashboard looked large (maybe microseconds or something?) There is a unitOverride field that you can put in the chart JSON (https://cloud.google.com/monitoring/api/ref_v3/rest/v1/projects.dashboards#timeseriesquery) that could be e.g. "us" (for microseconds) or "ns" for nanoseconds per https://cloud.google.com/monitoring/mql/reference#units-of-measure

@djaglowski
Copy link
Contributor

@draffensperger, good catch. You're right about those latency counts actually being operation counts. I can see why the original author of the data model made the mistake though, as Cassandra defines these counts as part of a "Latency" MBean. My team has submitted a PR to fix the receiver.

Regarding units on the other latency charts, we'll plan on overriding them in this case. Apparently there is a known issue with Google's exporter that is causing the loss of units.

@draffensperger
Copy link
Contributor

OK, makes sense on the bug in the exporter - I'm glad we have identified why the units were coming in correctly to begin with as those are helpful when users create custom charts/alerts. Yeah, unitOverride is really just a bandaid for the dashboard...

Thanks for fixing the operation count thing! For those charts, I see that they are using ALIGN_DELTA that would effectively give a per-sampled-period rate (like a per-minute rate). Would doing ALIGN_RATE make sense so that it is standardized to a per-second rate? And then would it make sense to name the chart something like "Operation Rate Avg." as it is showing the mean across the VMs?

@djaglowski
Copy link
Contributor

Using ALIGN_RATE makes sense, and updating the chart names accordingly too. There are some pending upstream changes which are close to finalized, so these dashboard updates should wait until those are merged.

@JonathanWamsley JonathanWamsley marked this pull request as ready for review January 14, 2022 19:49
@xiangshen-dk xiangshen-dk merged commit 2f25298 into GoogleCloudPlatform:master Jan 18, 2022
@cpheps cpheps deleted the cassandra branch January 18, 2022 19:07
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants