Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

Correlate Synapse version under each Grafana graph #15662

Closed
MadLittleMods opened this issue May 23, 2023 · 2 comments · Fixed by #15674
Closed

Correlate Synapse version under each Grafana graph #15662

MadLittleMods opened this issue May 23, 2023 · 2 comments · Fixed by #15674
Assignees
Labels
A-Metrics metrics, measures, stuff we put in Prometheus T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. Z-Dev-Wishlist Makes developers' lives better, but doesn't have direct user impact

Comments

@MadLittleMods
Copy link
Contributor

MadLittleMods commented May 23, 2023

Spawning from #15657 (comment),

What's possible:

We've rediscovered the synapse_build_info metric (introduced in #6005) which tracks the deployed Synapse version over time. I've added a graph that simply displays this which is already a big win:

But this can also be mixed into existing time-series for easier visual correlation (following this guide):

We can also additionally add annotations which can be toggled across all panels from a single toggle at the top of the dashboard. Dashboard settings -> Annotations -> New query. Set the Title as Deployed {{version}} and use the following Prometheus query:

changes(process_start_time_seconds{instance="matrix.org",job=~"synapse"}[$bucket_size]) * on (instance, job) group_left(version) synapse_build_info{instance="matrix.org",job="synapse"}

(thanks to https://www.robustperception.io/exposing-the-software-version-to-prometheus/ for the join trick)

What do we want to do?

The question is, do we want this added to all graphs? A select few?

The annotations are toggle-able and will show across all panels which means we probably don't have to add the full thick bottom color to many graphs.

The matrix.org dashboard has some of these changes so far if you want to see how it all feels: annotations, a new "Deployed Synapse vresions over time" graph, and updated "Up" graph, changes so far if you want to see how it all feels.

Limitations

Unable to share query

Ideally, we would be able to share the query or even time-series to other graphs but it's not possible to use both our normal datasource and the -- Dashboard -- to share the query at the same time. Even when using -- Mixed --. This is a limitation of Grafana which is tracked by grafana/grafana#63866

So we would need to manually/semi-manually add this to every graph (query and series override) which can be done I just want to gauge interest before I plow ahead.

Related docs: https://grafana.com/docs/grafana/latest/datasources/

Doesn't work with every visualization

Another limitation is that I can't figure out how to add this to certain visualizations like heatmaps. There doesn't seem to be a way to have multiple visualizations in the same panel or overlay things. Luckily, most of our things are time-series based anyway so we're mostly good.

Annotations don't show on heatmaps either -> grafana/grafana#13895

As a ugly workaround, we could place a slim panel showing the versions underneath the heatmap:

Dev notes

@MadLittleMods MadLittleMods added T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. A-Metrics metrics, measures, stuff we put in Prometheus Z-Dev-Wishlist Makes developers' lives better, but doesn't have direct user impact labels May 23, 2023
@reivilibre
Copy link
Contributor

I quite like the annotations since they're not too distracting but also align the grid vertically, which helps to see it easily.
How does this work for graphs with multiple workers? I use those a lot and I note we restart them in a staggered fashion so this could, if implemented poorly, add a lot of clutter to the graphs (but maybe the toggle is good enough?).

I wouldn't worry about the heatmaps personally, they are usually near another graph and you can correlate pretty easily if needed. Plus I typically use the heatmaps to get an idea of where the hone in on and then look at a more traditional graph which sounds like they would benefit from annotations here.

@MadLittleMods
Copy link
Contributor Author

How does this work for graphs with multiple workers? I use those a lot and I note we restart them in a staggered fashion so this could, if implemented poorly, add a lot of clutter to the graphs (but maybe the toggle is good enough?).

Right now I just have it looking at the job="synapse" main process which should detect deploys pretty well?

If we want to track restarts of any worker, we could add a separate "restarts" annotation probably based off of job="$job" which would pick up the variable in our dashboard and show whatever you have selected.

MadLittleMods added a commit that referenced this issue May 25, 2023
Fix #15662

This manifests as purple lines that show up on all time series panels
that you can hover and see what version was deployed.

Also added a new "Deployed Synapse versions over time" panel
where the color block changes with each version. And mixed this
color block into the "Up" time series panel.
@MadLittleMods MadLittleMods self-assigned this May 25, 2023
MadLittleMods added a commit that referenced this issue May 31, 2023
Fix #15662

This manifests as purple lines that show up on all time series panels
that you can hover and see what version was deployed.

Also added a new "Deployed Synapse versions over time" panel
where the color block changes with each version. And mixed this
color block into the "Up" time series panel.

To get the Grafana dashboard JSON to copy here: use the **Share** icon at the top -> **Export** -> check the **Export for sharing externally** option -> **View JSON** or **Save to file**
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
A-Metrics metrics, measures, stuff we put in Prometheus T-Enhancement New features, changes in functionality, improvements in performance, or user-facing enhancements. Z-Dev-Wishlist Makes developers' lives better, but doesn't have direct user impact
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants