Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add check state for mdadm arrays via node_md_state metric #1810

Merged
merged 2 commits into from
Sep 27, 2020

Conversation

frittentheke
Copy link
Contributor

@frittentheke frittentheke commented Aug 12, 2020

I added parsing of the checking state for mdstat from procfs (prometheus/procfs#321) and this PR will now expose this as node_md_state metric with label state=check.
This metric is useful to first of all see arrays which are currently in this state, but also to use as condition in i.e. in alerts on disk read rate which is certainly very high when doing a full read of all blocks.

The tests likely will fail until the PR for procfs (prometheus/procfs#321) is merged as there is no proper parsing of the md201 test case util then.

Signed-off-by: Christian Rohmann [email protected]

@SuperQ PTAL

@frittentheke frittentheke changed the title Expose metric for state=check for node_md_state Add check status for mdadm arrays via node_md_state metric Aug 12, 2020
@frittentheke frittentheke changed the title Add check status for mdadm arrays via node_md_state metric Add check state for mdadm arrays via node_md_state metric Aug 12, 2020
@frittentheke
Copy link
Contributor Author

@SuperQ @discordianfish with prometheus/procfs#321 merged already this PR now makes sense / is functional.

Could you maybe give it a glimpse and tell me if I need to make changes for this to be merged at some point?

@SuperQ
Copy link
Member

SuperQ commented Sep 21, 2020

There's another proposal to enhance mdstat parsing: prometheus/procfs#329

Maybe we should review that?

@frittentheke
Copy link
Contributor Author

There's another proposal to enhance mdstat parsing: prometheus/procfs#329

Maybe we should review that?

@SuperQ Do you see any reason to hold back on merging this PR in the meantime or is there something I should add to this PR as a result of the further additions of prometheus/procfs#329 then?

This PR here is currently only about adding another state an md can be in and there will certainly always be more things/metrics that could be added with yet another PR.

@SuperQ
Copy link
Member

SuperQ commented Sep 23, 2020

No, I guess this is fine without any more updates. But it looks like we need to cut a procfs release anyway to pull in the other change.

@SuperQ
Copy link
Member

SuperQ commented Sep 23, 2020

I've cut a new procfs release, and #1852 to merge it into node_exporter.

@SuperQ
Copy link
Member

SuperQ commented Sep 23, 2020

Would you please rebase this against head?

…te and a the new state=check labeled metric for all other md

Signed-off-by: Christian Rohmann <[email protected]>
@frittentheke
Copy link
Contributor Author

Would you please rebase this against head?

Sorry for the delay @SuperQ .. just did the rebase.

Copy link
Member

@SuperQ SuperQ left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@SuperQ SuperQ merged commit a3aaf63 into prometheus:master Sep 27, 2020
@SuperQ SuperQ mentioned this pull request Feb 5, 2021
SuperQ added a commit that referenced this pull request Feb 5, 2021
* Update Build
  - Update CircleCI orb.
  - Update CIrcleCI Machine image.
  - Use golang-builder 1.15.
* Update Go modules.
* Fixup fixtures for XFS bug.

Changes:
* [CHANGE] Improve filter flag names. #1743
* [CHANGE] Add btrfs and powersupplyclass to list of exporters enabled by default #1897
* [FEATURE] Add fibre channel collector #1786
* [FEATURE] Expose cpu bugs and flags as info metrics. #1788
* [FEATURE] Add network_route collector #1811
* [FEATURE] Add zoneinfo collector #1922
* [ENHANCEMENT] Add more InfiniBand counters #1694
* [ENHANCEMENT] Add flag to aggr ipvs metrics to avoid high cardinality metrics #1709
* [ENHANCEMENT] Adding backlog/current queue length to qdisc collector #1732
* [ENHANCEMENT] Include TCP OutRsts in netstat metrics #1733
* [ENHANCEMENT] Add pool size to entropy collector #1753
* [ENHANCEMENT] Remove CGO dependencies for OpenBSD amd64 #1774
* [ENHANCEMENT] bcache: add writeback_rate_debug stats #1658
* [ENHANCEMENT] Add check state for mdadm arrays via node_md_state metric #1810
* [ENHANCEMENT] Expose XFS inode statistics #1870
* [ENHANCEMENT] Expose zfs zpool state #1878
* [ENHANCEMENT] Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable #1947
* [BUGFIX] filesystem_freebsd: Fix label values #1728
* [BUGFIX] Fix various procfs parsing errors #1735
* [BUGFIX] Handle no data from powersupplyclass #1747
* [BUGFIX] udp_queues_linux.go: s/upd/udp/ in two error strings #1769
* [BUGFIX] Fix node_scrape_collector_success behaviour #1816
* [BUGFIX] Fix NodeRAIDDegraded to not use a string rule expressions #1827
* [BUGFIX] fix: node_md_disks state label from fail to failed #1862
* [BUGFIX] Handle EPERM for syscall in timex collector #1938
* [BUGFIX] bcache: fix typo #1943
* [BUGFIX] Fix XFS read/write stats (prometheus/procfs#343)

Signed-off-by: Ben Kochie <[email protected]>
SuperQ added a commit that referenced this pull request Feb 5, 2021
* Update Build
  - Update CircleCI orb.
  - Update CIrcleCI Machine image.
  - Use golang-builder 1.15.
* Update Go modules.
* Fixup fixtures for XFS bug.

NOTE: We have improved some of the flag naming conventions (PR #1743). The old names are
      deprecated and will be removed in 2.0. They will continue to work for backwards
      compatibility.

* [CHANGE] Improve filter flag names #1743
* [CHANGE] Add btrfs and powersupplyclass to list of exporters enabled by default #1897
* [FEATURE] Add fibre channel collector #1786
* [FEATURE] Expose cpu bugs and flags as info metrics. #1788
* [FEATURE] Add network_route collector #1811
* [FEATURE] Add zoneinfo collector #1922
* [ENHANCEMENT] Add more InfiniBand counters #1694
* [ENHANCEMENT] Add flag to aggr ipvs metrics to avoid high cardinality metrics #1709
* [ENHANCEMENT] Adding backlog/current queue length to qdisc collector #1732
* [ENHANCEMENT] Include TCP OutRsts in netstat metrics #1733
* [ENHANCEMENT] Add pool size to entropy collector #1753
* [ENHANCEMENT] Remove CGO dependencies for OpenBSD amd64 #1774
* [ENHANCEMENT] bcache: add writeback_rate_debug stats #1658
* [ENHANCEMENT] Add check state for mdadm arrays via node_md_state metric #1810
* [ENHANCEMENT] Expose XFS inode statistics #1870
* [ENHANCEMENT] Expose zfs zpool state #1878
* [ENHANCEMENT] Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable #1947
* [BUGFIX] filesystem_freebsd: Fix label values #1728
* [BUGFIX] Fix various procfs parsing errors #1735
* [BUGFIX] Handle no data from powersupplyclass #1747
* [BUGFIX] udp_queues_linux.go: change upd to udp in two error strings #1769
* [BUGFIX] Fix node_scrape_collector_success behaviour #1816
* [BUGFIX] Fix NodeRAIDDegraded to not use a string rule expressions #1827
* [BUGFIX] Fix node_md_disks state label from fail to failed #1862
* [BUGFIX] Handle EPERM for syscall in timex collector #1938
* [BUGFIX] bcache: fix typo in a metric name #1943
* [BUGFIX] Fix XFS read/write stats (prometheus/procfs#343)

Signed-off-by: Ben Kochie <[email protected]>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
…#1810)

* Expose metric for state=check for node_md_state
* Added new e2e output fixture including md201 which is in checking state and a the new state=check labeled metric for all other md

Signed-off-by: Christian Rohmann <[email protected]>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
* Update Build
  - Update CircleCI orb.
  - Update CIrcleCI Machine image.
  - Use golang-builder 1.15.
* Update Go modules.
* Fixup fixtures for XFS bug.

NOTE: We have improved some of the flag naming conventions (PR prometheus#1743). The old names are
      deprecated and will be removed in 2.0. They will continue to work for backwards
      compatibility.

* [CHANGE] Improve filter flag names prometheus#1743
* [CHANGE] Add btrfs and powersupplyclass to list of exporters enabled by default prometheus#1897
* [FEATURE] Add fibre channel collector prometheus#1786
* [FEATURE] Expose cpu bugs and flags as info metrics. prometheus#1788
* [FEATURE] Add network_route collector prometheus#1811
* [FEATURE] Add zoneinfo collector prometheus#1922
* [ENHANCEMENT] Add more InfiniBand counters prometheus#1694
* [ENHANCEMENT] Add flag to aggr ipvs metrics to avoid high cardinality metrics prometheus#1709
* [ENHANCEMENT] Adding backlog/current queue length to qdisc collector prometheus#1732
* [ENHANCEMENT] Include TCP OutRsts in netstat metrics prometheus#1733
* [ENHANCEMENT] Add pool size to entropy collector prometheus#1753
* [ENHANCEMENT] Remove CGO dependencies for OpenBSD amd64 prometheus#1774
* [ENHANCEMENT] bcache: add writeback_rate_debug stats prometheus#1658
* [ENHANCEMENT] Add check state for mdadm arrays via node_md_state metric prometheus#1810
* [ENHANCEMENT] Expose XFS inode statistics prometheus#1870
* [ENHANCEMENT] Expose zfs zpool state prometheus#1878
* [ENHANCEMENT] Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable prometheus#1947
* [BUGFIX] filesystem_freebsd: Fix label values prometheus#1728
* [BUGFIX] Fix various procfs parsing errors prometheus#1735
* [BUGFIX] Handle no data from powersupplyclass prometheus#1747
* [BUGFIX] udp_queues_linux.go: change upd to udp in two error strings prometheus#1769
* [BUGFIX] Fix node_scrape_collector_success behaviour prometheus#1816
* [BUGFIX] Fix NodeRAIDDegraded to not use a string rule expressions prometheus#1827
* [BUGFIX] Fix node_md_disks state label from fail to failed prometheus#1862
* [BUGFIX] Handle EPERM for syscall in timex collector prometheus#1938
* [BUGFIX] bcache: fix typo in a metric name prometheus#1943
* [BUGFIX] Fix XFS read/write stats (prometheus/procfs#343)

Signed-off-by: Ben Kochie <[email protected]>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
…#1810)

* Expose metric for state=check for node_md_state
* Added new e2e output fixture including md201 which is in checking state and a the new state=check labeled metric for all other md

Signed-off-by: Christian Rohmann <[email protected]>
oblitorum pushed a commit to shatteredsilicon/node_exporter that referenced this pull request Apr 9, 2024
* Update Build
  - Update CircleCI orb.
  - Update CIrcleCI Machine image.
  - Use golang-builder 1.15.
* Update Go modules.
* Fixup fixtures for XFS bug.

NOTE: We have improved some of the flag naming conventions (PR prometheus#1743). The old names are
      deprecated and will be removed in 2.0. They will continue to work for backwards
      compatibility.

* [CHANGE] Improve filter flag names prometheus#1743
* [CHANGE] Add btrfs and powersupplyclass to list of exporters enabled by default prometheus#1897
* [FEATURE] Add fibre channel collector prometheus#1786
* [FEATURE] Expose cpu bugs and flags as info metrics. prometheus#1788
* [FEATURE] Add network_route collector prometheus#1811
* [FEATURE] Add zoneinfo collector prometheus#1922
* [ENHANCEMENT] Add more InfiniBand counters prometheus#1694
* [ENHANCEMENT] Add flag to aggr ipvs metrics to avoid high cardinality metrics prometheus#1709
* [ENHANCEMENT] Adding backlog/current queue length to qdisc collector prometheus#1732
* [ENHANCEMENT] Include TCP OutRsts in netstat metrics prometheus#1733
* [ENHANCEMENT] Add pool size to entropy collector prometheus#1753
* [ENHANCEMENT] Remove CGO dependencies for OpenBSD amd64 prometheus#1774
* [ENHANCEMENT] bcache: add writeback_rate_debug stats prometheus#1658
* [ENHANCEMENT] Add check state for mdadm arrays via node_md_state metric prometheus#1810
* [ENHANCEMENT] Expose XFS inode statistics prometheus#1870
* [ENHANCEMENT] Expose zfs zpool state prometheus#1878
* [ENHANCEMENT] Added an ability to pass collector.supervisord.url via SUPERVISORD_URL environment variable prometheus#1947
* [BUGFIX] filesystem_freebsd: Fix label values prometheus#1728
* [BUGFIX] Fix various procfs parsing errors prometheus#1735
* [BUGFIX] Handle no data from powersupplyclass prometheus#1747
* [BUGFIX] udp_queues_linux.go: change upd to udp in two error strings prometheus#1769
* [BUGFIX] Fix node_scrape_collector_success behaviour prometheus#1816
* [BUGFIX] Fix NodeRAIDDegraded to not use a string rule expressions prometheus#1827
* [BUGFIX] Fix node_md_disks state label from fail to failed prometheus#1862
* [BUGFIX] Handle EPERM for syscall in timex collector prometheus#1938
* [BUGFIX] bcache: fix typo in a metric name prometheus#1943
* [BUGFIX] Fix XFS read/write stats (prometheus/procfs#343)

Signed-off-by: Ben Kochie <[email protected]>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants