Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Harvest Health != ONTAP GUI #3149

Open
db-wally007 opened this issue Sep 13, 2024 · 4 comments
Open

Harvest Health != ONTAP GUI #3149

db-wally007 opened this issue Sep 13, 2024 · 4 comments

Comments

@db-wally007
Copy link

A note for the community

  • Please vote on this issue by adding a 👍 reaction to the original issue to help the community and maintainers prioritize this request
  • If you are interested in working on this issue or have submitted a pull request, please let us know in a comment

Problem

Hi,

I had opened couple years ago a GitHub issue that ONTAP Web UI showed alerts that Harvest was not showing. I was told the new collector will be introduced to collect the data (I think Its called EMS)

We've had EMS collector enabled for awhile now and we just noticed that in ONTAP Web UI we have an alert that does not show up in Grafana Health dashboard.

Our collector:

  agora:
    datacenter: EQX
    addr: agora-cluster.deutsche-boerse.de
    auth_style: basic_auth
    username: $__env{NETAPP_HARVEST_READONLY_USERNAME}
    password: $__env{NETAPP_HARVEST_READONLY_PASSWORD}
    use_insecure_tls: true
    exporters:
      - agora
    collectors:
      - Rest
      - RestPerf
      - Ems

ONTAP UI:

image

image

Grafana Dashboard:

image

In the Grafana panel popup it reads

"
The EMS collector gathers EMS events as defined in your ems.yml file. This panel displays events with emergency severity that occurred within the selected time range.
"

The way I understand it, in order to "recreate" ONTAP Web UI alerting, it would require user to recreate 1400+ definitions in the ems.yml ?

Essentially what we are trying to achieve is to use harvest as the ONLY source of metrics and alerts. However the suggested approach is maintenance overkill. We simply want to be alerted when ONTAP has an error without having to look at the Web UI.

We dont need to see the description of the event as in the ONTAP Web UI, but need to be made aware that there is an Alert (ie. not show on the Grafana Dashboard "0" issues)

Configuration

No response

Poller

agora poller

Version

latest

Poller logs

No response

OS and platform

docker

ONTAP or StorageGRID version

Netapp 9.13.1P8

Additional Context

No response

References

No response

@rahulguptajss
Copy link
Contributor

@db-wally007 That's correct. The Emergency panel in the Grafana Health dashboard UI only displays EMS with a severity of "emergency" if they are defined in the ems.yaml file. The idea behind the EMS collector was to list only the relevant EMS in ems.yaml to avoid spam from listing all severity-based EMS. I noticed that SM shows EMS with severities of "emergency," "alert," and "error."

Currently, the only option is to add those EMS to the ems.yaml file. We will review this approach and update you.

@rahulguptajss
Copy link
Contributor

@db-wally007 SM displays emergency events in the header and shows all alert, error, and emergency events in the table within the UI. As mentioned earlier, we don't intend to collect all events since some may be just noise. The idea is to selectively pick and choose events as needed. If we focus on emergency events, there are approximately 300 emergency events in ONTAP. Therefore, the suggestion is to list these specific events as needed in the ems.yaml file.

@db-wally007
Copy link
Author

This makes no sense to me. Aren't all "emergency" events needed ?
Essentially, this makes alerting in harvest a guess at best. Maybe I documented all alerts that might hit the storage appliance, maybe I didnt.

At the very least, harvest should collect number of alerts. Showing "0" alerts in the dashboard while storage is degraded is def. not something many should rely on.

@rahulguptajss
Copy link
Contributor

@db-wally007 We'll get back to you on this. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants