Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: correct metrics path for MetricsEndpointProvider #236

Merged
merged 2 commits into from
Feb 13, 2024

Commits on Feb 13, 2024

  1. fix: correctly configure one scrape job to avoid firig alerts

    The metrics endpoint configuration had two scrape jobs, one for the
    regular metrics endpoint, and a second one based on a dynamic list of
    targets. The latter was causing the prometheus scraper to try and scrape
    metrics from *:80/metrics, which is not a valid endpoint. This was
    causing the UnitsUnavailable alert to fire constantly because that job
    was reporting back that the endpoint was not available.
    This new job was introduced by #94
    with no apparent justification. Because the seldon charm has changed
    since that PR, and the endpoint it is configuring is not valid, this
    commit will remove the extra job.
    
    This commit also refactors the MetricsEndpointProvider instantiation and
    removes the metrics-port config option as this value should not change.
    
    Finally, this commit changes the alert rule interval from 0m to 5m, as
    this interval is more appropriate for production environments.
    
    Part of canonical/bundle-kubeflow#564
    DnPlas committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    128c4da View commit details
    Browse the repository at this point in the history
  2. tests: add an assertion for checking unit is available

    The test_prometheus_grafana_integration test case was doing queries to prometheus
    and checking the request returned successfully and that the application name and model
    was listed correctly. To make this test case more accurately, we can add an assertion that
    also checks that the unit is available, this way we avoid issues like the one described in
    canonical/bundle-kubeflow#564.
    
    Part of canonical/bundle-kubeflow#564
    
    skip: fix test
    DnPlas committed Feb 13, 2024
    Configuration menu
    Copy the full SHA
    d8041a5 View commit details
    Browse the repository at this point in the history