Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Grafana] Update Grafana dashboard and Resolve Legacy Query Failures in Data Source #2432

Open
wants to merge 1 commit into
base: master
Choose a base branch
from

Conversation

win5923
Copy link
Contributor

@win5923 win5923 commented Oct 9, 2024

Why are these changes needed?

  1. Update the following dashboard plugins:
    • KubeRay-ApiServer-1650105351221.json
    • KubeRay-Controller-Runtime-Controllers-1650108080992.json
  2. Resolve Legacy Query Failures in Data Source Issues Occurring in the KubeRay-ApiServer and KubeRay-Controller-Runtime-Controllers Dashboards.

before:
螢幕擷取畫面 2024-10-09 191615

after:

  • kubeRay-ApiServer:

kubeRay-ApiServer

  • KubeRay-controller:

image

Since KubeRay-controller does not have a ServiceMonitor, I created one myself to scrape metrics.

apiVersion: monitoring.coreos.com/v1
kind: ServiceMonitor
metadata:
  labels:
    release: prometheus
  name: kuberay-operator-metrics-monitor
  namespace: prometheus-system
spec:
  endpoints:
    - path: /metrics
      targetPort: http
  namespaceSelector:
    matchNames:
      - default
  selector:
    matchLabels:
      app.kubernetes.io/name: kuberay-operator

I deployed Prometheus and Grafana according to the official documentation, with the following versions:

  • kube-prometheus-stack: v48.2.1
  • kind: v1.31.0
  • kuberay-operator: v1.2.2
  • kuberay-apiserver:v1.2.2

Related issue number

Closes #2400

Checks

  • I've made sure the tests are passing.
  • Testing Strategy
    • Unit tests
    • Manual tests
    • This PR is not tested :(

@win5923
Copy link
Contributor Author

win5923 commented Oct 9, 2024

Currently, I am encountering several issues while testing the KubeRay-ApiServer Dashboard. The following metrics are all 0:

  • grpc_server_handled_total
  • grpc_server_started_total
  • grpc_server_msg_received_total
  • grpc_server_msg_sent_total
  • grpc_server_started_total

The following logs appear for the KubeRay-ApiServer:

I1009 13:31:34.630892       1 client_manager.go:52] Initializing client manager
I1009 13:31:34.631265       1 client_manager.go:71] Client manager initialized successfully
I1009 13:31:34.631294       1 main.go:112] Starting Http Proxy
I1009 13:31:34.631426       1 main.go:74] Starting gRPC server
2024/10/09 13:32:35 ERROR: Failed to extract ServerMetadata from context
2024/10/09 13:32:35 ERROR: Failed to extract ServerMetadata from context

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[Bug] Grafana DashBoard is too old
1 participant