diff --git a/config/prometheus/podMonitor.yaml b/config/prometheus/podMonitor.yaml index 316a0337dd..e20f1367d2 100644 --- a/config/prometheus/podMonitor.yaml +++ b/config/prometheus/podMonitor.yaml @@ -3,21 +3,19 @@ kind: PodMonitor metadata: name: ray-workers-monitor namespace: prometheus-system + labels: + # `release: $HELM_RELEASE`: Prometheus can only detect PodMonitor with this label. + release: prometheus spec: jobLabel: ray-workers + # Only select Kubernetes Pods in the "default" namespace. namespaceSelector: matchNames: - default - - ray-system + # Only select Kubernetes Pods with "matchLabels". selector: matchLabels: ray.io/node-type: worker - ray.io/is-ray-node: "yes" + # A list of endpoints allowed as part of this PodMonitor. podMetricsEndpoints: - port: metrics - interval: 1m - scrapeTimeout: 10s - # - targetPort: 90001 - # interval: 1m - # scrapeTimeout: 10s - diff --git a/config/prometheus/serviceMonitor.yaml b/config/prometheus/serviceMonitor.yaml index f0f19fce45..93349e93c9 100644 --- a/config/prometheus/serviceMonitor.yaml +++ b/config/prometheus/serviceMonitor.yaml @@ -4,17 +4,19 @@ metadata: name: ray-head-monitor namespace: prometheus-system labels: - release: prometheus-operator - ray.io/node-type: head + # `release: $HELM_RELEASE`: Prometheus can only detect ServiceMonitor with this label. + release: prometheus spec: jobLabel: ray-head + # Only select Kubernetes Services in the "default" namespace. namespaceSelector: matchNames: - default - - ray-system + # Only select Kubernetes Services with "matchLabels". selector: matchLabels: ray.io/node-type: head + # A list of endpoints allowed as part of this ServiceMonitor. endpoints: - port: metrics targetLabels: diff --git a/docs/guidance/observability.md b/docs/guidance/observability.md index 246391bbd9..a52bb0d966 100644 --- a/docs/guidance/observability.md +++ b/docs/guidance/observability.md @@ -51,248 +51,4 @@ curl --request GET '/apis/v1alpha2/namespaces//clusters/ 9001:9001 -``` - -From a second terminal issue - -```bash -$> curl localhost:9001 -# TYPE ray_pull_manager_object_request_time_ms histogram -... -ray_pull_manager_object_request_time_ms_sum{Component="raylet",... -... -``` - -Before we move on, first ensure that the required metrics port is also defined in the Ray's cluster Kubernetes service. This is done automatically via the Ray Operator if you define the metrics port `containerPort: 9001` along with the name and protocol. - -```bash -$> kubectl get svc -head-svc -o yaml -NAME TYPE ... PORT(S) ... -... ClusterIP ... 6379/TCP,9001/TCP,10001/TCP,8265/TCP,8000/TCP ... -``` - -We are now ready to create the required Prometheus CRDs to collect metrics - -### Collect Head Node metrics with ServiceMonitors - -Prometheus provides a CRD that targets Kubernetes services to collect metrics. The idea is that we will define a CRD that will have selectors that match the Ray Cluster Kubernetes service labels and ports, the metrics port. - -```yaml -apiVersion: monitoring.coreos.com/v1 -kind: ServiceMonitor -metadata: - name: -head-monitor <-- Replace with the actual Ray Cluster name - namespace: <-- Add the namespace of your ray cluster -spec: - endpoints: - - interval: 1m - path: /metrics - scrapeTimeout: 10s - port: metrics - jobLabel: -ray-head <-- Replace with the actual Ray Cluster name - namespaceSelector: - matchNames: - - <-- Add the namespace of your ray cluster - selector: - matchLabels: - ray.io/cluster: <-- Replace with the actual Ray Cluster name - ray.io/identifier: -head <-- Replace with the actual Ray Cluster name - ray.io/node-type: head - targetLabels: - - ray.io/cluster -``` - -A notes for the `targetLabels`. We added `spec.targetLabels[0].ray.io/cluster` because we want to include the name of the ray cluster in the metrics that will be generated by this service monitor. The `ray.io/cluster` label is part of the Ray head node service and it will be transformed to a `ray_io_cluster` metric label. That is, any metric that will be imported, will also container the following label `ray_io_cluster=`. This may seem like optional but it becomes mandatory if you deploy multiple ray clusters. - -Create the above service monitor by issuing - -```bash -k apply -f serviceMonitor.yaml -``` - -After a while, Prometheus should start scraping metrics from the head node. You can confirm that by visiting the Prometheus web ui and start typing `ray_`. Prometheus should create a dropdown list with suggested Ray metrics. - -```bash -curl 'https:///api/v1/query?query=ray_object_store_available_memory' -H 'Accept: */*' -``` - -### Collect Worker Node metrics with PodMonitors - -Ray operator does not create a Kubernetes service for the ray workers, therefore we can not use a Prometheus ServiceMonitors to scrape the metrics from our workers. - -**Note**: We could create a Kubernetes service with selectors a common label subset from our worker pods, however this is not ideal because our workers are independent from each other, that is, they are not a collection of replicas spawned by replicaset controller. Due to that, we should avoid using a Kubernetes service for grouping them together. - -To collect worker metrics, we can use `Prometheus PodMonitros CRD`. - -```yaml -apiVersion: monitoring.coreos.com/v1 -kind: PodMonitor -metadata: - labels: - ray.io/cluster: <-- Replace with the actual Ray Cluster name - name: -workers-monitor <-- Replace with the actual Ray Cluster name - namespace: <-- Add the namespace of your ray cluster -spec: - jobLabel: -ray-workers <-- Replace with the actual Ray Cluster name - namespaceSelector: - matchNames: - - <-- Add the namespace of your ray cluster - podMetricsEndpoints: - - interval: 30s - port: metrics - scrapeTimeout: 10s - podTargetLabels: - - ray.io/cluster - selector: - matchLabels: - ray.io/is-ray-node: "yes" - ray.io/node-type: worker -``` - -Since we are not selecting a Kubernetes service but pods, our `matchLabels` now define a set of labels that is common on all Ray workers. - -We also define `metadata.labels` by manually adding `ray.io/cluster: ` and then instructing the PodMonitors resource to add that label in the scraped metrics via `spec.podTargetLabels[0].ray.io/cluster`. - -Apply the above PodMonitor manifest - -```bash -k apply -f podMonitor.yaml -``` - -Last, wait a bit and then ensure that you can see Ray worker metrics in Prometheus - -```bash -curl 'https:///api/v1/query?query=ray_object_store_available_memory' -H 'Accept: */*' -``` - -The above http query should yield metrics from the head node and your worker nodes - -We have everything we need now and we can use Grafana to create some panels and visualize the scrapped metrics - -### Grafana: Visualize ingested Ray metrics - -You can use the json in `config/grafana` to import in Grafana the Ray dashboards. - -### Custom Metrics & Alerting - -We can also define custom metrics, and create alerts by using `prometheusrules.monitoring.coreos.com` CRD. Because custom metrics, and alerting is different for each team and setup, we have included an example under `$kuberay/config/prometheus/rules` that you can use to build custom metrics and alerts - - +See [prometheus-grafana.md](./prometheus-grafana.md) for more details. \ No newline at end of file diff --git a/docs/guidance/prometheus-grafana.md b/docs/guidance/prometheus-grafana.md new file mode 100644 index 0000000000..148e20d5f6 --- /dev/null +++ b/docs/guidance/prometheus-grafana.md @@ -0,0 +1,198 @@ +# Ray Cluster: Monitoring with Prometheus & Grafana + +This section will describe how to monitor Ray Clusters in Kubernetes using Prometheus & Grafana. + +If you do not have any experience with Prometheus and Grafana on Kubernetes, I strongly recommend you watch this [YouTube playlist](https://youtube.com/playlist?list=PLy7NrYWoggjxCF3av5JKwyG7FFF9eLeL4). + +## Step 1: Create a Kubernetes cluster with Kind. + +```sh +kind create cluster +``` + +## Step 2: Install Kubernetes Prometheus Stack via Helm chart + +```sh +# Path: kuberay/ +./install/prometheus/install.sh + +# Check the installation +kubectl get all -n prometheus-system + +# (part of the output) +# NAME READY UP-TO-DATE AVAILABLE AGE +# deployment.apps/prometheus-grafana 1/1 1 1 46s +# deployment.apps/prometheus-kube-prometheus-operator 1/1 1 1 46s +# deployment.apps/prometheus-kube-state-metrics 1/1 1 1 46s +``` +* KubeRay provides an [install.sh script](../../install/prometheus/install.sh) to install the [kube-prometheus-stack](https://github.com/prometheus-community/helm-charts/tree/main/charts/kube-prometheus-stack) chart and related custom resources, including **ServiceMonitor** and **PodMonitor**, in the namespace `prometheus-system` automatically. + +## Step 3: Install a KubeRay operator + +* Follow this [document](../../helm-chart/kuberay-operator/README.md) to install the latest stable KubeRay operator via Helm repository. + +## Step 4: Install a RayCluster + +```sh +helm install raycluster kuberay/ray-cluster --version 0.4.0 + +# Check ${RAYCLUSTER_HEAD_POD} +kubectl get pod -l ray.io/node-type=head + +# Example output: +# NAME READY STATUS RESTARTS AGE +# raycluster-kuberay-head-btwc2 1/1 Running 0 63s + +# Wait until all Ray Pods are running and forward the port of the Prometheus metrics endpoint in a new terminal. +kubectl port-forward --address 0.0.0.0 ${RAYCLUSTER_HEAD_POD} 8080:8080 +curl localhost:8080 + +# Example output (Prometheus metrics format): +# # HELP ray_spill_manager_request_total Number of {spill, restore} requests. +# # TYPE ray_spill_manager_request_total gauge +# ray_spill_manager_request_total{Component="raylet",NodeAddress="10.244.0.13",Type="Restored",Version="2.0.0"} 0.0 + +# Ensure that the port (8080) for the metrics endpoint is also defined in the head's Kubernetes service. +kubectl get service + +# NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE +# raycluster-kuberay-head-svc ClusterIP 10.96.201.142 6379/TCP,8265/TCP,8080/TCP,8000/TCP,10001/TCP 106m +``` + +* Based on [kuberay/#230](https://github.com/ray-project/kuberay/pull/230), KubeRay will expose a Prometheus metrics endpoint in port **8080** via a built-in exporter by default. Hence, we do not need to install any external exporter. +* If you want to configure the metrics endpoint to a different port, also see [kuberay/#230](https://github.com/ray-project/kuberay/pull/230) for more details. +* Prometheus metrics format: + * `# HELP`: Describe the meaning of this metric. + * `# TYPE`: See [this document](https://prometheus.io/docs/concepts/metric_types/) for more details. + +## Step 5: Collect Head Node metrics with a ServiceMonitor + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: ServiceMonitor +metadata: + name: ray-head-monitor + namespace: prometheus-system + labels: + # `release: $HELM_RELEASE`: Prometheus can only detect ServiceMonitor with this label. + release: prometheus +spec: + jobLabel: ray-head + # Only select Kubernetes Services in the "default" namespace. + namespaceSelector: + matchNames: + - default + # Only select Kubernetes Services with "matchLabels". + selector: + matchLabels: + ray.io/node-type: head + # A list of endpoints allowed as part of this ServiceMonitor. + endpoints: + - port: metrics + targetLabels: + - ray.io/cluster +``` +* The YAML example above is [serviceMonitor.yaml](../../config/prometheus/serviceMonitor.yaml), and it is created by **install.sh**. Hence, no need to create anything here. +* See [ServiceMonitor official document](https://github.com/prometheus-operator/prometheus-operator/blob/main/Documentation/api.md#servicemonitor) for more details about the configurations. + +* `release: $HELM_RELEASE`: Prometheus can only detect ServiceMonitor with this label. + ```sh + helm ls -n prometheus-system + # ($HELM_RELEASE is "prometheus".) + # NAME NAMESPACE REVISION UPDATED STATUS CHART APP VERSION + # prometheus prometheus-system 1 2023-02-06 06:27:05.530950815 +0000 UTC deployed kube-prometheus-stack-44.3.1 v0.62.0 + + kubectl get prometheuses.monitoring.coreos.com -n prometheus-system -oyaml + # podMonitorSelector: + # matchLabels: + # release: prometheus + # ... + # serviceMonitorSelector: + # matchLabels: + # release: prometheus + ``` + +* `namespaceSelector` and `seletor` are used to select exporter's Kubernetes service. Because Ray uses a built-in exporter, the **ServiceMonitor** selects Ray's head service which exposes the metrics endpoint (i.e. port 8080 here). + ```sh + kubectl get service -n default -l ray.io/node-type=head + # NAME TYPE CLUSTER-IP EXTERNAL-IP PORT(S) AGE + # raycluster-kuberay-head-svc ClusterIP 10.96.201.142 6379/TCP,8265/TCP,8080/TCP,8000/TCP,10001/TCP 153m + ``` + +* `targetLabels`: We added `spec.targetLabels[0].ray.io/cluster` because we want to include the name of the RayCluster in the metrics that will be generated by this ServiceMonitor. The `ray.io/cluster` label is part of the Ray head node service and it will be transformed into a `ray_io_cluster` metric label. That is, any metric that will be imported, will also contain the following label `ray_io_cluster=`. This may seem optional but it becomes mandatory if you deploy multiple RayClusters. + +## Step 6: Collect Worker Node metrics with PodMonitors + +KubeRay operator does not create a Kubernetes service for the Ray worker Pods, therefore we cannot use a Prometheus ServiceMonitor to scrape the metrics from the worker Pods. To collect worker metrics, we can use `Prometheus PodMonitors CRD` instead. + +**Note**: We could create a Kubernetes service with selectors a common label subset from our worker pods, however, this is not ideal because our workers are independent from each other, that is, they are not a collection of replicas spawned by replicaset controller. Due to that, we should avoid using a Kubernetes service for grouping them together. + +```yaml +apiVersion: monitoring.coreos.com/v1 +kind: PodMonitor +metadata: + name: ray-workers-monitor + namespace: prometheus-system + labels: + # `release: $HELM_RELEASE`: Prometheus can only detect PodMonitor with this label. + release: prometheus + ray.io/cluster: raycluster-kuberay # $RAY_CLUSTER_NAME: "kubectl get rayclusters.ray.io" +spec: + jobLabel: ray-workers + # Only select Kubernetes Pods in the "default" namespace. + namespaceSelector: + matchNames: + - default + # Only select Kubernetes Pods with "matchLabels". + selector: + matchLabels: + ray.io/node-type: worker + # A list of endpoints allowed as part of this PodMonitor. + podMetricsEndpoints: + - port: metrics +``` +* **PodMonitor** in `namespaceSelector` and `selector` are used to select Kubernetes Pods. + ```sh + kubectl get pod -n default -l ray.io/node-type=worker + # NAME READY STATUS RESTARTS AGE + # raycluster-kuberay-worker-workergroup-5stpm 1/1 Running 0 3h16m + ``` + +* `ray.io/cluster: $RAY_CLUSTER_NAME`: We also define `metadata.labels` by manually adding `ray.io/cluster: ` and then instructing the PodMonitors resource to add that label in the scraped metrics via `spec.podTargetLabels[0].ray.io/cluster`. + +## Step 7: Access Prometheus Web UI +```sh +# Forward the port of Prometheus Web UI in the Prometheus server Pod. +kubectl port-forward --address 0.0.0.0 prometheus-prometheus-kube-prometheus-prometheus-0 -n prometheus-system 9090:9090 + +# Check ${YOUR_IP}:9090/targets for the Web UI (e.g. 127.0.0.1:9090/targets) +# You should be able to see "podMonitor/prometheus-system/ray-workers-monitor/0 (1/1 up)" +# and "serviceMonitor/prometheus-system/ray-head-monitor/0 (1/1 up)" in the page. +``` + +![Prometheus Web UI](../images/prometheus_web_ui.png) + +## Step 8: Access Grafana + +```sh +# Forward the port of Grafana +kubectl port-forward --address 0.0.0.0 deployment/prometheus-grafana -n prometheus-system 3000:3000 + +# Check ${YOUR_IP}:3000 for the Grafana login page (e.g. 127.0.0.1:3000). +# The default username is "admin" and the password is "prom-operator". +``` + +* The default password is defined by `grafana.adminPassword` in the [values.yaml](https://github.com/prometheus-community/helm-charts/blob/main/charts/kube-prometheus-stack/values.yaml) of the kube-prometheus-stack chart. + +* After logging in to Grafana successfully, we can import Ray Dashboard into Grafana via **dashboard_default.json**. + * Click "Dashboards" icon in the left panel. + * Click "Import". + * Click "Upload JSON file". + * Choose [config/grafana/dashboard_default.json](../../config/grafana/dashboard_default.json). + * Click "Import". + +![Grafana Ray Dashboard](../images/grafana_ray_dashboard.png) + +### Custom Metrics & Alerting + +We can also define custom metrics, and create alerts by using `prometheusrules.monitoring.coreos.com` CRD. Because custom metrics, and alerting is different for each team and setup, we have included an example under [config/prometheus/rules](../../config/prometheus/rules/) that you can use to build custom metrics and alerts. diff --git a/docs/images/grafana_ray_dashboard.png b/docs/images/grafana_ray_dashboard.png new file mode 100644 index 0000000000..84e7a09be8 Binary files /dev/null and b/docs/images/grafana_ray_dashboard.png differ diff --git a/docs/images/prometheus_web_ui.png b/docs/images/prometheus_web_ui.png new file mode 100644 index 0000000000..9225cfd3a7 Binary files /dev/null and b/docs/images/prometheus_web_ui.png differ diff --git a/install/prometheus/install.sh b/install/prometheus/install.sh index 705f811f33..d2dd75420b 100755 --- a/install/prometheus/install.sh +++ b/install/prometheus/install.sh @@ -5,7 +5,7 @@ set errexit helm repo add prometheus-community https://prometheus-community.github.io/helm-charts helm repo update -helm --namespace prometheus-system install prometheus-operator prometheus-community/kube-prometheus-stack --create-namespace +helm --namespace prometheus-system install prometheus prometheus-community/kube-prometheus-stack --create-namespace # set the place of monitor files DIR="$(cd "$(dirname "${BASH_SOURCE[0]}")" > /dev/null && pwd)"