You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Ideally we would be able to monitor the NFS servers we rely on in the grafana isntances directly, but unless we can't do that we need at least some way to understand if the NFS servers are overloaded.
I understand it as we rely on cloud provided NFS services GCP Filestore and AWS EFS. Ideally, we should at least learn how to monitor them using the cloud console if we can't provide grafana instances access to the datasources and import pre-defined dashboards for this.
I just encountered this kind of issue on EFS, and it took a lot of digging to understand what is going on.
EFS has 3 different throughput modes. Bursting is the default and AWS does some sneaky stuff to make sure it's initially fast, but if you don't put enough data on it right away you can hit a wall and have really variable and hard to diagnose performance.
The key metrics for EFS to look at are Burst Credit Balance, Permitted Throughput, and Throughput Utilization.
If that's what you are encountering, I'd be happy to pull together some of the resources that I found while trying to diagnose it.
Ideally we would be able to monitor the NFS servers we rely on in the grafana isntances directly, but unless we can't do that we need at least some way to understand if the NFS servers are overloaded.
I understand it as we rely on cloud provided NFS services GCP Filestore and AWS EFS. Ideally, we should at least learn how to monitor them using the cloud console if we can't provide grafana instances access to the datasources and import pre-defined dashboards for this.
Cloud services
Action points
Related
/tmp
for anything temp as that could help reduce load on the NFS server: Move uwhackweeks to a faster EFS server #1236 (comment)The text was updated successfully, but these errors were encountered: