Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Include volume type in volume metricset data from Kubernetes module #39524

Open
Tracked by #9859
eedugon opened this issue May 11, 2024 · 3 comments
Open
Tracked by #9859

Include volume type in volume metricset data from Kubernetes module #39524

eedugon opened this issue May 11, 2024 · 3 comments
Labels
enhancement Metricbeat Metricbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team

Comments

@eedugon
Copy link
Contributor

eedugon commented May 11, 2024

Describe the enhancement:

Currently our Metricbeat Kubernetes volume is not collecting volume types in the volume metricset.

Kubernetes volumes can be of many different types, and many of them might be unwanted by the users in certain visualizations or even during data collection. Examples of volume types are: Secret, ConfigMap, HostPath, EmptyDir, PersistentVolumeClaim.

It would be very useful to include the volume type into the field kubernetes.volume.type, which is currently not offered.

In case of supporting this, ideally we should also offer in the metricset the capacity of filtering out incoming data, in the same way as we do in the system-->filesystem metricset with filesystem.ignore_types. We could offer this with something like volume.ignore_types.

I also believe implementing this should be easy as the type is part of the volume metadata. --> Update: the previous is incorrect. A volume doesn't have an explicit type setting, it just has different settings depending on its type.

The same should be applied to Elastic Agent Kubernetes integration.

Describe a specific use case for the enhancement or feature:

Kubernetes monitoring and wanting to optimize data collection.

@elasticmachine
Copy link
Collaborator

Pinging @elastic/obs-ds-hosted-services (Team:obs-ds-hosted-services)

@botelastic botelastic bot removed the needs_team Indicates that the issue/PR needs a Team:* label label May 15, 2024
@gizas
Copy link
Contributor

gizas commented Jun 26, 2024

Hello @eedugon , I started having a look at this.

For my testing I have something as simple as:

Redis manifest with a volume with type `EmptyDir`
apiVersion: v1
kind: Pod
metadata:
  name: redis
spec:
  containers:
  - name: redis
    image: redis
    volumeMounts:
    - name: redis-storage
      mountPath: /data/redis
  volumes:
  - name: redis-storage
    emptyDir: {}

Once installed I exec into elastic-agent pod and retrieve the kubelet /stats/summry info:

How to retrieve kubelet info
TOKEN=$(cat /var/run/secrets/kubernetes.io/serviceaccount/token)
curl  https://kind-control-plane:10250/stats/summary --header "Authorization: Bearer $TOKEN" --insecure

See below the volume section. I can not see any volume type information relative to our case here.

"volume": [
    {
     "time": "2024-06-26T10:03:31Z",
     "availableBytes": 16463179776,
     "capacityBytes": 109647159296,
     "usedBytes": 1130033152,
     "inodesFree": 6016189,
     "inodes": 6815744,
     "inodesUsed": 33204,
     "name": "redis-storage"
    },
    {
     "time": "2024-06-26T10:03:31Z",
     "availableBytes": 11221483520,
     "capacityBytes": 11221495808,
     "usedBytes": 12288,
     "inodesFree": 1369802,
     "inodes": 1369811,
     "inodesUsed": 9,
     "name": "kube-api-access-6h6pj"
    }
   ],

For clarity, I also include the full output of /stats/summary endpoint in file
summary.json

Also the only metadata we add in the metricset relates to the pod.name.

I also believe implementing this should be easy as the type is part of the volume metadata.

Can you please verify your comment, maybe I am missing something here?

Same is related with #39525

@eedugon
Copy link
Contributor Author

eedugon commented Jun 27, 2024

@gizas , I'm sorry if my sentence about this being easy caused confusion :) as definitely it could be wrong!!

I don't think you are missing anything. The thing is that I didn't really know from where exactly the beat is taking the information about volumes, and I was only considering how a PodSpec or a pod describe looks like, where the information about the volume mounts and type is more or less available (although a volume doesn't have a explicit type parameter, which could also be an issue).

It's true that this stats API doesn't return anything like that, and trying to enrich the data might be too expensive.

Sorry for the noise here!

The reasoning behind wishing this type of data (type of volume and mount_point) to be available is that the volume stats are providing a massive amount of documents and in a lot of use cases the majority of them are useless in terms of stats. For example an Elasticsearch pod owned by ECK has a lot of configmaps and secrets mounted and each of them is reported as a separate volume, which stats are kind of irrelevant.

I still think this would be very useful, but might be difficult to accomplish.

From a describe pod command we can see something like (sharing only interesting parts):

Containers:
  elasticsearch:
...
...
    Mounts:
...
      /usr/share/elasticsearch/data from elasticsearch-data (rw)
      /usr/share/elasticsearch/logs from elasticsearch-logs (rw)
      /usr/share/elasticsearch/config/http-certs from elastic-internal-http-certificates (ro)
...
...
Volumes:
  elasticsearch-data:
    Type:       PersistentVolumeClaim (a reference to a PersistentVolumeClaim in the same namespace)
    ClaimName:  elasticsearch-data-elastic-k8s-es-default500g-0
    ReadOnly:   false
  elasticsearch-logs:
    Type:       EmptyDir (a temporary directory that shares a pod's lifetime)
    Medium:
    SizeLimit:  <unset>
  elastic-internal-http-certificates:
    Type:        Secret (a volume populated by a Secret)
    SecretName:  elastic-k8s-es-http-certs-internal
    Optional:    false

But it's true that the stats API you provided would only show some stats about the elasticsearch-data volume, without any information about the Type (which is not an attribute of the volume), or the Mount point.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Metricbeat Metricbeat Team:obs-ds-hosted-services Label for the Observability Hosted Services team
Projects
None yet
Development

No branches or pull requests

4 participants