Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add datastream to collect pod logs in kubernetes integration #1523

Merged
Show file tree
Hide file tree
Changes from 5 commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
7 changes: 6 additions & 1 deletion packages/kubernetes/_dev/build/docs/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Kubernetes integration

This integration is used to collect metrics from
This integration is used to collect logs and metrics from
[Kubernetes clusters](https://kubernetes.io/).

As one of the main pieces provided for Kubernetes monitoring, this integration is capable of fetching metrics from several components:
Expand Down Expand Up @@ -78,6 +78,11 @@ These datasets are not enabled by default.
Note: In some "As a Service" Kubernetes implementations, like `GKE`, the master nodes or even the pods running on
the masters won't be visible. In these cases it won't be possible to use `scheduler` and `controllermanager` metricsets.

#### kube-pod-logs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems out dated


The kube-pod-logs dataset requires access to the log files in each Kubernetes node where the pod logs are stored.
This defaults to `/var/log/containers/*${kubernetes.container.id}.log`.

## Compatibility

The Kubernetes package is tested with Kubernetes 1.13.x, 1.14.x, 1.15.x, 1.16.x, 1.17.x, and 1.18.x
Expand Down
6 changes: 6 additions & 0 deletions packages/kubernetes/_dev/build/docs/kube-pod-logs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# kube-pod-logs

kube-pod-logs integration collects and parses logs of Kubernetes pods.

It requires access to the log files in each Kubernetes node where the pod logs are stored.
This defaults to `/var/log/containers/*${kubernetes.container.id}.log`.
6 changes: 6 additions & 0 deletions packages/kubernetes/_dev/deploy/k8s/example-redis-config.yaml
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
apiVersion: v1
kind: ConfigMap
metadata:
name: example-redis-config
data:
redis-config: ""
5 changes: 5 additions & 0 deletions packages/kubernetes/changelog.yml
Original file line number Diff line number Diff line change
@@ -1,4 +1,9 @@
# newer versions go on top
- version: "0.14.0"
changes:
- description: Add new pod logs data stream in kubernetes integration
type: enhancement
link: https://github.com/elastic/integrations/pull/1324
- version: "0.13.0"
changes:
- description: Leverage dynamic kubernetes provider for controller and scheduler datastream
Expand Down
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
paths:
{{#each paths}}
- {{this}}
{{/each}}

symlinks: {{symlinks}}
198 changes: 198 additions & 0 deletions packages/kubernetes/data_stream/pod_logs/fields/agent.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,198 @@
- name: cloud
title: Cloud
group: 2
description: Fields related to the cloud or infrastructure the events are coming from.
footnote: 'Examples: If Metricbeat is running on an EC2 host and fetches data from its host, the cloud info contains the data about this machine. If Metricbeat runs on a remote machine outside the cloud and fetches data from a service running in the cloud, the field contains cloud data from the machine the service is running on.'
type: group
fields:
- name: account.id
level: extended
type: keyword
ignore_above: 1024
description: 'The cloud account or organization id used to identify different entities in a multi-tenant environment.

Examples: AWS account id, Google Cloud ORG Id, or other unique identifier.'
example: 666777888999
- name: availability_zone
level: extended
type: keyword
ignore_above: 1024
description: Availability zone in which this host is running.
example: us-east-1c
- name: instance.id
level: extended
type: keyword
ignore_above: 1024
description: Instance ID of the host machine.
example: i-1234567890abcdef0
- name: instance.name
level: extended
type: keyword
ignore_above: 1024
description: Instance name of the host machine.
- name: machine.type
level: extended
type: keyword
ignore_above: 1024
description: Machine type of the host machine.
example: t2.medium
- name: provider
level: extended
type: keyword
ignore_above: 1024
description: Name of the cloud provider. Example values are aws, azure, gcp, or digitalocean.
example: aws
- name: region
level: extended
type: keyword
ignore_above: 1024
description: Region in which this host is running.
example: us-east-1
- name: project.id
type: keyword
description: Name of the project in Google Cloud.
- name: image.id
type: keyword
description: Image ID for the cloud instance.
- name: container
title: Container
group: 2
description: 'Container fields are used for meta information about the specific container that is the source of information.

These fields help correlate data based containers from any runtime.'
type: group
fields:
- name: id
level: core
type: keyword
ignore_above: 1024
description: Unique container id.
- name: image.name
level: extended
type: keyword
ignore_above: 1024
description: Name of the image the container was built on.
- name: labels
level: extended
type: object
object_type: keyword
description: Image labels.
- name: name
level: extended
type: keyword
ignore_above: 1024
description: Container name.
- name: host
title: Host
group: 2
description: 'A host is defined as a general computing instance.

ECS host.* fields should be populated with details about the host on which the event happened, or from which the measurement was taken. Host types include hardware, virtual machines, Docker containers, and Kubernetes nodes.'
type: group
fields:
- name: architecture
level: core
type: keyword
ignore_above: 1024
description: Operating system architecture.
example: x86_64
- name: domain
level: extended
type: keyword
ignore_above: 1024
description: 'Name of the domain of which the host is a member.

For example, on Windows this could be the host''s Active Directory domain or NetBIOS domain name. For Linux this could be the domain of the host''s LDAP provider.'
example: CONTOSO
default_field: false
- name: hostname
level: core
type: keyword
ignore_above: 1024
description: 'Hostname of the host.

It normally contains what the `hostname` command returns on the host machine.'
- name: id
level: core
type: keyword
ignore_above: 1024
description: 'Unique host id.

As hostname is not always unique, use values that are meaningful in your environment.

Example: The current usage of `beat.name`.'
- name: ip
level: core
type: ip
description: Host ip addresses.
- name: mac
level: core
type: keyword
ignore_above: 1024
description: Host mac addresses.
- name: name
level: core
type: keyword
ignore_above: 1024
description: 'Name of the host.

It can contain what `hostname` returns on Unix systems, the fully qualified domain name, or a name specified by the user. The sender decides which value to use.'
- name: os.family
level: extended
type: keyword
ignore_above: 1024
description: OS family (such as redhat, debian, freebsd, windows).
example: debian
- name: os.kernel
level: extended
type: keyword
ignore_above: 1024
description: Operating system kernel version as a raw string.
example: 4.4.0-112-generic
- name: os.name
level: extended
type: keyword
ignore_above: 1024
multi_fields:
- name: text
type: text
norms: false
default_field: false
description: Operating system name, without the version.
example: Mac OS X
- name: os.platform
level: extended
type: keyword
ignore_above: 1024
description: Operating system platform (such centos, ubuntu, windows).
example: darwin
- name: os.version
level: extended
type: keyword
ignore_above: 1024
description: Operating system version as a raw string.
example: 10.14.1
- name: type
level: core
type: keyword
ignore_above: 1024
description: 'Type of host.

For Cloud providers this can be the machine type like `t2.medium`. If vm, this could be the container, for example, or other information meaningful in your environment.'
- name: containerized
type: boolean
description: >
If the host is a container.

- name: os.build
type: keyword
example: "18D109"
description: >
OS build information.

- name: os.codename
type: keyword
example: "stretch"
description: >
OS codename, if any.

95 changes: 95 additions & 0 deletions packages/kubernetes/data_stream/pod_logs/fields/base-fields.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,95 @@
- name: data_stream.type
type: constant_keyword
description: Data stream type.
- name: data_stream.dataset
type: constant_keyword
description: Data stream dataset.
- name: data_stream.namespace
type: constant_keyword
description: Data stream namespace.
- name: '@timestamp'
type: date
description: Event timestamp.
- name: kubernetes
type: group
fields:
- name: pod.name
type: keyword
description: >
Kubernetes pod name

- name: pod.uid
type: keyword
description: >
Kubernetes pod UID

- name: pod.ip
type: ip
description: >
Kubernetes pod IP

- name: namespace
type: keyword
description: >
Kubernetes namespace

- name: node.name
type: keyword
description: >
Kubernetes node name

- name: node.hostname
type: keyword
description: >
Kubernetes hostname as reported by the node’s kernel

- name: labels.*
type: object
object_type: keyword
object_type_mapping_type: "*"
description: >
Kubernetes labels map

- name: annotations.*
type: object
object_type: keyword
object_type_mapping_type: "*"
description: >
Kubernetes annotations map

- name: selectors.*
type: object
object_type: keyword
object_type_mapping_type: "*"
description: >
Kubernetes Service selectors map

- name: replicaset.name
type: keyword
description: >
Kubernetes replicaset name

- name: deployment.name
type: keyword
description: >
Kubernetes deployment name

- name: daemonset.name
type: keyword
description: >
Kubernetes daemonset name

- name: statefulset.name
type: keyword
description: >
Kubernetes statefulset name

- name: container.name
type: keyword
description: >
Kubernetes container name

- name: container.image
type: keyword
description: >-
Kubernetes container image
21 changes: 21 additions & 0 deletions packages/kubernetes/data_stream/pod_logs/manifest.yml
Original file line number Diff line number Diff line change
@@ -0,0 +1,21 @@
title: "Kubernetes pod logs"
type: logs
streams:
- input: logfile
title: Collect Kubernetes pod logs
description: Collect Kubernetes pod logs
vars:
- name: paths
type: text
required: true
title: Kubernetes pod log path
multi: true
default:
- /var/log/containers/*${kubernetes.container.id}.log
- name: symlinks
type: bool
title: Use Symlinks
multi: false
required: true
show_user: true
default: true
7 changes: 6 additions & 1 deletion packages/kubernetes/docs/README.md
Original file line number Diff line number Diff line change
@@ -1,6 +1,6 @@
# Kubernetes integration

This integration is used to collect metrics from
This integration is used to collect logs and metrics from
[Kubernetes clusters](https://kubernetes.io/).

As one of the main pieces provided for Kubernetes monitoring, this integration is capable of fetching metrics from several components:
Expand Down Expand Up @@ -78,6 +78,11 @@ These datasets are not enabled by default.
Note: In some "As a Service" Kubernetes implementations, like `GKE`, the master nodes or even the pods running on
the masters won't be visible. In these cases it won't be possible to use `scheduler` and `controllermanager` metricsets.

#### kube-pod-logs
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this seems out dated too


The kube-pod-logs dataset requires access to the log files in each Kubernetes node where the pod logs are stored.
This defaults to `/var/log/containers/*${kubernetes.container.id}.log`.

## Compatibility

The Kubernetes package is tested with Kubernetes 1.13.x, 1.14.x, 1.15.x, 1.16.x, 1.17.x, and 1.18.x
Expand Down
6 changes: 6 additions & 0 deletions packages/kubernetes/docs/kube-pod-logs.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,6 @@
# kube-pod-logs

kube-pod-logs integration collects and parses logs of Kubernetes pods.

It requires access to the log files in each Kubernetes node where the pod logs are stored.
This defaults to `/var/log/containers/*${kubernetes.container.id}.log`.
Loading