Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat(outputs.opensearch): opensearch output plugin #11948

Closed
wants to merge 1 commit into from
Closed
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
5 changes: 5 additions & 0 deletions plugins/outputs/all/opensearch.go
Original file line number Diff line number Diff line change
@@ -0,0 +1,5 @@
//go:build !custom || outputs || outputs.opensearch

package all

import _ "github.com/influxdata/telegraf/plugins/outputs/opensearch" // register plugin
364 changes: 364 additions & 0 deletions plugins/outputs/opensearch/README.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,364 @@
# Opensearch Output Plugin

This plugin writes to [Opensearch](https://opensearch.org/) via HTTP using
Elastic client API (<http://olivere.github.io/elastic/).>

It supports Opensearch releases from 1.x up to 2.x.

## Opensearch indexes and templates

### Indexes per time-frame

This plugin can manage indexes per time-frame, as commonly done in other tools
with Opensearch.

The timestamp of the metric collected will be used to decide the index
destination.

For more information about this usage on Opensearch, check [the
docs][1].

[1]: https://opensearch.org/docs/latest/

### Template management

Index templates are used in Opensearch to define settings and mappings for
the indexes and how the fields should be analyzed. For more information on how
this works, see [the docs][2].

This plugin can create a working template for use with telegraf metrics. It uses
Opensearch dynamic templates feature to set proper types for the tags and
metrics fields. If the template specified already exists, it will not overwrite
unless you configure this plugin to do so. Thus you can customize this template
after its creation if necessary.

Example of an index template created by telegraf on Opensearch 2.x:

```json
{
"telegraf-2022.10.02" : {
"aliases" : { },
"mappings" : {
"properties" : {
"@timestamp" : {
"type" : "date"
},
"disk" : {
"properties" : {
"free" : {
"type" : "long"
},
"inodes_free" : {
"type" : "long"
},
"inodes_total" : {
"type" : "long"
},
"inodes_used" : {
"type" : "long"
},
"total" : {
"type" : "long"
},
"used" : {
"type" : "long"
},
"used_percent" : {
"type" : "float"
}
}
},
"measurement_name" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"tag" : {
"properties" : {
"cpu" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"device" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"host" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"mode" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
},
"path" : {
"type" : "text",
"fields" : {
"keyword" : {
"type" : "keyword",
"ignore_above" : 256
}
}
}
}
}
}
},
"settings" : {
"index" : {
"creation_date" : "1664693522789",
"number_of_shards" : "1",
"number_of_replicas" : "1",
"uuid" : "TYugdmvsQfmxjzbGRJ8FIw",
"version" : {
"created" : "136247827"
},
"provided_name" : "telegraf-2022.10.02"
}
}
}
}

```

[2]: https://opensearch.org/docs/latest/opensearch/index-templates/

### Example events

This plugin will format the events in the following way:

```json
{
"@timestamp": "2017-01-01T00:00:00+00:00",
"measurement_name": "cpu",
"cpu": {
"usage_guest": 0,
"usage_guest_nice": 0,
"usage_idle": 71.85413456197966,
"usage_iowait": 0.256805341656516,
"usage_irq": 0,
"usage_nice": 0,
"usage_softirq": 0.2054442732579466,
"usage_steal": 0,
"usage_system": 15.04879301548127,
"usage_user": 12.634822807288275
},
"tag": {
"cpu": "cpu-total",
"host": "opensearhhost",
"dc": "datacenter1"
}
}
```

```json
{
"@timestamp": "2017-01-01T00:00:00+00:00",
"measurement_name": "system",
"system": {
"load1": 0.78,
"load15": 0.8,
"load5": 0.8,
"n_cpus": 2,
"n_users": 2
},
"tag": {
"host": "opensearhhost",
"dc": "datacenter1"
}
}
```

## Configuration

```toml @sample.conf
# Configuration for Opensearch to send metrics to.
[[outputs.opensearch]]
## The full HTTP endpoint URL for your Opensearch instance
## Multiple urls can be specified as part of the same cluster,
## this means that only ONE of the urls will be written to each interval
urls = [ "http://node1.es.example.com:9200" ] # required.
## Opensearch client timeout, defaults to "5s" if not set.
timeout = "5s"
## Set to true to ask Opensearch a list of all cluster nodes,
## thus it is not necessary to list all nodes in the urls config option
enable_sniffer = false
## Set to true to enable gzip compression
enable_gzip = false
## Set the interval to check if the Opensearch nodes are available
## Setting to "0s" will disable the health check (not recommended in production)
health_check_interval = "10s"
## Set the timeout for periodic health checks.
# health_check_timeout = "1s"
## HTTP basic authentication details.
## HTTP basic authentication details
# username = "telegraf"
# password = "mypassword"
## HTTP bearer token authentication details
# auth_bearer_token = "eyJhbGciOiJIUzI1NiIsInR5cCI6IkpXVCJ9"

## Index Config
## The target index for metrics (Opensearch will create if it not exists).
## You can use the date specifiers below to create indexes per time frame.
## The metric timestamp will be used to decide the destination index name
# %Y - year (2016)
# %y - last two digits of year (00..99)
# %m - month (01..12)
# %d - day of month (e.g., 01)
# %H - hour (00..23)
# %V - week of the year (ISO week) (01..53)
## Additionally, you can specify a tag name using the notation {{tag_name}}
## which will be used as part of the index name. If the tag does not exist,
## the default tag value will be used.
# index_name = "telegraf-{{host}}-%Y.%m.%d"
# default_tag_value = "none"
index_name = "telegraf-%Y.%m.%d" # required.

## Optional TLS Config
# tls_ca = "/etc/telegraf/ca.pem"
# tls_cert = "/etc/telegraf/cert.pem"
# tls_key = "/etc/telegraf/key.pem"
## Use TLS but skip chain & host verification
# insecure_skip_verify = false

## Template Config
## Set to true if you want telegraf to manage its index template.
## If enabled it will create a recommended index template for telegraf indexes
manage_template = true
## The template name used for telegraf indexes
template_name = "telegraf"
## Set to true if you want telegraf to overwrite an existing template
overwrite_template = false
## If set to true a unique ID hash will be sent as sha256(concat(timestamp,measurement,series-hash)) string
## it will enable data resend and update metric points avoiding duplicated metrics with diferent id's
force_document_id = false

## Specifies the handling of NaN and Inf values.
## This option can have the following values:
## none -- do not modify field-values (default); will produce an error if NaNs or infs are encountered
## drop -- drop fields containing NaNs or infs
## replace -- replace with the value in "float_replacement_value" (default: 0.0)
## NaNs and inf will be replaced with the given number, -inf with the negative of that number
# float_handling = "none"
# float_replacement_value = 0.0

## Pipeline Config
## To use a ingest pipeline, set this to the name of the pipeline you want to use.
# use_pipeline = "my_pipeline"
## Additionally, you can specify a tag name using the notation {{tag_name}}
## which will be used as part of the pipeline name. If the tag does not exist,
## the default pipeline will be used as the pipeline. If no default pipeline is set,
## no pipeline is used for the metric.
# use_pipeline = "{{es_pipeline}}"
# default_pipeline = "my_pipeline"
```

### Permissions

If you are using authentication within your Opensearch cluster, you need to
create a account and create a role with at least the manage role in the Cluster
Privileges category. Overwise, your account will not be able to connect to your
Opensearch cluster and send logs to your cluster. After that, you need to
add "create_indice" and "write" permission to your specific index pattern.

### Required parameters

* `urls`: A list containing the full HTTP URL of one or more nodes from your
Opensearch instance.
* `index_name`: The target index for metrics. You can use the date specifiers
below to create indexes per time frame.

``` %Y - year (2017)
%y - last two digits of year (00..99)
%m - month (01..12)
%d - day of month (e.g., 01)
%H - hour (00..23)
%V - week of the year (ISO week) (01..53)
```

Additionally, you can specify dynamic index names by using tags with the
notation ```{{tag_name}}```. This will store the metrics with different tag
values in different indices. If the tag does not exist in a particular metric,
the `default_tag_value` will be used instead.

### Optional parameters

* `timeout`: Opensearch client timeout, defaults to "5s" if not set.
* `enable_sniffer`: Set to true to ask Opensearch a list of all cluster
nodes, thus it is not necessary to list all nodes in the urls config option.
* `health_check_interval`: Set the interval to check if the nodes are available,
in seconds. Setting to 0 will disable the health check (not recommended in
production).
* `username`: The username for HTTP basic authentication details (eg. when using
Shield).
* `password`: The password for HTTP basic authentication details (eg. when using
Shield).
* `manage_template`: Set to true if you want telegraf to manage its index
template. If enabled it will create a recommended index template for telegraf
indexes.
* `template_name`: The template name used for telegraf indexes.
* `overwrite_template`: Set to true if you want telegraf to overwrite an
existing template.
* `force_document_id`: Set to true will compute a unique hash from as
sha256(concat(timestamp,measurement,series-hash)),enables resend or update
data withoud ES duplicated documents.
* `float_handling`: Specifies how to handle `NaN` and infinite field
values. `"none"` (default) will do nothing, `"drop"` will drop the field and
`replace` will replace the field value by the number in
`float_replacement_value`
* `float_replacement_value`: Value (defaulting to `0.0`) to replace `NaN`s and
`inf`s if `float_handling` is set to `replace`. Negative `inf` will be
replaced by the negative value in this number to respect the sign of the
field's original value.
* `use_pipeline`: If set, the set value will be used as the pipeline to call
when sending events to opensearch. Additionally, you can specify dynamic
pipeline names by using tags with the notation ```{{tag_name}}```. If the tag
does not exist in a particular metric, the `default_pipeline` will be used
instead.
* `default_pipeline`: If dynamic pipeline names the tag does not exist in a
particular metric, this value will be used instead.

## Known issues

Integer values collected that are bigger than 2^63 and smaller than 1e21 (or in
this exact same window of their negative counterparts) are encoded by golang
JSON encoder in decimal format and that is not fully supported by Opensearch
dynamic field mapping. This causes the metrics with such values to be dropped in
case a field mapping has not been created yet on the telegraf index. If that's
the case you will see an exception on Opensearch side like this:

```json
{"error":{"root_cause":[{"type":"mapper_parsing_exception","reason":"failed to parse"}],"type":"mapper_parsing_exception","reason":"failed to parse","caused_by":{"type":"illegal_state_exception","reason":"No matching token for number_type [BIG_INTEGER]"}},"status":400}
```

The correct field mapping will be created on the telegraf index as soon as a
supported JSON value is received by Opensearch, and subsequent insertions
will work because the field mapping will already exist.

This issue is caused by the way Opensearch tries to detect integer fields,
and by how golang encodes numbers in JSON. There is no clear workaround for this
at the moment.
Loading