Skip to content

Commit

Permalink
chore: cleanup readmes for processor, aggregator, and parser plugins
Browse files Browse the repository at this point in the history
  • Loading branch information
reimda committed Jun 6, 2022
1 parent d133143 commit 428b738
Show file tree
Hide file tree
Showing 39 changed files with 592 additions and 339 deletions.
5 changes: 3 additions & 2 deletions plugins/aggregators/basicstats/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,8 @@
# BasicStats Aggregator Plugin

The BasicStats aggregator plugin give us count,diff,max,min,mean,non_negative_diff,sum,s2(variance), stdev for a set of values,
emitting the aggregate every `period` seconds.
The BasicStats aggregator plugin give us count, diff, max, min, mean,
non_negative_diff, sum, s2(variance), stdev for a set of values, emitting the
aggregate every `period` seconds.

## Configuration

Expand Down
67 changes: 41 additions & 26 deletions plugins/aggregators/derivative/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -31,31 +31,36 @@ derivative = --------------------------------
variable_last - variable_first
```

**Make sure the specified variable is not filtered and exists in the metrics passed to this aggregator!**
**Make sure the specified variable is not filtered and exists in the metrics
passed to this aggregator!**

When using a custom derivation variable, you should change the `suffix` of the derivative name.
See the next section on [customizing the derivative name](#customize-the-derivative-name) for details.
When using a custom derivation variable, you should change the `suffix` of the
derivative name. See the next section on [customizing the derivative
name](#customize-the-derivative-name) for details.

## Customize the Derivative Name

The derivatives generated by the aggregator are named `<fieldname>_rate`, i.e. they are composed of the field name and a suffix `_rate`.
You can configure the suffix to be used by changing the `suffix` parameter.
The derivatives generated by the aggregator are named `<fieldname>_rate`,
i.e. they are composed of the field name and a suffix `_rate`. You can
configure the suffix to be used by changing the `suffix` parameter.

## Roll-Over to next Period

Calculating the derivative for a period requires at least two distinct measurements during that period.
Whether those are available depends on the configuration of the aggregator `period` and the agent `interval`.
By default the last measurement is used as first measurement in the next
aggregation period. This enables a continuous calculation of the derivative. If
within the next period an earlier timestamp is encountered this measurement will
replace the roll-over metric. A main benefit of this roll-over is the ability to
cope with multiple "quiet" periods, where no new measurement is pushed to the
Calculating the derivative for a period requires at least two distinct
measurements during that period. Whether those are available depends on the
configuration of the aggregator `period` and the agent `interval`. By default
the last measurement is used as first measurement in the next aggregation
period. This enables a continuous calculation of the derivative. If within the
next period an earlier timestamp is encountered this measurement will replace
the roll-over metric. A main benefit of this roll-over is the ability to cope
with multiple "quiet" periods, where no new measurement is pushed to the
aggregator. The roll-over will take place at most `max_roll_over` times.

### Example of Roll-Over

Let us assume we have an input plugin, that generates a measurement with a single metric "test" every 2 seconds.
Let this metric increase the first 10 seconds from 0.0 to 10.0 and then decrease the next 10 seconds form 10.0 to 0.0:
Let us assume we have an input plugin, that generates a measurement with a
single metric "test" every 2 seconds. Let this metric increase the first 10
seconds from 0.0 to 10.0 and then decrease the next 10 seconds form 10.0 to 0.0:

| timestamp | value |
|-----------|-------|
Expand All @@ -71,8 +76,9 @@ Let this metric increase the first 10 seconds from 0.0 to 10.0 and then decrease
| 18 | 2.0 |
| 20 | 0.0 |

To avoid thinking about border values, we consider periods to be inclusive at the start but exclusive in the end.
Using `period = "10s"` and `max_roll_over = 0` we would get the following aggregates:
To avoid thinking about border values, we consider periods to be inclusive at
the start but exclusive in the end. Using `period = "10s"` and `max_roll_over =
0` we would get the following aggregates:

| timestamp | value | aggregate | explanantion |
|-----------|-------|-----------|--------------|
Expand All @@ -90,9 +96,11 @@ Using `period = "10s"` and `max_roll_over = 0` we would get the following aggreg
||| -1.0 | (2.0 - 10.0) / (18 - 10)
| 20 | 0.0 |

If we now decrease the period with `period = 2s`, no derivative could be calculated since there would only one measurement for each period.
The aggregator will emit the log messages `Same first and last event for "test", skipping.`.
This changes, if we use `max_roll_over = 1`, since now end measurements of a period are taking as start for the next period.
If we now decrease the period with `period = 2s`, no derivative could be
calculated since there would only one measurement for each period. The
aggregator will emit the log messages `Same first and last event for "test",
skipping.`. This changes, if we use `max_roll_over = 1`, since now end
measurements of a period are taking as start for the next period.

| timestamp | value | aggregate | explanantion |
|-----------|-------|-----------|--------------|
Expand All @@ -108,10 +116,12 @@ This changes, if we use `max_roll_over = 1`, since now end measurements of a per
| 18 | 2.0 | -1.0 | (2.0 - 4.0) / (18 - 16) |
| 20 | 0.0 | -1.0 | (0.0 - 2.0) / (20 - 18) |

The default `max_roll_over = 10` allows for multiple periods without measurements either due to configuration or missing input.
The default `max_roll_over = 10` allows for multiple periods without
measurements either due to configuration or missing input.

There may be a slight difference in the calculation when using `max_roll_over` compared to running without.
To illustrate this, let us compare the derivatives for `period = "7s"`.
There may be a slight difference in the calculation when using `max_roll_over`
compared to running without. To illustrate this, let us compare the derivatives
for `period = "7s"`.

| timestamp | value | `max_roll_over = 0` | `max_roll_over = 1` |
|-----------|-------|-----------|--------------|
Expand All @@ -130,10 +140,15 @@ To illustrate this, let us compare the derivatives for `period = "7s"`.
| 20 | 0.0 |
||| -1.0 | -1.0 |

The difference stems from the change of the value between periods, e.g. from 6.0 to 8.0 between first and second period.
Thoses changes are omitted with `max_roll_over = 0` but are respected with `max_roll_over = 1`.
That there are no more differences in the calculated derivatives is due to the example data, which has constant derivatives in during the first and last period, even when including the gap between the periods.
Using `max_roll_over` with a value greater 0 may be important, if you need to detect changes between periods, e.g. when you have very few measurements in a period or quasi-constant metrics with only occasional changes.
The difference stems from the change of the value between periods, e.g. from 6.0
to 8.0 between first and second period. Thoses changes are omitted with
`max_roll_over = 0` but are respected with `max_roll_over = 1`. That there are
no more differences in the calculated derivatives is due to the example data,
which has constant derivatives in during the first and last period, even when
including the gap between the periods. Using `max_roll_over` with a value
greater 0 may be important, if you need to detect changes between periods,
e.g. when you have very few measurements in a period or quasi-constant metrics
with only occasional changes.

## Configuration

Expand Down
31 changes: 18 additions & 13 deletions plugins/aggregators/histogram/README.md
Original file line number Diff line number Diff line change
Expand Up @@ -4,25 +4,29 @@ The histogram aggregator plugin creates histograms containing the counts of
field values within a range.

If `cumulative` is set to true, values added to a bucket are also added to the
larger buckets in the distribution. This creates a [cumulative histogram](https://en.wikipedia.org/wiki/Histogram#/media/File:Cumulative_vs_normal_histogram.svg).
Otherwise, values are added to only one bucket, which creates an [ordinary histogram](https://en.wikipedia.org/wiki/Histogram#/media/File:Cumulative_vs_normal_histogram.svg)
larger buckets in the distribution. This creates a [cumulative histogram][1].
Otherwise, values are added to only one bucket, which creates an [ordinary
histogram][1]

Like other Telegraf aggregators, the metric is emitted every `period` seconds.
By default bucket counts are not reset between periods and will be non-strictly
increasing while Telegraf is running. This behavior can be changed by setting the
`reset` parameter to true.
increasing while Telegraf is running. This behavior can be changed by setting
the `reset` parameter to true.

[1]: https://en.wikipedia.org/wiki/Histogram#/media/File:Cumulative_vs_normal_histogram.svg

## Design

Each metric is passed to the aggregator and this aggregator searches
histogram buckets for those fields, which have been specified in the
config. If buckets are found, the aggregator will increment +1 to the appropriate
Each metric is passed to the aggregator and this aggregator searches histogram
buckets for those fields, which have been specified in the config. If buckets
are found, the aggregator will increment +1 to the appropriate
bucket. Otherwise, it will be added to the `+Inf` bucket. Every `period`
seconds this data will be forwarded to the outputs.

The algorithm of hit counting to buckets was implemented on the base
of the algorithm which is implemented in the Prometheus
[client](https://github.com/prometheus/client_golang/blob/master/prometheus/histogram.go).
The algorithm of hit counting to buckets was implemented on the base of the
algorithm which is implemented in the Prometheus [client][2].

[2]: https://github.com/prometheus/client_golang/blob/master/prometheus/histogram.go

## Configuration

Expand Down Expand Up @@ -77,9 +81,10 @@ option. Optionally, if `fields` is set only the fields listed will be
aggregated. If `fields` is not set all fields are aggregated.

The `buckets` option contains a list of floats which specify the bucket
boundaries. Each float value defines the inclusive upper (right) bound of the bucket.
The `+Inf` bucket is added automatically and does not need to be defined.
(For left boundaries, these specified bucket borders and `-Inf` will be used).
boundaries. Each float value defines the inclusive upper (right) bound of the
bucket. The `+Inf` bucket is added automatically and does not need to be
defined. (For left boundaries, these specified bucket borders and `-Inf` will
be used).

## Measurements & Fields

Expand Down
2 changes: 1 addition & 1 deletion plugins/aggregators/merge/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Merge Aggregator
# Merge Aggregator Plugin

Merge metrics together into a metric with multiple fields into the most memory
and network transfer efficient form.
Expand Down
24 changes: 13 additions & 11 deletions plugins/aggregators/quantile/README.md
Original file line number Diff line number Diff line change
@@ -1,7 +1,7 @@
# Quantile Aggregator Plugin

The quantile aggregator plugin aggregates specified quantiles for each numeric field
per metric it sees and emits the quantiles every `period`.
The quantile aggregator plugin aggregates specified quantiles for each numeric
field per metric it sees and emits the quantiles every `period`.

## Configuration

Expand Down Expand Up @@ -52,14 +52,15 @@ For implementation details see the underlying [golang library][tdigest_lib].

### exact R7 and R8

These algorithms compute quantiles as described in [Hyndman & Fan (1996)][hyndman_fan].
The R7 variant is used in Excel and NumPy. The R8 variant is recommended
by Hyndman & Fan due to its independence of the underlying sample distribution.
These algorithms compute quantiles as described in [Hyndman & Fan
(1996)][hyndman_fan]. The R7 variant is used in Excel and NumPy. The R8
variant is recommended by Hyndman & Fan due to its independence of the
underlying sample distribution.

These algorithms save all data for the aggregation `period`. They require
a lot of memory when used with a large number of series or a
large number of samples. They are slower than the `t-digest`
algorithm and are recommended only to be used with a small number of samples and series.
These algorithms save all data for the aggregation `period`. They require a lot
of memory when used with a large number of series or a large number of
samples. They are slower than the `t-digest` algorithm and are recommended only
to be used with a small number of samples and series.

## Benchmark (linux/amd64)

Expand Down Expand Up @@ -108,8 +109,9 @@ and the default setting for `quantiles` you get the following *output*
- maximum_response_ms_050 (float64)
- maximum_response_ms_075 (float64)

The `status` and `ok` fields are dropped because they are not numeric. Note that the
number of resulting fields scales with the number of `quantiles` specified.
The `status` and `ok` fields are dropped because they are not numeric. Note
that the number of resulting fields scales with the number of `quantiles`
specified.

### Tags

Expand Down
55 changes: 35 additions & 20 deletions plugins/aggregators/starlark/README.md
Original file line number Diff line number Diff line change
@@ -1,16 +1,22 @@
# Starlark Aggregator
# Starlark Aggregator Plugin

The `starlark` aggregator allows to implement a custom aggregator plugin with a Starlark script. The Starlark
script needs to be composed of the three methods defined in the Aggregator plugin interface which are `add`, `push` and `reset`.
The `starlark` aggregator allows to implement a custom aggregator plugin with a
Starlark script. The Starlark script needs to be composed of the three methods
defined in the Aggregator plugin interface which are `add`, `push` and `reset`.

The Starlark Aggregator plugin calls the Starlark function `add` to add the metrics to the aggregator, then calls the Starlark function `push` to push the resulting metrics into the accumulator and finally calls the Starlark function `reset` to reset the entire state of the plugin.
The Starlark Aggregator plugin calls the Starlark function `add` to add the
metrics to the aggregator, then calls the Starlark function `push` to push the
resulting metrics into the accumulator and finally calls the Starlark function
`reset` to reset the entire state of the plugin.

The Starlark functions can use the global function `state` to keep temporary the metrics to aggregate.
The Starlark functions can use the global function `state` to keep temporary the
metrics to aggregate.

The Starlark language is a dialect of Python, and will be familiar to those who
have experience with the Python language. However, there are major [differences](#python-differences).
Existing Python code is unlikely to work unmodified. The execution environment
is sandboxed, and it is not possible to do I/O operations such as reading from
have experience with the Python language. However, there are major
[differences](#python-differences). Existing
Python code is unlikely to work unmodified. The execution environment is
sandboxed, and it is not possible to do I/O operations such as reading from
files or sockets.

The **[Starlark specification][]** has details about the syntax and available
Expand Down Expand Up @@ -52,24 +58,27 @@ def reset():

## Usage

The Starlark code should contain a function called `add` that takes a metric as argument.
The function will be called with each metric to add, and doesn't return anything.
The Starlark code should contain a function called `add` that takes a metric as
argument. The function will be called with each metric to add, and doesn't
return anything.

```python
def add(metric):
state["last"] = metric
```

The Starlark code should also contain a function called `push` that doesn't take any argument.
The function will be called to compute the aggregation, and returns the metrics to push to the accumulator.
The Starlark code should also contain a function called `push` that doesn't take
any argument. The function will be called to compute the aggregation, and
returns the metrics to push to the accumulator.

```python
def push():
return state.get("last")
```

The Starlark code should also contain a function called `reset` that doesn't take any argument.
The function will be called to reset the plugin, and doesn't return anything.
The Starlark code should also contain a function called `reset` that doesn't
take any argument. The function will be called to reset the plugin, and doesn't
return anything.

```python
def push():
Expand All @@ -81,22 +90,28 @@ the [Starlark specification][].

## Python Differences

Refer to the section [Python Differences](plugins/processors/starlark/README.md#python-differences) of the documentation about the Starlark processor.
Refer to the section [Python
Differences](../../processors/starlark/README.md#python-differences) of the
documentation about the Starlark processor.

## Libraries available

Refer to the section [Libraries available](plugins/processors/starlark/README.md#libraries-available) of the documentation about the Starlark processor.
Refer to the section [Libraries
available](../../processors/starlark/README.md#libraries-available) of the
documentation about the Starlark processor.

## Common Questions

Refer to the section [Common Questions](plugins/processors/starlark/README.md#common-questions) of the documentation about the Starlark processor.
Refer to the section [Common
Questions](../../processors/starlark/README.md#common-questions) of the
documentation about the Starlark processor.

## Examples

- [minmax](/plugins/aggregators/starlark/testdata/min_max.star) - A minmax aggregator implemented with a Starlark script.
- [merge](/plugins/aggregators/starlark/testdata/merge.star) - A merge aggregator implemented with a Starlark script.
- [minmax](testdata/min_max.star) - A minmax aggregator implemented with a Starlark script.
- [merge](testdata/merge.star) - A merge aggregator implemented with a Starlark script.

[All examples](/plugins/aggregators/starlark/testdata) are in the testdata folder.
[All examples](testdata) are in the testdata folder.

Open a Pull Request to add any other useful Starlark examples.

Expand Down
7 changes: 4 additions & 3 deletions plugins/parsers/collectd/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# Collectd
# Collectd Parser Plugin

The collectd format parses the collectd binary network protocol. Tags are
created for host, instance, type, and type instance. All collectd values are
Expand All @@ -11,12 +11,13 @@ You can control the cryptographic settings with parser options. Create an
authentication file and set `collectd_auth_file` to the path of the file, then
set the desired security level in `collectd_security_level`.

Additional information including client setup can be found
[here](https://collectd.org/wiki/index.php/Networking_introduction#Cryptographic_setup).
Additional information including client setup can be found [here][1].

You can also change the path to the typesdb or add additional typesdb using
`collectd_typesdb`.

[1]: https://collectd.org/wiki/index.php/Networking_introduction#Cryptographic_setup

## Configuration

```toml
Expand Down
10 changes: 5 additions & 5 deletions plugins/parsers/csv/README.md
Original file line number Diff line number Diff line change
@@ -1,4 +1,4 @@
# CSV
# CSV Parser Plugin

The `csv` parser creates metrics from a document containing comma separated
values.
Expand Down Expand Up @@ -107,10 +107,10 @@ time using the JSON document you can use the `csv_timestamp_column` and
`csv_timestamp_format` options together to set the time to a value in the parsed
document.

The `csv_timestamp_column` option specifies the key containing the time value and
`csv_timestamp_format` must be set to `unix`, `unix_ms`, `unix_us`, `unix_ns`,
or a format string in using the Go "reference time" which is defined to be the
**specific time**: `Mon Jan 2 15:04:05 MST 2006`.
The `csv_timestamp_column` option specifies the key containing the time value
and `csv_timestamp_format` must be set to `unix`, `unix_ms`, `unix_us`,
`unix_ns`, or a format string in using the Go "reference time" which is defined
to be the **specific time**: `Mon Jan 2 15:04:05 MST 2006`.

Consult the Go [time][time parse] package for details and additional examples
on how to set the time format.
Expand Down
Loading

0 comments on commit 428b738

Please sign in to comment.