Apply a fixed window before writing row metrics #590
Merged
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
What this PR does / why we need it:
Apply a fixed window and send the aggregate Feature Row metrics vs sending all the Feature Row metrics directly. This is so that the metrics collector is not overwhelmed and start dropping metrics.
Which issue(s) this PR fixes:
Fixes #528
Does this PR introduce a user-facing change?:
If Telegraf is currently used to export the StatsD metric to Prometheus metric, the names of the Promethes metrics generated are changed:
This is so that it is consistent with metric name for the feature value:
feature_value_percentile_90
,feature_value_percentile_99
i.e.percentile_x
rather thanx_percentile
feature_row_event_time_epoch_ms
metric is no longer written to StatsD since this metrics is rarely used from our experience, the lag metrics seems to suffice. This also helps reduce the amount of metrics sent.In summary these are the Feature Row StatsD metrics written at every fixed window:
Gauge:
Count: