-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix timestamp handling in remote_write #21166
Conversation
Signed-off-by: chrismark <[email protected]>
Pinging @elastic/integrations-platforms (Team:Platforms) |
Signed-off-by: chrismark <[email protected]>
// join metrics with same labels and same timestamp in a single event | ||
timeStampedLabels := labels.Clone() | ||
timeStampedLabels.Put("timestamp", metric.Timestamp.Time()) | ||
labelsHash := timeStampedLabels.String() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would try to avoid cloning to save memory and cycles. How about appending the timestamp to the labelsHash?
Something like sampleHash := labels.String() + metric.Timestamp.Time()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for taking on this! did you happen to reproduce the issue?
Not with a real Prometheus setup but the testcase https://github.com/elastic/beats/pull/21166/files#diff-08a773efe40f622118ed7a6596ea3adaR45 reproduces exactly a scenario where a metric sent twice in a chunk could be saved only once from us due to the bad timestamp handling. |
Signed-off-by: chrismark <[email protected]>
(cherry picked from commit d2a8922)
(cherry picked from commit d2a8922)
…ne-2.0-arm * upstream/master: (29 commits) Fix librpm installation in auditbeat build (elastic#21239) Fix prometheus default config (elastic#21253) Fix dev guide test command (elastic#21254) Move aws lambda metricset to GA (elastic#21255) [Docs] Typo in table syntax (elastic#20227) [ECS] Adds related.hosts to capture all hostnames and host identifiers on an event. (elastic#21160) Add recursive split to httpjson (elastic#21214) [DOCS] Add beat specific start widgets (elastic#21217) Fix timestamp handling in remote_write (elastic#21166) Fix aws, azure and googlecloud compute dashboards (elastic#21098) Add acceptable event log keys to winlog (elastic#21205) Add elastic-agent to gitignore (elastic#21219) Add cloudfoundry tags to events (elastic#21177) [Ingest Manager] Agent includes pgp file (elastic#19480) Add compatibility note about ingress-controller-v0.34.1 (elastic#21209) [Ingest Manager] Support for UPGRADE_ACTION (elastic#21002) Fix libbeat.output.*.bytes metrics of Elasticsearch output (elastic#21197) [packaging] use docker.elastic.co/ubi8/ubi-minimal (elastic#21154) Add host inventory metrics to system module (elastic#20415) [Filebeat][Azure Module] Fixing event.outcome from result_type issue (elastic#20998) ...
…ne-2.0 * upstream/master: (33 commits) Stop running agent container as root by default (elastic#21213) Stop running auditbeat container as root by default (elastic#21202) Fix autodiscover flaky tests (elastic#21242) [Ingest Manager] Enabled dev builds (elastic#21241) Fix librpm installation in auditbeat build (elastic#21239) Fix prometheus default config (elastic#21253) Fix dev guide test command (elastic#21254) Move aws lambda metricset to GA (elastic#21255) [Docs] Typo in table syntax (elastic#20227) [ECS] Adds related.hosts to capture all hostnames and host identifiers on an event. (elastic#21160) Add recursive split to httpjson (elastic#21214) [DOCS] Add beat specific start widgets (elastic#21217) Fix timestamp handling in remote_write (elastic#21166) Fix aws, azure and googlecloud compute dashboards (elastic#21098) Add acceptable event log keys to winlog (elastic#21205) Add elastic-agent to gitignore (elastic#21219) Add cloudfoundry tags to events (elastic#21177) [Ingest Manager] Agent includes pgp file (elastic#19480) Add compatibility note about ingress-controller-v0.34.1 (elastic#21209) [Ingest Manager] Support for UPGRADE_ACTION (elastic#21002) ...
* upstream/master: (417 commits) libbeat/cmd/instance: report cgroup stats (elastic#21113) Configurable index template loading (elastic#21212) [Ingest Manager] Thread safe sorted set (elastic#21290) Change mirror of kafka download (elastic#19645) [Ingest manager] Copy Action store on upgrade (elastic#21298) [CI] Pipeline 2.0 for monorepos (elastic#20104) Stop running agent container as root by default (elastic#21213) Stop running auditbeat container as root by default (elastic#21202) Fix autodiscover flaky tests (elastic#21242) [Ingest Manager] Enabled dev builds (elastic#21241) Fix librpm installation in auditbeat build (elastic#21239) Fix prometheus default config (elastic#21253) Fix dev guide test command (elastic#21254) Move aws lambda metricset to GA (elastic#21255) [Docs] Typo in table syntax (elastic#20227) [ECS] Adds related.hosts to capture all hostnames and host identifiers on an event. (elastic#21160) Add recursive split to httpjson (elastic#21214) [DOCS] Add beat specific start widgets (elastic#21217) Fix timestamp handling in remote_write (elastic#21166) Fix aws, azure and googlecloud compute dashboards (elastic#21098) ...
* upstream/master: (399 commits) libbeat/cmd/instance: report cgroup stats (elastic#21113) Configurable index template loading (elastic#21212) [Ingest Manager] Thread safe sorted set (elastic#21290) Change mirror of kafka download (elastic#19645) [Ingest manager] Copy Action store on upgrade (elastic#21298) [CI] Pipeline 2.0 for monorepos (elastic#20104) Stop running agent container as root by default (elastic#21213) Stop running auditbeat container as root by default (elastic#21202) Fix autodiscover flaky tests (elastic#21242) [Ingest Manager] Enabled dev builds (elastic#21241) Fix librpm installation in auditbeat build (elastic#21239) Fix prometheus default config (elastic#21253) Fix dev guide test command (elastic#21254) Move aws lambda metricset to GA (elastic#21255) [Docs] Typo in table syntax (elastic#20227) [ECS] Adds related.hosts to capture all hostnames and host identifiers on an event. (elastic#21160) Add recursive split to httpjson (elastic#21214) [DOCS] Add beat specific start widgets (elastic#21217) Fix timestamp handling in remote_write (elastic#21166) Fix aws, azure and googlecloud compute dashboards (elastic#21098) ...
* upstream/master: (60 commits) libbeat/cmd/instance: report cgroup stats (elastic#21113) Configurable index template loading (elastic#21212) [Ingest Manager] Thread safe sorted set (elastic#21290) Change mirror of kafka download (elastic#19645) [Ingest manager] Copy Action store on upgrade (elastic#21298) [CI] Pipeline 2.0 for monorepos (elastic#20104) Stop running agent container as root by default (elastic#21213) Stop running auditbeat container as root by default (elastic#21202) Fix autodiscover flaky tests (elastic#21242) [Ingest Manager] Enabled dev builds (elastic#21241) Fix librpm installation in auditbeat build (elastic#21239) Fix prometheus default config (elastic#21253) Fix dev guide test command (elastic#21254) Move aws lambda metricset to GA (elastic#21255) [Docs] Typo in table syntax (elastic#20227) [ECS] Adds related.hosts to capture all hostnames and host identifiers on an event. (elastic#21160) Add recursive split to httpjson (elastic#21214) [DOCS] Add beat specific start widgets (elastic#21217) Fix timestamp handling in remote_write (elastic#21166) Fix aws, azure and googlecloud compute dashboards (elastic#21098) ...
What does this PR do?
Fixes how we handle timestamps when we process Prometheus Samples using remote_write. Before this change it could be possible that we miss some metrics in case of having the same metric in the same received chunk (with different timestamps). Metrics with different timestamps should go in different final events and hence
timestamp
should be taken into account when grouping metrics into events.Why is it important?
To avoid losing data due to wrong grouping.
Checklist
CHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.