-
Notifications
You must be signed in to change notification settings - Fork 4.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[Metricbeat] Convert millis-since-epoch timestamps in elasticsearch/ml_job
metricset to ints
#14222
Conversation
Pinging @elastic/stack-monitoring (Stack monitoring) |
metricbeat/helper/elastic/json.go
Outdated
func Unmarshal(data []byte, v interface{}) error { | ||
dec := json.NewDecoder(bytes.NewReader(data)) | ||
dec.UseNumber() | ||
return dec.Decode(v) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we also need to run
func TransformNumbers(dict common.MapStr) { |
on the output to convert the json.Number
back to a basic type?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since the stack modules in Metricbeat are meant for Stack Monitoring, the events they produce are meant to get indexed into Elasticsearch (directly via output.elasticsearch
or indirectly via another output). Elasticsearch can support numbers as strings (which is what json.Number
is) as long as they are mapped as a numeric type. For example:
PUT foo
{
"mappings": {
"properties": {
"n": { "type": "integer" },
"d": { "type": "date", "format": "epoch_millis"}
}
}
}
# 1571647285959 == Monday, October 21, 2019 8:41:25.959 AM
POST foo/_doc
{
"n": "999",
"d": "1571647285959"
}
# Should return no hits
POST foo/_search
{
"query": {
"range": {
"d": {
"lt": 1571647285900
}
}
}
}
# Should return 1 hit
POST foo/_search
{
"query": {
"range": {
"n": {
"lt": 1000
}
}
}
}
So I think, in the case of these stack modules, it's okay to keep the umarshalled numbers as quoted strings?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I decided to go a different way altogether with the solution here. See #14222 (comment).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(Haven't looked at the new approach yet.) My only concern with leaving the json.Number
values in the event is that conditions in processors probably won't work. For example the range
condition only works on primitive values.
Based on the comment in elastic/elasticsearch#48472 (comment), it looks like only |
e784fce
to
69f919d
Compare
elasticsearch/ml_job
metricset to ints
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I can't comment too much on the specific implementation but on the whole, this looks like a positive change.
jenkins, test this |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That should fix it. 👍
c4e9f8e
to
be7c8d9
Compare
Travis CI is green. Jenkins CI failures are unrelated. Merging... |
…earch/ml_job` metricset to ints (#14222) (#14293) * [Metricbeat] Convert millis-since-epoch timestamps in `elasticsearch/ml_job` metricset to ints (#14222) * Convert timestamp field to int from float64 * Reformatting comment to be less wide * Fixing up comment * Adding unit tests * Adding CHANGELOG entry * Fixing up CHANGELOG
…earch/ml_job` metricset to ints (#14222) (#14294) * [Metricbeat] Convert millis-since-epoch timestamps in `elasticsearch/ml_job` metricset to ints (#14222) * Convert timestamp field to int from float64 * Reformatting comment to be less wide * Fixing up comment * Adding unit tests * Adding CHANGELOG entry * Fixing up CHANGELOG
…earch/ml_job` metricset to ints (#14222) (#14295) * [Metricbeat] Convert millis-since-epoch timestamps in `elasticsearch/ml_job` metricset to ints (#14222) * Convert timestamp field to int from float64 * Reformatting comment to be less wide * Fixing up comment * Adding unit tests * Adding CHANGELOG entry * Fixing up CHANGELOG
…ml_job` metricset to ints (elastic#14222) * Convert timestamp field to int from float64 * Reformatting comment to be less wide * Fixing up comment * Adding unit tests * Adding CHANGELOG entry
…earch/ml_job` metricset to ints (elastic#14222) (elastic#14293) * [Metricbeat] Convert millis-since-epoch timestamps in `elasticsearch/ml_job` metricset to ints (elastic#14222) * Convert timestamp field to int from float64 * Reformatting comment to be less wide * Fixing up comment * Adding unit tests * Adding CHANGELOG entry * Fixing up CHANGELOG
…earch/ml_job` metricset to ints (elastic#14222) (elastic#14294) * [Metricbeat] Convert millis-since-epoch timestamps in `elasticsearch/ml_job` metricset to ints (elastic#14222) * Convert timestamp field to int from float64 * Reformatting comment to be less wide * Fixing up comment * Adding unit tests * Adding CHANGELOG entry * Fixing up CHANGELOG
Resolves #14220.
The
elasticsearch/ml_job
metricset calls theGET "/anomaly_detectors/_all/_stats
Elasticsearch API to collect monitoring metrics about ML jobs running in Elasticsearch. This API's response contains several fields that are timestamps represented as milliseconds-since-epoch; see https://www.elastic.co/guide/en/elasticsearch/reference/current/ml-get-job-stats.html#ml-get-job-stats-example.Of these fields, two are actually indexed in
.monitoring-es-*
:job_stats.data_counts.earliest_record_timestamp
andjob_stats.data_counts.latest_record_timestamp
.As the mappings indicate, these fields are of
type: "date"
. Per elastic/elasticsearch#36691, the Elasticsearch date parser can no longer accept millis-since-epoch values in scientific notation (e.g.1.571647285959e+12
). However, during the unmarshalling of the ML API response JSON into memory, thejson.Unmarshal
function will, by default, producefloat64
values. These values, when marshalled back into JSON for indexing into Elasticsearch may be represented in scientific notation if they are sufficiently large, which will be rejected by the Elasticsearch date parser with an error like so:This PR fixes the problem by converting both these fields to
int
s fromfloat64
s before they are written to the event.