Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow users to configure fitting/filling function for missing/null buckets #17717

Closed
timroes opened this issue Apr 16, 2018 · 17 comments
Closed
Labels
Feature:Visualizations Generic visualization features (in case no more specific feature label is available) release_note:enhancement Team:Visualizations Visualization editors, elastic-charts and infrastructure

Comments

@timroes
Copy link
Contributor

timroes commented Apr 16, 2018

Kibana currently handles null within a chart a bit differently depending on the chart.

Since the way a null value should be handled really depends on the semantics of the data, we cannot make a meaningful decision for the user. We should rather give the user the possibility to change a "fitting function", that will calculate what values should be "fitted" in for null values. Possible values could be (the examples, show what that fitting function would turn the input series [2, null, null, 8] into):

Name Description Examples
None Don't draw that value on the graph [2, null null, 8]
Carry Use the last non null value before that [2, 2, 2, 8]
Nearest Use the closest value (either before or after) that was non null [2, 2, 8, 8]
Lookahead Use the next non null value after that (opposite of Carry) [2, 8, 8, 8]
Average Use the average of the last and next non null value [2, 5, 5, 8]
Linear Linear interpolate between closest values [2, 4, 6, 8]
Zero Replace values with 0 [2, 0, 0, 8]
Explicit Specify an explicit value (x), that should be used instead [2, x, x, 8]

This fitting settings would behave similar to the timelion .fit function, but should work for all charts.

@timroes timroes added release_note:enhancement Feature:Visualizations Generic visualization features (in case no more specific feature label is available) labels Apr 16, 2018
@emilmirzayev
Copy link

When having those options, can it also make sense allowing the 4th option Custom value which replaces the null value with the given? I am not sure how complex it is though

@timroes
Copy link
Contributor Author

timroes commented Apr 16, 2018

Hey, sorry I actually hit Send before I finished typing :D So I was still editing that issue, and yeah that's on the list too.

@emilmirzayev
Copy link

Suggesting to replace Scale with LinScale as it may be confusing keeping in mind that there are two most used scale types: linear and log

@timroes
Copy link
Contributor Author

timroes commented Apr 17, 2018

Thanks for that feedback. I changed the name in the above description.

@timroes timroes added the Feature:ElasticCharts Issues related to the elastic-charts library label May 18, 2018
@timroes timroes added the Team:Visualizations Visualization editors, elastic-charts and infrastructure label Sep 16, 2018
@markov00
Copy link
Member

Hey @timroes can you describe better what is the difference between the carry and the nearest functions?
maybe it's a typo on nearest description

Use the closest value (either before or behind) that was non null

Maybe what you want to say is: use the closest value (either before or after) that was non null?

@timroes
Copy link
Contributor Author

timroes commented Feb 13, 2019

Yeap that's what I want to say :)

So carry would turn [3, null, null, 5] into [3, 3, 3, 5], while nearest would result in [3, 3, 5, 5].

Update: extended the table above to hold an example for each fitting function.

@cjcenizal
Copy link
Contributor

Can we also add an option so that the user can choose to interpret 0 values as null? For very good reasons, ES returns a null bucket value when there are no documents in the bucket for average/max/etc aggregations but a 0 bucket value for a sum aggregation.

As an example use case we may have the chart below, where the drop to 0 communicates unintended information, when we really want to treat those as null and just show no information at all.

image

@markov00
Copy link
Member

I will leave here a comment for fixing this last comment: I think we should be careful on that and always check the bucket docs count before changing the sum value to 0. we can have buckets with 2 values with 0 sum that should be treated as a valid 0.

@nickofthyme
Copy link
Contributor

nickofthyme commented Oct 1, 2019

How would end values be handled in this scenario? Specifically referring to Carry, Lookahead, Nearest, Average and Linear.

Would these values just be dropped from the dataset? Or assume 0 value for the missing previous/next value?

Example:

const nullStart = [null, 2, 5, 8];
const nullEnd = [2, 5, 8, null];

@nickofthyme
Copy link
Contributor

@wylieconlon any thoughts on this ☝️

@wylieconlon
Copy link
Contributor

@nickofthyme I think you're assuming that all of these fitting functions will return continuous data- as in we can draw a line between any values. Are there two types of fitting functions, one that is continuous and one that is discrete?

In the examples you mentioned, I would expect that a line chart based on the Carry function would draw a line that doesn't reach to the last bucket (unlike the chart above, which draws a path down to 0)

@woodchalk
Copy link

Can another option be added for a Zero function? While Custom can certainly facilitate setting the null value to 0, I’d wager that users are looking for a 0 value more than anything else. This is especially true when performing count metrics over a date histogram on a line chart. [2, null, null, 8] would be rendered as [2, 0, 0, 8].

GH-6245 references this issue in more detail, but I’m not sure anything was ever concluded with it. Newer users shouldn’t be expected to input { "min_doc_count": 0 } to get the proper buckets displayed - or understand why that’s needed.

@nickofthyme
Copy link
Contributor

@sec-init so rather than using the Explicit value option with 0 just using a Zero fit type?

@woodchalk
Copy link

I would say in addition to Explicit value there should be a Zero value option, or something that satisfies the intent. The most commonly used explicit value will likely be 0, so making an option for it makes sense. I suspect that without an option in place for users to trial-and-error their way through (until successful), they’ll just assume Kibana/Visualizations aren’t capable of doing it correctly, regardless of Elasticsearch returning the appropriate response.

@timroes timroes changed the title Allow users to configure fitting function for missing/null buckets Allow users to configure fitting/filling function for missing/null buckets Oct 18, 2019
@piellick
Copy link

piellick commented Nov 12, 2019

Hi everyone,
some news ? @timroes proposition look great, it could be usefull to apply this in visual builder's time serie module, no ?

@flash1293
Copy link
Contributor

Removing Lens and elastic-charts labels as fitting functions are supported there by now

@timroes
Copy link
Contributor Author

timroes commented Nov 26, 2020

This feature is available since 7.9.0 in Lens (implemented via #69820).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Feature:Visualizations Generic visualization features (in case no more specific feature label is available) release_note:enhancement Team:Visualizations Visualization editors, elastic-charts and infrastructure
Projects
None yet
Development

No branches or pull requests

9 participants