-
Notifications
You must be signed in to change notification settings - Fork 139
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[META]PPL new trendline
command
#3011
Comments
Hi @YANG-DB , could you provide any background about distinguishing between |
Hi |
My concern is there could be more window function related requests in PPL. I prefer to add a new command to support all of them instead of introducing specific
or named |
Hello, I've started the trendline PPL implementation. Implementation-wise I think this should be very similar to how average is implemented, except the AggregationState should factor in the window specification. Does this seem like the right way to go? One difference between this and average is that this is a root level command rather than a sub-command of stats |
Also, noticed that the syntax suggested above differs from the SPL definition of trendline: Notably, in SPL, the period argument is sort of embedded into moving average type instead of being within the parentheses of the moving average type |
@jduo thanks |
@YANG-DB |
@jduo thanks for the update |
Should the alias part of the trendline command be mandatory? As a user I would expect its optional and to use the original field name if omitted. In SPL it is optional. |
Is your feature request related to a problem?
Adding a new PPL
trendline
command to support computing a moving averages of fields.We would like to support two flavours of moving average:
SMA : Simple moving average
SMA(t) = (1/n) * Σ(f[i]), where i = t-n+1 to t
WMA : Weighted moving average
WMA(t) = Σ(w[i] * f[i]) / Σ(w[i]), where i = t-n+1 to t
Where w[i] is the weight for the i-th data-point.
In a typical WMA, the weights are linearly decreasing from the most recent to the oldest data-point:
w[i] = n - (t - i), where i = t-n+1 to t
The complete forumlation would be:
WMA(t) = Σ((n - (t - i)) * f[i]) / Σ(n - (t - i)), where i = t-n+1 to t
Example
The next command shows a trendline over a 5 month period events by month
The next command would compute a 5-point simple moving average of the 'cpu_usage' field and store it in a new field called 'smooth_cpu'.
Multiple trendlines could be calculated in a single command, such as
Support for PPL
trendline
functionality is required for both:- OpenSearch based PPL engine
- Spark based PPL engine
Do you have any additional context?
Add any other context or screenshots about the feature request here.
The text was updated successfully, but these errors were encountered: