-
Notifications
You must be signed in to change notification settings - Fork 25k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add standard deviation / variance sampling to extended stats aggregation #49554
Labels
:Analytics/Aggregations
Aggregations
>enhancement
good first issue
low hanging fruit
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
Comments
costin
added
>enhancement
good first issue
low hanging fruit
:Analytics/Aggregations
Aggregations
labels
Nov 25, 2019
Pinging @elastic/es-analytics-geo (:Analytics/Aggregations) |
I am a beginner. Can I start working on this? |
Hi @hoonti06, sure! Let me know if you have questions or need some guidance :) |
Is anyone working on this currently ? |
Hi! I added a pull request for this issue. |
rjernst
added
the
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
label
May 4, 2020
imotov
added a commit
that referenced
this issue
Jun 10, 2020
Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface. Closes #49554 Co-authored-by: Igor Motov <igor@motovs.org>
imotov
added a commit
to imotov/elasticsearch
that referenced
this issue
Jun 10, 2020
…ic#49782) Per 49554 I added standard deviation sampling and variance sampling to the extended stats interface. Closes elastic#49554 Co-authored-by: Igor Motov <igor@motovs.org>
29 tasks
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
:Analytics/Aggregations
Aggregations
>enhancement
good first issue
low hanging fruit
Team:Analytics
Meta label for analytical engine team (ESQL/Aggs/Geo)
Currently Elasticsearch offers standard deviation (STDDEV) and variance (VAR) both in population form however there's also the
sampling
form which depending on the data size, can yield significantly different results.As it's just a matter of a (somewhat) different formula, it should be straight forward to expand the current implementation Extended Stats to support this variant as well.
Potentially to avoid any ambiguities going forward, the current
std_deviation
could be aliased tostd_deviation_population
(same forvariance
) so one could easily pick up the desired type and while also being clear about what type the default fields are.The improved response can look something like this:
The text was updated successfully, but these errors were encountered: