Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

New trendline ppl command (SMA only) #833

Merged
merged 24 commits into from
Nov 1, 2024

Conversation

salyh
Copy link
Contributor

@salyh salyh commented Oct 29, 2024

Description

New trendline ppl command (SMA only). For WMA see #831

Related Issues

#655

Check List

  • Updated documentation (docs/ppl-lang/README.md)
  • Implemented unit tests
  • Implemented tests for combination with other commands
  • New added source code should include a copyright header
  • Commits are signed per the DCO using --signoff

By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.

Copy link
Member

@YANG-DB YANG-DB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please also add command to the README, a dedicated page for this command with a link to the dedicated doc and examples in the examples page

@salyh salyh force-pushed the trendline-command-sma branch from 2c04f75 to 4a5d3c1 Compare October 30, 2024 16:59
@YANG-DB YANG-DB self-requested a review October 30, 2024 17:40
Copy link
Member

@YANG-DB YANG-DB left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@salyh plz add logical plan test in the relevant folder
And also plz add a test with no alias example (for logical and actual queries)

@salyh
Copy link
Contributor Author

salyh commented Oct 30, 2024

@salyh plz add logical plan test in the relevant folder And also plz add a test with no alias example (for logical and actual queries)

done

@YANG-DB
Copy link
Member

YANG-DB commented Oct 30, 2024

@jduo thanks for u'r feedback - really appreciated !

Copy link
Member

@LantaoJin LantaoJin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

https://github.com/opensearch-project/opensearch-spark/pull/833/files#r1823633566

To save time, you can ignore the plan assertion and just check the return data matching the expectation. Many thanks! Here is an example for chaining trendline:
| source = $testTable | eval age_1 = age, age_2 = age | trendline sort - age_1 sma(3, age_1) | trendline sort + age_2 sma(3, age_2)

@LantaoJin
Copy link
Member

Another question is whether the sort field can be different from the calculated field? If the answer is no, please throw out an exception for | source = $testTable | trendline sort + name sma(3, age).
If the answer is yes, what is the special semantic for above query?

But this sort keyword in trendline still seems an one way door design, deep diving we could get another case:
| source = $testTable | eval age_alias = age | trendline sort + age_alias sma(3, age).
Will this query fail either? There is no semantic problem IMO. That's why I comment and suggest here.

However, I can give an approval to unblock releasing this as is for now as long as above chaining case works.

@salyh
Copy link
Contributor Author

salyh commented Oct 31, 2024

https://github.com/opensearch-project/opensearch-spark/pull/833/files#r1823633566

To save time, you can ignore the plan assertion and just check the return data matching the expectation. Many thanks! Here is an example for chaining trendline: | source = $testTable | eval age_1 = age, age_2 = age | trendline sort - age_1 sma(3, age_1) | trendline sort + age_2 sma(3, age_2)

@LantaoJin @YANG-DB IT chaining test added

kt-eliatra and others added 20 commits October 31, 2024 22:33
Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
@salyh salyh force-pushed the trendline-command-sma branch from 5a51ceb to ffcc918 Compare October 31, 2024 21:38
@YANG-DB YANG-DB requested a review from LantaoJin November 1, 2024 19:18
@YANG-DB
Copy link
Member

YANG-DB commented Nov 1, 2024

Another question is whether the sort field can be different from the calculated field? If the answer is no, please throw out an exception for | source = $testTable | trendline sort + name sma(3, age). If the answer is yes, what is the special semantic for above query?

But this sort keyword in trendline still seems an one way door design, deep diving we could get another case: | source = $testTable | eval age_alias = age | trendline sort + age_alias sma(3, age). Will this query fail either? There is no semantic problem IMO. That's why I comment and suggest here.

However, I can give an approval to unblock releasing this as is for now as long as above chaining case works.

@LantaoJin I'm approving the PR for merge as this issue was addresses here

@YANG-DB YANG-DB merged commit bdb4848 into opensearch-project:main Nov 1, 2024
4 checks passed
YANG-DB added a commit that referenced this pull request Nov 2, 2024
* update antlr grammar for (future) P1 command syntax

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* add trendline command

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* add expand command

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* add geoip command

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* PPl `flatten` command (#784)

* The flatten command implemented

Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>

* The flatten command integration tests were extended with additional checks for logical plans.

Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>

* flatten, added more tests related to plan translation and integration tests

Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>

* Flatten command added to command names list.

Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>

---------

Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>

* Extract source table names from mv query (#854)

* add sourceTables to MV index metadata properties

Signed-off-by: Sean Kao <seankao@amazon.com>

* parse source tables from mv query

Signed-off-by: Sean Kao <seankao@amazon.com>

* test cases for parse source tables from mv query

Signed-off-by: Sean Kao <seankao@amazon.com>

* use constant for metadata cache version

Signed-off-by: Sean Kao <seankao@amazon.com>

* write source tables to metadata cache

Signed-off-by: Sean Kao <seankao@amazon.com>

* address comment

Signed-off-by: Sean Kao <seankao@amazon.com>

* generate source tables for old mv without new prop

Signed-off-by: Sean Kao <seankao@amazon.com>

* syntax fix

Signed-off-by: Sean Kao <seankao@amazon.com>

---------

Signed-off-by: Sean Kao <seankao@amazon.com>

* Fallback to internal scheduler when index creation failed (#850)

* Fallback to internal scheduler when index creation failed

Signed-off-by: Louis Chu <clingzhi@amazon.com>

* Fix IT

Signed-off-by: Louis Chu <clingzhi@amazon.com>

* Fix IOException

Signed-off-by: Louis Chu <clingzhi@amazon.com>

---------

Signed-off-by: Louis Chu <clingzhi@amazon.com>

* New trendline ppl command (SMA only) (#833)

* WIP trendline command

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* wip

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* trendline supports sorting

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* run scalafmtAll

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* return null when there are too few data points

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* sbt scalafmtAll

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* Remove WMA references

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* trendline - sortByField as Optional<Field>

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* introduce TrendlineStrategy

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* keywordsCanBeId -> replace SMA with trendlineType

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* handle trendline alias as qualifiedName instead of fieldExpression

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* Add docs

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Make alias optional

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Adapt tests for optional alias

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Adden logical plan unittests

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Add missing license headers

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix docs

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* numberOfDataPoints must be 1 or greater

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Rename TrendlineStrategy to  TrendlineCatalystUtils

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Validate TrendlineType early and pass around enum type

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Add trendline chaining test

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix compile errors

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix imports

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix imports

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

---------

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Co-authored-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* update iplocation antlr

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

---------

Signed-off-by: YANGDB <yang.db.dev@gmail.com>
Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Louis Chu <clingzhi@amazon.com>
Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Co-authored-by: lukasz-soszynski-eliatra <110241464+lukasz-soszynski-eliatra@users.noreply.github.com>
Co-authored-by: Sean Kao <seankao@amazon.com>
Co-authored-by: Louis Chu <clingzhi@amazon.com>
Co-authored-by: Hendrik Saly <hendrik.saly@eliatra.com>
Co-authored-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
YANG-DB added a commit that referenced this pull request Nov 5, 2024
* update antlr grammar for (future) P1 command syntax

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* add trendline command

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* add expand command

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* add geoip command

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* PPl `flatten` command (#784)

* The flatten command implemented

Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>

* The flatten command integration tests were extended with additional checks for logical plans.

Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>

* flatten, added more tests related to plan translation and integration tests

Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>

* Flatten command added to command names list.

Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>

---------

Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>

* Extract source table names from mv query (#854)

* add sourceTables to MV index metadata properties

Signed-off-by: Sean Kao <seankao@amazon.com>

* parse source tables from mv query

Signed-off-by: Sean Kao <seankao@amazon.com>

* test cases for parse source tables from mv query

Signed-off-by: Sean Kao <seankao@amazon.com>

* use constant for metadata cache version

Signed-off-by: Sean Kao <seankao@amazon.com>

* write source tables to metadata cache

Signed-off-by: Sean Kao <seankao@amazon.com>

* address comment

Signed-off-by: Sean Kao <seankao@amazon.com>

* generate source tables for old mv without new prop

Signed-off-by: Sean Kao <seankao@amazon.com>

* syntax fix

Signed-off-by: Sean Kao <seankao@amazon.com>

---------

Signed-off-by: Sean Kao <seankao@amazon.com>

* Fallback to internal scheduler when index creation failed (#850)

* Fallback to internal scheduler when index creation failed

Signed-off-by: Louis Chu <clingzhi@amazon.com>

* Fix IT

Signed-off-by: Louis Chu <clingzhi@amazon.com>

* Fix IOException

Signed-off-by: Louis Chu <clingzhi@amazon.com>

---------

Signed-off-by: Louis Chu <clingzhi@amazon.com>

* New trendline ppl command (SMA only) (#833)

* WIP trendline command

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* wip

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* trendline supports sorting

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* run scalafmtAll

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* return null when there are too few data points

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* sbt scalafmtAll

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* Remove WMA references

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* trendline - sortByField as Optional<Field>

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* introduce TrendlineStrategy

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* keywordsCanBeId -> replace SMA with trendlineType

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* handle trendline alias as qualifiedName instead of fieldExpression

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* Add docs

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Make alias optional

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Adapt tests for optional alias

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Adden logical plan unittests

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Add missing license headers

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix docs

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* numberOfDataPoints must be 1 or greater

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Rename TrendlineStrategy to  TrendlineCatalystUtils

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Validate TrendlineType early and pass around enum type

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Add trendline chaining test

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix compile errors

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix imports

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix imports

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

---------

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Co-authored-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* update iplocation antlr

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update scala fmt style

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* `cidrmatch` ppl command add logical tests and docs (#865)

* update logical tests and docs

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* update scala fmt style

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* fix type error

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

---------

Signed-off-by: YANGDB <yang.db.dev@gmail.com>

* Support Lambda and add related array functions (#864)

* json function enhancement

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Add JavaToScalaTransformer

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Apply scalafmtAll

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Add IT and change to use the same function name as spark

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Address comments

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Add document and separate lambda functions from json functions

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Add lambda functions transform and reduce

Signed-off-by: Heng Qian <qianheng@amazon.com>

* polish lambda function document

Signed-off-by: Heng Qian <qianheng@amazon.com>

* polish lambda function document

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Minor fix

Signed-off-by: Heng Qian <qianheng@amazon.com>

* Minor change to polish the documents

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: Heng Qian <qianheng@amazon.com>

---------

Signed-off-by: YANGDB <yang.db.dev@gmail.com>
Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>
Signed-off-by: Sean Kao <seankao@amazon.com>
Signed-off-by: Louis Chu <clingzhi@amazon.com>
Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Signed-off-by: Heng Qian <qianheng@amazon.com>
Co-authored-by: lukasz-soszynski-eliatra <110241464+lukasz-soszynski-eliatra@users.noreply.github.com>
Co-authored-by: Sean Kao <seankao@amazon.com>
Co-authored-by: Louis Chu <clingzhi@amazon.com>
Co-authored-by: Hendrik Saly <hendrik.saly@eliatra.com>
Co-authored-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Co-authored-by: qianheng <qianheng@amazon.com>
kenrickyap pushed a commit to Bit-Quill/opensearch-spark that referenced this pull request Dec 11, 2024
* WIP trendline command

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* wip

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* trendline supports sorting

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* run scalafmtAll

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* return null when there are too few data points

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* sbt scalafmtAll

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* Remove WMA references

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* trendline - sortByField as Optional<Field>

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* introduce TrendlineStrategy

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* keywordsCanBeId -> replace SMA with trendlineType

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* handle trendline alias as qualifiedName instead of fieldExpression

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>

* Add docs

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Make alias optional

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Adapt tests for optional alias

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Adden logical plan unittests

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Add missing license headers

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix docs

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* numberOfDataPoints must be 1 or greater

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Rename TrendlineStrategy to  TrendlineCatalystUtils

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Validate TrendlineType early and pass around enum type

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Add trendline chaining test

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix compile errors

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix imports

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

* Fix imports

Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>

---------

Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com>
Co-authored-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
0.6 Lang:PPL Pipe Processing Language support
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants