-
Notifications
You must be signed in to change notification settings - Fork 34
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
PPl flatten
command
#784
PPl flatten
command
#784
Conversation
a7a0677
to
d566d3f
Compare
955ffe3
to
8f2ade9
Compare
03222ce
to
287ad3f
Compare
private val planTransformer = new CatalystQueryPlanVisitor() | ||
private val pplParser = new PPLSyntaxParser() | ||
|
||
test("test fillnull only field") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukasz-soszynski-eliatra plz change the test name to match the ppl query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukasz-soszynski-eliatra plz add more tests with different queries including
- stats (aggregations)
- parse (parsing of regExp)
- eval
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
super.beforeAll() | ||
|
||
// Create test table | ||
createSimpleNestedJsonContentTable(tempFile, testTable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@lukasz-soszynski-eliatra plz add additional test from other types of tables or use cases:
- NestesTable
- structTable
- table with multiValue column (Array)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@lukasz-soszynski-eliatra also plz rebase ... |
Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>
…hecks for logical plans. Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>
… tests Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>
aefe422
to
0a0e2c5
Compare
#### Data | ||
| \_time | bridges | city | coor | country | | ||
|---------------------|----------------------------------------------|---------|------------------------|---------------| | ||
| 2024-09-13T12:00:00 | [{801, Tower Bridge}, {928, London Bridge}] | London | {35, 51.5074, -0.1278} | England | |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does these example data come from? I suggest to create some example data by ourselves instead of copying from internet only if they are from public dataset. @YANG-DB please do not use internet data, specially documentation of other similar product, in github issue either.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These data come from the issue #669 but were extended.
@@ -53,6 +53,7 @@ commands | |||
| renameCommand | |||
| fillnullCommand | |||
| fieldsummaryCommand | |||
| flattenCommand | |||
; | |||
|
|||
commandName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add FLATTEN
keyword in the end of commandName
section.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great job! LGTM except this^
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Corrected.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add
FLATTEN
keyword in the end ofcommandName
section.
please add here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
Using `flatten` command to flatten a field of type: | ||
- `struct<?,?>` | ||
- `array<struct<?,?>>` |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are these data_types what we expecting for this command? I thought the expected input is json string. But it's fine for now. We can enhance it later.
Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>
@@ -53,6 +53,7 @@ commands | |||
| renameCommand | |||
| fillnullCommand | |||
| fieldsummaryCommand | |||
| flattenCommand | |||
; | |||
|
|||
commandName |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add
FLATTEN
keyword in the end ofcommandName
section.
please add here
* update antlr grammar for (future) P1 command syntax Signed-off-by: YANGDB <yang.db.dev@gmail.com> * add trendline command Signed-off-by: YANGDB <yang.db.dev@gmail.com> * add expand command Signed-off-by: YANGDB <yang.db.dev@gmail.com> * add geoip command Signed-off-by: YANGDB <yang.db.dev@gmail.com> * PPl `flatten` command (#784) * The flatten command implemented Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * The flatten command integration tests were extended with additional checks for logical plans. Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * flatten, added more tests related to plan translation and integration tests Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * Flatten command added to command names list. Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> --------- Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * Extract source table names from mv query (#854) * add sourceTables to MV index metadata properties Signed-off-by: Sean Kao <seankao@amazon.com> * parse source tables from mv query Signed-off-by: Sean Kao <seankao@amazon.com> * test cases for parse source tables from mv query Signed-off-by: Sean Kao <seankao@amazon.com> * use constant for metadata cache version Signed-off-by: Sean Kao <seankao@amazon.com> * write source tables to metadata cache Signed-off-by: Sean Kao <seankao@amazon.com> * address comment Signed-off-by: Sean Kao <seankao@amazon.com> * generate source tables for old mv without new prop Signed-off-by: Sean Kao <seankao@amazon.com> * syntax fix Signed-off-by: Sean Kao <seankao@amazon.com> --------- Signed-off-by: Sean Kao <seankao@amazon.com> * Fallback to internal scheduler when index creation failed (#850) * Fallback to internal scheduler when index creation failed Signed-off-by: Louis Chu <clingzhi@amazon.com> * Fix IT Signed-off-by: Louis Chu <clingzhi@amazon.com> * Fix IOException Signed-off-by: Louis Chu <clingzhi@amazon.com> --------- Signed-off-by: Louis Chu <clingzhi@amazon.com> * New trendline ppl command (SMA only) (#833) * WIP trendline command Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * wip Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * trendline supports sorting Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * run scalafmtAll Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * return null when there are too few data points Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * sbt scalafmtAll Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * Remove WMA references Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * trendline - sortByField as Optional<Field> Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * introduce TrendlineStrategy Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * keywordsCanBeId -> replace SMA with trendlineType Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * handle trendline alias as qualifiedName instead of fieldExpression Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * Add docs Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Make alias optional Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Adapt tests for optional alias Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Adden logical plan unittests Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Add missing license headers Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Fix docs Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * numberOfDataPoints must be 1 or greater Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Rename TrendlineStrategy to TrendlineCatalystUtils Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Validate TrendlineType early and pass around enum type Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Add trendline chaining test Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Fix compile errors Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Fix imports Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Fix imports Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> --------- Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> Co-authored-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * update iplocation antlr Signed-off-by: YANGDB <yang.db.dev@gmail.com> --------- Signed-off-by: YANGDB <yang.db.dev@gmail.com> Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> Signed-off-by: Sean Kao <seankao@amazon.com> Signed-off-by: Louis Chu <clingzhi@amazon.com> Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> Co-authored-by: lukasz-soszynski-eliatra <110241464+lukasz-soszynski-eliatra@users.noreply.github.com> Co-authored-by: Sean Kao <seankao@amazon.com> Co-authored-by: Louis Chu <clingzhi@amazon.com> Co-authored-by: Hendrik Saly <hendrik.saly@eliatra.com> Co-authored-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com>
* update antlr grammar for (future) P1 command syntax Signed-off-by: YANGDB <yang.db.dev@gmail.com> * add trendline command Signed-off-by: YANGDB <yang.db.dev@gmail.com> * add expand command Signed-off-by: YANGDB <yang.db.dev@gmail.com> * add geoip command Signed-off-by: YANGDB <yang.db.dev@gmail.com> * PPl `flatten` command (#784) * The flatten command implemented Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * The flatten command integration tests were extended with additional checks for logical plans. Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * flatten, added more tests related to plan translation and integration tests Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * Flatten command added to command names list. Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> --------- Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * Extract source table names from mv query (#854) * add sourceTables to MV index metadata properties Signed-off-by: Sean Kao <seankao@amazon.com> * parse source tables from mv query Signed-off-by: Sean Kao <seankao@amazon.com> * test cases for parse source tables from mv query Signed-off-by: Sean Kao <seankao@amazon.com> * use constant for metadata cache version Signed-off-by: Sean Kao <seankao@amazon.com> * write source tables to metadata cache Signed-off-by: Sean Kao <seankao@amazon.com> * address comment Signed-off-by: Sean Kao <seankao@amazon.com> * generate source tables for old mv without new prop Signed-off-by: Sean Kao <seankao@amazon.com> * syntax fix Signed-off-by: Sean Kao <seankao@amazon.com> --------- Signed-off-by: Sean Kao <seankao@amazon.com> * Fallback to internal scheduler when index creation failed (#850) * Fallback to internal scheduler when index creation failed Signed-off-by: Louis Chu <clingzhi@amazon.com> * Fix IT Signed-off-by: Louis Chu <clingzhi@amazon.com> * Fix IOException Signed-off-by: Louis Chu <clingzhi@amazon.com> --------- Signed-off-by: Louis Chu <clingzhi@amazon.com> * New trendline ppl command (SMA only) (#833) * WIP trendline command Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * wip Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * trendline supports sorting Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * run scalafmtAll Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * return null when there are too few data points Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * sbt scalafmtAll Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * Remove WMA references Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * trendline - sortByField as Optional<Field> Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * introduce TrendlineStrategy Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * keywordsCanBeId -> replace SMA with trendlineType Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * handle trendline alias as qualifiedName instead of fieldExpression Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * Add docs Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Make alias optional Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Adapt tests for optional alias Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Adden logical plan unittests Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Add missing license headers Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Fix docs Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * numberOfDataPoints must be 1 or greater Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Rename TrendlineStrategy to TrendlineCatalystUtils Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Validate TrendlineType early and pass around enum type Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Add trendline chaining test Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Fix compile errors Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Fix imports Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> * Fix imports Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> --------- Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> Co-authored-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> * update iplocation antlr Signed-off-by: YANGDB <yang.db.dev@gmail.com> * update scala fmt style Signed-off-by: YANGDB <yang.db.dev@gmail.com> * `cidrmatch` ppl command add logical tests and docs (#865) * update logical tests and docs Signed-off-by: YANGDB <yang.db.dev@gmail.com> * update scala fmt style Signed-off-by: YANGDB <yang.db.dev@gmail.com> * fix type error Signed-off-by: YANGDB <yang.db.dev@gmail.com> --------- Signed-off-by: YANGDB <yang.db.dev@gmail.com> * Support Lambda and add related array functions (#864) * json function enhancement Signed-off-by: Heng Qian <qianheng@amazon.com> * Add JavaToScalaTransformer Signed-off-by: Heng Qian <qianheng@amazon.com> * Apply scalafmtAll Signed-off-by: Heng Qian <qianheng@amazon.com> * Address comments Signed-off-by: Heng Qian <qianheng@amazon.com> * Add IT and change to use the same function name as spark Signed-off-by: Heng Qian <qianheng@amazon.com> * Address comments Signed-off-by: Heng Qian <qianheng@amazon.com> * Add document and separate lambda functions from json functions Signed-off-by: Heng Qian <qianheng@amazon.com> * Add lambda functions transform and reduce Signed-off-by: Heng Qian <qianheng@amazon.com> * polish lambda function document Signed-off-by: Heng Qian <qianheng@amazon.com> * polish lambda function document Signed-off-by: Heng Qian <qianheng@amazon.com> * Minor fix Signed-off-by: Heng Qian <qianheng@amazon.com> * Minor change to polish the documents Signed-off-by: Heng Qian <qianheng@amazon.com> --------- Signed-off-by: Heng Qian <qianheng@amazon.com> --------- Signed-off-by: YANGDB <yang.db.dev@gmail.com> Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> Signed-off-by: Sean Kao <seankao@amazon.com> Signed-off-by: Louis Chu <clingzhi@amazon.com> Signed-off-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> Signed-off-by: Hendrik Saly <hendrik.saly@eliatra.com> Signed-off-by: Heng Qian <qianheng@amazon.com> Co-authored-by: lukasz-soszynski-eliatra <110241464+lukasz-soszynski-eliatra@users.noreply.github.com> Co-authored-by: Sean Kao <seankao@amazon.com> Co-authored-by: Louis Chu <clingzhi@amazon.com> Co-authored-by: Hendrik Saly <hendrik.saly@eliatra.com> Co-authored-by: Kacper Trochimiak <kacper.trochimiak@eliatra.com> Co-authored-by: qianheng <qianheng@amazon.com>
* The flatten command implemented Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * The flatten command integration tests were extended with additional checks for logical plans. Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * flatten, added more tests related to plan translation and integration tests Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> * Flatten command added to command names list. Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com> --------- Signed-off-by: Lukasz Soszynski <lukasz.soszynski@eliatra.com>
Description
The
flatten
command is introducedIssues Resolved
#669
By submitting this pull request, I confirm that my contribution is made under the terms of the Apache 2.0 license.
For more information on following Developer Certificate of Origin and signing off your commits, please check here.