Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix(unpivot): _field should not be added as a group key by default #5272

Merged
merged 2 commits into from
Oct 13, 2022

Conversation

skartikey
Copy link
Contributor

if _field is in otherColumns parameter, _field should not be made group key by default
The issue was discovered during the iox group push down work

Checklist

Dear Author 👋, the following checks should be completed (or explicitly dismissed) before merging.

  • ✏️ Write a PR description, regardless of triviality, to include the value of this PR
  • 🔗 Reference related issues
  • 🏃 Test cases are included to exercise the new code
  • 🧪 If new packages are being introduced to stdlib, link to Working Group discussion notes and ensure it lands under experimental/
  • 📖 If language features are changing, ensure docs/Spec.md has been updated

Dear Reviewer(s) 👋, you are responsible (among others) for ensuring the completeness and quality of the above before approval.

@skartikey skartikey requested a review from a team as a code owner October 11, 2022 14:05
@skartikey skartikey requested review from wolffcm and removed request for a team October 11, 2022 14:05
Copy link

@wolffcm wolffcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this is pretty close, but the logic is not quite right.

f0: 20.1,
f1: 20.2,
_time: 2018-12-01T00:00:10Z,
_field: "load1",
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

For this to be a representative test case, there shouldn't be a _field column in the input to unpivot, since pivoted data does generally not have a _field column.

Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thinking it through, if there is a _field column in the input an error seems to be appropriate? Since there is no way to produce output that does not overwrite the data that already exists in _field. Thoughts?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it makes sense to show an error if the input to unpivot contains _field in it. wondering if we have to do the same for _value column?

The reason I have to put up this PR is that the pushdown group test was failing because it contains _field in the input.

The fix would be to throw a user error when there is _field or _value column in the input to unpivot.

At the moment, if a column is present in the otherColumns and missing in the input column, the code throws an error.

import "array"
import "experimental"
import "testing"

 array.from(
    rows: [
        {
            _measurement: "m",
            tag: "t1",
            f0: 10.1,
            f1: 10.2,
            _time: 2018-12-01T00:00:00Z,
        },
        {
            _measurement: "m",
            tag: "t1",
            f0: 20.1,
            f1: 20.2,
            _time: 2018-12-01T00:00:10Z,
        },
    ],
)
    |> group(columns: ["_measurement"])
    |> experimental.unpivot(otherColumns: ["_time", "tag", "_field"])
Result: _result
Error: runtime error @24:8-24:70: unpivot: unpivot could not find column named "_field"

columns = append(columns, flux.ColMeta{Label: "_field", Type: flux.TString})
if defaultFieldColLen != 0 {
columns = append(columns, flux.ColMeta{Label: influxdb.DefaultFieldColLabel, Type: flux.TString})
}
Copy link

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This logic doesn't seem quite right to me. If there is no _field column in the input, then the _field column will never get created. We want to create it regardless, we just want to omit the column from groupCols and groupValues below if _field is in the otherCols parameter.

Copy link
Contributor Author

@skartikey skartikey Oct 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

No, you’ve got it wrong. _field will only 'not' get created when it is present in the otherColumns, rest of the cases it will always get created.

import "array"
import "experimental"
import "testing"

 array.from(
    rows: [
        {
            _measurement: "m",
            tag: "t1",
            f0: 10.1,
            f1: 10.2,
            _time: 2018-12-01T00:00:00Z,
        },
        {
            _measurement: "m",
            tag: "t1",
            f0: 20.1,
            f1: 20.2,
            _time: 2018-12-01T00:00:10Z,
        },
    ],
)
    |> group(columns: ["_measurement"])
    |> experimental.unpivot(otherColumns: ["_time", "tag"])
Result: _result
Table: keys: [_measurement, _field]
   _measurement:string           _field:string                      _time:time              tag:string                  _value:float
----------------------  ----------------------  ------------------------------  ----------------------  ----------------------------
                     m                      f0  2018-12-01T00:00:00.000000000Z                      t1                          10.1
                     m                      f0  2018-12-01T00:00:10.000000000Z                      t1                          20.1
Table: keys: [_measurement, _field]
   _measurement:string           _field:string                      _time:time              tag:string                  _value:float
----------------------  ----------------------  ------------------------------  ----------------------  ----------------------------
                     m                      f1  2018-12-01T00:00:00.000000000Z                      t1                          10.2
                     m                      f1  2018-12-01T00:00:10.000000000Z                      t1                          20.2

@skartikey skartikey force-pushed the skartikey-flux-unpivot branch from ce7d364 to 29c4d23 Compare October 13, 2022 13:04
@skartikey skartikey requested a review from wolffcm October 13, 2022 13:06
Copy link

@wolffcm wolffcm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks.

@skartikey skartikey merged commit 29766af into master Oct 13, 2022
@skartikey skartikey deleted the skartikey-flux-unpivot branch October 13, 2022 19:09
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants