Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-49451][FOLLOW-UP] Add support for duplicate keys in from_json(_, 'variant') #48177

Closed
wants to merge 3 commits into from

Conversation

harshmotw-db
Copy link
Contributor

What changes were proposed in this pull request?

This PR adds support for duplicate key support in the from_json(_, 'variant') query pattern. Duplicate key support has been introduced in parse_json, json scans and the from_json expressions with nested schemas but this code path was not updated.

Why are the changes needed?

This change makes the behavior of from_json(_, 'variant') consistent with every other variant construction expression.

Does this PR introduce any user-facing change?

It potentially allows users to use the from_json(<input>, 'variant') expression on json inputs with duplicate keys depending on a config.

How was this patch tested?

Unit tests.

Was this patch authored or co-authored using generative AI tooling?

No.

@github-actions github-actions bot added the SQL label Sep 20, 2024
@harshmotw-db
Copy link
Contributor Author

@chenhao-db @cloud-fan This PR follows up on the previous PR that added support for duplicate keys where one corner case was left out. Can you review this PR? Thanks!

Copy link
Member

@MaxGekk MaxGekk left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Waiting for CI.

@HyukjinKwon
Copy link
Member

Merged to master.

attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
…_, 'variant')

### What changes were proposed in this pull request?

This PR adds support for duplicate key support in the `from_json(_, 'variant')` query pattern. Duplicate key support [has been introduced](apache#47920) in `parse_json`, json scans and the `from_json` expressions with nested schemas but this code path was not updated.

### Why are the changes needed?

This change makes the behavior of `from_json(_, 'variant')` consistent with every other variant construction expression.

### Does this PR introduce _any_ user-facing change?

It potentially allows users to use the `from_json(<input>, 'variant')` expression on json inputs with duplicate keys depending on a config.

### How was this patch tested?

Unit tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#48177 from harshmotw-db/harshmotw-db/master.

Authored-by: Harsh Motwani <harsh.motwani@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
…_, 'variant')

### What changes were proposed in this pull request?

This PR adds support for duplicate key support in the `from_json(_, 'variant')` query pattern. Duplicate key support [has been introduced](apache#47920) in `parse_json`, json scans and the `from_json` expressions with nested schemas but this code path was not updated.

### Why are the changes needed?

This change makes the behavior of `from_json(_, 'variant')` consistent with every other variant construction expression.

### Does this PR introduce _any_ user-facing change?

It potentially allows users to use the `from_json(<input>, 'variant')` expression on json inputs with duplicate keys depending on a config.

### How was this patch tested?

Unit tests.

### Was this patch authored or co-authored using generative AI tooling?

No.

Closes apache#48177 from harshmotw-db/harshmotw-db/master.

Authored-by: Harsh Motwani <harsh.motwani@databricks.com>
Signed-off-by: Hyukjin Kwon <gurwls223@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants