Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[SPARK-47307][SQL][FOLLOWUP] Promote spark.sql.legacy.chunkBase64String.enabled from a legacy/internal config to a regular/public one #47410

Closed
wants to merge 4 commits into from

Conversation

wForget
Copy link
Member

@wForget wForget commented Jul 19, 2024

What changes were proposed in this pull request?

  • Promote spark.sql.legacy.chunkBase64String.enabled from a legacy/internal config to a regular/public one.
  • Add test cases for unbase64

Why are the changes needed?

Keep the same behavior as before. More details: #47303 (comment)

Does this PR introduce any user-facing change?

yes, revert behavior change introduced in #47303

How was this patch tested?

existing unit test

Was this patch authored or co-authored using generative AI tooling?

No

@github-actions github-actions bot added the SQL label Jul 19, 2024
@yaooqinn
Copy link
Member

cc @gatorsmile too


// check if unbase64 works well for chunked and non-chunked encoded strings
checkEvaluation(StringDecode(UnBase64(Literal(encoded)), Literal("utf-8")), longString)
checkEvaluation(StringDecode(UnBase64(Literal(chunkEncoded)), Literal("utf-8")), longString)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I added this case to confirm that the chunkEncoded string can be decoded correctly. cc @cloud-fan

Copy link
Member

@dongjoon-hyun dongjoon-hyun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@dongjoon-hyun
Copy link
Member

BTW, @wForget , if you change the configuration name in this PR, the PR title and description should be updated accordingly.

Currently, the PR title and description is not matched.

@yaooqinn yaooqinn changed the title [SPARK-47307][SQL][FOLLOWUP] Change the default value of spark.sql.legacy.chunkBase64String.enabled to true [SPARK-47307][SQL][FOLLOWUP] Promote spark.sql.legacy.chunkBase64String.enabled from a legacy/internal config to a regular/public one Jul 19, 2024
@wForget
Copy link
Member Author

wForget commented Jul 19, 2024

BTW, @wForget , if you change the configuration name in this PR, the PR title and description should be updated accordingly.

Currently, the PR title and description is not matched.

Updated, thank you.

@dongjoon-hyun
Copy link
Member

Let's wait for @gatorsmile and @cloud-fan 's review.

@wForget
Copy link
Member Author

wForget commented Jul 19, 2024

The test golden files need to be regenerated, I am doing it.

@dongjoon-hyun
Copy link
Member

Oh, this one?

[info] *** 1 TEST FAILED ***
[error] Failed tests:
[error] 	org.apache.spark.sql.connect.ProtoToParsedPlanTestSuite

@wForget
Copy link
Member Author

wForget commented Jul 19, 2024

Oh, this one?

yeah

@yaooqinn
Copy link
Member

Merged to master.

Hi @wForget, please help backport this to 3.5. Thank you

wForget added a commit to wForget/spark that referenced this pull request Jul 19, 2024
…ng.enabled from a legacy/internal config to a regular/public one

### What changes were proposed in this pull request?

+ Promote spark.sql.legacy.chunkBase64String.enabled from a legacy/internal config to a regular/public one.
+ Add test cases for unbase64

### Why are the changes needed?

Keep the same behavior as before. More details: apache#47303 (comment)

### Does this PR introduce _any_ user-facing change?

yes, revert behavior change introduced in apache#47303

### How was this patch tested?

existing unit test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#47410 from wForget/SPARK-47307_followup.

Lead-authored-by: wforget <643348094@qq.com>
Co-authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>

(cherry picked from commit af5eb08)
@cloud-fan
Copy link
Contributor

It's good to keep the old behavior by default, but I'd like to understand the difference. Which one is more commonly supported in the ecosystem (decoding)? The chunked one or the unchunked one?

@yaooqinn
Copy link
Member

Unchunking is the behavior before 3.2(inclusive). Chunking was the behavior introduced accidentally in 3.3. I suggested @wForget in #47303 to make chunking as legacy behavior and unchunking as default, considering that unchunking is more reasonable here and users may not upgrade so eagerly, while chunking is mostly for editors who can not handle long lines. For the unbase64 part, both are valid for decoding.

yaooqinn pushed a commit that referenced this pull request Jul 20, 2024
…4String.enabled from a legacy/internal config to a regular/public one

Backports #47410 to 3.5

### What changes were proposed in this pull request?

+ Promote spark.sql.legacy.chunkBase64String.enabled from a legacy/internal config to a regular/public one.
+ Add test cases for unbase64

### Why are the changes needed?

Keep the same behavior as before. More details: #47303 (comment)

### Does this PR introduce _any_ user-facing change?

yes, revert behavior change introduced in #47303

### How was this patch tested?

existing unit test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes #47416 from wForget/SPARK-47307_followup_3.5.

Authored-by: wforget <643348094@qq.com>
Signed-off-by: Kent Yao <yao@apache.org>
jingz-db pushed a commit to jingz-db/spark that referenced this pull request Jul 22, 2024
…ng.enabled from a legacy/internal config to a regular/public one

### What changes were proposed in this pull request?

+ Promote spark.sql.legacy.chunkBase64String.enabled from a legacy/internal config to a regular/public one.
+ Add test cases for unbase64

### Why are the changes needed?

Keep the same behavior as before. More details: apache#47303 (comment)

### Does this PR introduce _any_ user-facing change?

yes, revert behavior change introduced in apache#47303

### How was this patch tested?

existing unit test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#47410 from wForget/SPARK-47307_followup.

Lead-authored-by: wforget <643348094@qq.com>
Co-authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
attilapiros pushed a commit to attilapiros/spark that referenced this pull request Oct 4, 2024
…ng.enabled from a legacy/internal config to a regular/public one

### What changes were proposed in this pull request?

+ Promote spark.sql.legacy.chunkBase64String.enabled from a legacy/internal config to a regular/public one.
+ Add test cases for unbase64

### Why are the changes needed?

Keep the same behavior as before. More details: apache#47303 (comment)

### Does this PR introduce _any_ user-facing change?

yes, revert behavior change introduced in apache#47303

### How was this patch tested?

existing unit test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#47410 from wForget/SPARK-47307_followup.

Lead-authored-by: wforget <643348094@qq.com>
Co-authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
himadripal pushed a commit to himadripal/spark that referenced this pull request Oct 19, 2024
…ng.enabled from a legacy/internal config to a regular/public one

### What changes were proposed in this pull request?

+ Promote spark.sql.legacy.chunkBase64String.enabled from a legacy/internal config to a regular/public one.
+ Add test cases for unbase64

### Why are the changes needed?

Keep the same behavior as before. More details: apache#47303 (comment)

### Does this PR introduce _any_ user-facing change?

yes, revert behavior change introduced in apache#47303

### How was this patch tested?

existing unit test

### Was this patch authored or co-authored using generative AI tooling?

No

Closes apache#47410 from wForget/SPARK-47307_followup.

Lead-authored-by: wforget <643348094@qq.com>
Co-authored-by: Kent Yao <yao@apache.org>
Signed-off-by: Kent Yao <yao@apache.org>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants