Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable kudo serializer by default. #12202

Open
liurenjie1024 opened this issue Feb 24, 2025 · 0 comments · May be fixed by #12222
Open

Enable kudo serializer by default. #12202

liurenjie1024 opened this issue Feb 24, 2025 · 0 comments · May be fixed by #12222
Assignees
Labels
ease of use Makes the product simpler to use or configure shuffle things that impact the shuffle plugin

Comments

@liurenjie1024
Copy link
Collaborator

liurenjie1024 commented Feb 24, 2025

Kudo serializer has been verified and tested in customer production environmented for a long time, where it shows significant performance improvement compared with original jcudf serializer. Also we did a lot nds tests in different environments such as spark2a, nds, and we didn't observe regression.

As a summary of nds results:

  1. At least 1% performance improvent in dataproc.
  2. No regression in spark2a.

The tests were running using spark-rapids git has 417c6a0, but latest branch 25.04 should have similar results.

@liurenjie1024 liurenjie1024 added ease of use Makes the product simpler to use or configure shuffle things that impact the shuffle plugin labels Feb 24, 2025
@liurenjie1024 liurenjie1024 self-assigned this Feb 24, 2025
@liurenjie1024 liurenjie1024 linked a pull request Feb 25, 2025 that will close this issue
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ease of use Makes the product simpler to use or configure shuffle things that impact the shuffle plugin
Projects
None yet
Development

Successfully merging a pull request may close this issue.

1 participant