Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[Improvement] Use OR operation instead of serialization for cloning BitMaps #103

Merged
merged 1 commit into from
Jul 29, 2022

Conversation

kaijchen
Copy link
Contributor

@kaijchen kaijchen commented Jul 29, 2022

What changes were proposed in this pull request?

  1. Add RssUtils#cloneBitMap().
  2. Replace deserializeBitMap(serializeBitMap(bitmap)) by cloneBitMap(bitmap).

Why are the changes needed?

  1. No need to handle IOException.
  2. More efficient.

Does this PR introduce any user-facing change?

No.

How was this patch tested?

New unit test RssUtilsTest#testCloneBitmap()

@codecov-commenter
Copy link

codecov-commenter commented Jul 29, 2022

Codecov Report

Merging #103 (0fcac2b) into master (ccb39ed) will increase coverage by 0.01%.
The diff coverage is 100.00%.

@@             Coverage Diff              @@
##             master     #103      +/-   ##
============================================
+ Coverage     56.33%   56.34%   +0.01%     
  Complexity     1176     1176              
============================================
  Files           149      149              
  Lines          7992     7987       -5     
  Branches        766      766              
============================================
- Hits           4502     4500       -2     
+ Misses         3246     3242       -4     
- Partials        244      245       +1     
Impacted Files Coverage Δ
...che/uniffle/client/impl/ShuffleReadClientImpl.java 89.69% <100.00%> (+3.28%) ⬆️
.../java/org/apache/uniffle/common/util/RssUtils.java 66.19% <100.00%> (+0.72%) ⬆️
.../org/apache/uniffle/server/ShuffleTaskManager.java 64.10% <100.00%> (-0.31%) ⬇️
...storage/handler/impl/DataSkippableReadHandler.java 81.25% <0.00%> (-3.13%) ⬇️
.../apache/uniffle/coordinator/ClientConfManager.java 91.54% <0.00%> (-1.41%) ⬇️
...e/uniffle/server/storage/SingleStorageManager.java 67.21% <0.00%> (+1.63%) ⬆️

Help us with your feedback. Take ten seconds to tell us how you rate us.

@jerqi
Copy link
Contributor

jerqi commented Jul 29, 2022

Could we have a simple benchmark test?

@kaijchen
Copy link
Contributor Author

kaijchen commented Jul 29, 2022

Could we have a simple benchmark test?

Performance is not the main concern here.
Exception-free is more important.

If you find this change is impacting performance later,
you can change the method body to

try {
    return deserializeBitMap(serializeBitMap(bitmap));
} catch(IOException e) {
    // not expected
}

Just look at the body of serializeBitMap and deserializeBitMap,
it feels they are more costly and error-prone.

And explicitly throwing RuntimeException is not a good practice,
because they are difficult to catch (without catching real RuntimeException).

@jerqi
Copy link
Contributor

jerqi commented Jul 29, 2022

Could we have a simple benchmark test?

Performance is not the main concern here. Exception-free is more important.

If you find this change is impacting performance later, you can change the method body to

try {
    return deserializeBitMap(serializeBitMap(bitmap));
} catch(IOException e) {
    // not expected
}

Just look at the body of serializeBitMap and deserializeBitMap, it feels they are more costly and error-prone.

And explicitly throwing RuntimeException is not a good practice, because they are difficult to catch (without catching real RuntimeException).

Sadly, we care the performance here. We need performance test to ensure that the pr won't bring worse performance. Bitmap method or will do many operations, it may cost much time, I can't ensure the performance easily.

@kaijchen
Copy link
Contributor Author

kaijchen commented Jul 29, 2022

It turns out the new method is 15x faster than the old one.
I'll include the following test in this PR.

screenshot_ccc1ad40-80da-418d-9d17-568a6dc428c9

@jerqi
Copy link
Contributor

jerqi commented Jul 29, 2022

Great work!

@jerqi
Copy link
Contributor

jerqi commented Jul 29, 2022

Could you use some random number to test them again? Because bitmap will compress the ordered number, the size of bitmap will be smaller, we test in more situations, the result will be more persuasive.

@kaijchen
Copy link
Contributor Author

Random ints
image

Random longs
image

@kaijchen
Copy link
Contributor Author

I'll include the following test in this PR.

Removed this test because it might be flaky due to resource sharing in the CI environment.

@kaijchen kaijchen changed the title [Improvement] Add RssUtils#cloneBitMap() [Improvement] Use OR operation instead of serialization for cloning BitMaps Jul 29, 2022
Copy link
Contributor

@jerqi jerqi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks for your contribution @kaijchen.

@jerqi jerqi merged commit 4e4b940 into apache:master Jul 29, 2022
@kaijchen
Copy link
Contributor Author

Thanks @jerqi for the review.

@kaijchen kaijchen deleted the clone-bitmap branch July 29, 2022 08:51
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants