[ISSUE-455] Lazily create uncompressedData #457

xianjingfeng · 2023-01-05T02:59:25Z

What changes were proposed in this pull request?

Lazily create uncompressedData.

Why are the changes needed?

Save memory. #455

Does this PR introduce any user-facing change?

No

How was this patch tested?

The existing UTs is enough

codecov-commenter · 2023-01-05T03:16:04Z

Codecov Report

Merging #457 (b23b540) into master (9572b84) will increase coverage by 3.06%.
The diff coverage is n/a.

@@             Coverage Diff              @@
##             master     #457      +/-   ##
============================================
+ Coverage     58.67%   61.74%   +3.06%     
- Complexity     1654     1655       +1     
============================================
  Files           199      193       -6     
  Lines         11217    10053    -1164     
  Branches        997      997              
============================================
- Hits           6582     6207     -375     
+ Misses         4243     3514     -729     
+ Partials        392      332      -60

Impacted Files	Coverage Δ
...e/spark/shuffle/reader/RssShuffleDataIterator.java	`94.28% <ø> (+1.13%)`	⬆️
...y/kubernetes/operator/pkg/webhook/inspector/rss.go
...y/kubernetes/operator/pkg/webhook/inspector/pod.go
...oy/kubernetes/operator/pkg/controller/util/util.go
...bernetes/operator/pkg/controller/controller/rss.go
...rnetes/operator/pkg/webhook/inspector/inspector.go
deploy/kubernetes/operator/pkg/webhook/manager.go

📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java

advancedxy · 2023-01-06T02:26:06Z

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java

+          // todo: support off-heap bytebuffer
+          uncompressedData = ByteBuffer.allocate(
+              (int) rssConf.getSizeAsBytes(
+                  RssClientConfig.RSS_WRITER_BUFFER_SIZE,


this should be RSS_READER_BUFFER_SIZE?
And for this if block, the bufferSize should be max(readBufferSize, uncompressedLen)?

Could you add a test case for this scenario?

I think the purpose of using RSS_WRITER_BUFFER_SIZE may be to avoid allocate memory for multiple times.Let's remove it.

Yes. The initial uncompressedData size is hard to set. I prefer to remove it.

jerqi

LGTM

jerqi · 2023-01-06T08:51:23Z

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java

@@ -69,13 +67,6 @@ public RssShuffleDataIterator(
    this.shuffleReadClient = shuffleReadClient;
    this.shuffleReadMetrics = shuffleReadMetrics;
    this.codec = Codec.newInstance(rssConf);
-    // todo: support off-heap bytebuffer


We would better not remove this todo comment.

zuston

LGTM

zuston · 2023-01-06T10:37:16Z

Merged. Thanks @jerqi @xianjingfeng @advancedxy

Lazily create uncompressedData

9b8ab7e

xianjingfeng requested a review from zuston January 5, 2023 02:59

jerqi reviewed Jan 5, 2023

View reviewed changes

client-spark/common/src/main/java/org/apache/spark/shuffle/reader/RssShuffleDataIterator.java Show resolved Hide resolved

advancedxy reviewed Jan 6, 2023

View reviewed changes

Optimize

889e7ac

jerqi previously approved these changes Jan 6, 2023

View reviewed changes

jerqi reviewed Jan 6, 2023

View reviewed changes

zuston previously approved these changes Jan 6, 2023

View reviewed changes

Add todo comment back

b23b540

xianjingfeng dismissed stale reviews from zuston and jerqi via b23b540 January 6, 2023 09:18

jerqi approved these changes Jan 6, 2023

View reviewed changes

zuston approved these changes Jan 6, 2023

View reviewed changes

zuston merged commit 2b756c3 into apache:master Jan 6, 2023

jerqi mentioned this pull request Jan 6, 2023

[Improvement] Lazily create uncompressedData #455

Closed

3 tasks

xianjingfeng deleted the issue_455 branch March 1, 2023 13:25

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[ISSUE-455] Lazily create uncompressedData #457

[ISSUE-455] Lazily create uncompressedData #457

xianjingfeng commented Jan 5, 2023

codecov-commenter commented Jan 5, 2023 •

edited

Loading

advancedxy Jan 6, 2023

xianjingfeng Jan 6, 2023

zuston Jan 6, 2023

jerqi left a comment

jerqi Jan 6, 2023

zuston left a comment

zuston commented Jan 6, 2023

[ISSUE-455] Lazily create uncompressedData #457

[ISSUE-455] Lazily create uncompressedData #457

Conversation

xianjingfeng commented Jan 5, 2023

What changes were proposed in this pull request?

Why are the changes needed?

Does this PR introduce any user-facing change?

How was this patch tested?

codecov-commenter commented Jan 5, 2023 • edited Loading

Codecov Report

advancedxy Jan 6, 2023

Choose a reason for hiding this comment

xianjingfeng Jan 6, 2023

Choose a reason for hiding this comment

zuston Jan 6, 2023

Choose a reason for hiding this comment

jerqi left a comment

Choose a reason for hiding this comment

jerqi Jan 6, 2023

Choose a reason for hiding this comment

zuston left a comment

Choose a reason for hiding this comment

zuston commented Jan 6, 2023

codecov-commenter commented Jan 5, 2023 •

edited

Loading