-
Notifications
You must be signed in to change notification settings - Fork 28.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[SPARK-4452][Core]Shuffle data structures can starve others on the same thread for memory #10024
Conversation
Test build #46839 has finished for PR 10024 at commit
|
test this please |
Test build #46852 has finished for PR 10024 at commit
|
Test build #49948 has finished for PR 10024 at commit
|
Test build #49951 has finished for PR 10024 at commit
|
Hi @lianhuiwang thanks for submitting this patch. I just have a high-level question first. If I understand how this works correctly, the idea is that:
Is this a correct understanding? If so, this seems to hinge on one key assumption: that the I think this assumption is sound -- it is implied by "destructive"SortedIterator in the internals, though I think the actual wrapping FWIW, I started down the path of writing something similar with_out_ that assumption -- when a spill was requested on an in-flight iterator, then the entire in-memory structure would get spilled to disk, and the in-flight iterator would switch to the spilled data, and advance to the same location in the spilled data that it was on the in-memory data. This was pretty convoluted, and as I started writing tests I realized there were corner cases that needed work. So I decided to submit the simpler change instead. It seems much easier to do it your way. I do have some test which I think I can add as well -- lemme dig those up and send them later today. |
/** | ||
* Allocates a heap memory of `size`. | ||
*/ | ||
public long allocateHeapExecutionMemory(long size) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This function does not actually create any object, I'd like to call it acquireOnHeapMemory
@lianhuiwang Thanks for working on this, I think it's in the good direction. Two things left:
|
Test build #56099 has finished for PR 10024 at commit
|
@davies Thanks. I have added a SpillableIterator that can make consumer and spill thread safe. I think you can take a look at it. |
Test build #56229 has finished for PR 10024 at commit
|
@squito Yes, I think your understanding is correct. this PR only support that a Spillables's iterator will be called once. The code 'val sort = new Spillable() sort.iterator() sort.iterator()' will be wrong. |
Test build #56246 has finished for PR 10024 at commit
|
Test build #56365 has finished for PR 10024 at commit
|
@davies Yes, I have update it using object.lock. I will rebased to master. |
Test build #56465 has finished for PR 10024 at commit
|
test it please. |
Jenkins, test this please |
Test build #56456 has finished for PR 10024 at commit
|
Test build #56459 has finished for PR 10024 at commit
|
Test build #56462 has finished for PR 10024 at commit
|
Test build #56466 has finished for PR 10024 at commit
|
@davies Now all tests have been passed. So Could you take a look again? Thanks. |
Test build #56475 has finished for PR 10024 at commit
|
val freeMemory = myMemoryThreshold - initialMemoryThreshold | ||
_memoryBytesSpilled += freeMemory | ||
releaseMemory() | ||
freeMemory |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should free memory first, then release memory
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It did collection = null in forceSpill() before releaseMemory().
LGTM |
@lianhuiwang Have you run some stress tests with latest change? |
This is a big change, maybe not. |
sorry, I mistakenly deleted my comment. |
@davies I also have run unit tests with big number N. |
Test build #56496 has finished for PR 10024 at commit
|
Jenkins, test this please |
Test build #56513 has finished for PR 10024 at commit
|
Test build #56532 has finished for PR 10024 at commit
|
Merging this into master, thanks! |
@davies Thanks. |
…he same thread for memory apache#10024 [SPARK-14007] [SQL] Manage the memory used by hash map in shuffled hash join (just TaskMemoryManager.java) [SPARK-13113] [CORE] Remove unnecessary bit operation when decoding page number
…same thread for memory In apache#9241 It implemented a mechanism to call spill() on those SQL operators that support spilling if there is not enough memory for execution. But ExternalSorter and AppendOnlyMap in Spark core are not worked. So this PR make them benefit from apache#9241. Now when there is not enough memory for execution, it can get memory by spilling ExternalSorter and AppendOnlyMap in Spark core. add two unit tests for it. Author: Lianhui Wang <lianhuiwang09@gmail.com> Closes apache#10024 from lianhuiwang/SPARK-4452-2.
What changes were proposed in this pull request?
In #9241 It implemented a mechanism to call spill() on those SQL operators that support spilling if there is not enough memory for execution.
But ExternalSorter and AppendOnlyMap in Spark core are not worked. So this PR make them benefit from #9241. Now when there is not enough memory for execution, it can get memory by spilling ExternalSorter and AppendOnlyMap in Spark core.
How was this patch tested?
add two unit tests for it.