Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] clean up host memory limit configs #8878

Open
revans2 opened this issue Jul 31, 2023 · 4 comments
Open

[FEA] clean up host memory limit configs #8878

revans2 opened this issue Jul 31, 2023 · 4 comments
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin task Work required that improves the product but is not user facing

Comments

@revans2
Copy link
Collaborator

revans2 commented Jul 31, 2023

Is your feature request related to a problem? Please describe.

Once the changes to the plugin are enough done that we feel that it should be turned on by default.

  • We will set spark.rapids.memory.hostOffHeapLimit.enabled to true by default, and deprecate it. We will have a follow on issue to remove it once we have confidence that customers can move to this without a lot of problems.
  • We will deprecate spark.rapids.memory.host.spillStorageSize and point people to spark.rapids.memory.hostOffHeapLimit.size instead.
  • We will move spark.rapids.memory.hostPageable.taskOverhead.size from being an internal config to being an advanced config.
  • We will write user facing docs about the remaining non-hidden configs
  • We will add in a warning to the user if the configs appear to be set in a way that would make it very hard to run well. This would be something like if spark.rapids.memory.hostOffHeapLimit.size < 2 * spark.rapids.memory.hostPageable.taskOverhead.size * numberOfTasks OR spark.rapids.memory.hostOffHeapLimit.size < spark.rapids.sql.batchSizeBytes + spark.sql.files.maxPartitionBytes * some factor (but we can adjust this based off of testing we do). The warning should be there to let them know that we are adjusting it to the new minimum value to avoid problems. We should also warn it the pinned pool is larger than the offHealLimit. For now we will adjust the pinned pool down to fit in the off heap limit. All of these should be documented.
@revans2 revans2 added feature request New feature or request ? - Needs Triage Need team to review and classify task Work required that improves the product but is not user facing labels Jul 31, 2023
@revans2 revans2 added reliability Features to improve reliability or bugs that severly impact the reliability of the plugin and removed feature request New feature or request labels Jul 31, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Aug 8, 2023
@binmahone
Copy link
Collaborator

Hi @revans2 recently I found that, unless setting spark.rapids.memory.hostOffHeapLimit.enabled, the memory consumption is unbounded, so the whole spark process at risk of being killed by OOM killer or YARN. Any reason why we're still not enabling spark.rapids.memory.hostOffHeapLimit.enabled by default?

@revans2
Copy link
Collaborator Author

revans2 commented Jul 11, 2024

Really there are two reasons.

  1. We never finished updating all of the CPU allocation code so it is still technically unbounded. Just not as unbounded.
  2. It is likely to cause a lot of current jobs to run much slower because they will end up spilling.

When we finally finish the first issue, then we will discuss about the second issue and if there are things we can/should do to help mitigate it.

@binmahone
Copy link
Collaborator

Hi @revans2 can you give some example of CPU allocation code to help me better understand it? Per my understanding spark.rapids.memory.hostOffHeapLimit places a limit on HostAlloc (

), so the total size of created buffers afterwards will be bounded (because it requires HostAlloc.tryAlloc to create a buffer). With this, I assume all of the host allocations are bounded, so what does CPU allocation code actually mean ?

The original problem we're facing is that: When running a customer SQL with buffer spilling, we want to maximize memory store spilling and minimize disk store spilling (to get better performance). Meanwhile, we have to limit the total offheap memory being used, to prevent the executor process eating up all OS memory and then being killed by OOM killer. Our current solution is to set:

spark.rapids.memory.host.offHeapLimit.enabled=true
spark.rapids.memory.host.offHeapLimit.size=40g --> limit offheap used by RapidsHostMemoryStore

spark.memory.offHeap.enabled=true
spark.memory.offHeap.size=40g --> limit offheap used by Spark in places like ShuffleExternalSorter, etc.

With these configs we hope the total offheap memory is bounded. Any comments on your side?

@revans2
Copy link
Collaborator Author

revans2 commented Jul 16, 2024

It should be close, but I would have to go back and look at the EPIC to see exactly what is left. You definitely could try that. I think we are 99% of the way to truly limiting host memory, but it has been a while.

Be aware that the pool Spark uses for off heap does not overlap with the pool that we use for off heap. It would be nice to eventually combine them, but as it is now your config could use up to 80 GiB of off heap memory.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin task Work required that improves the product but is not user facing
Projects
None yet
Development

No branches or pull requests

3 participants