-
Notifications
You must be signed in to change notification settings - Fork 461
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[VL] Task killed by Yarn #6947
Comments
We may need provide the ability to proxy malloc/free in Gluten, and inject this proxy to Velox MemoryAllocator. |
Related to #6960. The golbal memory allocation isn't tracked either, which doesn't have a limit now. It's mostly the root cause. Thank @zhztheplayer and @marin-ma |
PR6988 added the limitation of global memory allocation, 0.75xoverhead memory. But Velox has no control of the global memory allocation. So with PR6988, we can workaround to set a large overhead memory, otherwise you will see the |
Here is a simple way to detect this. It output the memory allocation bigger than 1M but not from memory pool. disable jemalloc, set executor.core to 1, then you can get output like
In this example, the root cause is
Solution is to set |
the other places:
|
|
It's a temp buffer and will be released after flush |
Another known BKM, the malloc also matters. To boost performance, jemalloc will hold some memory for performance, which is counted into overhead memory by yarn as well. |
Looks timsort needs multiple memory allocation. In this case each allocation is ~3M
|
window related fix1: facebookincubator/velox#11077 sort related fix: facebookincubator/velox#11129 |
|
An basic design in Velox is that before spill, we allocate memory in spark memory pool (offheap size). During spill, we allocate memory in global memory pool (overhead) |
The 3 configurations have big impact to the offheap and overhead memory usage: spark.gluten.sql.columnar.backend.velox.spillWriteBufferSize SpillWriteBufferSize controls the buffer size when spill write data to disk. Looks it also control the read buffer size when spill data is fetch back. Each file must have one buffer allocated in offheap memory. If the size is too large, it will report OOM error triggered by getOutput. MaxSpillRunRows controls the batch size of spill. The bigger the number, the more overhead memory is allocated, because during spill all memory allocation is overhead memory. The smaller the number, the more spill files. maxSpillFileSize controls the file size of spill. The smaller the number, the more spill files. |
Backend
VL (Velox)
Bug description
one common issue of Gluten is that the task is killed by yarn. Currently Gluten has some memory allocation like std::vector isn't tracked by spark's memory management. It's counted into executor.overhead memory now. But there should be some such allocation with large data size. We should create a proxy allocator for such memory allocation.
Spark version
None
Spark configurations
No response
System information
No response
Relevant logs
No response
The text was updated successfully, but these errors were encountered: