-
Notifications
You must be signed in to change notification settings - Fork 244
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Have host spill use the new HostAlloc API #9257
Have host spill use the new HostAlloc API #9257
Conversation
Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
build |
@@ -192,7 +192,7 @@ class RapidsDiskStore(diskBlockManager: RapidsDiskBlockManager) | |||
val path = id.getDiskPath(diskBlockManager) | |||
withResource(new FileInputStream(path)) { fis => | |||
val (header, hostBuffer) = SerializedHostTableUtils.readTableHeaderAndBuffer(fis) | |||
val hostCols = closeOnExcept(hostBuffer) { _ => | |||
val hostCols = withResource(hostBuffer) { _ => |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice catch
// TODO: this is disabled for now since subsequent work will tie this into | ||
// our host allocator apis. | ||
if (false && !wouldFit) { | ||
if (!wouldFit) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@revans2 this will change the behavior in 23.10. Before this, we would still to host. Now we are going to skip to disk if we can't fit given the limit. Just making sure that was intended.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I personally think that is what we want, but if anyone has a different opinion I am all ears. We want to get to the point where we have a hard limit on host memory. The reason we don't want to make the changes piecemeal is to reduce the pain a customer might see when running a job. This change is only going to show up when a customer wants to spill from GPU memory, and it is larger than the spill store size. I think that is fairly rare, so I am willing to take the hit.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
By default the host store is configured to 1GB + pinnedPoolSize
, and pinnedPoolSize
is defaulted to 0. I think if we raised it to 2GB I'd agree, otherwise it seems fairly common to bump up batchSizeBytes
to 2GB and those would spill to disk.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That said, with such a small spill store, we are bound to spill to disk very often anyway... so I don't know if we are saving too much.
@jlowe can you please take a look? |
build |
sql-plugin/src/main/scala/com/nvidia/spark/rapids/RapidsHostMemoryStore.scala
Outdated
Show resolved
Hide resolved
build |
This also includes a small memory leak fix for the RapidsDiskStore that was exposed when I started to test with spill this way.
This does not test the code path when we enable host memory limits instead of the host spill store limits. That will be covered by #8883, which should be the next thing I work on.
This fixes #8881