Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[FEA] Add Host Memory Retry for Row to Columnar Conversion #8887

Open
revans2 opened this issue Jul 31, 2023 · 5 comments · Fixed by #10011, #9929, #10450 or #10617
Open

[FEA] Add Host Memory Retry for Row to Columnar Conversion #8887

revans2 opened this issue Jul 31, 2023 · 5 comments · Fixed by #10011, #9929, #10450 or #10617
Assignees
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin task Work required that improves the product but is not user facing

Comments

@revans2
Copy link
Collaborator

revans2 commented Jul 31, 2023

Is your feature request related to a problem? Please describe.
This gets to be a little difficult. We have two different implementations that we need to update to deal with this. GeneratedInternalRowToCudfRowIterator and RowToColumnarIterator.

#9862 will add in limits to the host memory being allocated, but we still need to make sure that the allocations can be retried if needed.

The generated version uses code gen to put the data into a buffer with a given format, and then uses the GPU to transform that data into a Table on the GPU. This is relatively simple to deal with because we know how big of a buffer we are going to allocate, and we already have switching the buffer around/allocating new ones/etc in place. So just switching this over to use the new allocation APIs in a blocking allocation is almost good enough. We just have to make sure that we can have the buffer(s) we allocated be spillable after we are done inserting in a new row. This is where having a mutable buffer would be nice. We also need to make sure that we have ways to deal with a single really large row. But in general not too bad.

It also looks fairly simple to make the batch smaller so that we could chunk them, and then pass multiple of these batches down to the kernel so we are not holding onto too much non-spillable CPU memory at a time.

But the other one that is the slow way. That is much harder because we need a way to limit the host memory and still follow the rules where we would not do a blocking allocation of host memory while holding on to non-spillable buffers. This is especially hard because all of the buffers that we want to work with are in the CUDF java APIs.

The problem here is that there is just a lot of coupling between the different parts of the code, and it might be simpler to just copy a lot of the HostColumnVector.ColumnBuilder code out into the plugin for the short term until we can understand exactly how to do this. Then we can move it back if we have clean and generic APIs. Right now we create a GpuColumnarBatchBuilder inside of the plugin with an estimate of the number of rows we want for that table. The buffers are allocated lazily and can grow at any point in time. There just is no good way for us to say that you can do an initial allocation, and it should look this particular way, but don't grow the data, and if you have to stop part way through a row, then we need to make sure that we roll back anything that was partially written for that row so far. Oh and we have to make the buffers spillable in between each call to next to get another row.

I think what we want to do is to allocate one host memory buffer that is close to a target batch size (possibly smaller if we want to chunk it). We can then slice the buffer up into smaller buffers that would correspond to the estimated size we expect per buffer. The initial guess would come from just the schema, but future guesses could come based off of where we know that we made mistakes in earlier estimates. We would then start to populate these buffers with data, but that update would happen in a two phase commit. The first pass would copy the data into the buffers, if there is room, and keep track of which buffers didn't have enough room. If all of the buffers had enough room, then we would commit the result and go on to the next row. If any of them didn't have enough room, we would not commit the result. We would update any stats we wanted for a better allocation next time, and process what we have, assuming that there is enough memory for at least a single row.

@revans2 revans2 added ? - Needs Triage Need team to review and classify task Work required that improves the product but is not user facing reliability Features to improve reliability or bugs that severly impact the reliability of the plugin labels Jul 31, 2023
@mattahrens mattahrens removed the ? - Needs Triage Need team to review and classify label Aug 8, 2023
@revans2 revans2 changed the title [FEA] Add Host Memory Limits for Row to Columnar Conversion [FEA] Add Host Memory Retry for Row to Columnar Conversion Nov 27, 2023
@jbrennan333
Copy link
Contributor

I put up #9929 as a first step for GeneratedInternalRowToCudfRowIterator.

@jbrennan333
Copy link
Contributor

I put up another pr to improve this: #10450

@jbrennan333
Copy link
Contributor

This should not have been closed. Only the first part (GeneratedInternalRowToCudfRowIterator) is done. I am still working on the second part (RowToColumnarIterator).

@jbrennan333
Copy link
Contributor

I put up a draft PR #10617 that is a large part of the second part of this feature.
It allocates a single buffer up front and splices it up among the column builders.

@sameerz
Copy link
Collaborator

sameerz commented Apr 22, 2024

Assigning to @jlowe for triage after #10617 was reverted

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
reliability Features to improve reliability or bugs that severly impact the reliability of the plugin task Work required that improves the product but is not user facing
Projects
None yet
5 participants