Skip to content

Commit

Permalink
Grab the GPU Semaphore when reading cached batch data with the GPU (#…
Browse files Browse the repository at this point in the history
…11991)

This fixes #11989
except for the retry which I will file a follow on issue for.

---------

Signed-off-by: Robert (Bobby) Evans <bobby@apache.org>
  • Loading branch information
revans2 authored Jan 27, 2025
1 parent 2fe7c6c commit 0aaeed9
Showing 1 changed file with 3 additions and 1 deletion.
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2020-2024, NVIDIA CORPORATION.
* Copyright (c) 2020-2025, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand Down Expand Up @@ -487,13 +487,15 @@ protected class ParquetCachedBatchSerializer extends GpuCachedBatchSerializer {

val cbRdd: RDD[ColumnarBatch] = input.map {
case parquetCB: ParquetCachedBatch if parquetCB.sizeInBytes == 0 =>
GpuSemaphore.acquireIfNecessary(TaskContext.get())
// If the buffer is empty, we have cached a batch with no columns, we don't need to decode
// it, instead just return a ColumnarBatch with only rows
withResource(new GpuColumnarBatchBuilder(originalSelectedAttributes.toStructType,
parquetCB.numRows)) {
builder => builder.build(parquetCB.numRows)
}
case parquetCB: ParquetCachedBatch =>
GpuSemaphore.acquireIfNecessary(TaskContext.get())
val parquetOptions = ParquetOptions.builder()
.includeColumn(selectedAttributes.map(_.name).asJavaCollection).build()
val table = try {
Expand Down

0 comments on commit 0aaeed9

Please sign in to comment.