Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Update how to manage host UDF instance #17770

Merged
merged 14 commits into from
Jan 23, 2025
11 changes: 10 additions & 1 deletion java/src/main/java/ai/rapids/cudf/Aggregation.java
Original file line number Diff line number Diff line change
Expand Up @@ -396,7 +396,16 @@ private HostUDFAggregation(HostUDFWrapper wrapper) {

@Override
long createNativeInstance() {
return Aggregation.createHostUDFAgg(wrapper.udfNativeHandle);
long udf = 0;
try {
udf = wrapper.createUDFInstance();
return Aggregation.createHostUDFAgg(udf);
Copy link
Contributor Author

@res-life res-life Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

UDF instance is hold by Aggregation instance via a unique_ptr, the Aggregation instance is responsible for the life cycle of UDF instance.
Here pass in a UDF instance and createHostUDFAgg clone it and construct a Aggregation instance, in the finally block, we close the original UDF instance.

} finally {
// a new UDF is cloned in `createHostUDFAgg`, here should close the UDF instance.
if (udf != 0) {
HostUDFWrapper.closeUDFInstance(udf);
}
}
}

@Override
Expand Down
56 changes: 47 additions & 9 deletions java/src/main/java/ai/rapids/cudf/HostUDFWrapper.java
Original file line number Diff line number Diff line change
@@ -1,5 +1,5 @@
/*
* Copyright (c) 2024, NVIDIA CORPORATION.
* Copyright (c) 2024-2025, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
Expand All @@ -19,18 +19,56 @@
/**
* A wrapper around native host UDF aggregations.
* <p>
* This class is used to store the native handle of a host UDF aggregation and is used as
* This class is used to create the native handle of a host UDF aggregation and is used as
* a proxy object to compute hash code and compare two host UDF aggregations for equality.
* <p>
* A new host UDF aggregation implementation must extend this class and override the
* {@code hashCode} and {@code equals} methods for such purposes.
* In addition, since this class implements {@code AutoCloseable}, the {@code close} method must
* also be overridden to automatically delete the native UDF instance upon class destruction.
* {@code computeHashCode} and {@code isEqual} methods for such purposes.
*
*/
public abstract class HostUDFWrapper implements AutoCloseable {
public final long udfNativeHandle;
public abstract class HostUDFWrapper {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pardon my ignorance, but why is this not an AutoCloseable any more?

Copy link
Contributor Author

@res-life res-life Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previsously HostUDFWrapper has a udfNativeHandle (UDF native instance), so it's AutoCloseable which is expected to release the UDF instance in it.
Now, the udfNativeHandle is removed, instead we create a function createUDFInstance to lazily create this udfNativeHandle and use closeUDFInstance it.
Now the createUDFInstance and closeUDFInstance are in the same class, and after UDF is created and used, we directly remove it, so this will reduce the possiblity of memory leak, and make the API more simple.


/**
* Create a derived host UDF native instance.
* The instance created by this function MUST be closed by `closeUDFInstance`
* <p>Typical usage, refer to Aggregation.java:</p>
* <pre>
* long udf = 0;
* try {
* udf = wrapper.createUDFInstance();
* return Aggregation.createHostUDFAgg(udf);
* } finally {
* // a new UDF is cloned in `createHostUDFAgg`, here should close the UDF instance.
* if (udf != 0) {
* HostUDFWrapper.closeUDFInstance(udf);
* }
* }
* </pre>
*
*/
public abstract long createUDFInstance();

/**
* Close the derived UDF instance created by `createUDFInstance`.
* @param hostUDFInstance the UDF instance
*/
public static void closeUDFInstance(long hostUDFInstance) {
close(hostUDFInstance);
}

public abstract int computeHashCode();

public HostUDFWrapper(long udfNativeHandle) {
this.udfNativeHandle = udfNativeHandle;
@Override
public int hashCode() {
return computeHashCode();
}

public abstract boolean isEqual(Object obj);

@Override
public boolean equals(Object obj) {
return isEqual(obj);
}

static native void close(long hostUDFInstance);
}
1 change: 1 addition & 0 deletions java/src/main/native/CMakeLists.txt
Original file line number Diff line number Diff line change
Expand Up @@ -148,6 +148,7 @@ add_library(
src/DataSourceHelperJni.cpp
src/HashJoinJni.cpp
src/HostMemoryBufferNativeUtilsJni.cpp
src/HostUDFWrapperJni.cpp
src/NvcompJni.cpp
src/NvtxRangeJni.cpp
src/NvtxUniqueRangeJni.cpp
Expand Down
34 changes: 34 additions & 0 deletions java/src/main/native/src/HostUDFWrapperJni.cpp
Original file line number Diff line number Diff line change
@@ -0,0 +1,34 @@
/*
* Copyright (c) 2025, NVIDIA CORPORATION.
*
* Licensed under the Apache License, Version 2.0 (the "License");
* you may not use this file except in compliance with the License.
* You may obtain a copy of the License at
*
* http://www.apache.org/licenses/LICENSE-2.0
*
* Unless required by applicable law or agreed to in writing, software
* distributed under the License is distributed on an "AS IS" BASIS,
* WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied.
* See the License for the specific language governing permissions and
* limitations under the License.
*/

#include "cudf_jni_apis.hpp"

#include <cudf/aggregation/host_udf.hpp>

extern "C" {

JNIEXPORT void JNICALL Java_ai_rapids_cudf_HostUDFWrapper_close(JNIEnv* env,
jclass class_object,
jlong ptr)
{
try {
auto to_del = reinterpret_cast<cudf::host_udf_base*>(ptr);
delete to_del;
Comment on lines +28 to +29
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

👍

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Here we can colse all kind of derived UDF instances.

Copy link
Contributor Author

@res-life res-life Jan 23, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Previously, it's in the derived class, link.

}
CATCH_STD(env, );
}

} // extern "C"
Loading