-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add all_match Presto function #3356
Conversation
✅ Deploy Preview for meta-velox canceled.
|
ebfca34
to
24d346c
Compare
Hi @duanmeng. Are you still working on this function? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for opening the PR and working on the function! I left a few comments, I would appreciate it if you could take a look.
336e301
to
587f8a1
Compare
@isadikov I've resolved all the comments, could you help to have a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good, thanks for working on this. I left a couple of nit comments.
@isadikov I've resolved the rest of nit comments, could you help to have a look? |
Looks good, thanks for working on this! @mbasmanova Could you take a look and review this PR that adds all_match function? Thank you! |
@mbasmanova Could you please help to take a look, I've resolved all the comments, thanks. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duanmeng Looks good overall. Some comments below.
edae9dd
to
731f937
Compare
@mbasmanova I've resolved all the comments and added the error handling logic and related UT, could you help to take a look? |
212f8fb
to
c394eb4
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duanmeng Thank you for iterating on this PR. Looks good to me % a couple comments.
Returns true if all the elements match the predicate (a special case is when the array is empty); | ||
Returns false if one or more elements don’t match; | ||
Returns NULL if the predicate function returns NULL for one or more elements and true for all other elements. | ||
Throw exceptions if the predicate function has errors on some elements and returns true or NULL on others. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
typos: Throws an exception if the predicate fails for one or more elements and returns true or NULL for the rest.
auto allMatch = true; | ||
auto hasNull = false; | ||
auto hasError = | ||
errors && row < errors->size() && !errors->isNullAt(row); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
perhaps, add a helper method to EvalCtx for readability:
auto hasError = context.hasError(row);
Also, this variable is not used until late. Let's move it next to first use.
if (hasError) { | ||
auto err = std::static_pointer_cast<std::exception_ptr>( | ||
context.errors()->valueAt(row)); | ||
std::rethrow_exception(*err); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We need to make sure we throw VeloxException. See throwError in EvalCtx.cpp.
auto throwError(const std::exception_ptr& exceptionPtr) {
std::rethrow_exception(toVeloxException(exceptionPtr));
}
Consider moving this logic into a helper method in EvalCtx to avoid copy-paste.
context.throwIfHasError(row);
exec::LocalDecodedVector bitsDecoder(context); | ||
auto it = args[1]->asUnchecked<FunctionVector>()->iterator(&rows); | ||
|
||
// Turns off the ThrowOnError flag and temporarily resets errors in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Use active verbs: Turn off ... and reset
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
auto result = evaluate<SimpleVector<bool>>( | ||
"all_match(c0, x -> ((10 / x) > 2))", makeRowVector({input})); | ||
|
||
auto expectedResult = makeNullableFlatVector<bool>({false, false}); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
makeNullableFlatVector -> makeFlatVector
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
} | ||
|
||
TEST_F(ArrayAllMatchTest, errorReThrow) { | ||
static constexpr std::string_view errorMessage{"division by zero"}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
naming: kErrorMessage and kErrorCode
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
static constexpr std::string_view errorMessage{"division by zero"}; | ||
static constexpr std::string_view errorCode{"ARITHMETIC_ERROR"}; | ||
|
||
try { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use VELOX_ASSERT_THROW
Let's also verify that these expressions work properly with TRY.
- TRY(all_match(c0, x -> ))
- all_match(c0, x -> TRY()))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed
velox/expression/LambdaExpr.cpp
Outdated
const BufferPtr& wrapCapture, | ||
EvalCtx* context, | ||
const std::vector<VectorPtr>& args, | ||
VectorPtr& errors, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
VectorPtr -> ErrorVectorPtr (sorry, I got it wrong in my earlier suggestion)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
OK, done.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duanmeng Looks great % minor comments.
velox/expression/EvalCtx.h
Outdated
} | ||
|
||
/// Extract the actual exception of the row | ||
std::exception_ptr getError(vector_size_t row) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These new methods seem unused. Let's remove.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
rows.end(), | ||
std::move(allVectors)); | ||
EvalCtx lambdaCtx(context->execCtx(), context->exprSet(), row.get()); | ||
auto row = createRowVector(context, wrapCapture, args, rows.end()); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice refactor.
velox/expression/LambdaExpr.cpp
Outdated
transformErrorVector(lambdaCtx, context, rows, elementToTopLevelRows); | ||
} | ||
|
||
void applyNoThrowError( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
applyNoThrowError -> applyNoThrow for consistency (see EvalCtx::applyToSelectedNoThrow)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
auto allMatch = true; | ||
auto hasNull = false; | ||
std::exception_ptr errorPtr = nullptr; | ||
auto hasError = false; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe we don't need hasError variable. We can just check if errorPtr is null or not.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
auto hasError = false; | ||
for (auto i = 0; i < size; ++i) { | ||
auto idx = offset + i; | ||
if (elementErrors && idx < elementErrors->size() && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider extracting a helper function for readability
if (hasError(elementErrors, idx))
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
assertEqualVectors(expectedResult, result); | ||
} | ||
|
||
TEST_F(ArrayAllMatchTest, errorSuppress) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errorSuppress -> errors
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
assertEqualVectors(expectedResult, result); | ||
} | ||
|
||
TEST_F(ArrayAllMatchTest, errorReThrow) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
maybe combine this test case with the previous one
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
kErrorMessage); | ||
} | ||
|
||
TEST_F(ArrayAllMatchTest, withTrys) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
makeRowVector({makeArrayVector<int8_t>({{1, 0}, {1}, {6}})})); | ||
expectedResult = makeNullableFlatVector<bool>({std::nullopt, true, false}); | ||
assertEqualVectors(expectedResult, result); | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let's add a test case with multiple lambdas. See TransformTest.cpp:
// Test different lambdas applied to different rows
TEST_F(TransformTest, conditional) {
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@mbasmanova I've resolved all comments, combined the error test cases, and added conditional multiple lambdas test cases, could you please help to have a look? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@duanmeng Looks great. Thank you for your patience.
kErrorMessage); | ||
VELOX_ASSERT_THROW( | ||
evaluate<SimpleVector<bool>>( | ||
"all_match(c0, x -> ((10 / x) > 2))", makeRowVector({errorInput})), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: consider changing errorInput to include makeRowVector, so it can be used directly and drop template parameter for readability:
VELOX_ASSERT_THROW(
evaluate("all_match(c0, x -> ((10 / x) > 2))", errorInput),
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
auto input = makeRowVector({makeNullableArrayVector<int8_t>( | ||
{{0, 2, 0, 5, 0}, {5, std::nullopt, 0}})}); | ||
auto result = | ||
evaluate<SimpleVector<bool>>("all_match(c0, x -> ((10 / x) > 2))", input); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: can drop template parameters here and in other similar places as well
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
velox/vector/FunctionVector.h
Outdated
@@ -15,14 +15,11 @@ | |||
*/ | |||
#pragma once | |||
|
|||
#include "velox/expression/EvalCtx.h" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this introduces circular dependency: vectors depend expressions and expressions depend on vectors. We need to figure out how to break that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe move ErrorVectorPtr to FlatVector.h.
using ErrorVector = FlatVector<std::shared_ptr<void>>;
using ErrorVectorPtr = std::shared_ptr<ErrorVector>;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, done. ErrorVectorPtr
should be declared in the vectors scope.
@mbasmanova has imported this pull request. If you are a Meta employee, you can view this diff on Phabricator. |
@mbasmanova merged this pull request in d8c747f. |
@duanmeng Now that the all_match Presto function is merged into Velox, it would be nice to follow up with an e2e test case in Prestissimo to ensure this function works correctly end-to-end: presto-native-execution/src/test/java/com/facebook/presto/nativeworker/TestHiveArrayFunctionQueries.java |
Summary: all_match(array(T), function(T, boolean)) → boolean >Returns whether all elements of an array match the given predicate. Returns true if all the elements match the predicate (a special case is when the array is empty); false if one or more elements don’t match; NULL if the predicate function returns NULL for one or more elements and true for all other elements. Pull Request resolved: facebookincubator#3356 Reviewed By: Yuhta Differential Revision: D44165931 Pulled By: mbasmanova fbshipit-source-id: a23bc2619c3d2befcaf2aab9a373415ecc8f378a
…d of all partitions (facebookincubator#3356)
all_match(array(T), function(T, boolean)) → boolean