-
Notifications
You must be signed in to change notification settings - Fork 116
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
8371297134 Type promotion between integral and floating types #2173
base: master
Are you sure you want to change the base?
Conversation
Update tests: I deliberately promote to float64 here so we don't lose precision from the int32
…get that just returns a bool To prevent dangerous uses where only the static_cast<bool> of its return value was used, but where the type to be promoted to was not the second argument given to it.
cpp/arcticdb/entity/type_utils.cpp
Outdated
|
||
auto target_size = entity::SizeBits::UNKNOWN_SIZE_BITS; | ||
auto floating_size = is_floating_point_type(left_type) ? left_size : right_size; | ||
auto integral_size = is_floating_point_type(left_type) ? right_size : left_size; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: I think using is_integer_type
would be easier to follow.
cpp/arcticdb/entity/type_utils.cpp
Outdated
} else { | ||
// Non-numeric target type | ||
return std::nullopt; | ||
return false; | ||
} | ||
} else if (is_floating_point_type(source_type)) { | ||
if (is_unsigned_type(target_type) || is_signed_type(target_type)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not part of the change but is a bit more readable
if (is_unsigned_type(target_type) || is_signed_type(target_type)) { | |
if (is_integer_type(target_type)) { |
} | ||
|
||
return target; | ||
auto target_size = entity::SizeBits(uint8_t(std::max(left_size, right_size)) + 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I get why this works but it seems a bit hacky and fragile.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is just the old logic moved around, so I'd rather not change it
cpp/arcticdb/entity/type_utils.cpp
Outdated
return (is_integer_type(left) && is_floating_point_type(right)) || (is_floating_point_type(left) && is_integer_type(right)); | ||
} | ||
|
||
std::optional<entity::TypeDescriptor> common_type_float_integer(const entity::TypeDescriptor& left, const entity::TypeDescriptor& right) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: If used only in this file mark it as static
std::optional<entity::TypeDescriptor> common_type_float_integer(const entity::TypeDescriptor& left, const entity::TypeDescriptor& right) { | |
static std::optional<entity::TypeDescriptor> common_type_float_integer(const entity::TypeDescriptor& left, const entity::TypeDescriptor& right) { |
cpp/arcticdb/entity/type_utils.cpp
Outdated
} | ||
} | ||
|
||
std::optional<entity::TypeDescriptor> common_type_mixed_sign_ints(const entity::TypeDescriptor& left, const entity::TypeDescriptor& right) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
std::optional<entity::TypeDescriptor> common_type_mixed_sign_ints(const entity::TypeDescriptor& left, const entity::TypeDescriptor& right) { | |
static std::optional<entity::TypeDescriptor> common_type_mixed_sign_ints(const entity::TypeDescriptor& left, const entity::TypeDescriptor& right) { |
… tolerance as a result
To fix the bug reported in 8371297134
I've added logic so that when we merge descriptors we combine:
This is because a 16 bit integer can safely fit in a 32 bit float without loss of precision, whereas a 32 bit integer cannot. We could instead always promote up to float 64 which would be more wasteful but simpler.
Separately, for query builder pipelines I added changes so that when combining a float and an integer we always promote up to float64. This misses the cases where it is actually safe to promote to float32, but is simpler and matches Pandas. This is the cause of the test change in
test_sort_merge.py
.I reworked
has_valid_type_promotion
, introducing a new functionis_valid_type_promotion_to_target
instead. This is because thehas_valid_type_promotion
signature was dangerous - it returned an optional type that was often interpreted only as a bool (the actual type inside it was ignored). I've replace the call sites that only tested the bool with calls to the newis_valid_type_promotion_to_target
that only returns a bool.