-
Notifications
You must be signed in to change notification settings - Fork 3.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
GH-32276: [C++][FlightRPC] Align buffers from Flight #35679
Conversation
Given more time we might be able to avoid the extra copies, but I'd need to dig deeper into Protobuf & do some benchmarking. (In particular: instead of going gRPC slices -> CodedInputStream -> Arrow buffers, I'd rather directly go from gRPC slices -> Arrow.) |
cda6048
to
c9d1b82
Compare
// XXX: due to where we sit, we can't use a custom allocator | ||
// XXX: any error here will likely crash or hang gRPC! | ||
auto status = | ||
util::EnsureAlignment(std::move(out->body), 64, default_memory_pool()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Isn't 64 a bit heavy-handed? Basically, and assuming the distribution of gRPC slice alignments is uniform, we will reallocate almost all incoming buffers?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This should be configurable as not all code using the C++ flight client is sensitive to alignment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Misaligned data pointers can be a significant hurdle when working with client libraries that require proper alignment. If these libraries offered better built-in support for data alignment, it would greatly reduce the challenges for developers.
Ensuring data is aligned by default leads to a smoother, more efficient developer experience.
@@ -380,6 +382,14 @@ ::grpc::Status FlightDataDeserialize(ByteBuffer* buffer, | |||
return ::grpc::Status(::grpc::StatusCode::INTERNAL, | |||
"Unable to read FlightData body"); | |||
} | |||
// XXX: due to where we sit, we can't use a custom allocator | |||
// XXX: any error here will likely crash or hang gRPC! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This seems to be a problem and could actually be a regression for current users of the Flight C++ server, no?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is true of any error returned here. Last I checked, gRPC doesn't actually handle errors despite this being a fallible API, instead choosing to crash.
I'm skeptical that we should force this onto users of Flight C++, many of whom may not be concerned with misaligned buffers. |
Given the nature of the gRPC API there's not a great way to customize it at runtime. If someone complains about this in the future we can point them here then. Possibly we can make it a compile-time toggle, though that's effectively useless to most users. @westonpace @rtpsw were the most recent to mention this. If there's not a real desire for this then I'll close the PR. |
This could be settable using an env var. But I don't think we should realign buffers by default. This can decrease performance and increase memory fragmentation for unsuspecting users. |
Now that we have a workaround enabled in Acero I don't have any urgent need for this. I'm fine with waiting for it to become a problem (and it may never do so). |
The arrow standard requires it.
|
Hmm, the language in our columnar spec is weird. Alignment has no defined meaning in an IPC stream (but padding has). |
Rationale for this change
Ensure buffers from Flight are at least 8-byte aligned to meet general expectations of downstream users.
What changes are included in this PR?
Manually align buffers after deserialization.
Are these changes tested?
Yes
Are there any user-facing changes?
Flight data will be aligned, at the cost of an additional copy.