You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I have the IPC format of Arrow record batches in memory (e.g. as {{&[u8]}}) and would like to create a vector of batches while avoiding as many memory copies as possible. It would be great if there was a way to create the vector without having to go through the file abstraction.
I might be misunderstanding the way the file reader works and maybe it does not incur memory copies. I think it does, though, since creating arrow record batches from a larger arrow buffer takes much longer.
The text was updated successfully, but these errors were encountered:
Comment from Andrew Lamb(alamb) @ 2021-03-08T22:10:56.189+0000:
[~domoritz] I wonder if you mean this reader: https://docs.rs/arrow/3.0.0/arrow/ipc/reader/struct.FileReader.html#method.try_new
If so, while it is called a `FileReader` I think that is somewhat misleading. It requires something that implements `std::io::Read` -- which `&[u8]` does.
https://doc.rust-lang.org/std/io/trait.Read.html#impl-Read-2
Comment from Dominik Moritz(domoritz) @ 2021-03-09T08:12:41.105+0000:
But {{&[u8]}} does not seem to implement Seek so FileReader does not work.
The error is:
{{the trait bound `&[u8]: Seek` is not satisfied}}
{{ required by `FileReader::::try_new`}}
If I switch to the StreamReader, I get an IO error at runtime:
{{Io error: failed to fill whole buffer}}
So what I implemented was
{{let cursor = std::io::Cursor::new(contents);}}
{{ let reader = match arrow::ipc::reader::FileReader::try_new(cursor) {}}
{{ Ok(reader) => reader,}}
{{ Err(error) => return Err(format!("{}", error).into()),}}
{{ };}}
Note: migrated from original JIRA: https://issues.apache.org/jira/browse/ARROW-11696
I have the IPC format of Arrow record batches in memory (e.g. as {{&[u8]}}) and would like to create a vector of batches while avoiding as many memory copies as possible. It would be great if there was a way to create the vector without having to go through the file abstraction.
I might be misunderstanding the way the file reader works and maybe it does not incur memory copies. I think it does, though, since creating arrow record batches from a larger arrow buffer takes much longer.
The text was updated successfully, but these errors were encountered: