-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: support fastlanes bitpacking #2886
Merged
Merged
Changes from all commits
Commits
Show all changes
35 commits
Select commit
Hold shift + click to select a range
c089644
feature: support fastlanes bitpacking for uint8 type
broccoliSpicy a1e3cdf
minor fix
broccoliSpicy 3f340a3
fix a bug, add self.buffer_offset in byte range
broccoliSpicy 9a6c489
minor fix 2
broccoliSpicy f55a445
feat: add fastlanes bitpacking for other types
broccoliSpicy 7c21438
address initial PR comments
broccoliSpicy 3b82ec5
Merge branch 'main' into fastlanes
broccoliSpicy f0bd3a8
fix lint
broccoliSpicy ce0f798
return a slice of LanceBuffer in `decode`
broccoliSpicy fb9ede2
use `elems_per_chunk` constant to represent 1024, delete
broccoliSpicy 3ad773c
use macro in encode method
broccoliSpicy 403e89d
Don't pass strings to the choose_array_encoder method when choosing a…
westonpace 0ae8362
fix a bug in `bitpacked_for_non_neg_decode`
broccoliSpicy 23e261c
add stable rust fastlanes
broccoliSpicy 3f92fcd
Merge remote-tracking branch 'refs/remotes/origin/fastlanes' into fas…
broccoliSpicy dba9a48
fix lint
broccoliSpicy 1eb75e2
remove external fastlanes crate
broccoliSpicy 1485759
license header
broccoliSpicy 8543f54
fix lint
broccoliSpicy ee78fc6
delete a unnecessary file rust/lance-encoding/compression-algo/mod.rs
broccoliSpicy fe3fda8
delete two redundant file
broccoliSpicy 697af4a
hangle nullable and all null data block in `encode`.
broccoliSpicy 922c2fe
fix `choose_array_encoder` issue when enable V2.1
broccoliSpicy f09cad7
fix lint
broccoliSpicy fc89bf4
fix a bug scheduling ranges for data types other than 32-bit width
broccoliSpicy d5b9201
Make sure to use version 2.1 in tests for bitpacking
westonpace 13b757a
make `locate_chunk_start` and `locate_chunk_end` a method
broccoliSpicy ca4dba3
Merge branch 'main' into fastlanes
broccoliSpicy f42af4c
add test_pack
broccoliSpicy 1c2878b
add test_unchecked_pack
broccoliSpicy 688bb1f
address PR comments
broccoliSpicy 4d7557f
Update rust/lance-encoding/src/buffer.rs
broccoliSpicy c7ecb08
Merge branch 'fix/use-v2-1-on-bitpack-tests' of https://github.com/we…
broccoliSpicy dedb306
fix fastlanes original code link
broccoliSpicy 655a063
lint
broccoliSpicy File filter
Filter by extension
Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -283,6 +283,32 @@ impl LanceBuffer { | |
pub fn copy_array<const N: usize>(array: [u8; N]) -> Self { | ||
Self::Owned(Vec::from(array)) | ||
} | ||
|
||
#[allow(clippy::len_without_is_empty)] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. We might as well add an is_empty |
||
pub fn len(&self) -> usize { | ||
match self { | ||
Self::Borrowed(buffer) => buffer.len(), | ||
Self::Owned(buffer) => buffer.len(), | ||
} | ||
} | ||
|
||
/// Returns a new [LanceBuffer] that is a slice of this buffer starting at `offset`, | ||
/// with `length` bytes. | ||
/// Doing so allows the same memory region to be shared between lance buffers. | ||
/// # Panics | ||
/// Panics if `(offset + length)` is larger than the existing length. | ||
/// If the buffer is owned this method will require a copy. | ||
pub fn slice_with_length(&self, offset: usize, length: usize) -> Self { | ||
let original_buffer_len = self.len(); | ||
assert!( | ||
offset.saturating_add(length) <= original_buffer_len, | ||
"the offset + length of the sliced Buffer cannot exceed the existing length" | ||
); | ||
match self { | ||
Self::Borrowed(buffer) => Self::Borrowed(buffer.slice_with_length(offset, length)), | ||
Self::Owned(buffer) => Self::Owned(buffer[offset..offset + length].to_vec()), | ||
} | ||
} | ||
} | ||
|
||
impl AsRef<[u8]> for LanceBuffer { | ||
|
Oops, something went wrong.
Oops, something went wrong.
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What do you think BitpackedWithNeg` will look like?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I plan to make it a cascading encoding of
a BTreeMap
ofrow number index
toreal value
for a few very wide(bit-width) values (negative values) thenbitpacking
, for arrays that have too many negative values (for example: 50 percent), I think we should not usebitpacking
on it.There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In theory these should still may be bitpackable right? For example if all data is between
[-8, 8]
then we could shift to[0, 16]
and bitpack? Although I suppose frame-of-reference would do that for us 🤔There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yeah, I think there might be some literatures focusing on how to intuitively combine lightweight integer encoding algorithms, I will research on that.