Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: support fastlanes bitpacking #2886

Merged
merged 35 commits into from
Sep 27, 2024
Merged
Changes from 1 commit
Commits
Show all changes
35 commits
Select commit Hold shift + click to select a range
c089644
feature: support fastlanes bitpacking for uint8 type
broccoliSpicy Sep 16, 2024
a1e3cdf
minor fix
broccoliSpicy Sep 16, 2024
3f340a3
fix a bug, add self.buffer_offset in byte range
broccoliSpicy Sep 17, 2024
9a6c489
minor fix 2
broccoliSpicy Sep 17, 2024
f55a445
feat: add fastlanes bitpacking for other types
broccoliSpicy Sep 18, 2024
7c21438
address initial PR comments
broccoliSpicy Sep 18, 2024
3b82ec5
Merge branch 'main' into fastlanes
broccoliSpicy Sep 18, 2024
f0bd3a8
fix lint
broccoliSpicy Sep 18, 2024
ce0f798
return a slice of LanceBuffer in `decode`
broccoliSpicy Sep 18, 2024
fb9ede2
use `elems_per_chunk` constant to represent 1024, delete
broccoliSpicy Sep 19, 2024
3ad773c
use macro in encode method
broccoliSpicy Sep 19, 2024
403e89d
Don't pass strings to the choose_array_encoder method when choosing a…
westonpace Sep 18, 2024
0ae8362
fix a bug in `bitpacked_for_non_neg_decode`
broccoliSpicy Sep 20, 2024
23e261c
add stable rust fastlanes
broccoliSpicy Sep 20, 2024
3f92fcd
Merge remote-tracking branch 'refs/remotes/origin/fastlanes' into fas…
broccoliSpicy Sep 20, 2024
dba9a48
fix lint
broccoliSpicy Sep 20, 2024
1eb75e2
remove external fastlanes crate
broccoliSpicy Sep 21, 2024
1485759
license header
broccoliSpicy Sep 21, 2024
8543f54
fix lint
broccoliSpicy Sep 21, 2024
ee78fc6
delete a unnecessary file rust/lance-encoding/compression-algo/mod.rs
broccoliSpicy Sep 21, 2024
fe3fda8
delete two redundant file
broccoliSpicy Sep 21, 2024
697af4a
hangle nullable and all null data block in `encode`.
broccoliSpicy Sep 23, 2024
922c2fe
fix `choose_array_encoder` issue when enable V2.1
broccoliSpicy Sep 24, 2024
f09cad7
fix lint
broccoliSpicy Sep 24, 2024
fc89bf4
fix a bug scheduling ranges for data types other than 32-bit width
broccoliSpicy Sep 24, 2024
d5b9201
Make sure to use version 2.1 in tests for bitpacking
westonpace Sep 24, 2024
13b757a
make `locate_chunk_start` and `locate_chunk_end` a method
broccoliSpicy Sep 24, 2024
ca4dba3
Merge branch 'main' into fastlanes
broccoliSpicy Sep 24, 2024
f42af4c
add test_pack
broccoliSpicy Sep 25, 2024
1c2878b
add test_unchecked_pack
broccoliSpicy Sep 25, 2024
688bb1f
address PR comments
broccoliSpicy Sep 25, 2024
4d7557f
Update rust/lance-encoding/src/buffer.rs
broccoliSpicy Sep 25, 2024
c7ecb08
Merge branch 'fix/use-v2-1-on-bitpack-tests' of https://github.com/we…
broccoliSpicy Sep 25, 2024
dedb306
fix fastlanes original code link
broccoliSpicy Sep 27, 2024
655a063
lint
broccoliSpicy Sep 27, 2024
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
Next Next commit
minor fix
broccoliSpicy committed Sep 16, 2024
commit a1e3cdf9bf950ac1436589110d5b3bb9199ee2f8
2 changes: 1 addition & 1 deletion rust/lance-encoding/benches/decoder.rs
Original file line number Diff line number Diff line change
@@ -60,10 +60,10 @@
keep_original_array: true,
};

fn bench_decode2(c: &mut Criterion) {

Check warning on line 63 in rust/lance-encoding/benches/decoder.rs

GitHub Actions / linux-arm

Diff in /runner/_work/lance/lance/rust/lance-encoding/benches/decoder.rs
let rt = tokio::runtime::Runtime::new().unwrap();
let mut group = c.benchmark_group("decode_uint8");
group.measurement_time(std::time::Duration::new(12, 0)); // 11.1 seconds
group.measurement_time(std::time::Duration::new(12, 0));
let array = UInt8Array::from(vec![5; 1024 * 1024 * 1024]);
let data = RecordBatch::try_new(
Arc::new(Schema::new(vec![Field::new(
20 changes: 8 additions & 12 deletions rust/lance-encoding/src/encodings/physical/bitpack_fastlanes.rs
Original file line number Diff line number Diff line change
@@ -220,13 +220,13 @@ pub struct BitpackedForNonNegScheduler {
buffer_offset: u64,
}

fn locate_chunk_start2(scheduler: &BitpackedForNonNegScheduler, relative_row_num: u64) -> u64 {
fn locate_chunk_start(scheduler: &BitpackedForNonNegScheduler, relative_row_num: u64) -> u64 {
let elems_per_chunk = 1024;
let chunk_size = elems_per_chunk * scheduler.compressed_bit_width / 8;
relative_row_num / elems_per_chunk * chunk_size
}

fn locate_chunk_end2(scheduler: &BitpackedForNonNegScheduler, relative_row_num: u64) -> u64 {
fn locate_chunk_end(scheduler: &BitpackedForNonNegScheduler, relative_row_num: u64) -> u64 {
let elems_per_chunk: u64 = 1024;
let chunk_size = elems_per_chunk * scheduler.compressed_bit_width / 8;
relative_row_num / elems_per_chunk * chunk_size + chunk_size
@@ -260,15 +260,15 @@ impl PageScheduler for BitpackedForNonNegScheduler {
let mut byte_ranges = vec![];
let mut bytes_idx_to_range_indices = vec![];
let first_byte_range = std::ops::Range {
start: self.buffer_offset + locate_chunk_start2(self, ranges[0].start),
end: self.buffer_offset + locate_chunk_end2(self, ranges[0].end - 1),
start: self.buffer_offset + locate_chunk_start(self, ranges[0].start),
end: self.buffer_offset + locate_chunk_end(self, ranges[0].end - 1),
}; // the ranges are half-open
byte_ranges.push(first_byte_range);
bytes_idx_to_range_indices.push(vec![ranges[0].clone()]);
for (i, range) in ranges.iter().enumerate().skip(1) {
let this_start = locate_chunk_start2(self, range.start);
let this_end = locate_chunk_end2(self, range.end - 1);
if this_start == locate_chunk_start2(self, ranges[i - 1].end - 1) {
let this_start = locate_chunk_start(self, range.start);
let this_end = locate_chunk_end(self, range.end - 1);
if this_start == locate_chunk_start(self, ranges[i - 1].end - 1) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Instead of re-running locate_chunk_start can we look at the last item in byte_ranges?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

actually, the last item in byte_ranges is calculated using locate_chunk_start(range.start) and locate_chunk_end(range.end - 1) and here I am comparing this_start with locate_chunk_start(range.end - 1) to see if I want to coalesce bytes_ranges`

byte_ranges.last_mut().unwrap().end = this_end;
bytes_idx_to_range_indices
.last_mut()
@@ -299,7 +299,7 @@ impl PageScheduler for BitpackedForNonNegScheduler {

let bytes = scheduler.submit_request(byte_ranges.clone(), top_level_row);

// Clone the necessary data from `self` to move into the async block
// copy the necessary data from `self` to move into the async block
let compressed_bit_width = self.compressed_bit_width;
let uncompressed_bits_per_value = self.uncompressed_bits_per_value;
let num_rows = ranges.iter().map(|range| range.end - range.start).sum();
@@ -416,10 +416,6 @@ mod tests {
use arrow::record_batch::RecordBatch;
use arrow::util::bit_util::ceil;

#[test]
fn test_encode() {
// Prepare input data
}
use crate::decoder::decode_batch;
use crate::decoder::DecoderMiddlewareChain;
use crate::decoder::FilterExpression;