Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Port component storage to arrow-rs #8725

Open
wants to merge 26 commits into
base: main
Choose a base branch
from
Open

Port component storage to arrow-rs #8725

wants to merge 26 commits into from

Conversation

emilk
Copy link
Member

@emilk emilk commented Jan 17, 2025

This makes our store run 100% on arrow-rs.

arrow2 is now relegated to the margins:

  • IPC
  • re_types_builder
  • Some legacy methods on Loggable, Archetype etc

@emilk emilk added 🏹 arrow concerning arrow 🚜 refactor Change the code, not the functionality exclude from changelog PRs with this won't show up in CHANGELOG.md labels Jan 17, 2025
Copy link

github-actions bot commented Jan 17, 2025

Web viewer built successfully. If applicable, you should also test it:

  • I have tested the web viewer
Result Commit Link Manifest
67ef6ec https://rerun.io/viewer/pr/8725 +nightly +main

Note: This comment is updated whenever you push a commit.

@emilk emilk force-pushed the emilk/arrow-re_chunk branch from 58f628c to e94a3b1 Compare January 17, 2025 14:45
---
ChunkStore {
id: test_id
config: ChunkStoreConfig { enable_changelog: true, chunk_max_bytes: 393216, chunk_max_rows: 4096, chunk_max_rows_if_unsorted: 1024 }
stats: {
num_chunks: 1
total_size_bytes: 1.3 KiB
total_size_bytes: 1.8 KiB
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This should be fixed when we update arrow and can run .shrink_to_fit

match ArrowRecordBatch::try_new_with_options(
self.schema().clone(),
row,
&arrow::array::RecordBatchOptions::new().with_row_count(Some(1)),
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Explicitly setting row-count to one means it works even when there are no columns (e.g. due to heavy filtering)

Comment on lines +1236 to +1240
if cfg!(debug_assertions) {
panic!("Failed to create record batch: {err}");
} else {
re_log::error_once!("Failed to create record batch: {err}");
None
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Better than the previous silent error

Comment on lines +50 to +57
let field = match array.data_type() {
arrow::datatypes::DataType::List(field) => field.clone(),
_ => unreachable!(
"{} should always be a list array",
self.descriptor().full_name()
),
};
ArrowListArray::try_new(field, offsets, array, None).map_err(|err| err.into())
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this right? 🤔

@emilk
Copy link
Member Author

emilk commented Jan 17, 2025

@rerun-bot full-check

Copy link

@emilk emilk marked this pull request as ready for review January 17, 2025 15:19
@emilk
Copy link
Member Author

emilk commented Jan 17, 2025

@rerun-bot full-check

Copy link

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
🏹 arrow concerning arrow exclude from changelog PRs with this won't show up in CHANGELOG.md 🚜 refactor Change the code, not the functionality
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant