-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: move-stable row ids in compaction #2544
Conversation
51402ce
to
77214c7
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These functions are moved from the parent module.
50424ea
to
f652924
Compare
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2544 +/- ##
==========================================
+ Coverage 79.81% 79.93% +0.12%
==========================================
Files 224 225 +1
Lines 65871 66377 +506
Branches 65871 66377 +506
==========================================
+ Hits 52572 53060 +488
+ Misses 10231 10213 -18
- Partials 3068 3104 +36
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
rust/lance-table/src/rowids.rs
Outdated
@@ -152,6 +152,27 @@ impl RowIdSequence { | |||
self.0.extend_from_slice(remaining_segments); | |||
} | |||
|
|||
/// Mask |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// Mask | |
/// Delete row ids by position |
Or some other comment explaining the difference between this and delete
(they look very similar)
let deletions = read_deletion_file(&dataset.base, frag, dataset.object_store()).await?; | ||
if let Some(deletions) = deletions { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we guaranteed the deletion was materialized at this point? Is there any way two fragments can be compacted without materializing their deletions?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah that is guaranteed. We compacting by scanning and then writing out. The scan removes deleted rows.
.iter_mut() | ||
.flat_map(|group| group.new_fragments.iter_mut()) | ||
.collect::<Vec<_>>(); | ||
reserve_fragment_ids(dataset, new_fragments.into_iter()).await?; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do we even need to call reserve_fragment_ids
in this case? For some reason I thought commit would just assign ids if they were left as -1.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
That's a good point. We don't need to create another transaction for that.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nevermind, turns out this was in fact load bearing. The assigned fragment ids from this step are used to compute the new fragment bitmap for each of the indices. I'll add a comment to clarify this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, thanks for the try
2dfe148
to
2637df7
Compare
Part of #2307.
Also addresses #2397