-
Notifications
You must be signed in to change notification settings - Fork 265
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat: expose number of rows covered per delta #2979
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm a little concerned about whether the result is useful if we don't even guarantee the order of the index deltas. Otherwise seems fine. Though I feel like it's not that hard for users to compute this themselves, as the fragment_bitmap
and num_rows
should both be exposed publicly, in Python and Rust.
@@ -550,6 +555,7 @@ impl DatasetIndexExt for Dataset { | |||
"num_indexed_rows": num_indexed_rows, | |||
"num_unindexed_fragments": num_unindexed_fragments, | |||
"num_unindexed_rows": num_unindexed_rows, | |||
"num_indexed_rows_per_delta": num_indexed_rows_per_delta, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is this implicitly in the same order that list_indices
returns? Do we even guarantee an order there?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we guarantee append order, let me check
We don't expose the per delta fragment bitmap to python or rust. This is only a stop gap so we have something for calculating the "base" index rows. Eventually this should be something we can track natively in v3 |
1ad2c73
to
b3069e0
Compare
Actually, looks like it's a |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #2979 +/- ##
==========================================
+ Coverage 78.81% 78.85% +0.03%
==========================================
Files 235 236 +1
Lines 73141 73581 +440
Branches 73141 73581 +440
==========================================
+ Hits 57648 58019 +371
- Misses 12513 12554 +41
- Partials 2980 3008 +28
Flags with carried forward coverage won't be shown. Click here to find out more. ☔ View full report in Codecov by Sentry. |
going to merge this for the time being as this saves us from drifting from OSS. |
This PR adds a field to
index_stats
where we now return the number of rows covered by each delta innum_indexed_rows_per_delta
TODO: