Clarify BTree `range_search` comments #81312

dylni · 2021-01-23T22:03:19Z

These comments were added by #81169. However, the soundness issue might not be exploitable here, so the comments should be updated.

cc @ssomers

rust-highfive · 2021-01-23T22:03:22Z

r? @shepmaster

(rust-highfive has picked a reviewer for you, use r? to override)

ssomers · 2021-01-23T23:05:40Z

library/alloc/src/collections/btree/navigate.rs

@@ -88,6 +87,7 @@ where
            (_, Unbounded) => max_node.last_edge(),
        };

+        // This can't be used for safety. The `PartialOrd` impl can change between calls.


This is btree's own PartialOrd (defined in node.rs, until #81094 reorganizes it away) and it is about the only real safety we have. If the condition fails, the iterator would never hit the other end and run off the parent of the root node (which is None) without a safe unwrap - that last safety has been "optimized" away).

Good catch. This is a merge conflict waiting to happen.

dylni · 2021-02-25T04:53:35Z

@shepmaster This PR is ready for review when you have time.

dylni · 2021-02-25T04:55:31Z

Actually, if you're waiting on #81094, that might be better.

bors · 2021-03-01T05:30:56Z

☔ The latest upstream changes (presumably #81094) made this pull request unmergeable. Please resolve the merge conflicts.

shepmaster · 2021-03-08T14:46:28Z

r? @m-ou-se

dylni · 2021-03-15T04:30:23Z

@ssomers I updated this PR based on #81094. Do you agree with the changes?

ssomers · 2021-03-15T11:30:31Z

library/alloc/src/collections/btree/search.rs

@@ -94,7 +94,7 @@ impl<BorrowType: marker::BorrowType, K, V> NodeRef<BorrowType, K, V, marker::Lea
        K: Borrow<Q>,
        R: RangeBounds<Q>,
    {
-        // WARNING: Inlining these variables would be unsound (#81138)
+        // It might be unsound to inline these variables if this logic changes (#81138).


I think it's good to tone down the warning, but I don't know what you mean with "if this logic changes". What logic? It might be bad to inline these variables without any other change, because successfully passing the match statement might fool you to believe the start and end actually used were fine.

What logic?

I meant the code in the rest of the method. This might be better:

// Inlining these variables should be avoided (#81138).

The rest of the comment already explains why inlining can cause a safety issue.

ssomers · 2021-03-15T12:51:53Z

library/alloc/src/collections/btree/search.rs

+            // SAFETY: This panic is used for safety, so external impls can't be called here. The
+            // comparison is done with integers for that reason.


I'm quite confused here. If with "external impls" you mean the K: Ord implementation, sure we can call it, we have to call it, we just had the find functions call it profusely. But we can't blindly trust it to implement a total order ~~, and even if it does, it may not correspond to the Q: Ord implementation that we used to check the range bounds~~.

The comparison is done with integers because those are what we end up with, what we need to identify positions in the data structure. For me, the question is: why do we compare and panic at all? Because if we didn't, and if the Ord implementations are invalid ~~or inconsistent~~, or until #81138 if the range implementation was adversarial, we could easily and silently return a result that would lead callers to loop beyond the end of the tree, which they do unsafely with unwrap_unchecked, because they rely on a promise delivered here but documented nowhere.

I thought "SAFETY:" was for documenting why calling something marked unsafe is in fact safe, and here it's kind of the opposite.

So I would add API documentation about the promise here and elsewhere (not sure if I could or should squeeze that into this PR), and change this comment to something like:

We've called an external `Ord` implementation and can't trust it implements a total order

~~nor (if type K != Q) that it corresponds to the Ord implementation we used to check the range bounds.~~

Turns out I'm pretty wrong, after noticing there is no K: Ord declared here. An inconsistency between K: Ord and Q: Ord doesn't play up here, only Q: Ord is ever used. We assume that for various types of Q used over time, the same order among keys prevails. We only use K: Ord directly in some places like insert, because we actually move the key into the map so there is no room to play for a Q type.

sure we can call it, we have to call it, we just had the find functions call it profusely.

We can call it, but the result can't be used for a safety check. The user impl could return anything, which makes it useless.

The comparison is done with integers because those are what we end up with

Yes, but it's really because the Ord impl can't be assumed to be consistent or performant. Otherwise, the integers shouldn't be necessary IIUC.

I thought "SAFETY:" was for documenting why calling something marked unsafe is in fact safe, and here it's kind of the opposite.

I debated putting it here. However, it does document that the panic allows using unsafe code later with user impls involved.

We've called an external Ord implementation and can't trust it implements a total order

This would be okay with me, but it should say something about the panic providing safety.

We can call it, but the result can't be used for a safety check.

Again, I don't understand why you write that. We have called it, we needed to call it, it's the purpose of the function we're discussing. Yes, the user impl could return anything, but that doesn't make it useless. We need to know what it says in case it's not trying to pull our leg.

Otherwise, the integers shouldn't be necessary IIUC.

The purpose of this function is to translate a pair of bounds (let's say a pair of keys) to (a node and) a pair of integers, edge indices, indices into various arrays. To translate from the problem domain to the solution domain, in terms of the btree data structure.

I debated putting it here. However, it does document that the panic allows using unsafe code later with user impls involved.

In my view, the fact that later code is calling unsafe code under some assumption about the result is not an unsafety problem in this function and not the only reason to have the panic. Even if there was no unsafe code outside, without ~~a panic~~ countermeasures, the outside code could erroneously access key-value pairs outside the range. Well, that's up for debate, because if your Ord stinks, there is no inside and outside a range. In any case, I think this function should define (which I tried in #83147) and honour its contract, and then callers can exploit those guarantees with any unsafe shenanigans they like.

We've called an external Ord implementation and can't trust it implements a total order

This would be okay with me, but it should say something about the panic providing safety.

Actually, come to think of it, that's not even true. AFAIK the panic used to provide safety (to code outside the function) in the old code, but no longer after #81094. The line below the main panic (if lower_edge_idx < upper_edge_idx {) is what provides adherence to the contract / external safety. If we'd leave out all the panics from this function, the whole btree code would be just as safe, it would merely return results as weird as the Ord implementation, instead of barking at users.

All the panics in this function are only there to fulfill an additional service this function delivers, diagnosing user error.

We need to know what it says in case it's not trying to pull our leg.

I agree. All I'm saying is that we can't make a safety decision based on it. For example, you can't decide if a range can be passed to slice::get_unchecked based on a user impl.

I think this function should define (which I tried in #83147) and honour its contract, and then callers can exploit those guarantees with any unsafe shenanigans they like.

You wrote the new code, so whatever you prefer is good with me. The purpose of this comment was to document what part of the implementation of this function makes it safe, since that wasn't obvious when we talked about the function before your PR.

If we'd leave out all the panics from this function, the whole btree code would be just as safe, it would merely return results as weird as the Ord implementation, instead of barking at users.

I didn't realize this. In this case, I see no reason to have this comment. I'll remove it and fix the other one as we discussed.

I also didn't realize this, so it was a fruitful discussion.

dylni · 2021-03-17T23:41:54Z

@m-ou-se This PR is now ready for review.

m-ou-se · 2021-03-17T23:50:19Z

Thanks!

@bors r+ rollup

bors · 2021-03-17T23:50:20Z

📌 Commit 35a2096 has been approved by m-ou-se

bors · 2021-03-18T08:18:11Z

⌛ Testing commit 35a2096 with merge 895a8e7...

bors · 2021-03-18T11:03:44Z

☀️ Test successful - checks-actions
Approved by: m-ou-se
Pushing 895a8e7 to master...

rust-highfive assigned shepmaster Jan 23, 2021

rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 23, 2021

ssomers reviewed Jan 23, 2021

View reviewed changes

camelid added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Feb 12, 2021

rust-highfive assigned m-ou-se and unassigned shepmaster Mar 8, 2021

Clarify BTree range searching comments

922ccac

ssomers suggested changes Mar 15, 2021

View reviewed changes

ssomers mentioned this pull request Mar 15, 2021

BTree: clarify order guarantee provided by searches #83147

Merged

Fix comments based on review

35a2096

ssomers approved these changes Mar 17, 2021

View reviewed changes

bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 17, 2021

bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 18, 2021

bors merged commit 895a8e7 into rust-lang:master Mar 18, 2021

rustbot added this to the 1.52.0 milestone Mar 18, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Clarify BTree `range_search` comments #81312

Clarify BTree `range_search` comments #81312

dylni commented Jan 23, 2021 •

edited

Loading

rust-highfive commented Jan 23, 2021

ssomers Jan 23, 2021 •

edited

Loading

dylni Jan 23, 2021

dylni commented Feb 25, 2021

dylni commented Feb 25, 2021

bors commented Mar 1, 2021

shepmaster commented Mar 8, 2021

dylni commented Mar 15, 2021

ssomers Mar 15, 2021

dylni Mar 16, 2021

ssomers Mar 15, 2021 •

edited

Loading

ssomers Mar 15, 2021

dylni Mar 16, 2021

ssomers Mar 16, 2021 •

edited

Loading

dylni Mar 17, 2021

ssomers Mar 17, 2021

dylni commented Mar 17, 2021

m-ou-se commented Mar 17, 2021

bors commented Mar 17, 2021

bors commented Mar 18, 2021

bors commented Mar 18, 2021

		// SAFETY: This panic is used for safety, so external impls can't be called here. The
		// comparison is done with integers for that reason.

Clarify BTree range_search comments #81312

Clarify BTree range_search comments #81312

Conversation

dylni commented Jan 23, 2021 • edited Loading

rust-highfive commented Jan 23, 2021

ssomers Jan 23, 2021 • edited Loading

Choose a reason for hiding this comment

dylni Jan 23, 2021

Choose a reason for hiding this comment

dylni commented Feb 25, 2021

dylni commented Feb 25, 2021

bors commented Mar 1, 2021

shepmaster commented Mar 8, 2021

dylni commented Mar 15, 2021

ssomers Mar 15, 2021

Choose a reason for hiding this comment

dylni Mar 16, 2021

Choose a reason for hiding this comment

ssomers Mar 15, 2021 • edited Loading

Choose a reason for hiding this comment

ssomers Mar 15, 2021

Choose a reason for hiding this comment

dylni Mar 16, 2021

Choose a reason for hiding this comment

ssomers Mar 16, 2021 • edited Loading

Choose a reason for hiding this comment

dylni Mar 17, 2021

Choose a reason for hiding this comment

ssomers Mar 17, 2021

Choose a reason for hiding this comment

dylni commented Mar 17, 2021

m-ou-se commented Mar 17, 2021

bors commented Mar 17, 2021

bors commented Mar 18, 2021

bors commented Mar 18, 2021

Clarify BTree `range_search` comments #81312

Clarify BTree `range_search` comments #81312

dylni commented Jan 23, 2021 •

edited

Loading

ssomers Jan 23, 2021 •

edited

Loading

ssomers Mar 15, 2021 •

edited

Loading

ssomers Mar 16, 2021 •

edited

Loading