Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Clarify BTree range_search comments #81312

Merged
merged 2 commits into from
Mar 18, 2021
Merged

Clarify BTree range_search comments #81312

merged 2 commits into from
Mar 18, 2021

Conversation

dylni
Copy link
Contributor

@dylni dylni commented Jan 23, 2021

These comments were added by #81169. However, the soundness issue might not be exploitable here, so the comments should be updated.

cc @ssomers

@rust-highfive
Copy link
Collaborator

r? @shepmaster

(rust-highfive has picked a reviewer for you, use r? to override)

@rust-highfive rust-highfive added the S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. label Jan 23, 2021
@@ -88,6 +87,7 @@ where
(_, Unbounded) => max_node.last_edge(),
};

// This can't be used for safety. The `PartialOrd` impl can change between calls.
Copy link
Contributor

@ssomers ssomers Jan 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is btree's own PartialOrd (defined in node.rs, until #81094 reorganizes it away) and it is about the only real safety we have. If the condition fails, the iterator would never hit the other end and run off the parent of the root node (which is None) without a safe unwrap - that last safety has been "optimized" away).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch. This is a merge conflict waiting to happen.

@camelid camelid added the T-libs Relevant to the library team, which will review and decide on the PR/issue. label Feb 12, 2021
@dylni
Copy link
Contributor Author

dylni commented Feb 25, 2021

@shepmaster This PR is ready for review when you have time.

@dylni
Copy link
Contributor Author

dylni commented Feb 25, 2021

Actually, if you're waiting on #81094, that might be better.

@bors
Copy link
Contributor

bors commented Mar 1, 2021

☔ The latest upstream changes (presumably #81094) made this pull request unmergeable. Please resolve the merge conflicts.

@shepmaster
Copy link
Member

r? @m-ou-se

@rust-highfive rust-highfive assigned m-ou-se and unassigned shepmaster Mar 8, 2021
@dylni
Copy link
Contributor Author

dylni commented Mar 15, 2021

@ssomers I updated this PR based on #81094. Do you agree with the changes?

@@ -94,7 +94,7 @@ impl<BorrowType: marker::BorrowType, K, V> NodeRef<BorrowType, K, V, marker::Lea
K: Borrow<Q>,
R: RangeBounds<Q>,
{
// WARNING: Inlining these variables would be unsound (#81138)
// It might be unsound to inline these variables if this logic changes (#81138).
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think it's good to tone down the warning, but I don't know what you mean with "if this logic changes". What logic? It might be bad to inline these variables without any other change, because successfully passing the match statement might fool you to believe the start and end actually used were fine.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

What logic?

I meant the code in the rest of the method. This might be better:

// Inlining these variables should be avoided (#81138).

The rest of the comment already explains why inlining can cause a safety issue.

Comment on lines 117 to 118
// SAFETY: This panic is used for safety, so external impls can't be called here. The
// comparison is done with integers for that reason.
Copy link
Contributor

@ssomers ssomers Mar 15, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm quite confused here. If with "external impls" you mean the K: Ord implementation, sure we can call it, we have to call it, we just had the find functions call it profusely. But we can't blindly trust it to implement a total order , and even if it does, it may not correspond to the Q: Ord implementation that we used to check the range bounds.

The comparison is done with integers because those are what we end up with, what we need to identify positions in the data structure. For me, the question is: why do we compare and panic at all? Because if we didn't, and if the Ord implementations are invalid or inconsistent, or until #81138 if the range implementation was adversarial, we could easily and silently return a result that would lead callers to loop beyond the end of the tree, which they do unsafely with unwrap_unchecked, because they rely on a promise delivered here but documented nowhere.

I thought "SAFETY:" was for documenting why calling something marked unsafe is in fact safe, and here it's kind of the opposite.

So I would add API documentation about the promise here and elsewhere (not sure if I could or should squeeze that into this PR), and change this comment to something like:

We've called an external `Ord` implementation and can't trust it implements a total order

nor (if type K != Q) that it corresponds to the Ord implementation we used to check the range bounds.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Turns out I'm pretty wrong, after noticing there is no K: Ord declared here. An inconsistency between K: Ord and Q: Ord doesn't play up here, only Q: Ord is ever used. We assume that for various types of Q used over time, the same order among keys prevails. We only use K: Ord directly in some places like insert, because we actually move the key into the map so there is no room to play for a Q type.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sure we can call it, we have to call it, we just had the find functions call it profusely.

We can call it, but the result can't be used for a safety check. The user impl could return anything, which makes it useless.

The comparison is done with integers because those are what we end up with

Yes, but it's really because the Ord impl can't be assumed to be consistent or performant. Otherwise, the integers shouldn't be necessary IIUC.

I thought "SAFETY:" was for documenting why calling something marked unsafe is in fact safe, and here it's kind of the opposite.

I debated putting it here. However, it does document that the panic allows using unsafe code later with user impls involved.

We've called an external Ord implementation and can't trust it implements a total order

This would be okay with me, but it should say something about the panic providing safety.

Copy link
Contributor

@ssomers ssomers Mar 16, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We can call it, but the result can't be used for a safety check.

Again, I don't understand why you write that. We have called it, we needed to call it, it's the purpose of the function we're discussing. Yes, the user impl could return anything, but that doesn't make it useless. We need to know what it says in case it's not trying to pull our leg.

Otherwise, the integers shouldn't be necessary IIUC.

The purpose of this function is to translate a pair of bounds (let's say a pair of keys) to (a node and) a pair of integers, edge indices, indices into various arrays. To translate from the problem domain to the solution domain, in terms of the btree data structure.

I debated putting it here. However, it does document that the panic allows using unsafe code later with user impls involved.

In my view, the fact that later code is calling unsafe code under some assumption about the result is not an unsafety problem in this function and not the only reason to have the panic. Even if there was no unsafe code outside, without a panic countermeasures, the outside code could erroneously access key-value pairs outside the range. Well, that's up for debate, because if your Ord stinks, there is no inside and outside a range. In any case, I think this function should define (which I tried in #83147) and honour its contract, and then callers can exploit those guarantees with any unsafe shenanigans they like.

We've called an external Ord implementation and can't trust it implements a total order

This would be okay with me, but it should say something about the panic providing safety.

Actually, come to think of it, that's not even true. AFAIK the panic used to provide safety (to code outside the function) in the old code, but no longer after #81094. The line below the main panic (if lower_edge_idx < upper_edge_idx {) is what provides adherence to the contract / external safety. If we'd leave out all the panics from this function, the whole btree code would be just as safe, it would merely return results as weird as the Ord implementation, instead of barking at users.

All the panics in this function are only there to fulfill an additional service this function delivers, diagnosing user error.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We need to know what it says in case it's not trying to pull our leg.

I agree. All I'm saying is that we can't make a safety decision based on it. For example, you can't decide if a range can be passed to slice::get_unchecked based on a user impl.

I think this function should define (which I tried in #83147) and honour its contract, and then callers can exploit those guarantees with any unsafe shenanigans they like.

You wrote the new code, so whatever you prefer is good with me. The purpose of this comment was to document what part of the implementation of this function makes it safe, since that wasn't obvious when we talked about the function before your PR.

If we'd leave out all the panics from this function, the whole btree code would be just as safe, it would merely return results as weird as the Ord implementation, instead of barking at users.

I didn't realize this. In this case, I see no reason to have this comment. I'll remove it and fix the other one as we discussed.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also didn't realize this, so it was a fruitful discussion.

@dylni
Copy link
Contributor Author

dylni commented Mar 17, 2021

@m-ou-se This PR is now ready for review.

@m-ou-se
Copy link
Member

m-ou-se commented Mar 17, 2021

Thanks!

@bors r+ rollup

@bors
Copy link
Contributor

bors commented Mar 17, 2021

📌 Commit 35a2096 has been approved by m-ou-se

@bors bors added S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. and removed S-waiting-on-review Status: Awaiting review from the assignee but also interested parties. labels Mar 17, 2021
@bors
Copy link
Contributor

bors commented Mar 18, 2021

⌛ Testing commit 35a2096 with merge 895a8e7...

@bors
Copy link
Contributor

bors commented Mar 18, 2021

☀️ Test successful - checks-actions
Approved by: m-ou-se
Pushing 895a8e7 to master...

@bors bors added the merged-by-bors This PR was explicitly merged by bors. label Mar 18, 2021
@bors bors merged commit 895a8e7 into rust-lang:master Mar 18, 2021
@rustbot rustbot added this to the 1.52.0 milestone Mar 18, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
merged-by-bors This PR was explicitly merged by bors. S-waiting-on-bors Status: Waiting on bors to run and complete tests. Bors will change the label on completion. T-libs Relevant to the library team, which will review and decide on the PR/issue.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

8 participants