-
Notifications
You must be signed in to change notification settings - Fork 78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix handling of escape codes in paths #118
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One of those patches where I really appreciate having all the tests - thank you!
Looks good, just a handful of comments.
* We use this type to avoid confusion with paths used to define routes. | ||
*/ | ||
#[derive(Debug)] | ||
pub struct InputPath<'a>(&'a str); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I have no issue with this current use of typing, but I feel like I've usually seen the strong types used as an "output" of parsing functions, rather than input. IMO that seems more useful, since it propagates the notion that "this &str
has been validated", whereas I think InputPath
is currently truly indistinguishable from an arbitrary &str
.
(I'm kinda eyeing route_path_to_segments
and input_path_to_segments
below - it does seem like both could take an arbitrary &str
as input, but the results kinda have different implications)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't have a strong feeling about what would be most idiomatic here. I wanted to "quarantine" the input type to avoid accidental use of it. I agree that it's all a little meh. I'd love any input regarding how to improve it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wanna give the caveat - I think this patch is decidedly a net improvement, so this comment shouldn't be a blocker!
The reason I'm a little skeptical of the current usage is that I don't think the quarantining is actually happening very clearly (aside from requiring a bunch of .into()
calls). Any function that operates on an InputPath
may as well operate on a &str
, since there is (seemingly?) no check or consideration to distinguish them.
IMO, the path that's the result of a "user input" actually is a str. If we want to distinguish that from a route-defining path, which has a higher burden of validation, perhaps that is the type we should abstract more cautiously.
As an example, route_path_to_segments
returns a Vec<&str>
- but it implicitly carries the implication that "this is actually a non-empty &str
type, without /
characters.
We could make that more explicit by returning a Vec<Segment>
, where Segment is a &str
, but it can be only constructed post-validation.
Similarly, input_path_to_segments
could return a Vec<Segment>
, to indicate that it represents a percent-decoded, non "./.." str.
(This actually raises the question - should these functions actually be returning distinct Segment
types? They appear to be performing slightly different validations)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
FWIW: I think this pattern is generally the one Cliff calls "the typestate pattern".
@smklein thanks for the thorough review. I've pushed a new commit that I think addresses your feedback. |
One last comment on the |
I'm going to attempt to revisit the discussion of |
Previously we haven't had any special handling for HTTP path escape codes. This implicitly required consumers to do all their own decoding... which they surely were not. This PR adds proper handling for input paths i.e. paths as the results of a query, and it enforces stricter requirements for paths specified by consumers for endpoints. For example, we strip out "//" for a query, but
panic!()
if a consumer were to specify a path that contained "//" (note we also panic if a consumer tries to register two handlers for the same path).