Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix handling of escape codes in paths #118

Merged
merged 4 commits into from
Jun 23, 2021
Merged

Fix handling of escape codes in paths #118

merged 4 commits into from
Jun 23, 2021

Conversation

ahl
Copy link
Collaborator

@ahl ahl commented Jun 22, 2021

Previously we haven't had any special handling for HTTP path escape codes. This implicitly required consumers to do all their own decoding... which they surely were not. This PR adds proper handling for input paths i.e. paths as the results of a query, and it enforces stricter requirements for paths specified by consumers for endpoints. For example, we strip out "//" for a query, but panic!() if a consumer were to specify a path that contained "//" (note we also panic if a consumer tries to register two handlers for the same path).

Copy link
Contributor

@smklein smklein left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

One of those patches where I really appreciate having all the tests - thank you!

Looks good, just a handful of comments.

dropshot/src/api_description.rs Outdated Show resolved Hide resolved
dropshot/src/api_description.rs Outdated Show resolved Hide resolved
dropshot/src/router.rs Outdated Show resolved Hide resolved
dropshot/src/router.rs Outdated Show resolved Hide resolved
* We use this type to avoid confusion with paths used to define routes.
*/
#[derive(Debug)]
pub struct InputPath<'a>(&'a str);
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I have no issue with this current use of typing, but I feel like I've usually seen the strong types used as an "output" of parsing functions, rather than input. IMO that seems more useful, since it propagates the notion that "this &str has been validated", whereas I think InputPath is currently truly indistinguishable from an arbitrary &str.

(I'm kinda eyeing route_path_to_segments and input_path_to_segments below - it does seem like both could take an arbitrary &str as input, but the results kinda have different implications)

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't have a strong feeling about what would be most idiomatic here. I wanted to "quarantine" the input type to avoid accidental use of it. I agree that it's all a little meh. I'd love any input regarding how to improve it.

Copy link
Contributor

@smklein smklein Jun 23, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wanna give the caveat - I think this patch is decidedly a net improvement, so this comment shouldn't be a blocker!

The reason I'm a little skeptical of the current usage is that I don't think the quarantining is actually happening very clearly (aside from requiring a bunch of .into() calls). Any function that operates on an InputPath may as well operate on a &str, since there is (seemingly?) no check or consideration to distinguish them.

IMO, the path that's the result of a "user input" actually is a str. If we want to distinguish that from a route-defining path, which has a higher burden of validation, perhaps that is the type we should abstract more cautiously.

As an example, route_path_to_segments returns a Vec<&str> - but it implicitly carries the implication that "this is actually a non-empty &str type, without / characters.

We could make that more explicit by returning a Vec<Segment>, where Segment is a &str, but it can be only constructed post-validation.

Similarly, input_path_to_segments could return a Vec<Segment>, to indicate that it represents a percent-decoded, non "./.." str.

(This actually raises the question - should these functions actually be returning distinct Segment types? They appear to be performing slightly different validations)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FWIW: I think this pattern is generally the one Cliff calls "the typestate pattern".

@ahl
Copy link
Collaborator Author

ahl commented Jun 23, 2021

@smklein thanks for the thorough review. I've pushed a new commit that I think addresses your feedback.

@smklein
Copy link
Contributor

smklein commented Jun 23, 2021

@smklein thanks for the thorough review. I've pushed a new commit that I think addresses your feedback.

One last comment on the InputPath type, but LGTM!

@ahl ahl merged commit feea258 into main Jun 23, 2021
@ahl
Copy link
Collaborator Author

ahl commented Jun 23, 2021

I'm going to attempt to revisit the discussion of InputPath in #110

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants