Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Provide a means to deal with malformed facet text representation for the query parser #1056

Conversation

moriyoshi
Copy link
Contributor

@moriyoshi moriyoshi commented May 19, 2021

This patch requires API change.

As Facet::from_text is not designed to return a Result, the query parser cannot handle invalid facet representations. This patch enables it to do so by modifying Facet::from_text() to return a Result<Facet, TantivyError> instead of a bare Facet. Also, I added a new error kind to QueryParserError.

BTW, I will agree to not insist any copyright on this patch.

@fulmicoton fulmicoton requested a review from PSeitz May 21, 2021 06:23
Copy link
Contributor

@PSeitz PSeitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! lgtm besides the two comments

Comment on lines 72 to 74
/// The format for the facet field invalid.
#[error("The facet field is malformed")]
FacetFormatError(String),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be moved to TantivyError

Copy link
Contributor Author

@moriyoshi moriyoshi May 21, 2021

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While an additional entry in TantivyError for this will possibly be handy later, I prefer this to be a query parser error like others (e.g. date time format error) because one can think Facet string representations are coincidentally the same between Facet and QueryParser.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

From the view point of Facet::from_text(), it will just suffice, for now, to return with a single kind of error (meaning TantivyError::InvalidArgument) because the parser isn't made possible to report a detailed error.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had another look and think from_text should return a FacetParseError. This error would then be included in the tantivty and parse error sets. Currently there is a conversion to _ => Err(QueryParserError::SyntaxError),, but this is dead code as I understand it, which would be removed.

/// Error that may occur when parsing a facet
#[derive(Debug, Error)]
pub enum FacetParseError {
    /// The format for the facet field invalid.
    #[error("The facet field is malformed")]
    FacetParseError(String),
}
pub enum TantivyError {
    /// Failed to parse facet.
    #[error("{0}")]
    FacetParseError(#[from] FacetParseError),

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having a specific error enum for the facet parser totally makes sense to me.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I just added an enum to facet as I wouldn't think extending TantivyError was necessary.

src/schema/facet.rs Outdated Show resolved Hide resolved
@moriyoshi
Copy link
Contributor Author

moriyoshi commented May 21, 2021

As a sidenote, I intended to use Tantivy in a production and found that user-supplied queries including wrong facet strings can effectively cause DoS (the server itself is written in Python). That's why I made this fix. Anyway, thanks for offering such a great product for free with a generous license!

@@ -68,6 +68,9 @@ pub enum QueryParserError {
/// The format for the date field is not RFC 3339 compliant.
#[error("The date field has an invalid format")]
DateFormatError(chrono::ParseError),
/// The format for the facet field is invalid.
#[error("The facet field is malformed")]
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure, but I think this error should forward the original error message or the user would not see which part of the query failed

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

like this #[error("{0}")]

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch :) I missed that. Thanks.

@moriyoshi moriyoshi force-pushed the facet-path-from-text-must-return-error branch from 2a13c61 to 29aaf00 Compare May 21, 2021 13:00
Copy link
Contributor

@PSeitz PSeitz left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Looks good!

@fulmicoton fulmicoton merged commit 4afba00 into quickwit-oss:main May 27, 2021
@moriyoshi moriyoshi deleted the facet-path-from-text-must-return-error branch May 27, 2021 12:20
This was referenced Feb 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants