-
Notifications
You must be signed in to change notification settings - Fork 13k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Create real parser for search queries #90630
Create real parser for search queries #90630
Conversation
Some changes occurred in HTML/CSS/JS. |
This comment has been minimized.
This comment has been minimized.
5ad8a97
to
fc63739
Compare
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
This comment has been minimized.
Ok, seems like I fixed all tidy issues. :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I really wish you'd talked this over with the team before adding lots more custom syntax. I was under the impression you were fixing bugs, not adding entirely new features.
src/bootstrap/test.rs
Outdated
} else { | ||
builder.info("No nodejs found, skipping \"src/test/rustdoc-js-parser\" tests"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please do not force-skip this. Make the default condition whether node
is installed instead.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
<code>* -> vec</code>)", | ||
"Search multiple things at once by splitting your query with comma (e.g., \ | ||
<code>str,u8</code> or <code>String,struct:Vec,test</code>)", | ||
"You can look for functions based on its arguments (e.g., \ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why does this need special syntax? There's already a "function parameters" tab.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's not a special syntax or anything. Maybe I badly wrote it but it's simply that you can look at function arguments specifically if you want. Extra explanations:
item
This will look for item pretty much everywhere: in the items' name, in their arguments if they have any and in the returned type (if there is any).
item(arg)
It'll now look for item
only in item's name and will look for arg
only in arguments.
And so this:
(arg)
Will look for arg
only in arguments. This is what I meant here.
Except I didn't add any new syntax in here... All this was already supported before in very hackish ways. Test it on the current docs and you'll see. I just wrote a proper parser over the already existing syntax. |
... this seems pretty clearly not true? Rustdoc doesn't currently support double-quotes to mean "exact match", nor the tuple |
Oh you're right: it doesn't support specifying arguments directly (with However, it supports double quotes, it supports |
@camelid Then please tell what you think of the UI. ;) |
I really don't think we should be adding this. It exposes way too much of rustdoc's internals, and it'll likely confuse users. Instead, we should try to make rustdoc's search "Just Work" the way users expect. |
Also, the search doesn't seem to be working? |
Indeed... If we keep this approach, I'll check what's wrong.
This the big question: what do users expect? Maybe the approach I took here to allow to have more advanced check isn't the right way. Maybe we should simplify this as follows: generating the items "signature" and then simply run levenshtein on all of them and sort the result. But that approach would have so many downsides as well. Or did you simply suggest to not display this? |
Yes, that's what I suggest :) |
Fine by me! |
Trying to understand quotes for exact match better. On current nightly, if you search for https://doc.rust-lang.org/nightly/settings.html?search=str If you add quotation marks, you'll only get matches where the last component of the path matches "str" case-insensitively: https://doc.rust-lang.org/nightly/settings.html?search=%22str%22 That seems like pretty reasonable behavior. Am I right in guessing that we use quotes for this exact match meaning because that's how Google indicates exact match? https://support.google.com/websearch/answer/2466433?hl=en. Google's use of quotes for exact match originates in text search: normally a search for It's useful to know that history, because our use case for quotes is different: we never have multi-word searches (so far), and we don't do stemming. We do do sub-token matching, which is not something traditional search engines usually do because it's fairly expensive. For instance It looks like this PR intends to allow quotes for exact match in all positions. For instance, I could search for That seems like overkill, feature-wise. Instead, we should say:
By the way, other reviewers: you may notice that searches on the demo link are slower to load than on doc.rust-lang.org. I believe that's because I haven't set up automatic compression on rustdoc.crud.net, so the search JS is larger than it would be in real life. I'll try to fix that later this week. |
…eBNF does not allow elements without a name
b3c8cfb
to
5c6c1e1
Compare
Updated (with the correct handling of whitespaces for type filter this time...). |
This comment has been minimized.
This comment has been minimized.
--- rustdoc.json
+++ rustdoc-search-ref.json
@@ -3,7 +3,7 @@
"foundElems": 0,
"original": "mod: :",
"returned": [],
- "typeFilter": 0,
+ "typeFilter": -1,
"userQuery": "mod: :",
- "error": null
+ "error": "ERROR"
}
\ No newline at end of file
Error: "FAIL"
Query: mod: : |
And fixed this one as well. |
Well, all four-token checks pass when run against the fuzzer. They seem to be parsing the same language now! |
Awesome! Thanks a lot for writing the eBNF fuzzer! |
@bors r+ |
📌 Commit 4d26bde has been approved by |
Rollup of 5 pull requests Successful merges: - rust-lang#90630 (Create real parser for search queries) - rust-lang#96193 ([fuchsia] Add implementation for `current_exe`) - rust-lang#96196 (Remove assertion that all paths in `ShouldRun` exist) - rust-lang#96228 (Fix locations for intrinsics impls and change to links) - rust-lang#96236 (Add an explicit `Span` field to `OutlivesConstraint`) Failed merges: r? `@ghost` `@rustbot` modify labels: rollup
You can test it here.
This PR adds a real parser for the query engine in rustdoc. The parser is quite simple but it allows to makes query handling much easier. I added a new testsuite to ensure it works as expected and ran fuzzing checks on it for a few hours without problems.
So about the parser: as you can see in the screenshot, it handles recursive generics parsing. It also allows to set which item should use exact matching by adding double-quotes around it (look for
exact_search
in the screenshot).Now about the query engine itself: I simplified it a lot thanks to the parsed query. It behaves mostly the same when there is only one argument, but is much more powerful when there are more than one.
When making this change, we also removed the support for multi-query.
PS: A big part of the PR is tests and test-related code. :)
r? @camelid