-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Keyword extraction doesn't work when optional regexp precedes a keyword #1404
Comments
I think it’s because your optional regex conflicts with ‘identifier’, which is your word token. Is there a reason you don’t want to use ‘identifier’ as the optional part of ‘sigil’? |
I actually simplified the identifier, the actual version is more like |
On top of that, if we replace the optional regexp with |
Yeah, that makes sense, since all of those alphabetical tokens overlap with your The problem is when parsing this:
Without keyword extraction, there are no lexical conflicts. But suppose that Tree-sitter did perform keyword extraction on the You can probably tweak your grammar to make keyword extraction possible though. If you list Alternatively, as I mentioned above, I would just generalize |
Thanks for explaining, this makes sense :) In the example above, moving module.exports = grammar({
name: "test",
word: ($) => $.identifier,
rules: {
source: ($) => seq(choice($.integer, $.sigil), "or", $.integer),
integer: ($) => /\d+/,
sigil: ($) => seq("~", "()", optional(/[a-zA-Z]+/)),
identifier: ($) => /[\p{ID_Start}][\p{ID_Continue}]*[?!]?/u,
},
}); However it no longer works when using |
Ah |
Either way, gonna close the issue since it's been answered :) Thanks for all the help! |
Let's assume a word operator as in
1 or 1
. We don't want1 or1
to be valid, so we use keyword extraction so thator1
is consumed at once. Here's the stripped down grammar:The meaning of
sigil
or its exact content is not particularly relevant, the noteworthy fact is the optional regexp at the end. Also, given the~
and()
this path should be quickly rejected when parsing1 or1
.Even with keyword extraction, this grammar doesn't produce an error for
1 or1
. The problem seems to be in the possible$.sigil
left operand, more specifically in the optional regexp. As soon as we make the regexp not optional, the grammar properly errors for1 or1
.(tree-sitter-cli: 0.20.0)
The text was updated successfully, but these errors were encountered: