Question: Access/Match Last Consumed Character(s) of Input #144
-
Issue DescriptionI wrote a very basic parser for a small subset of YAML using PEGTL. While I was able to construct rules for most of the YAML specification without too much problems, there is a particular grammar rule that my code currently does not handle. The spec defines the rule ns-plain-char(c) ::= (ns-plain-safe(c) - ":" - "#") |
(/* An ns-char preceding */ "#") |
(":" /* Followed by an ns-plain-safe(c) */ ) . The problem here is the comment “An One option to work around this issue would be to create a custom rule that checks the last character in the input. However, as far as I can tell, the input class only provides methods to check the text the parser did not consume yet ( |
Beta Was this translation helpful? Give feedback.
Replies: 6 comments
-
Scanning backwards is not properly supported because the usual (?) way with PEGs is to make use of the infinite look-ahead and disambiguate in advance. Or you rework the grammar in other ways, by deferring decisions to as late as possible. Now you would presumable prefer to use the YAML grammar as is, without major refactoring. The So it's a hassle, but it's possible, and I couldn't say whether there is a more elegant solution without thinking about it some more. |
Beta Was this translation helpful? Give feedback.
-
Thank you for the detailed answer. I used the second approach you described in my project. As far as I can tell it seems to work. |
Beta Was this translation helpful? Give feedback.
-
How are your bridging the gap between the input's |
Beta Was this translation helpful? Give feedback.
-
I guess I do not do that 😄. I just assumed Is there an easy method to handle the encoding? |
Beta Was this translation helpful? Give feedback.
-
I changed the rule |
Beta Was this translation helpful? Give feedback.
-
There's no easy way (that we supply), glad that you managed to make it work (even if the hard way). |
Beta Was this translation helpful? Give feedback.
Scanning backwards is not properly supported because the usual (?) way with PEGs is to make use of the infinite look-ahead and disambiguate in advance. Or you rework the grammar in other ways, by deferring decisions to as late as possible.
Now you would presumable prefer to use the YAML grammar as is, without major refactoring.
The
memory_input
now also remembers where it started parsing, so you could actually check whether there are bytes betweenbegin()
andcurrent()
, and implement a kind of backward-scanning UTF-8 decoding function.So it's a hassle, but it's possible, and I couldn't say whether there is a more elegant solution without thinking about it some more.