Expose raw byte position in position information. #31

samhocevar · 2016-09-24T10:16:03Z

In some cases it may be desirable to know the byte offset in addition
to the line / column information, for instance when parsing binary
files, or when feeding the parser with partial data.

In some cases it may be desirable to know the byte offset in addition to the line / column information, for instance when parsing binary files, or when feeding the parser with partial data.

samhocevar · 2016-09-24T10:23:34Z

This commit changes the API so I am not sure it is acceptable
as is. However the feature is desirable for my purposes, and I
found out it was easier to modify the PEGTL rather than implement
a byte tracking mechanism using additional state variables, which
I would then have to add to every parser needing it.

I am also not sure of how to format the information (or whether I should
ignore it) in position_info::operator<<.

If this does not fit within your overall plans, maybe you can suggest a
more elegant solution?

coveralls · 2016-09-24T10:30:24Z

Coverage remained the same at 100.0% when pulling e5f8968 on lolengine:byte-position-in-input into 84a64ef on ColinH:master.

ColinH · 2016-09-26T07:21:50Z

Thanks for the pull request, I'll not merge it right now because it might interfere with some other things that we have planned, but keep it as reminder and for possible future inclusion if sufficiently independent of these other changes.

ColinH · 2016-11-29T16:35:32Z

Do you later use the byte position programmatically, or do you only need the human-readable form in the exception message?

samhocevar · 2016-12-01T17:55:29Z

I use the byte position programmatically, yes.

My current use case here is the creation of a transpiler that does not need an intermediate AST. The parser analyses the input and marks parts of the code using their byte offsets. A post-process then uses search/replace to perform the language transformation, and if the replaced chunk changes the size, offsets located after it get updated. Using line/column notation would make the search/replace work more complex.

ColinH · 2016-12-02T09:39:15Z

Ok, thanks, another small question: Do you ever need the byte offset together with column and line, or do you not need the column and line in cases where you use the byte offset?

samhocevar · 2016-12-02T12:53:55Z

In my case, unless I am parsing binary data, I think I always need column and line in order to provide meaningful error reporting, regardless of whether my action class uses the byte offset.

ColinH · 2016-12-06T17:03:59Z

I've been trying to make this more flexible, to not always keep all three numbers, but it seems to be more work than anticipated, and only for a small optimisation. Your pull-request is now unfortunately out of date, but some of the changes will be simpler with the latest commits. I will now look into re-doing this.

Expose raw byte position in position information.

e5f8968

In some cases it may be desirable to know the byte offset in addition to the line / column information, for instance when parsing binary files, or when feeding the parser with partial data.

ColinH closed this in 267e992 Dec 6, 2016

d-frey assigned ColinH Jan 9, 2018

d-frey added the enhancement label Jan 9, 2018

schrewe mentioned this pull request Feb 8, 2021

Stack overflow in json parser when processing lots of "[" #256

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Expose raw byte position in position information. #31

Expose raw byte position in position information. #31

samhocevar commented Sep 24, 2016

samhocevar commented Sep 24, 2016

coveralls commented Sep 24, 2016

ColinH commented Sep 26, 2016

ColinH commented Nov 29, 2016

samhocevar commented Dec 1, 2016

ColinH commented Dec 2, 2016

samhocevar commented Dec 2, 2016

ColinH commented Dec 6, 2016

Expose raw byte position in position information. #31

Expose raw byte position in position information. #31

Conversation

samhocevar commented Sep 24, 2016

samhocevar commented Sep 24, 2016

coveralls commented Sep 24, 2016

ColinH commented Sep 26, 2016

ColinH commented Nov 29, 2016

samhocevar commented Dec 1, 2016

ColinH commented Dec 2, 2016

samhocevar commented Dec 2, 2016

ColinH commented Dec 6, 2016