Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tokenizer returnings pointers make calculation of position info more complicated #97973

Closed
lysnikolaou opened this issue Oct 6, 2022 · 0 comments · Fixed by #97984
Closed
Assignees
Labels
type-feature A feature request or enhancement

Comments

@lysnikolaou
Copy link
Member

Bacause the tokenizer only returns pointers to the beginning and the end of the token, calculating line numbers/column offsets is more complicated than needed.

Feature or enhancement

Instead of the tokenizer returning a token type and settings pointers, we wanna return the token type (remains as is) and then set a struct token that has the following information:

  1. Pointers to beginning and end
  2. Location information (lineno, col_offset, etc.)
  3. Level (the level in the parenstack)

This way the parser will have a much easier job of setting line numbers & column offsets in the generated AST numbers and will make some of our work on the f-strings parsing easier.

@lysnikolaou lysnikolaou added the type-feature A feature request or enhancement label Oct 6, 2022
@lysnikolaou lysnikolaou self-assigned this Oct 6, 2022
lysnikolaou added a commit to lysnikolaou/cpython that referenced this issue Oct 6, 2022
Right now, the tokenizer only returns type and two pointers to the
start and end of the token. This PR modifies the tokenizer to return
the type and set all of the necessary information, so that the parser
does not have to this.
lysnikolaou added a commit to lysnikolaou/cpython that referenced this issue Oct 6, 2022
Right now, the tokenizer only returns type and two pointers to the start and end of the token.
This PR modifies the tokenizer to return the type and set all of the necessary information,
so that the parser does not have to this.
lysnikolaou added a commit that referenced this issue Oct 6, 2022
Right now, the tokenizer only returns type and two pointers to the start and end of the token.
This PR modifies the tokenizer to return the type and set all of the necessary information,
so that the parser does not have to this.
carljm added a commit to carljm/cpython that referenced this issue Oct 8, 2022
* main:
  pythonGH-97002: Prevent `_PyInterpreterFrame`s from backing more than one `PyFrameObject` (pythonGH-97996)
  pythongh-97973: Return all necessary information from the tokenizer (pythonGH-97984)
  fixes pythongh-96078: os.sched_yield release the GIL while calling sched_yield(2). (pythongh-97965)
  pythongh-65961: Do not rely solely on `__cached__` (pythonGH-97990)
  pythongh-97850: Remove the open issues section from the import reference (python#97935)
  Docs: pin sphinx-lint (pythonGH-97992)
  pythongh-94590: add signatures to operator itemgetter, attrgetter, methodcaller (python#94591)
  Add Pynche's move to the What's new in 3.11 (python#97974)
  pythongh-97781: Apply changes from importlib_metadata 5. (pythonGH-97785)
  pythongh-86482: Document assignment expression need for ()s (python#23291)
  pythongh-97943: PyFunction_GetAnnotations should return a borrowed reference. (python#97949)
mpage pushed a commit to mpage/cpython that referenced this issue Oct 11, 2022
…ythonGH-97984)

Right now, the tokenizer only returns type and two pointers to the start and end of the token.
This PR modifies the tokenizer to return the type and set all of the necessary information,
so that the parser does not have to this.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
type-feature A feature request or enhancement
Projects
None yet
1 participant