Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve syntax highlighting #3

Closed
4 tasks done
nineteendo opened this issue Dec 2, 2023 · 21 comments
Closed
4 tasks done

Improve syntax highlighting #3

nineteendo opened this issue Dec 2, 2023 · 21 comments
Assignees
Labels
enhancement New feature or request

Comments

@nineteendo
Copy link
Contributor

nineteendo commented Dec 2, 2023

The extension doesn't handle my byte optimised program very well:

Screenshot 2023-12-03 at 11 03 19

Easy problems:

  • DelVar: is matched with (?<=[^a-z]|Ans)DelVar(?=[^a-zA-Z]|$) instead of (?<=[^a-z]|Ans|^)DelVar(?=[^a-zA-Z]|$)
  • logical operations: are scoped as keyword.operator.8xp instead of keyword.operator.logical.8xp (or, and, not(), xor)

Harder problems:

  • DelVar: instructions can be appended to a DelVar without using a colon
  • implicit multiplication: variables and functions used in an implicit multiplication aren't highlighted
@TIny-Hacker TIny-Hacker added the enhancement New feature or request label Dec 2, 2023
@TIny-Hacker TIny-Hacker self-assigned this Dec 2, 2023
@TIny-Hacker
Copy link
Owner

Thanks! I'll try to look into this.

Variables preceded by are not highlighted because the extension follows the TI-BASIC -> text scheme SourceCoder3 uses, which uses |L instead. Some of those other things should probably be recognized so I'll have to figure out what's going wrong.

@nineteendo
Copy link
Contributor Author

OK, I've converted the program using SourceCoder3, and indented it properly (the End on line 22 acts like a continue).

@nineteendo
Copy link
Contributor Author

I found the solution to the easy problems.

@TIny-Hacker
Copy link
Owner

I've implemented your changes in c7aca74. I've published the update on the Marketplace and the Releases tab as well.

I've been a bit busy this past week but I'll try to look into the other two things. I guess it's time to figure out Regex again 😆

@nineteendo
Copy link
Contributor Author

nineteendo commented Dec 14, 2023

It looks like you need keyword.operator.expression.8xp instead of keyword.operator.logical.8xp:

  • keyword.operator.new: new
  • keyword.operator.expression: in, instanceof, delete, of, void, typeof, is
  • keyword.operator.cast: const_cast, dynamic_cast, reinterpret_cast, static_cast
  • keyword.operator.sizeof: sizeof
  • keyword.operator.alignof: alignof
  • keyword.operator.typeid: typeid
  • keyword.operator.alignas: alignas
  • keyword.operator.instanceof: instanceof
  • keyword.operator.logical.python: and, or, not, in, is (in Python)
  • keyword.operator.wordlike

keyword.operator.logical is not in the list.

@TIny-Hacker
Copy link
Owner

As far as I'm aware, how things are highlighted depends on the theme. If a theme doesn't have a color for keyword.operator.logical, it won't be highlighted any differently. IDK if the best move is to keep changing it until it's something that every theme uses, or keep it more with the correct category and then end up with it only being highlighted in some themes.

@nineteendo
Copy link
Contributor Author

Yes, you're right, that's controlled by the theme.
The reason they aren't highlighted right now is because in the default themes keyword.operator overrides the color of keyword with black. Which isn't overridden again, because it doesn't belong to the previously mentioned categories.

To fix this you can classify them as keyword.operator.expression.
You don't need to make a release just to fix this.

@TIny-Hacker
Copy link
Owner

TIny-Hacker commented Jan 6, 2024

I've looked into the two more difficult issues again and I'm wondering if the best course of action is to just match the token rather than also checking for correct syntax. This is what TI-Planet's Project Builder does, for example. What would you think of that solution?

To clarify: Currently stuff is matched with word breaks for more correct syntax, but removing those would properly highlight the token at all times. However, there's less error checking because checking for just If means something like abc123If would still recognize If .

@nineteendo
Copy link
Contributor Author

nineteendo commented Jan 6, 2024

Sounds good, just make sure to match the tokens in the correct order and use lookbehind and lookahead:

  • Control keywords: Disp , Goto , If , ... are always followed by a space.
  • Functions: DS<(, For(, Get(, ... are always followed by a left parenthesis.
  • Logical operators: and, or, xor, ... are always between spaces.
  • Statistic variables: [|a], [|b], [|c], ... are always preceded by a vertical line and enclosed in square brackets.

Good luck!

Edit: Escaping parenthesis is possible: ANOVA\(. is A * N * O * V * A * (...
Edit 2: You're also not detecting matrices yet:

  • [A], [B], [C], ...

@TIny-Hacker
Copy link
Owner

I've made the changes and pushed them, along with updating the latest release. Let me know if you feel like the issue has been resolved or there's still changes to make :)

@nineteendo
Copy link
Contributor Author

nineteendo commented Jan 7, 2024

AMAZING work, just some minor polishing...

  • DelVar and Pause must be followed by a space.
  • rand and GetKey don't need to be followed by a bracket.

Note that this is NOT a complete list.

@TIny-Hacker
Copy link
Owner

I've updated the extension and did a little testing of my own and it seems that things are properly detected, at least the way I think they should.

Matr>list( isn't detected because the extension expects Matr▶list, and L1 isn't detected as list 1 because |L1 is.

@nineteendo
Copy link
Contributor Author

nineteendo commented Jan 24, 2024

I'm afraid we're not done yet. I've spent the entire day creating a file with every token and they aren't properly highlighted:

Screenshot 2024-01-24 at 16 19 45

Persisting problems (besides the final boss):

  • {r1}, {r2}, {r3}, ... are still not highlighted correctly.
  • ->ABCD is still detected as 6 tokens: -, >, A, B, C and D instead of 2: -> and ABCD (list assignment). I suggest going through the file with > Developer: Inspect Editor Tokens and Scopes to check for this:
    Screenshot 2024-01-24 at 17 17 46
  • Matr>list is still not highlighted correctly.
  • L1 is still not highlighted correctly (list 1).
  • The regular expressions are still a bit unorganised. E.g:
    [
        {
            "name": "keyword.control.8xp",
            "match": "(If |Then|Else|For(?=\\()|While |Repeat |End|Pause |Lbl |Goto |Wait |IS>(?=\\()|DS<(?=\\()|Menu(?=\\()|Return|Stop|GraphStyle(?=\\()|GraphColor(?=\\()|OpenLib(?=\\()|ExecLib )"
        }
    ]
    I believe this could be split up based on syntax, making the extension more maintainable:
    [
        {
            "name": "keyword.control.8xp",
            "match": "(If|While|Repeat|Pause|Lbl|Goto|Wait|ExecLib) "
        },
        {
            "name": "keyword.control.8xp",
            "match": "Then|Else|End|Return|Stop"
        },
        {
            "name": "keyword.control.8xp",
            "match": "(For|IS>|DS<|Menu|GraphStyle|GraphColor|OpenLib)(?=\\()"
        }
    ]

I believe that Source Coder 3 does something wrong with the accented letters, so you don't have to worry about that.

It also can't load this file even though TI Connect can.

@nineteendo nineteendo changed the title Support "unconventional" code Improve syntax highlighting Jan 24, 2024
@TIny-Hacker
Copy link
Owner

TIny-Hacker commented Jan 25, 2024

Thanks so much for taking the time to create that file! I found it very useful for debugging highlighting. Have a look at d05c76d and let me know what you think. There are a few tokens that were not properly highlighted because they were missing a ( or at the end, but I've updated the tokens file accordingly to match the way they are shown in TI-OS (I think it was only ExecLib and getDtStr( but there may have been a few more.)

Let me know what you think and when it's ready I'll publish a new release!

@nineteendo
Copy link
Contributor Author

nineteendo commented Jan 25, 2024

Looks fairly good, but 3 remarks:

  • (?<=(Goto|Lbl) )[0-9A-Zθ]{1,2}, it's worth noting that this doesn't detect labels in Menu( although I'm not sure how to fix that.
  • It looks like this won't work if brackets aren't closed: \\√\\((.*)\\), \\√\\( should be fine.
  • So I would not use [A-Zθ]* (which should even have been [A-Zθ]+) because this might lead to many false positives.
    Instead replace \\|L[A-Zθ][0-9A-Zθ]{0,4} with ((?<=->|→)|\\|L)[A-Zθ][0-9A-Zθ]{0,4} to detect these list assignments. This might still not detect them properly in some list functions, but that's fine.

Hmm, are these missing characters getting fixed for Source Coder 3? Because it would be a pain to do manually.

@TIny-Hacker
Copy link
Owner

TIny-Hacker commented Jan 25, 2024

The changes are made! I'll try to look into getting it corrected in SourceCoder and confirm what's going on before I publish a release.

@nineteendo
Copy link
Contributor Author

Hmm, (?<=(Goto) )|(?<=(Lbl) )[0-9A-Zθ]{1,2} is now detecting Goto without a label. I just wanted to make note of the issue with Menu(.

@TIny-Hacker
Copy link
Owner

Looks like the issue with SourceCoder has been corrected: http://ceme.tech/p305502

I think I need to change the Goto detection to ((?<=(Goto) )|(?<=(Lbl) ))[0-9A-Zθ]{1,2} instead.

@nineteendo
Copy link
Contributor Author

nineteendo commented Jan 26, 2024

Seq seems to be fixed, but still 6 differences remain.

Some more ideas:

  • You don't need brackets around (Asm...Sto)), (CENTER...RIGHT), (BLUE...DARKGRAY), (Blue...DarkGray), (Xmin...PlotStep), (ZXmin...ZPlotStep), (\\[n\\]...Q3), (Tbl...Tbl) and (theta...tau)
  • I also think that we can just put DelVar here: (If...ExecLib|DelVar)
  • I believe it's better to use (?:(?<=Goto )|(?<=Lbl ))[0-9A-Zθ]{1,2} because you don't need to capture an empty string.
  • You can use (Pic|Image)[0-9]

@TIny-Hacker
Copy link
Owner

I feel like this is in a good enough state that I'm going to mark this closed for now (unless there's any significant issues left). I'm also considering changing token syntax slightly in #4. Feel free to make a new issue or submit a PR if there's any major issues you're still encountering, but looking through the tokens file you provided everything looks okay to me.

@nineteendo
Copy link
Contributor Author

I've updated the TOKENS2.8xp.zip.

@github-staff github-staff deleted a comment Apr 26, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement New feature or request
Projects
None yet
Development

No branches or pull requests

2 participants