-
-
Notifications
You must be signed in to change notification settings - Fork 538
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve AST with respect to estree #2854
Comments
The "modifiers" concept was ported from a contributor from a long time ago. We may need to reevaluate this decision and make larger changes. The underlying problem is how recoverable our parsing should be. |
Don't include `trailing_comma` fields in JSON AST, for compat with ESTree (#2854).
This comment was marked as resolved.
This comment was marked as resolved.
|
…ortDeclaration.specifiers` (#8560) refs #2854 (comment)
To surface all the obscure differences between our JS-side AST and ESTree, I think we ideally need conformance tests. Initially we'll have tons of failures, but we can chip away at them until there are none. I suggest:
The Acorn AST JSON files will take time to compute and will be many many files. We could:
Then the conformance tests could be written purely in Rust, and we only need to run Acorn once, not every time conformance runs. That'll only cover us for JS syntax. I guess we could do similar for TS ASTs using fixtures from TypeScript repo and using typescript-eslint's parser to generate the JSON files. But personally I think we should concentrate on the JS AST first. Does that sound like a good idea? If so, does anyone have an interest in building the infrastructure? |
I think that a tool like this would be useful for interfacing with other tools that manipulate source code (my initial goal that I still want to pursue at some point is having a fast prettier parser). And for me source code is TS code and I don't know anyone still thinking that people write mainly JS (expect ESLint). So I think the infra should target TS code in priority, but I don't have the bandwidth to do it so maybe Test262 is better than nothing. |
Just to clarify: The target is very much TS. But as TS is a superset of JS, in my opinion it'd be easier if we tackle JS first, and then go on to TS. I am also keen to make sure that there aren't any places where JS and TS ASTs diverge from each other (aside from TS AST adding new types, and new properties to existing JS types). i.e. I want to make sure that we can satisfy both ESTree and TS AST compatibility with a single AST shape. If we can't, that will inform how the serializer needs to be designed (which is a work in progress - #6347). |
- part of #2854 This PR attempts to handle estree ast incompatibility of `Function.params: FormalParameters` as mentioned in the above issue: > `FormalParameters` is closer to ESTree now, but should be inlined directly into `<node>.params` in multiple nodes. Estree spec has `Function.params: Pattern[]` https://github.com/estree/estree/blob/master/es5.md#functions, but oxc already has `interface Pattern`, so I named it to `Function.params: ParamPattern[]` for now. Also I'm not sure about the testing (I suppose that's a part of #8630), so I snapshoted one example code. For comparison, here is acorn's output https://astexplorer.net/#/gist/25138c0605f82dcfc1a8fd363dc2a681/5ad30d36c9f276519063e6fd2e340c113d8c85b0 --------- Co-authored-by: overlookmotel <theoverlookmotel@gmail.com>
Part of #2854 I added a bit weird macro `add_entry(computed = true)` to add constant. Likely this is not the best way. I would appreciate any suggestion for a better approach 🙏 --------- Co-authored-by: overlookmotel <theoverlookmotel@gmail.com>
part of #2854 ref: https://github.com/estree/estree/blob/master/es2015.md#programs I tried to move `via` customization on `struct SourceType`, but couldn't find a way maybe because struct is defined in `oxc_span`. For now, I made this as `via` on `Program::source_type` field.
Part of #2854 difff os estree_test262.snap ```diff -Positive Passed: 13042/44293 (29.44%) +Positive Passed: 16040/44293 (36.21%) ```
Part of #2854 ```diff -Positive Passed: 28924/44293 (65.30%) +Positive Passed: 30446/44293 (68.74%) ```
Part of #2854 difff os estree_test262.snap ```diff -Positive Passed: 13042/44293 (29.44%) +Positive Passed: 16040/44293 (36.21%) ```
Part of #2854 ```diff -Positive Passed: 28924/44293 (65.30%) +Positive Passed: 30446/44293 (68.74%) ```
Part of #2854 Related: #8972 (comment) I updated `append_to` macro to support concatenating two `Vec`. The logic is very similar so I reused it, but maybe it's better to introduce a new macro or maybe rename `append_to`?
…ression` (#8988) Part of #2854 Serialization of `ArrowFunctionExpression.body` requires knowing `ArrowFunctionExpression.expression: boolean`, so I generalized `ESTreeStructField::via` macro to cover this use case. This alone doesn't improve conformance since `ArrowFunctionExpression.generator/id` needs to be fixed first by #8980.
We are now at 99.88% on the conformance tests vs Acorn for JS syntax. What's left is: Two oddities:
And 2 groups of edge cases which we currently need to accept failure on:
With apologies, I do intend to destroy these fine results shortly by making the conformance test runner stricter - not ignoring extraneous fields, and requiring fields to be in the right order as well in the JSON. But all the test failures we'll get from those changes should be easy to fix. Main next step, once the last few tests listed above are fixed, will be to go on to TS and JSX. |
Looks like |
Consider this issue done when acorn estree compat is complete. We'll create another tracking issue for typescript eslint estree, but the core team will not spend time on it for the time being. |
All JS tests now passing except the small number which fail due to #3526 and #9029. #9073 filters out those test cases, which leaves 100% pass! Next step: Make the tester stricter to make sure no extraneous fields in our AST which aren't related to TS, and to ensure fields are in the right order (important for usability and consistent object "shapes" for JS engines). This will no doubt cause many of the passing tests to fail again, but they should be fairly quick to fix. I'm going to check what Rollup ( |
Looks like Rollup's
@Boshen I assume this means that we do need to make our JS-side AST conform to On the positive side, given how swiftly we've been able to achieve ESTree compatibility, and that our TS types are broadly similar to |
I don't think For jsx support, that's fairly recent addition in rollup https://github.com/rollup/rollup/releases/tag/v4.24.0, so I'd expect the downstream usage is very small or maybe none. For Vite ecosystem specifically, estree without jsx and typescript should suffice as far as I know. |
OK great! I do think we should support JSX, since Rollup does. Thankfully it'll be pretty simple as there's very few JSX types. I'll open a separate issue for that. |
Re-opening this issue for the remaining work on JS syntax - correct property order, and ensuring no extraneous properties in the AST (except those related to TS). |
#9285 ticks off those last 2. After that's merged, JS-side AST is 100% ESTree compatible. The new stricter conformance tester flushed out a few bugs, which are now fixed. And the new JSON serializer is 25% faster than All that remains is to decide how to plumb it into the NAPI/WASM bindings. |
Oxc v0.53.0 now returns an estree for direct consumption in JavaScript through napi and wasm APIs, no more ugly work arounds. |
@Boshen is there a way to access the information about whether an identifier is a reference one or not, like it was possible before? (in javascript oxc-parser) Or is it only estree from now on? |
Nope, it's estree, people wants it, it has less information than our Rust AST, sadly we can't go back now. |
I see. Thank you! |
@cernymatej Sorry for belated response. We may consider in the future providing a way to access this information on JS side. But I'm afraid that's likely a way off. ESTree compatibility was the primary goal, as it's what's needed for interop with other tools. |
Great, thank you for the info! 🙂 |
This is not exhaustive but should cover all 'types' of differences to resolve:
setProp(node, "typeArguments", node.typeParameters)
is not a mistake, it's because TSESLint deprecatedtypeParameters
in multiple nodes fortypeArguments
but keep both for now for backward compatibility. I'm not sure what is the right move for OXC here. For ref: fix: rename typeParameters to typeArguments where needed typescript-eslint/typescript-eslint#5384prefix: true
onUnaryExpression
. I don't yet know how to do that in serdeStaticMemberExpression
,ComputedMemberExpression
&PrivateFieldExpression
need to be serialized toMemberExpression
with few changesStringLiteral
,BooleanLiteral
, ... need to be serialized asLiteral
with raw value set. For BigInt, because it's not in the JSON spec, it would still requires some work on the JS side but this can be done as part of the reviver function when desrializing.modifiers
is not an ESTree concept. So anytime there is this prop in OXC it probably means you need to set one or two boolean on the equivalent node. Examples areVariableDeclaration.declare
,ClassExpression.abstract
,TSEnumDeclaration.const
. My take here is that it will make the OXC AST more clean and easy to explore to have named booleans on nodes instead of a comment to say which one are 'allowed'. But I don't know the implication of this.TSInterfaceDeclaration.extends
,ClassExpression.implements
FunctionBody
is aBlockStatement
in ESTreeFormalParameters
is closer to ESTree now, but should be inlined directly into<node>.params
in multiple nodesTSMappedTypeModifierOperator
isboolean | '-' | '+' | undefined
in TSESTree, I think my conversion is incomplete at the momentUsingDeclaration
is aVariableDeclaration
(Not done in my wrapper)Some changes could be seen as rename only, but as soon we touch the name of the nodes the type generation became complex to handle that why I didn't push more PRs for that for now.
For info the
TSUnionType
hack is really prettier specific.But the biggest blocker to make my repo public is the utf16 conversion. The time to print some large files goes in seconds with the dump implementation, and because I saw you plan to solve that I don't want to optimize for it.
Some good news is that my repo successfully gave the same output of prettier on ~500 TSX files, ~8k dts files & 10k files inside my node_modules. This is obviously a very biased dataset, but still promising!
Originally posted by @ArnaudBarre in #2463 (comment)
The text was updated successfully, but these errors were encountered: