Skip to content
This repository has been archived by the owner on Jul 31, 2018. It is now read-only.

Replace section 5.1 with unambiguous JavaScript grammar. #33

Closed
wants to merge 1 commit into from
Closed

Replace section 5.1 with unambiguous JavaScript grammar. #33

wants to merge 1 commit into from

Conversation

jdalton
Copy link
Member

@jdalton jdalton commented Jun 24, 2016

This PR replaces section 5.1 of the ES6 Module Interoperability proposal with the UnambiguousJavaScriptGrammar proposal.

TL;DR

  • CJS and ES modules just work without new extensions, extra ceremony, or
    excessive scaffolding
  • Performance is generally on par with existing CJS module loading
  • Performance is significantly improved for ES modules over transpilation workflows
  • Change JS grammars for Script and Module to be unambiguous
  • Determine grammar of .js files by parsing as one grammar and falling back to others

@jdalton
Copy link
Member Author

jdalton commented Jun 24, 2016

Let me know if you'd like more of the UJSG proposal brought over, things regrouped, or split out.

A package opts-in to the Module goal by specifying `"module"` as the parse goal
field *(name not final)* in its `package.json`. Package dependencies are not
affected by the opt-in and may be a mix of CJS and ES module packages. If a parse
goal is not specified, then attempt to parse source text as the preferred goal.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

probably need to be specific that CJS (I think) is the preferred goal

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

err .. Script "goal" that would be I suppose

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, you're correct. I'll update that.

@rvagg
Copy link
Member

rvagg commented Jun 27, 2016

So I guess there are a couple of ways forward here:

  • Come up with a position on whether this would be a preferred way forward if it were accepted by TC39, the spec was changed and engines (specifically V8) made changes to make this work; or
  • Agree that this is the preferred way forward (or not) then adopt this as our detection mechanism regardless of how TC39 reacts to this and effectively make it a de facto part of the spec because we don't handle the ambiguous case at all.

The only problem with the second option is that we'd have to build the detection code ourselves without being able to rely on V8, I assume, perhaps that would be as simple (?) as doing a Module parse, which, if successful, could be inspected for import and export and then fail if those don't exist.

Also, this doesn't include the "modules.root" property for fat packages, I would have thought that's a necessary part for a migration period (which could be quite long).

/cc @nodejs/collaborators - please chime in on this, it'd be great to have more than just the CTC engaged on the modules questions, I think we'd all appreciate more core team minds being applied to this than we currently have.

@ronkorving
Copy link

Since you're asking (and I'll keep it short), I would definitely wait for TC39's position on this. If an alternative solution comes out of this (again), we should probably re-evaluate. I see absolutely no reason to rush here. It's not like CommonJS is broken.

@vkurchatkin
Copy link

I don't think that parsing ourselves is a great idea, so detection part needs to be implemented in v8 first

@rvagg
Copy link
Member

rvagg commented Jun 27, 2016

@ronkorving even if we opt for waiting for TC39 to pass judgement, they may be looking to us to provide a signal wrt whether this is going to be something we'd even end up using. While the unambiguous grammar proposal serves more than just Node, we are the primary beneficiary and target of it. TC39 are obviously not so comfortable with the idea of changing a spec that they've already signed off on, but given the consternation around Node's preference for .mjs, if TC39 hears from us that we may be willing to give up on .mjs in favour of this if it gets through their hoops then they may be more likely to consider it. So it's a bit of a multi-party dance and the priority now is to get maximal input from Node Collaborators so we can figure out whether we are even willing to consider this path.

@ronkorving
Copy link

I wasn't gonna touch the topic of preference of the actual solution, but I for one much prefer this over a new file extension.

@vkurchatkin
Copy link

I'm -1. This is solution is not significantly better (or not better at all), but is much harder to implement and affects performance

@Fishrock123
Copy link

Fishrock123 commented Jun 27, 2016

@vkurchatkin How does it effect performance? It doesn't seem like this would actually significantly impact startup time, but we'd have to benchmark to say that it would, which is pretty much impossible without V8 implementing it.

@YurySolovyov
Copy link

@Fishrock123 I wonder then how

Performance is generally on par with existing CJS module loading

is achieved

@rvagg
Copy link
Member

rvagg commented Jun 27, 2016

@Fishrock123 it'll require a full double-parse unless the module type being loaded is the same as the default goal being parsed by the version of Node.js being used, and we can't really cache that between processes, only within a process. The .mjs proposal only costs a few extra stat calls.

@RReverser
Copy link
Member

RReverser commented Jun 27, 2016

it'll require a full double-parse

This really depends on the implementation. It's totally possible not to drop away all the parsing results, but only to delay strict early errors + things like octal literals until the end of the parsing (and, because V8 already uses lazy parsing, it should result in pretty small if no overhead).

@rvagg
Copy link
Member

rvagg commented Jun 27, 2016

I guess that's a call for V8 to make re optimisation, my understanding was that the two goals are handled completely separately in the current incarnation of Modules that V8 has implemented (but not shipped). It'd be nice if this was all handled upstream from Node but I doubt we'll get such a free ride.

@jdalton @joshgav can we pull in some ChakraCore folks in here as well to have them comment on where this might be implemented in a ChakraCore-backed Node and how much Node could get for free if this became part of the spec?

@ChALkeR
Copy link
Member

ChALkeR commented Jun 27, 2016

@RReverser

It's totally possible not to drop away all the parsing results, but only to delay strict early errors + things like octal literals until the end of the parsing (and, because V8 already uses lazy parsing, it should result in pretty small if no overhead).

Note that e.g. a top-level await(fo); would have a different meaning in scripts and modules, and it might happen that this would be correct in both of those, but have different results.

@RReverser
Copy link
Member

would have different meaning in scripts and modules, and it might happen that this would be correct in both of those, but have different results

Sure, but I guess you're talking about runtime semantics here, not parsing? Note that for distinguishing modules only parsing stage is important.

@ChALkeR
Copy link
Member

ChALkeR commented Jun 27, 2016

@RReverser In that example, await will be a function name for scripts, and it would likely be a keyword in modules case.

@RReverser
Copy link
Member

and it would likely be a keyword in modules case

Ah right. Another one to the list of delayed elements (like octal literals), although I agree it's somewhat trickier. (Btw, AFAIR it's not even part of the current proposal, only a future consideration?)

@vkurchatkin
Copy link

This really depends on the implementation. It's totally possible not to drop away all the parsing results, but only to delay strict early error

It can't be done outside of engine, and engines will probably do it only if it's a part of Ecma-262

@RReverser
Copy link
Member

It can't be done outside of engine, and engines will probably do it only if it's a part of Ecma-262

Just saying that it can be done efficiently on the level of engine, but if not, we can just rely on own parsing pass, which will be less efficient, but still work. Personally, I really like this idea independently of specific implementation details.

@bmeck
Copy link
Member

bmeck commented Jun 27, 2016

@rvagg modules.root is removed in prep for talking with TC39 and to simplify spec. We know a solution and I can make a separate EP for that which won't cause extraneous discussion with TC39.

We do need v8 buy in for the case of export {} since it does not export anything we cannot detect that it exists in the code. import ... always produces a request to load some code, and export {name}/export default always add a value to the list of exports.

We have a few contacts at ChakraCode we have talked to in private about the steps necessary for this, I had a very specific call with @bterlson about parsing and bytecode caching. It was held under the idea of a grammar change, but they seemed pretty unfazed by things.

@RReverser v8 has said it would be non-trivial for us to get a parser that does both modes. It may be possible to get them to provide it, but not in the near future.

A key part to understanding the way to get max perf is:

  1. cache the goal + bytecode somewhere (speedup)
  2. the fact that modern engines do light parsing until eval of fn. CJS sadly always runs a function asap. (initial parsing of non-evaluated stuff isn't too expensive). @trevnorris did some tests on raw parsing, maybe he has a link.
  3. if you guess the right goal the first time, you don't parse it the second time (no slowdown). Add a possible "goal" flag to the package.json and you only parse once.

All this said, I am heavily on the side of a file extensions unless this removal of ambiguity lands in spec. File extensions do not scale like parsing does; but, extending the spec has gotten us arguments about doing things not to spec in the past. I fear we will get bitten in some way where the spec does not consider the implications outside of browsers like "uncaughtRejection" is currently dealing with.

@jdalton
Copy link
Member Author

jdalton commented Jun 27, 2016

To echo a bit of @bmeck, after meeting with the CTC it became clear that modules.root is really its own thing, separate from the UnambiguousJavaScriptGrammar proposal. It has applications for both module formats and has plenty of other things to debate and iron out on its side.

Performance is not a big concern. Investigations were had, benchmarks run (@trevnorris), and the numbers showed that even without caching, and worse case double parsing many many files, that the performance impact was not a blocker.

I can't see a future in which engines don't help Node achieve its goals. V8 and Chakra are already optimizing specifically for Node and are invested in Node on several fronts. SpiderMonkey has experiments in the Node space as well.

As for the TC39, I'm optimistic. This July will be @bmeck's first meeting with them as a TC39 member. It's going to be great to have Node represented there. I'm sure something can come out of the meeting to avoid the "doing things not to spec" arguments.

While iterating on this proposal @bmeck and I kept coming back to keeping it simple. What came out was something that can scale as more parse goals are added, something that supports stdin, something that doesn't require lots of metadata, and something that "just works" for developers without the ecosystem concerns of the current proposal.

@trevnorris
Copy link
Contributor

Note that e.g. a top-level await(fo); would have a different meaning in scripts and modules

Non-module code is allowed top-level await?

@bmeck
Copy link
Member

bmeck commented Jun 27, 2016

@trevnorris await is not a reserved word in Script, so it could be anything a normal identifier could refer to, function, variable, etc.

@joshgav
Copy link

joshgav commented Jun 29, 2016

@rvagg

@jdalton @joshgav can we pull in some ChakraCore folks in here as well

@jdalton is working on it, in the meantime /cc @aruneshchandra @nodejs/chakracore

@aruneshchandra
Copy link

@rvagg
After a brief discussion with @jdalton on this proposal I believe that, for supporting Node, exposing a new parse API in ChakraCore that takes some source text and returns whether it has an Import or Export declaration, is doable and not likely to be expensive in terms of implementation and performance.

@jdalton
Copy link
Member Author

jdalton commented Jun 29, 2016

🤘

I pinged @ajklein (V8) who echoed that providing an API for Node would be doable. From chatting with both Chakra/V8 folks it looks like the new API would not have the ES spec change as a hard requirement either. An engine API like this could be useful beyond Module detection too, e.g. detecting something like asm.js as a goal.

@Fishrock123
Copy link

@jdalton So wait, do we not need to change the spec then, or does changing the spec just make the parseability faster?

@jdalton
Copy link
Member Author

jdalton commented Jun 29, 2016

@Fishrock123
The spec change isn't really about performance or helper API implementation.
See the ecma262-solution section.

@joshgav
Copy link

joshgav commented Jun 29, 2016

@aruneshchandra

takes some source text and returns whether it has an Import or Export declaration

@rvagg @Fishrock123: Is this all Node.js needs? Or do we want to know that Chakra and V8 will parse a module without import/export keywords as a script? I think that would be the spec change.

@bmeck
Copy link
Member

bmeck commented Jun 29, 2016

@joshgav I am heavily, and I do stress heavily, against doing parsing while ambiguity remains in the spec. It leads to surprises for developers in non-Node environments, and lovely comments about how Node is non-standard by spec developers. It also has led to situations where spec has not considered implication on Node since behavior is seen as non-standard.

We can take the parsing approach if given an API, but I will fight it will all my might if it is not endorsed by TC39 / spec.

@jdalton
Copy link
Member Author

jdalton commented Jun 29, 2016

@bmeck

It leads to surprises for developers in non-Node environments

Code moving from node to the browser (current npm workflow) will continue to work regardless of spec change, so that's nice.

It also has led to situations where spec has not considered implication on Node since behavior is seen as non-standard.

I think having you on on the TC39, representing Node, will help avoid issues like that going forward :)

We can take the parsing approach if given an API,

🙌

@chrisdickinson
Copy link

Reading through this, some reactions:

  • I agree wholeheartedly with @bmeck: we should not implement this approach (or merge this PR) until the following conditions are true:
    • TC39 changes the spec or issues a recommendation,
    • V8 offers an ParseModule API that fails on source that does not contain imports/exports.
  • If either or both of those are not possible, I am in favor of .mjs.
  • Re: performance:
    • My position is that the slowdown this would impose is not a problem.
    • The worst case scenario (no caching, etc) was minimal compared to the time spent loading modules (@trevnorris benchmarked this, I believe.)
    • If we absolutely have to make up the perf gap somehow, we should avoid filesystem bytecode caching.
  • I am generally positive on this, assuming the TC39 and V8 prerequisites are fulfilled.
  • Re: modules.root: This is strictly OT for this PR, but it would be valuable to talk about this topic with @isaacs. I believe he implemented something like this behavior in an early version of Node, and the resulting confusion led him to remove the behavior.

@jdalton
Copy link
Member Author

jdalton commented Jun 29, 2016

@chrisdickinson

I agree wholeheartedly with @bmeck: we should not implement this approach (or merge this PR)

I don't believe @bmeck is suggesting the PR should not be merged.

@bmeck
Copy link
Member

bmeck commented Jun 29, 2016

Correct, I think it should be merged in order to show commitment (just like the file extension was). I doubt we can make easy progress without a vote of confidence prior to getting support. We are not moving to ACCEPTED and as such the proposal will continue to be open to change / reverting.

@chrisdickinson
Copy link

chrisdickinson commented Jun 29, 2016

To rephrase: I agree with @bmeck that committing to this approach should be contingent on TC39 blessing the approach (by recommendation or spec change) and V8 giving us a CompileModule API corresponding to that recommendation or change.

That said, assuming that this EP represents the CTC's intent with regards to modules, we should either re-include the .mjs bits (noting that we prefer unambiguous parsing if it becomes available to us) or wait to merge this PR until we know whether or not unambiguous grammar will become available to us.

@jdalton
Copy link
Member Author

jdalton commented Jun 29, 2016

As requested from today's CTC meeting, I moved a bit of the ecma262-solution section over to this PR.

@rvagg
Copy link
Member

rvagg commented Jun 30, 2016

Thanks to @jdalton for his comments today at the CTC meeting about "modules.root" being removed and the "fat packages" concern maybe being misplaced—I hadn't thought about it this way but I can see that accommodating fat packages like this might lead to sub-optimal outcomes, with the runtime taking different paths depending on which version it loads and that it might be best to leave it up to users to figure out their own migration timing for a switch from transpiling to native.

@nodejs/collaborators if possible, we'd like to have a vote at the next @nodejs/ctc meeting and we'd like input from the broader group before that can happen. Please review the change this PR makes, and comment here if you have any questions or have any concerns that we should hold up the process to resolve. A -1 is an acceptable comment but please give some justification for that in case there are ways your concerns can be resolved and to help guide the CTC to a final decision. Collaborators are still the first gate in the decision-making process so please engage!

The basic question here goes something like this:

If TC39 agrees to revisiting the Modules spec to accommodate the needs spelled out in https://github.com/bmeck/UnambiguousJavaScriptGrammar and the ability to detect filetype is made possible by our JS engine(s) then we would use such a mechanism to load, detect and run .js files whether they are of Script or Module type; i.e. discarding the .mjs detection option.

As for how to reflect this conditionality in the doc, I like @chrisdickinson's suggestion of including the .mjs option in the doc still as a fallback case if we don't get what we need from TC39 and making it clear that the decision to go either way is contingent on having a spec change. Let's be absolutely clear about the path forward, which does have a fork in it.

@Fishrock123
Copy link

I can't say I understand at the current time what "modules.root" was removed, but that definitely makes me a bit happier...

@jdalton
Copy link
Member Author

jdalton commented Jun 30, 2016

@rvagg

As for how to reflect this conditionality in the doc, I like @chrisdickinson's suggestion of including the .mjs option in the doc still as a fallback case if we don't get what we need from TC39 and making it clear that the decision to go either way is contingent on having a spec change.

Cool. I can add a link to the current .mjs revision. (updated ✔)

Let's be absolutely clear about the path forward, which does have a fork in it.

We should clarify why the fork, Node restricting ES modules to having at least one import/export and browsers not, is an issue. Especially because:

  • The current workflow of going from npm/node to the browser will still work since it would be going from a more restricted (node) to less restricted (browser) grammar environment
  • The TC39 sees the Node restriction as allowable since it's not a forbidden specification mod

I've been a bit wiggly with the proposal text:

A specification change or at least an official endorsement of this Node proposal
would be welcomed.

It's wiggly because at the time the root concern seemed to be upset emails/tweets about not following the spec, so a 👍 of some kind from TC39 would provide a way to avoid that. I think clarifying the issue would help the wiggles :)

@bmeck
Copy link
Member

bmeck commented Jun 30, 2016

I am more concerned with things progressing in spec without consideration to the approach taken by Node to things, unhandledRejection is something that falls in that category of having to deal with things because it was not well understood that Node had a different set of expectations. The emails and discussions are also a concern, but show deeper problem of "non-standard" behavior not being taken into as much consideration as spec progresses.

@jdalton
Copy link
Member Author

jdalton commented Jun 30, 2016

@bmeck
Would both of your concerns be tackled by your joining of the TC39 to champion for and highlight issues that affect Node?

@bmeck
Copy link
Member

bmeck commented Jun 30, 2016

TC39 is consensus based, I could influence things but not guarantee that any non-standard behavior would be respected in the future.

@jdalton
Copy link
Member Author

jdalton commented Jun 30, 2016

I think Node representation at the TC39 goes a long way. I'm not sure guarantees are guaranteed. For example, I relied on a the spec'ed Reflect.enumerate path but it was pulled after being added and rubber stamped.

If design choices that effect Node are a concern, having friends like Chakra/V8 can help. Node/Microsoft/Google/jQuery-Foundation should be able to elevate concerns. I see Node being a part of the TC39 as a way to get things like the Node package ecosystem taken into account when new built-in APIs are proposed, in much the same way folks look to browsers for API conflict investigations.

@rvagg
Copy link
Member

rvagg commented Jul 6, 2016

Noting here for clarity that this was added in response to the request above:

Note: While the ES2015 specification does not forbid this extension, Node wants to avoid acting as a rogue agent. Node has a TC39 representative, @bmeck, to champion this proposal. A specification change or at least an official endorsement of this Node proposal would be welcomed. If a resolution is not possible, this proposal will fallback to the previous .mjs file extension proposal.

We'll try and move for a vote at the CTC meeting today unless anyone has reason to hold it up further.

@jdalton
Copy link
Member Author

jdalton commented Jul 6, 2016

🎉

trevnorris pushed a commit that referenced this pull request Jul 6, 2016
@trevnorris
Copy link
Contributor

Thanks much! Landed in 86e0241.

@trevnorris trevnorris closed this Jul 6, 2016
@jdalton jdalton deleted the unambiguous branch July 6, 2016 21:28
@bmeck bmeck mentioned this pull request Sep 7, 2016
@Trott Trott removed the tsc-agenda label Oct 20, 2016
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.