-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
esm: refactor hooks – add loader instances #49315
base: main
Are you sure you want to change the base?
Conversation
Review requested:
|
const { defaultResolve } = require('internal/modules/esm/resolve'); | ||
|
||
exports.resolve = defaultResolve; | ||
exports.load = defaultLoad; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we create a new file we might need to add it to the startup snapshot (see #45849). Alternatively is there a way to stick this in an existing file?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The reason I put this in its own file is so that the url
property actually led to a valid file that represented the base "loader".
It can live in literally any file that can be treated as a loader (exports load
, resolve
, etc.) This code is the relevant block:
const defaultLoader = 'internal/modules/esm/default_loader';
this.addCustomLoader(`node:${defaultLoader}`, require(defaultLoader));
So that if we ever expose a list loaders or anything like that the url
property is what you see above.
} | ||
|
||
#rebuildChains() { | ||
this.#rebuildChain('globalPreload'); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Let’s wait to land this after #49144.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Works for me 🎉
Previously we had one array for each kind of chain in the variable `#chains`; this is probably the most efficient, but it decouples the connection between the original `register` call and the hook functions themselves. By adding `#loaderInstances` we can keep track of each time `register` is called. This allows, for example, trivially removing the loader later on, or associating shared context with a particular `register` call. The tradeoff here if we just did this would be that during hook traversal we may iterate over a slightly larger number of objects. Instead we keep `#hooks` but it is now a derived property; every time `#loaderInstances` changes we simply rebuild `#hooks`.
a989338
to
a1aef89
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm, the code here is less readable and more convoluted in execution than it was before. I think my proposals will help, but I'd need to see them all together in the diff.
Note that you basically cannot use for…of
or native object methods at all due to the risk from prototype pollution. See https://github.com/nodejs/node/blob/main/doc/contributing/primordials.md
#rebuildChain(name) { | ||
const chain = this.#chains[name] = []; | ||
let i = 0; | ||
for (const instance of this.#loaderInstances) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can't use for…of :( gotta use a plain ol' for
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -87,45 +86,40 @@ let importMetaInitializer; | |||
// [2] `validate...()`s throw the wrong error | |||
|
|||
class Hooks { | |||
#chains = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why remove these? If you need them to start out empty, you can still leave the fields (resolve
etc) and then also the code docs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So that the built-in hooks are not treated as any kind of "special" object; they now work just like any other loader – you can see them being added in the constructor. This ensures that if we attach any data to a loader instance that the default node loader gets it too – right now it's just url
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah, yes. I figured that :) I think you can still leave everything else here and just initialise the lists as empty 🙂
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ah yes true 😄 I can do that. Part of me considered that hard-coding the hook names in a bunch of places might be less clean… but it's more clear and that works for me 🎉
const { hookName } = meta; | ||
const { | ||
fn: hook, | ||
url: hookFilePath, | ||
next, | ||
} = current; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Did anything change here? It looks like it's now just less readable
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes. The previous structure of chain:
type Fn = (arg0: any, context: any, next: Fn) => any;
interface ChainItem {
fn: Fn
url: string;
}
type Chain = ChainItem[];
The new structure of chain:
type Fn = (arg0: any, context: any, next: Fn) => any;
interface LoaderInstance {
load?: Fn;
resolve?: Fn;
initialize?: Fn;
url: string;
}
interface ChainItem {
fn: Fn
loader: LoaderInstance;
}
type Chain = ChainItem[];
if (typeof instance[name] !== 'function') { | ||
continue; | ||
} | ||
chain.push({ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Gotta use primordials for these sorts of things ArrayPrototypePush
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Derp thanks 😄
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
chain.push({ | |
ArrayPrototypePush(chain, { | |
__proto__: null, |
} | ||
} | ||
|
||
#rebuildChains() { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a performance loss. I think it would be better for there to be only rebuildChains
and for it to accept a list of chain names to rebuild, like
#rebuildChains(...names) {
if (!names.length) {
names = new Array('initialize','load','resolve'); // [1]
}
const shouldRebuild = { __proto__: null };
for (let n = names.length - 1; n > -1; n--) {
const name = names[n];
shouldRebuild[name]: true;
this.#chains[name].length = 0; // [2] Might need a primordial?
}
for (
let i = 0,
l = this.#loaderInstances.length - 1;
i < l;
i++
) {
const {
initialize,
load,
resolve,
} = this.#loaderInstances[i];
if (shouldRebuild.initialize) { /* … */ }
if (shouldRebuild.load) { /* … */ }
if (shouldRebuild.resolve) { /* … */ }
}
}
this.#rebuildChains('resolve'); // Rebuild only `resolve` chain
this.#rebuildChains('load','resolve'); // Rebuild `load` and `resolve`
this.#rebuildChains(); // Rebuild all chains
[1] I believe this (new Array(…)
instead of […]
) will invoke an optimisation in V8 to choose the most appropriate specialised C++ array from the start and also to pre-allocation only exactly the space for these 3 items. If it doesn't, no need for the otherwise uncommon use of new Array()
instead of []
.
[2] We don't need to re-create the array, only empty it. For that, setting its length to 0 is far better.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we can make this happen 😄 Probably a bit of a µ-optimization considering how often #rebuildChains
is called and the size of N
… but it's not a problem to make it a little bit faster. If the actual chain execution was affected I would be more concerned, considering that is called on every import
/require
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mm, it likely won't break the performance bank. But it's easy to avoid the de-op, so why not 🙂
Previously we had one array for each kind of chain in the variable
#chains
; this is probably the most efficient, but it decouples the connection between the originalregister
call and the hook functions themselves.By adding
#loaderInstances
we can keep track of each timeregister
is called. This allows, for example, trivially removing the loader later on, or associating shared context with a particularregister
call. The tradeoff here if we just did this would be that during hook traversal we may iterate over a slightly larger number of objects. Instead we keep#chains
but it is now a derived property; every time#loaderInstances
changes we simply rebuild#chains
.This sets the stage for doing other things, like adding
deregister
: #49159 or adding the ability to attach local state to individual loader instances.N.B. I should probably add back the documentation about the entries in
#chains
somehow.Developer's Certificate of Origin 1.1
By making a contribution to this project, I certify that:
(a) The contribution was created in whole or in part by me and I
have the right to submit it under the open source license
indicated in the file; or
(b) The contribution is based upon previous work that, to the best
of my knowledge, is covered under an appropriate open source
license and I have the right under that license to submit that
work with modifications, whether created in whole or in part
by me, under the same open source license (unless I am
permitted to submit under a different license), as indicated
in the file; or
(c) The contribution was provided directly to me by some other
person who certified (a), (b) or (c) and I have not modified
it.
(d) I understand and agree that this project and the contribution
are public and that a record of the contribution (including all
personal information I submit with it, including my sign-off) is
maintained indefinitely and may be redistributed consistent with
this project or the open source license(s) involved.