Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

vm: add experimental NodeRealm implementation #47855

Closed
wants to merge 30 commits into from
Closed
Show file tree
Hide file tree
Changes from 5 commits
Commits
Show all changes
30 commits
Select commit Hold shift + click to select a range
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
2 changes: 1 addition & 1 deletion LICENSE
Original file line number Diff line number Diff line change
Expand Up @@ -2187,7 +2187,7 @@ The externally maintained libraries used by Node.js are:
SOFTWARE, EVEN IF ADVISED OF THE POSSIBILITY OF SUCH DAMAGE.
"""

- synchronous-worker, located at lib/internal/vm/localworker.js, is licensed as follows:
- synchronous-worker, located at lib/internal/vm/node_realm.js, is licensed as follows:
"""
The MIT License (MIT)

Expand Down
6 changes: 3 additions & 3 deletions doc/api/cli.md
Original file line number Diff line number Diff line change
Expand Up @@ -530,15 +530,15 @@ changes:
Specify the `module` of a custom experimental [ECMAScript module loader][].
`module` may be any string accepted as an [`import` specifier][].

### `--experimental-noderealm`
### `--experimental-node-realm`

<!-- YAML
added: REPLACEME
-->

Enable experimental support for `vm.NodeRealm`.

### `--no-experimental-noderealm`
### `--no-experimental-node-realm`

<!-- YAML
added: REPLACEME
Expand Down Expand Up @@ -2129,7 +2129,7 @@ Node.js options that are allowed are:
* `--experimental-import-meta-resolve`
* `--experimental-json-modules`
* `--experimental-loader`
* `--experimental-noderealm`
* `--experimental-node-realm`
* `--experimental-modules`
* `--experimental-network-imports`
* `--experimental-permission`
Expand Down
11 changes: 11 additions & 0 deletions doc/api/errors.md
Original file line number Diff line number Diff line change
Expand Up @@ -3536,6 +3536,17 @@ removed:

The linker function returned a module for which linking has failed.

<a id="ERR_VM_NODE_REALM_INVALID_PARENT"></a>

### `ERR_VM_NODE_REALM_INVALID_PARENT`

<!-- YAML
added: REPLACEME
-->

The `createImport()` function was passed a valued that was neither
a string or a `URL`.
mcollina marked this conversation as resolved.
Show resolved Hide resolved

<a id="ERR_WORKER_UNSUPPORTED_EXTENSION"></a>

### `ERR_WORKER_UNSUPPORTED_EXTENSION`
Expand Down
25 changes: 16 additions & 9 deletions doc/api/vm.md
Original file line number Diff line number Diff line change
Expand Up @@ -1575,7 +1575,8 @@ are not controllable through the timeout either.

### Class: `NodeRealm`

> Stability: 1 - Experimental. Use `--experimental-noderealm` CLI flag to enable this feature.
> Stability: 1 - Experimental. Use `--experimental-node-realm` CLI flag to
> enable this feature.

<!-- YAML
added: REPLACEME
Expand All @@ -1584,12 +1585,17 @@ added: REPLACEME
* Extends: {EventEmitter}

A `NodeRealm` is effectively a Node.js environment that runs within the
same thread.
same thread. It similar to a [ShadowRealm][], but with a few main differences:

* `NodeRealm` allows to load both commonjs and ESM modules.
mcollina marked this conversation as resolved.
Show resolved Hide resolved
* Full interoperability between the host realm and the `NodeRealm` instance
is allowed
mcollina marked this conversation as resolved.
Show resolved Hide resolved
* There is a deliberate `stop()` function.

```mjs
import { NodeRealm } from 'node:vm';
const noderealm = new NodeRealm();
const myAsyncFunction = noderealm.createImport(import.meta.url)('my-module');
const nodeRealm = new NodeRealm();
const myAsyncFunction = nodeRealm.createImport(import.meta.url)('my-module');
mcollina marked this conversation as resolved.
Show resolved Hide resolved
console.log(await myAsyncFunction());
```
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do think the docs should clarify the difference between this and a ShadowRealm.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also like to understand the differences (and similarities) between this and a worker. Because they look very similar. For example, does a realm have an event loop? Does it share globals? (I'm assuming yes and no?)


Expand All @@ -1599,7 +1605,7 @@ console.log(await myAsyncFunction());
added: REPLACEME
-->

#### `noderealm.stop()`
#### `nodeRealm.stop()`

<!-- YAML
added: REPLACEME
Expand All @@ -1614,18 +1620,18 @@ This method returns a promise that will be resolved when all resources
associated with this Node.js instance are released. This promise resolves on
the event loop of the _outer_ Node.js instance.

#### `noderealm.createImport(filename)`
#### `nodeRealm.createImport(filename)`
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we create the NodeRealm with a specifier?

interface NodeRealm {
  constructor(specifier: string);
  import(specifier, importAssertions): ModuleNamespace;
}

In this way, we can get rid of the higher order function createImport and makes the class method more aligned with ShadowRealm.prototype.importValue:

import { NodeRealm } from 'node:vm';
const nodeRealm = new NodeRealm(import.meta.url);
const { myAsyncFunction } = await nodeRealm.import('my-module');
console.log(await myAsyncFunction());

mcollina marked this conversation as resolved.
Show resolved Hide resolved

<!-- YAML
added: REPLACEME
-->

* `filename` {string}
* `specifier` {string} A module specifier like './file.js' or 'my-package'

Create a function that can be used for loading
mcollina marked this conversation as resolved.
Show resolved Hide resolved
modules inside the inner Node.js instance.

#### `noderealm.globalThis`
#### `nodeRealm.globalThis`

<!-- YAML
added: REPLACEME
Expand All @@ -1635,7 +1641,7 @@ added: REPLACEME

Returns a reference to the global object of the inner Node.js instance.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It should be clarified whether this value is mutable. e.g. is it possible to localworker.globalThis.foo = 1 and have that value reflected within the local worker.


#### `noderealm.process`
#### `nodeRealm.process`

<!-- YAML
added: REPLACEME
Expand Down Expand Up @@ -1671,3 +1677,4 @@ Returns a reference to the `process` object of the inner Node.js instance.
[global object]: https://es5.github.io/#x15.1
[indirect `eval()` call]: https://es5.github.io/#x10.4.2
[origin]: https://developer.mozilla.org/en-US/docs/Glossary/Origin
[ShadowRealm]: https://github.com/tc39/proposal-shadowrealm
3 changes: 3 additions & 0 deletions doc/node.1
Original file line number Diff line number Diff line change
Expand Up @@ -166,6 +166,9 @@ to use as a custom module loader.
.It Fl -experimental-network-imports
Enable experimental support for loading modules using `import` over `https:`.
.
.It Fl -experimental-node-realm
Enable experimental support for vm.NodeRealm.
.
.It Fl -experimental-permission
Enable the experimental permission model.
.
Expand Down
1 change: 1 addition & 0 deletions lib/internal/errors.js
Original file line number Diff line number Diff line change
Expand Up @@ -1706,6 +1706,7 @@ E('ERR_VM_MODULE_LINK_FAILURE', function(message, cause) {
E('ERR_VM_MODULE_NOT_MODULE',
'Provided module is not an instance of Module', Error);
E('ERR_VM_MODULE_STATUS', 'Module status %s', Error);
E('ERR_VM_NODE_REALM_INVALID_PARENT', 'createImport() must be called with a string or URL; received "%s"', Error);
E('ERR_WASI_ALREADY_STARTED', 'WASI instance has already started', Error);
E('ERR_WEBASSEMBLY_RESPONSE', 'WebAssembly response %s', TypeError);
E('ERR_WORKER_INIT_FAILED', 'Worker initialization failure: %s', Error);
Expand Down
6 changes: 3 additions & 3 deletions lib/internal/process/pre_execution.js
Original file line number Diff line number Diff line change
Expand Up @@ -269,10 +269,10 @@ function setupFetch() {
}

function setupNodeRealm() {
// Patch the vm module when --experimental-noderealm is on.
// Patch the vm module when --experimental-node-realm is on.
// Please update the comments in vm.js when this block changes.
if (getOptionValue('--experimental-noderealm')) {
const NodeRealm = require('internal/vm/noderealm');
if (getOptionValue('--experimental-node-realm')) {
const NodeRealm = require('internal/vm/node_realm');
const vm = require('vm');
vm.NodeRealm = NodeRealm;
}
Expand Down
16 changes: 13 additions & 3 deletions lib/internal/vm/noderealm.js → lib/internal/vm/node_realm.js
Original file line number Diff line number Diff line change
Expand Up @@ -10,10 +10,19 @@ const {
Promise,
} = primordials;

const {
emitExperimentalWarning,
} = require('internal/util');

const {
ERR_VM_NODE_REALM_INVALID_PARENT,
} = require('internal/errors').codes;

const {
NodeRealm: NodeRealmImpl,
} = internalBinding('contextify');

const { URL } = require('internal/url');
const EventEmitter = require('events');
const { setTimeout } = require('timers');
const { pathToFileURL } = require('url');
Expand All @@ -33,6 +42,7 @@ class NodeRealm extends EventEmitter {
*/
constructor() {
super();
emitExperimentalWarning('NodeRealm');
this.#handle = new NodeRealmImpl();
this.#handle.onexit = (code) => {
this.stop();
Expand Down Expand Up @@ -78,7 +88,7 @@ class NodeRealm extends EventEmitter {
// but at this point it seems like a premature optimization.
// We cannot unref() this because we need to shut this down properly.
// TODO(@mcollina): refactor to use a close callback
setTimeout(tryClosing, 100)
setTimeout(tryClosing, 100);
} else {

this.#handle.stop();
Expand All @@ -90,7 +100,7 @@ class NodeRealm extends EventEmitter {
// phase of the event loop. This is important because the immediate queue
// would crash if the environment it refers to has been already closed.
// We cannot unref() this because we need to shut this down properly.
setTimeout(tryClosing, 100)
setTimeout(tryClosing, 100);
});
}

Expand All @@ -113,7 +123,7 @@ class NodeRealm extends EventEmitter {
parentURL = pathToFileURL(parentURL);
}
} else if (!(parentURL instanceof URL)) {
throw new Error('createImport() must be called with a string or URL');
throw new ERR_VM_NODE_REALM_INVALID_PARENT(parentURL);
}

return (specifiers, importAssertions) => {
Expand Down
2 changes: 1 addition & 1 deletion lib/vm.js
Original file line number Diff line number Diff line change
Expand Up @@ -344,4 +344,4 @@ module.exports = {
// and vm.SyntheticModule in the pre-execution phase when
// --experimental-vm-modules is on.
// The vm module is also patched to include vm.NodeRealm in the
// pre-execution phase when --experimental-noderealm is on.
// pre-execution phase when --experimental-node-realm is on.
3 changes: 1 addition & 2 deletions src/env.cc
Original file line number Diff line number Diff line change
Expand Up @@ -1021,8 +1021,7 @@ void Environment::CleanupHandles() {

CleanupHandlesNoUvRun();

while (handle_cleanup_waiting_ != 0 ||
request_waiting_ != 0 ||
while (handle_cleanup_waiting_ != 0 || request_waiting_ != 0 ||
!handle_wrap_queue_.IsEmpty()) {
uv_run(event_loop(), UV_RUN_ONCE);
}
Expand Down
2 changes: 1 addition & 1 deletion src/env_properties.h
Original file line number Diff line number Diff line change
Expand Up @@ -366,7 +366,7 @@
V(socketaddress_constructor_template, v8::FunctionTemplate) \
V(streambaseentry_ctor_template, v8::FunctionTemplate) \
V(streambaseoutputstream_constructor_template, v8::ObjectTemplate) \
V(noderealm_constructor_template, v8::FunctionTemplate) \
V(node_realm_constructor_template, v8::FunctionTemplate) \
V(streamentry_ctor_template, v8::FunctionTemplate) \
V(streamentry_opaque_ctor_template, v8::FunctionTemplate) \
V(qlogoutputstream_constructor_template, v8::ObjectTemplate) \
Expand Down
32 changes: 14 additions & 18 deletions src/node_contextify.cc
Original file line number Diff line number Diff line change
Expand Up @@ -21,8 +21,8 @@

#include "node_contextify.h"

#include "base_object-inl.h"
#include "async_wrap-inl.h"
#include "base_object-inl.h"
#include "memory_tracker-inl.h"
#include "module_wrap.h"
#include "node_context_data.h"
Expand Down Expand Up @@ -1398,7 +1398,7 @@ void MicrotaskQueueWrap::RegisterExternalReferences(
Local<FunctionTemplate> NodeRealm::GetConstructorTemplate(
IsolateData* isolate_data) {
Local<FunctionTemplate> tmpl =
isolate_data->noderealm_constructor_template();
isolate_data->node_realm_constructor_template();
if (tmpl.IsEmpty()) {
Isolate* isolate = isolate_data->isolate();
tmpl = NewFunctionTemplate(isolate, New);
Expand All @@ -1412,13 +1412,13 @@ Local<FunctionTemplate> NodeRealm::GetConstructorTemplate(
SetProtoMethod(isolate, tmpl, "tryCloseAllHandles", TryCloseAllHandles);
SetProtoMethod(isolate, tmpl, "internalRequire", InternalRequire);

isolate_data->set_noderealm_constructor_template(tmpl);
isolate_data->set_node_realm_constructor_template(tmpl);
}
return tmpl;
}

void NodeRealm::CreatePerIsolateProperties(IsolateData* isolate_data,
v8::Local<v8::ObjectTemplate> target) {
void NodeRealm::CreatePerIsolateProperties(
IsolateData* isolate_data, v8::Local<v8::ObjectTemplate> target) {
SetConstructorFunction(isolate_data->isolate(),
target,
"NodeRealm",
Expand All @@ -1437,8 +1437,7 @@ void NodeRealm::RegisterExternalReferences(
registry->Register(InternalRequire);
}

NodeRealm::NodeRealmScope::NodeRealmScope(
NodeRealm* w)
NodeRealm::NodeRealmScope::NodeRealmScope(NodeRealm* w)
: EscapableHandleScope(w->isolate_),
Scope(w->context()),
Isolate::SafeForTerminationScope(w->isolate_),
Expand All @@ -1464,8 +1463,7 @@ NodeRealm::NodeRealm(Environment* env, Local<Object> object)
outer_context_.Reset(env->isolate(), outer_context);
}

NodeRealm* NodeRealm::Unwrap(
const FunctionCallbackInfo<Value>& args) {
NodeRealm* NodeRealm::Unwrap(const FunctionCallbackInfo<Value>& args) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should NodeRealm inherit from BaseObject? This seems pretty similar to BaseObject's work.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Only reason I didn't do this in synchronous-worker is because it was built outside of Node.js core.

Local<Value> value = args.This();
if (!value->IsObject() || value.As<Object>()->InternalFieldCount() < 1) {
THROW_ERR_INVALID_THIS(Environment::GetCurrent(args.GetIsolate()));
Expand All @@ -1486,8 +1484,7 @@ void NodeRealm::Start(const FunctionCallbackInfo<Value>& args) {
self->Start();
}

void NodeRealm::TryCloseAllHandles(
const FunctionCallbackInfo<Value>& args) {
void NodeRealm::TryCloseAllHandles(const FunctionCallbackInfo<Value>& args) {
auto count = 0;
NodeRealm* self = Unwrap(args);
mcollina marked this conversation as resolved.
Show resolved Hide resolved
if (self == nullptr) return;
Expand All @@ -1496,11 +1493,10 @@ void NodeRealm::TryCloseAllHandles(
args.GetReturnValue().Set(v8::Number::New(self->isolate_, count));
}

void NodeRealm::InternalRequire(
const FunctionCallbackInfo<Value>& args) {
void NodeRealm::InternalRequire(const FunctionCallbackInfo<Value>& args) {
NodeRealm* self = Unwrap(args);
Local<Function> require = Realm::GetCurrent(
self->context())->builtin_module_require();
Local<Function> require =
Realm::GetCurrent(self->context())->builtin_module_require();
args.GetReturnValue().Set(require);
}

Expand Down Expand Up @@ -1540,7 +1536,7 @@ void NodeRealm::Start() {
assert(loop != nullptr);

MicrotaskQueue* microtask_queue =
outer_context_.Get(isolate_)->GetMicrotaskQueue();
outer_context_.Get(isolate_)->GetMicrotaskQueue();

Local<Context> context = Context::New(
isolate_,
Expand All @@ -1563,8 +1559,8 @@ void NodeRealm::Start() {
GetArrayBufferAllocator(GetEnvironmentIsolateData(outer_env)));
assert(isolate_data_ != nullptr);
ThreadId thread_id = AllocateEnvironmentThreadId();
auto inspector_parent_handle = GetInspectorParentHandle(
outer_env, thread_id, "file:///noderealm.js");
auto inspector_parent_handle =
GetInspectorParentHandle(outer_env, thread_id, "file:///node_realm.js");
env_ = CreateEnvironment(isolate_data_,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem to be a good idea - I think so far we've been assuming in the code base that there is a 1:1 relationship between the environment and the isolate, things can be subtly broken once we have n:1. It would be cleaner if we follow the Realm approach and just split out states in the Environment that are unique to each context/realm to a subclass of node::Realm (or maybe it is already just node::PrincipalRealm).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so far we've been assuming in the code base that there is a 1:1 relationship between the environment and the isolate, things can be subtly broken once we have n:1.

Quite a few of the crashes I've been fighting with this approach are due to that. I'm using the reference to env to clean up all the handles. Otherwise, we will have bad crashes after stop() is called (which is why I'm doing this in core). Using a node::PrincipalReam would require removing the deliberate stop() method: how would we shut it down?

The other question I have if we dith the CreateEnvironment call, how do we guarantee some level of isolation?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just move the handles into the Realm instead. Basically, just move things that should belong to individual realms instead of being shared across them to the realm, and do the setup/cleanup on a per-realm basis, which is what we've been trying to do with the ShadowRealm integration.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can just move the handles into the Realm instead.

How would you do that?

As an example HandleWrap add things to the Environment here:

env()->handle_wrap_queue()->PushBack(this);
(there are a few other places too).

Should I try to move that list from the Env to the Realm?

Copy link
Member

@joyeecheung joyeecheung May 5, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, just make that realm()->handle_wrap_queue()->PushBack(this). That's what we've been doing to support other BaseObjects in ShadowRealms, we just haven't got to HandleWrap yet. If our intent is to make individual realms have their own sets of handles etc., we'd have to move them into realms and stop mixing them in one giant Environment that contains per-thread information anyway.

Copy link
Member

@legendecas legendecas May 6, 2023

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, tracking handle wraps and request wraps by realms is the plan for shadow realm integration. It's not the current focus yet.

However, moving the list from the Environment to the Realm breaks the postmortem diagnostics since tools like llnode are built on top of the Environment::handle_wrap_queue structure: https://github.com/nodejs/node/blob/main/src/node_postmortem_metadata.cc#L23

An option can be tracking the handle wraps and request wraps by both the Env and its creation Realm.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think we don't need to worry much about the postmortem diagnostics data, we only need to leave the offset of the queues within Realms and the tools would then figure out how to adapt to newer versions of Node.js. It's never guaranteed that we'd never change the layout of our internals, only that when we do, we still leave some information in the binary (the offsets) for these tools to figure out how to extract information from a core dump / process memory.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My understanding is that we won't be able to have a process object without a new Node.js Environment, right?

May we land this PR and work to remove CreateEnvironment in follow-on work?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think so far we've been assuming in the code base that there is a 1:1 relationship between the environment and the isolate

That that's not the case is precisely the difference between Environment and IsolateData, though. Just because the Node.js CLI doesn't create multiple Environment instances per Isolate doesn't mean that it's not semantically correct to do so.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That that's not the case is precisely the difference between Environment and IsolateData, though. Just because the Node.js CLI doesn't create multiple Environment instances per Isolate doesn't mean that it's not semantically correct to do so.

I don't think in practice we have really been writing the code that way. There are some places where we configure V8 isolate hooks (e.g. the ones in Environment::InitializeDiagnostics) with the current Environment as data to be used in the callback. The async hooks where things like Environment::GetCurrent(isolate)->trigger_async_id() is used a lot don't look particularly robust against a multi-Environment architecture either, and there are probably more places than I can think of off the top of my head..

context,
{},
Expand Down
2 changes: 1 addition & 1 deletion src/node_contextify.h
Original file line number Diff line number Diff line change
Expand Up @@ -4,9 +4,9 @@
#if defined(NODE_WANT_INTERNALS) && NODE_WANT_INTERNALS

#include "base_object-inl.h"
#include "memory_tracker-inl.h"
#include "node_context_data.h"
#include "node_errors.h"
#include "memory_tracker-inl.h"

namespace node {
class ExternalReferenceRegistry;
Expand Down
4 changes: 2 additions & 2 deletions src/node_options.cc
Original file line number Diff line number Diff line change
Expand Up @@ -442,9 +442,9 @@ EnvironmentOptionsParser::EnvironmentOptionsParser() {
&EnvironmentOptions::experimental_vm_modules,
kAllowedInEnvvar);
AddOption("--experimental-worker", "", NoOp{}, kAllowedInEnvvar);
AddOption("--experimental-noderealm",
AddOption("--experimental-node-realm",
"experimental NodeRealm support",
&EnvironmentOptions::experimental_noderealm,
&EnvironmentOptions::experimental_node_realm,
kAllowedInEnvironment,
false);
AddOption("--experimental-report", "", NoOp{}, kAllowedInEnvvar);
Expand Down
2 changes: 1 addition & 1 deletion src/node_options.h
Original file line number Diff line number Diff line change
Expand Up @@ -128,7 +128,7 @@ class EnvironmentOptions : public Options {
bool experimental_repl_await = true;
bool experimental_vm_modules = false;
bool expose_internals = false;
bool experimental_noderealm = false;
bool experimental_node_realm = false;

bool force_node_api_uncaught_exceptions_policy = false;
bool frozen_intrinsics = false;
Expand Down
Loading