Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Permit sending arbitrary closures through ports within isolate group on the VM #40370

Closed
mraleph opened this issue Jan 29, 2020 · 13 comments
Closed
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-isolate

Comments

@mraleph
Copy link
Member

mraleph commented Jan 29, 2020

We currently prohibit sending arbitrary closures through ports. I think this restrictions originates from the web where it is impossible to send a closure represented as a JS closure through a port to a web worker. This restriction however does not make any sense in the VM.

I suggest lifting it unless somebody has any objections.

/cc @leafpetersen @lrhn @vsmenon

@mraleph
Copy link
Member Author

mraleph commented Jan 29, 2020

/cc @mkustermann @a-siva

@lrhn
Copy link
Member

lrhn commented Jan 29, 2020

Closures can contain references to local variables. It's unclear what would happen if you send two closures referring to the same local variable—will they refer to the same location when they are received as well? What if you send them in separate messages on the same port?
It's definitely possible to send something, but it's unclear that the result will have a meaningful semantics.
Any object that the closure closes over will need to be sent as well. That makes it unpredictable how much data a port message will have to copy. Sending objects generally make a copy each time it's sent, so if you send an object and a function which closes over that object, you might end up with two different objects (at least if the two are sent in different messages).

Will closures sent through a port preserve equality? Identity?
Currently we can only send static/top-level functions, which means they are constants and the VM treats serialization of constants specially to preserve identity. If I send the same local function, created in the same scope, through a port twice, will it preserve identity?

If a closure references both local and global variables, will it clone the local closure, but use the other side's global state? That can break any number of invariants (but then, an object suffers from that same issue, it can contain its own state and refer to global state as well, and sending it to a different isolate can make those out of sync).

Sending closures is definitely possible, but I'm not sure the behavior is predictable enough that it's desirable. Too many footguns.

@mraleph
Copy link
Member Author

mraleph commented Jan 29, 2020

@lrhn Given that you can send an instance of a class like this:

class MyClosure {
  final List<dynamic> context;
  void call() {
    // ...
  }
}

What is the difference between sending this and a real closure? In my eyes not much. All the same questions and concerns apply.

  • It is unpredictable how much we will copy.
  • If you send it twice you get two different copies.
  • Identity is not preserved
  • etc

So why do we permit sending MyClosure(...) and not () => ...? Seems rather arbitrary.

@mkustermann
Copy link
Member

It has always been the plan to lift this restriction with isolate groups.

Though it is true that sending arbitrary closures can have unexpected behavior, because of the Dart VM's implementation of captured variables: We capture more than necessary, which can significantly increase the transitive graph to copy due to context sharing and not capturing final instance field values but rather the this pointer (if a closure accesses any instance field then the entire this object, and it's transitive closure has to be cloned)

Partially this is also an implementation problem atm, where we can have deeply nested inner closures where we would need to be able to identify the corresponding RawFunction to use on the receiver side. We cannot use our normal canonical names for this. Kernel offsets would be an option, but wouldn't work for AOT ...

By sharing the entire program structure within an isolate group, this is trivial because the sender and receiver RawFunction are the same, and the shared heap allows the copy of the RawClosure to point to the same function.

@lrhn
Copy link
Member

lrhn commented Jan 29, 2020

The big difference is closing over local variables.

To do that, you need to clone the closure as part of cloning the function. That again means that you no longer refer to the same variable. That makes it unpredictable what happens if you send such a closure. Sure, we can just say that that is the case, and that you shouldn't modify (or depend on) local variables in the functions you send, but we still have to give it a semantics when it happens.
Also, it's hard to predict how much a closure actually closes over. You have to be very good at not remembering anything which cannot be used (no sharing closures between local functions), otherwise you add significant overhead to sending it.

If you send a function that you receive as an argument, not one you define yourself, you have no way to predict what it will include when serialized.

@leafpetersen
Copy link
Member

Though it is true that sending arbitrary closures can have unexpected behavior, because of the Dart VM's implementation of captured variables: We capture more than necessary, which can significantly increase the transitive graph to copy due to context sharing and not capturing final instance field values but rather the this pointer (if a closure accesses any instance field then the entire this object, and it's transitive closure has to be cloned)

Slightly off topic, but is there any desire to fix/change this? This can dramatically (asymptotically actually) increase memory usage in a very hard to diagnose way. I have, concretely, run into this problem on other platforms in the past.

@a-siva a-siva added area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. library-isolate labels Jan 29, 2020
@vsmenon
Copy link
Member

vsmenon commented Jan 30, 2020

I think this restrictions originates from the web where it is impossible to send a closure represented as a JS closure through a port to a web worker. This restriction however does not make any sense in the VM.

Other issues aside, I have no objections with ignoring web-only restrictions.

(At some point, we will probably want a common abstraction for concurrency. I don't dart:isolate will be it.)

@mraleph
Copy link
Member Author

mraleph commented May 27, 2020

Returning to this request:

@lrhn

The big difference is closing over local variables.

I am still not sure what's the big difference between local variables and object fields - to me these are two completely isomorphic concepts, so I don't believe anybody would be surprised that when you send a closure across the port and then send it back you get a copy of this closure closing over a copy of your state.

Though I must acknowledge that our implementation of closures might result in unobvious amount of state being copied to another isolate due to parent. We could address this issue by tracking exactly which variables are referenced by a specific function (and its nested functions) and pruning context on serialisation.

In any case it seems that there is strong opposition to this proposal from either the language team or web team - so we could actually move forward and expand Dart VM capabilities here.

@mkustermann
Copy link
Member

... so we could actually move forward and expand Dart VM capabilities here.

As mentioned in #40370 (comment) our plan is to lift this restriction when isolate groups are fully implemented (that makes it safer with regards to hot reload and avoids hacks to identify correct closure function on receiver side)

@lrhn
Copy link
Member

lrhn commented May 27, 2020

The difference is that users understand what an object is. If I wrote the class, I can control which fields it has and which values it keeps references to. When I send an object, its state is copied and the result is again an object. It's a copy, so it has a different identity, but that's all well-known concepts.

A closure retains references to some local variables, whatever those really are. It's not clear which variables are closed over, and there is no way to control it. It's the latter part which worries me the most, and that users won't understand what's going on. And why local variables are closed over, but global ones are not.

One consequence of sending local functions is that the argument to Isolate.spawn can be a local function too, closing over values. We need to make sure that it serializes the function and the argument together, so objects occurring in both will keep their identity. The isn't necessary when the function must be a top-level function. (We probably do that, but let's make sure).

(Can you send an Expando? Does it work? 😁)

@leafpetersen
Copy link
Member

It's not clear which variables are closed over, and there is no way to control it.

This is where I insert my obligatory note that this can be (and I would argue should be) fixed.

@mraleph
Copy link
Member Author

mraleph commented May 28, 2020

This is where I insert my obligatory note that this can be (and I would argue should be) fixed.

I think the time is ripe to try. We will track the work in #36983 - I am a bit worried that it might lead to worse code size than the current implementation, but I think leaks are bad enough to warrant trying.

@lrhn lrhn added area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. and removed area-core-library SDK core library issues (core, async, ...); use area-vm or area-web for platform specific libraries. labels Sep 29, 2020
@mkustermann
Copy link
Member

As @mraleph points out, the fact that closures hang on to more memory than needed is now tracked at #36983

We'll track sending closures and other objects via SendPorts in the more general #46623 and fix it in one go once lightweight isolates are enabled by-default.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area-vm Use area-vm for VM related issues, including code coverage, and the AOT and JIT backends. library-isolate
Projects
None yet
Development

No branches or pull requests

6 participants