-
Notifications
You must be signed in to change notification settings - Fork 161
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Should we "lock" readable streams while piping? #241
Comments
This might be overcomplicating things, but would it be possible to introduce a separate "read lock" structure + method of sorts? Something like: var lock = readable.getReadLock()
lock.read() // proxies original readable.read()
stream.isLocked() // true
readable.getReadLock() // throws, only one lock can be extant at a time per readable
stream.read() // throws, or is a rejected promise
lock.release()
stream.isLocked() // false
stream.read() // read works as expected The idea being that this affords |
That's ... really interesting. It starts down the path of "some kind of radical redesign that splits out the concept of readable stream into a stream and a stream reader" but doesn't go in nearly as crazy a direction as I anticipated. I do definitely like that it gives us a simpler primitive that pipeTo then builds on top of, and can be used for .read()-based consumers. I'll noodle on it for an hour and see what it would end up looking like.
With normal author-implemented streams no, but sometimes in UA-stream to UA-stream case there might be. |
Wrote up spec-ish detail for @chrisdickinson's suggestion at https://gist.github.com/domenic/d421643d95cdec9a9b5b. Minor ergonomic tweaks bring it to var lock = new ReadableStreamLock(readable)
lock.read() // proxies original readable.read()
stream.isLocked // true
new ReadableStreamLock(readable) // throws, only one lock can be extant at a time per readable
stream.read() // throws
lock.release()
stream.isLocked // false
stream.read() // read works as expected I actually think this is strictly superior to my OP proposal because it explains the mechanism by which my proposal works and gives authors access to its power instead of reserving it for pipeTo only. Readable stream look-alikes, like @tyoshino's ReadableByteStream, still don't have an easy time. The brand check in the ReadableStreamLock constructor prohibits it from being used with ReadableByteStream. And that in turn means that ReadableStream.prototype.pipeTo doesn't work directly when applied to a ReadableByteStream. (We could hack it to work with ReadableByteStream specifically, but we shouldn't, because we're still not solving the real problem for e.g. author-defined stream lookalikes.) You could create a separate lock class for ReadableByteStreams? And hook it up via |
Is this due to the "Brand-check that stream is a real ReadableStream." step in the EDIT: I just realized that even if there were no need to get to |
Yep
Not quite; the main properties we use are (a) has a @[[locked]] property we can manipulate; (b) can be fruitfully passed to ReadFromReadableStream, which will do a bunch of internal state stuff (see equivalent current spec). You can't just call Hmm, maybe you use JS's single-threaded nature to un-lock the stream, call its .read(), then re-lock it after read completes in a finally block. That still leaves (a) though, but it's seeming more soluble...
That is starting to sound promising... basically it maintains a side table (weak map) of whether a stream is locked. Then ReadableStream.prototype.read() can consult that side table? But how? Via
Slot, or "internal slot," means "truly private state." It's a spec-only concept, but can be reified in JS (not very performantly) via a weak map. A symbol is a JS concept, and is just a non-string-named property. It can be exposed to user code through e.g. |
Oh, no, I just realized, with your suggestions, you can use the same side table for every type of stream; you don't need one side table per stream type. Updated gist coming shortly. |
OK, pseudo-specced out your suggestions at https://gist.github.com/domenic/d421643d95cdec9a9b5b#file-readable-stream-locks-2-md. The key differences are that by moving to a side-table model instead of using the [[locked]] slot, and by doing the unlock-read()-lock again dance in ReadableStreamLock.prototype.read(), we can be completely generic and work with any object that has a read() method. It's now entirely extensible. Specs that need strong guarantees can use "If %ReadableStreamLocks% has this" to guard their read() methods; author code that wants locking support can just use I'm a bit worried about the performance impact of this kind of "weak map" usage though. |
I think Readable(Byte)Stream can provide ReadFromReadableStream and unlock to the ReadableStreamLock constructor. If streams assure they expose the methods only to ReadableStreamLock, lock state check / manipulation will go away from ReadableStreamLock implementation. |
@yutakahirano I don't quite understand what you're proposing. Would it remove the |
@domenic Sorry for the poor expression.
Does it make sense? |
This was wrong and should be:
|
Ah I see. Hmm, that gives the advantage that user-defined streams can create their own locking mechanism by creating their own getReadLock() function. It does seem to work pretty nicely. The token is slightly awkward but solves the problem in a clever way. I like it. Will try to prove it out a bit more, but seems good. |
It is natural to throw an exception when |
Hmm. So My original plan was for the stream to appear to be "waiting" whenever off-main-thread piping was going on. But I am not sure how to fit that into the lock design. Need to think on it more. |
I think it is good to proxy It might be useful if closing a stream unlocks it implicitly. Then we don't need to proxy |
I tried to prototype your idea in a bit more detail at https://github.com/whatwg/streams/blob/lock/reference-implementation/lib/experimental/exclusive-stream-reader.js. I am still uneasy about how much we are duplicating the stream interface here. Unsure what to do about it.
This would require the stream to track any locks it creates. Certainly doable, but adds a bit more complexity. Unsure if there's much gain either in preventing multiple consumers from seeing that the stream was closed or became errored. I wonder if there is a simpler design we can use that doesn't require a new object that parallels the stream API. One idea would be something like var lock = stream.lock();
stream.read(lock); // works
stream.read(); // throws
lock.release();
stream.read(); // works now This is kind of nice since the code stays entirely within the ReadableStream class. However I don't know how to solve this for the getters, lol -_-. |
At this point I am tempted to go back to my original idea of [[piping]] built in to the stream. It would cause .read() to throw, .state to return "waiting", and .ready to return a promise that is fulfilled when the piping is over and the stream becomes readable or closed (perhaps errored too, given #245). Everything else seems over-engineered. Which would be OK if the user-facing ergonomics were worth it, but I am not sure that they are. |
Having piping built in to the stream sounds tempting for some performance cases but it's only going to work if it can take a user specified read method and just make it private for use by the internal pipeing. There's all kinds of reasons you need to customize or wrap the read function. I do it in almost every new style stream I've written, usually to do lazy completion and state checking the first time read() is called. |
I guess the disadvantage of going back to [[piping]] is that other consumers which want an exclusive lock (like the |
In IRC @tabatkins made me feel better about implementing .json() in terms of piping to a writable stream that accumulates the data. Roughly his argument was that .json() is a pipe-like consumer, in that it tries to direct all the stream's data to a single place. It hides the piping from you behind a nice simple API, but underneath the covers it's basically piping. HOWEVER, NEW IDEA: This might be competely crazy but what if instead of duplicating .read(), .ready, and .state into the ExclusiveStreamReader, we said they were only available in the reader, and the stream itself only had pipeTo/pipeThrough (which operate via getting a reader?). I think the implementation would be more or less the same as https://github.com/whatwg/streams/blob/lock/reference-implementation/lib/experimental/exclusive-stream-reader.js except we'd remove .ready/.read()/.state from ReadableStream. Whether this is acceptable kind of depends on how strongly we believe that readable streams should primarily be consumed by piping them places. It'd also be a kind of big change, which I am leery of at this stage. |
It's a very interesting idea, I think. |
I didn't even think about writable stream :-/. After sleeping on it, I think it is OK to duplicate the interface. The way @tabatkins phrased it was:
If we make that clear in the documentation, it feels OK. I'm not sure if it'll affect the spec or implementation strategy. The remaining issue in my mind is exactly how we create these reader objects. My spike follows your suggestion, which is pretty clever. However, I made a tweak in the last commit that makes it easier for custom stream implementers. For example let's say I want to do something like class InstrumentedReadableStream extends ReadableStream {
read() {
const chunk = super.read();
console.log('chunk!', chunk);
return chunk;
}
} In your design, any reads made through the lock would not trigger the instrumentation, since it delegates directly to ReadFromReadableStream. With my tweak, it delegates to Overall I am feeling better about this. A few minor issues we should think about:
|
I think I did this via the latest commit to my spike.
I sent https://readable-email.org/list/public-script-coord/topic/weak-set-usage-in-streams to get some opinions. |
That is an idea of proxying |
Makes sense? |
Makes sense, although I will need to try validating to make sure that things like
Are doable in a sane way, and to figure out the "not sure" things. Will try to do so Monday.
I think "waiting" works fine, as the observable behavior is the same---you cannot read or see anything happen until |
At least for ReadableStream, checking if the lock is acquired is useful, I think. Having a boolean property ( |
friendly ping |
Heh, thanks. I've been working through this all day actually. Last week was spent on some V8 work, sorry. Here is some stuff I wrote up: Currently I am trying to work out all the implications of exactly how to specify this. I think the work at is pretty OK, although there are a few issues:
Anyway this will be my focus for the next day or two; it's getting to be end of day here so tomorrow is likely when most of the work will be done. I aim to have a pull request up for discussion by EOD tomorrow. |
Took a look at I agree that there's no compelling reason for exposing stream.isLocked, yet.
Even after release, reader's |
Thanks for the review @tyoshino; sorry I didn't get a chance to pass it by you earlier.
Yeah, maybe we should remove it until someone asks for it? I guess part of the reason I have it is that it's better to avoid forcing people to use try/catch. I can't think of a real world situation where you would, but it seems nicer to be able to do if (reader.isActive && reader.state === "readable") {
console.log(reader.read());
} else {
console.log("not locked!");
} instead of if (reader.state === "readable") {
try {
console.log(reader.read());
} catch (e) {
console.log("not locked! or maybe another error?", e);
}
}
I said releaseLock() since usually the variable will be
I at first had a design where they would fail but it turned out to be quite hard to write |
I was on vacation and didn't have time to do it, sorry. Thanks so much for great progress!
Oh, I see. I'm now a little inclined to keep it.
Got it!
Thanks for explanation. I understand and agree that it makes sense for transition-to-"closed" case. When a reader is unlocked while the associated stream is readable/waiting, it sounds a bit weird to keep exposing the non-closed stream's status via the released reader. What do you think about making a reader look like a closed stream regardless for what kind of transition the unlock happened? |
I.e. making
or keep it look waiting? Ah, for errored case, we should make the reader look errored, sorry. |
Hmm, that's interesting. I think it is better than what we have now, yes. I will open an issue to take care of it. |
Inspired by:
Right now, if you are piping a readable stream to a writable stream, you can also concurrently call
.read()
. Usually this will just cause a mess, but it fell out of the design naturally.Both "off-main thread piping" (#97) and a similar situation with Fetch's Request class, which has a
.json()
method, deal with the hazards of "C++ readers" concurrent with potential "JS readers." Ideally once a stream is being read by C++, JS cannot touch it, both for performance and implementation-complexity reasons.One solution for this might be to add some kind of "locking" mechanism so that only one consumer can read. But I don't know how to define that, without some kind of radical redesign that splits out the concept of readable stream into a stream and a stream reader or something like that.
Another possibility that would work quite well I think would be to lock the stream while it's being piped somewhere. That is, at the beginning of pipeTo, set a [[piping]] = true property; when pipeTo finishes, set [[piping]] = false. While [[piping]] is true, .read() instantly throws, saying that the stream is being piped elsewhere and cannot be read.
This would solve all of our problems that inspire this issue:
.json()
, by "explaining".json()
as piping to a concatenator stream. (See Backpressure on fetch integrated with Streams w3c/ServiceWorker#452 (comment)) Under the hood the UA will probably not implement.json()
that way, but what's important is that they could, with the same author-observable behavior---so that way,.json()
isn't doing anything magic.It has a few theoretical disadvantages, but I am not sure how big of a deal they are in practice:
ReadableStream.prototype.pipeTo
will no longer be implementable purely on top of the public readable interface; it will be using some private [[piping]] state, and some private version of the.read()
method that doesn't throw while the stream is [[piping]]. This means e.g. it doesn't apply directly to ReadableByteStreams. (Note however that it still only uses public writable stream interfaces, unless we want to lock those too...)Note that this does not disable the use case of e.g. .read()ing headers from a stream and then piping the rest of it, or piping with
{ preventCancel: false }
and .read()ing the leftover if the destination gets closed prematurely. It only disables concurrent pipes + reads, not sequential.Would love thoughts from @tyoshino @yutakahirano @sicking and anyone else.
The text was updated successfully, but these errors were encountered: