Skip to content
This repository has been archived by the owner on Apr 22, 2023. It is now read-only.

Readable stream documentation is unclear #14124

Closed
fresheneesz opened this issue Mar 26, 2015 · 8 comments
Closed

Readable stream documentation is unclear #14124

fresheneesz opened this issue Mar 26, 2015 · 8 comments

Comments

@fresheneesz
Copy link

The documentation makes it really unclear how to implement Readable streams.

Note: Implement this function, but do NOT call it directly.
This function should NOT be called directly. It should be implemented by child classes, and only called by the internal Readable class methods.
All Readable stream implementations must provide a _read method to fetch data from the underlying resource.
This method is prefixed with an underscore because it is internal to the class that defines it, and should not be called directly by user programs. However, you are expected to override this method in your own extension classes.

I get it, don't call read directly. Don't need to be told 4 separate times. This redundancy obscures the very small amount of useful information written about this method. Lets contract this into something less redundant like: "_Note: Implement this function, but do NOT call it directly. _This method is prefixed with an underscore because it is internal to the class that defines it and only the internal Readable class methods should call it. All Readable stream implementations must provide a read method to fetch data from the underlying resource."

When data is available, put it into the read queue by calling readable.push(chunk). If push returns false, then you should stop reading. When _read is called again, you should start pushing more data.

What in god's name does this mean about _read? This is documentation under the _read method. Is it saying that the whole point of _read is to indicate that the stream should pull from its data source after having stopped doing that? What is it talking about "again"? It never mentioned _read being called before that. When is _read called initially?

Perhaps that paragraph should be changed to: "__read will be called [insert when read is first called here]. When the object stops reading as a result of push returning false, the object should only continue reading from the resource when _read is called again._"

What's the return value for _read? Nothing?

For push:

The _read() function will not be called again until at least one push(chunk) call is made.

Once again, the documentation refers to calling _read "again" even tho it never mentions anything about calling it before. What does this mean?

The way the documentation describes things seems like a chicken and egg problem. _read won't be called until push is called (with something other than null), and push shouldn't be called until a call to _read requests more data. How does this make sense?

How would I implement a simple passthrough that properly handles backpressure and all that? Would it be something like this?:

var Passthrough = function(stream, callback) {
    Readable.call(this)
    this.stream = stream

    stream.on('readable', function() {
        var data = stream.read()
        if(data !== null) {
            if(!this.push(data)) 
              stream.pause()
        }
    }.bind(this))

    stream.on('end', function() {
        this.push(null)
    }.bind(this))
}
util.inherits(StreamPeeker, Readable)
StreamPeeker.prototype._read = function() {
    this.stream.resume()
}
@fresheneesz
Copy link
Author

Also, the highwatermark option is not very well explained, especially with regards to what behavior actually changes when the buffer is full. Is the only thing that happens that calls to push will return false? So if you keep pushing even tho it is returning false, the buffer can grow without bound, correct? The docs should mention this.

@fresheneesz
Copy link
Author

Part of the problem it seems, is that node.js essentially has 2 different stream APIs mangled together. You guys should never have modified streams1 into streams2. Instead you should have made a completely separate streams2 module, and deprecated streams1.

The resume and pause methods are incredibly unintuitive if you're trying to do streams2. For one thing, it seems that if you're doing streams2 with backpressure, you should NEVER call resume or pause because it switches it into flowing mode (streams1 mode). What I would expect pause to do is to make calls to read return null. This is just so confusing.

Please make a separate module for this and throw out legacy flowing mode.

@jasnell
Copy link
Member

jasnell commented Jun 25, 2015

@fresheneesz ... interested in putting together a quick PR (targeted at either v0.12 or master) that clarifies the documentation?

@fresheneesz
Copy link
Author

@jasnell I'm really not sure I understand streams enough to be trusted with something like that. Honestly I wouldn't trust myself not to get 50% of it wrong.

@fresheneesz
Copy link
Author

I guess I could clarify just the parts I'm most confident about.. I'll give it a go

@fresheneesz
Copy link
Author

@jasnell
Copy link
Member

jasnell commented Jun 25, 2015

Yes, that's the right place!

@fresheneesz
Copy link
Author

@jasnell Do you know the answers to some of my questions? Eg about highwatermark?

fresheneesz added a commit to fresheneesz/node that referenced this issue Jun 26, 2015
fresheneesz added a commit to fresheneesz/node that referenced this issue Jun 26, 2015
jasnell pushed a commit to jasnell/node-joyent that referenced this issue Jul 1, 2015
Minor clarifications around Readable._read and Readable.push
to make their implementation/usage easier to understand.

nodejs#14124 (comment)
jasnell pushed a commit that referenced this issue Jul 10, 2015
Minor clarifications around Readable._read and Readable.push
to make their implementation/usage easier to understand.

#14124 (comment)

Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #25635
@jasnell jasnell closed this as completed Jul 10, 2015
jasnell pushed a commit to jasnell/node-joyent that referenced this issue Jul 10, 2015
Minor clarifications around Readable._read and Readable.push
to make their implementation/usage easier to understand.

nodejs#14124 (comment)

Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: nodejs#25635
jasnell pushed a commit that referenced this issue Aug 4, 2015
Minor clarifications around Readable._read and Readable.push
to make their implementation/usage easier to understand.

#14124 (comment)

Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: #25591
jasnell pushed a commit to jasnell/node that referenced this issue Aug 4, 2015
Minor clarifications around Readable._read and Readable.push
to make their implementation/usage easier to understand.

nodejs/node-v0.x-archive#14124 (comment)

Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: nodejs/node-v0.x-archive#25591
jBarz pushed a commit to ibmruntimes/node that referenced this issue Nov 4, 2016
Minor clarifications around Readable._read and Readable.push
to make their implementation/usage easier to understand.

nodejs#14124 (comment)

Reviewed-By: James M Snell <jasnell@gmail.com>
PR-URL: nodejs#25591
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants