Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fs: add FileHandle.prototype.readLines #42590

Merged
merged 1 commit into from
Oct 10, 2022
Merged

Conversation

aduh95
Copy link
Contributor

@aduh95 aduh95 commented Apr 3, 2022

It seems to be a frequent enough use case that it makes sense to have a shortcut method.

@aduh95 aduh95 added fs Issues and PRs related to the fs subsystem / file system. semver-minor PRs that contain new features and should be released in the next minor version. labels Apr 3, 2022
@nodejs-github-bot nodejs-github-bot added the needs-ci PRs that need a full CI run. label Apr 3, 2022
@aduh95
Copy link
Contributor Author

aduh95 commented Apr 3, 2022

//cc @nodejs/fs @nodejs/readline

Copy link
Contributor

@RaisinTen RaisinTen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think convenience methods like this one which isn't strictly necessary for the API to be fully functional should belong to userland modules like https://github.com/jprichardson/node-fs-extra.

@aduh95
Copy link
Contributor Author

aduh95 commented Apr 29, 2022

@RaisinTen I disagree, I think it would be a shame to have to use a third party module for something like this.

Other languages have this feature built-in:

Additionally, it's very little code to maintain on out end, and saves our users a good chunk of boilerplate.

doc/api/fs.md Outdated
console.log(line);
}
} finally {
await file?.close();
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could this be handled automatically by the async iterator?

Copy link
Contributor Author

@aduh95 aduh95 Apr 29, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh it is indeed, autoClose defaults to true so that explicit close call is superfluous.

@RaisinTen
Copy link
Contributor

@RaisinTen I disagree, I think it would be a shame to have to use a third party module for something like this.

Other languages have this feature built-in:

I think python is the only language in this list that has an equivalent API for what we are doing in this PR.

Aren't both of these like our 'readline' module? These deal with streams not files specifically which is a kind of a stream.

Additionally, it's very little code to maintain on out end, and saves our users a good chunk of boilerplate.

But what's wrong with using an npm module? Those have loads of utilities already to reduce boilerplate code.

@aduh95
Copy link
Contributor Author

aduh95 commented Apr 29, 2022

But what's wrong with using an npm module? Those have loads of utilities already to reduce boilerplate code.

I'm not saying using an npm package is wrong, and I don't think adding this utility would change anything to the download stats of e.g. node-fs-extra. What I'm saying is I don't want to have to download an external package (that I have to audit now?) for something like this.

@aduh95
Copy link
Contributor Author

aduh95 commented Apr 29, 2022

Aren't both of these like our 'readline' module? These deal with streams not files specifically which is a kind of a stream.

To be clear, I'm totally OK to move this to the node:readline module (or node:readline/promises) if you think it's a better place, what I'm looking for is an util to easily iterate over the lines of an open file handle.

@RaisinTen
Copy link
Contributor

What I'm saying is I don't want to have to download an external package (that I have to audit now?) for something like this.

If we don't want to audit an external package, I think it's okay if we use the boilerplate code. I don't think the core repo is supposed to solve the use case of reducing boilerplate code.

To be clear, I'm totally OK to move this to the node:readline module (or node:readline/promises) if you think it's a better place, what I'm looking for is an util to easily iterate over the lines of an open file handle.

I still don't think that would be a good idea because readline is supposed to be an interface - a wrapper around any sort of readable stream we want to read lines from. If we allow folks to use it in a way that's not like using an interface, I feel it kinda goes against the principles of that module.

@aduh95
Copy link
Contributor Author

aduh95 commented Apr 29, 2022

My use case is the following:

  • I want to read the content of an open file handle line by line.
  • I don't want to use npm.
  • I don't want to learn by hearth the boilerplate.

Is your response to this: "just use Python"? I don't really get how this util being part of core is detrimental to you or anyone (and if it's not detrimental, why do you block it?).
If you're OK with this feature landing, but not in the current form, what would you like me to change?

@RaisinTen
Copy link
Contributor

My use case is the following:

  • I want to read the content of an open file handle line by line.
  • I don't want to use npm.
  • I don't want to learn by hearth the boilerplate.

Why don't you want to use npm? It would be really helpful to reduce a lot of boilerplate code throughout your application. Even if you are against using npm, you could create a utility function in your application and use that wherever you need to read lines from a file. That way, you could write the boilerplate code just once and forget it and use the more approachable utility function everywhere you need it - no need to learn anything by heart. Maybe you could put that into a private package repo if you don't want that utility to be there in your main application source?

Is your response to this: "just use Python"?

That's a very valid solution.

I don't really get how this util being part of core is detrimental to you or anyone (and if it's not detrimental, why do you block it?).

It's not detrimental, it's very helpful. It just doesn't look like something that should belong to core because of the reasons I stated above.

If you're OK with this feature landing, but not in the current form, what would you like me to change?

If we absolutely do have to add this functionality to core, I would start off by adding it to readable streams, so that all sorts of readable streams can benefit from this (readline accepts a readable stream as the input object for that purpose). If it gets accepted there then I'm okay with adding it to fs if the createReadStream() part looks too much.

@aduh95
Copy link
Contributor Author

aduh95 commented Apr 30, 2022

It just doesn't look like something that should belong to core because of the reasons I stated above.

Sorry, but what reasons? The only reason I'm reading is that you don't think it belongs in core.

@RaisinTen Let's assume that:

  • We both know what npm is and how it works.
  • We both are knowledgable enough to chose for ourselves if npm fits our usecase.
  • We both know what is a utility function and know how to set it up.
  • We both know how to make a (private or not) package when something's too niche for our use case.

And now with all that, I'm telling you: I still want this in core, I think it belongs there.

I working with Node.js because I find it fun, and on Node.js core development because it let me overcome the frustrations I have with it sometimes. If you are not going to let me fix them when I found one, well that's not cool. Node.js is a community-owned project, refusing a feature on the basis that you don't like it is not acceptable IMHO, so if you agree that it's a useful feature and would not be detrimental as far as you know, then please remove your block.

I would start off by adding it to readable streams, so that all sorts of readable streams can benefit from this (readline accepts a readable stream as the input object for that purpose).

TBF that's already a one-liner:

import { createInterface as readLines } from 'node:readline'

for await(const line of readLines(myStream)) console.log(line)

But that's not really my use-case anyway, mine is specifically about open file handles.

@RaisinTen
Copy link
Contributor

It just doesn't look like something that should belong to core because of the reasons I stated above.

Sorry, but what reasons? The only reason I'm reading is that you don't think it belongs in core.

This is my reasoning:

  • This function is purely for utility purposes and what it does can already be achieved in userland using the already exposed file system primitive operations. Exposing this from fs would add a non-primitive fs operation and make the API inconsistent, so IMO it doesn't belong to fs.
  • And I don't think it belongs to the readline module either because readline is supposed to be a wrapper around a readable stream object and it is used to read lines from it. If we bind the interface to a specific object, it doesn't remain purely an interface anymore.

I would start off by adding it to readable streams, so that all sorts of readable streams can benefit from this (readline accepts a readable stream as the input object for that purpose).

TBF that's already a one-liner:

import { createInterface as readLines } from 'node:readline'

for await(const line of readLines(myStream)) console.log(line)

But that's not really my use-case anyway, mine is specifically about open file handles.

Well, there is a one-liner for this too

for await(const line of readLines(openFileHandle.createReadStream())) console.log(line)

@RaisinTen RaisinTen dismissed their stale review April 30, 2022 13:52

I'll let others from @nodejs/fs and @nodejs/readline decide if we should add this

@RaisinTen
Copy link
Contributor

Actually, please disregard my second point. Could you please share what the API would look like if we expose this from readline?

@aduh95
Copy link
Contributor Author

aduh95 commented Apr 30, 2022

  • Exposing this from fs would add a non-primitive fs operation and make the API inconsistent, so IMO it doesn't belong to fs.

Thanks for clarifying, I didn't get that point, I think it's reasonable (although I don't agree 😅).

Well, there is a one-liner for this too

for await(const line of readLines(openFileHandle.createReadStream())) console.log(line)

That's the gotcha: you need to provide crlfDelay option if you don't want to see weird behaviour.

for await(const line of readLines({ input: openFileHandle.createReadStream(), crlfDelay: Infinity })) console.log(line)

Could you please share what the API would look like if we expose this from readline?

Well I suppose the easiest would be to add support for FileHandle as input for the readline Interface:

import { createInterface as readLines } from 'node:readline';
import { open } from 'node:fs/promises';

const fh = await open('path/to/file', 'r');
for await(const line of readLines(fh)) console.log(line);

@RaisinTen
Copy link
Contributor

That's the gotcha: you need to provide crlfDelay option if you don't want to see weird behaviour.

Does it sound reasonable to change the crlfDelay default to Infinity? Not sure if the one-liner would satisfy your needs then.

@aduh95
Copy link
Contributor Author

aduh95 commented May 20, 2022

@nodejs/fs @nodejs/readline can I get some more reviews please?

@aduh95 aduh95 added the review wanted PRs that need reviews. label May 20, 2022
@RaisinTen
Copy link
Contributor

For anyone else following along, the other half of the discussion regarding the inclusion of this functionality in core took place in #43102.

Copy link
Contributor

@LiviaMedeiros LiviaMedeiros left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+1 on having this method in fs; I think that importance of builtin utility functions is underestimated: their absence might be an invisible obstacle for the whole language to become a convenient utility tool for daily routine.

It doesn't create inconsistency between low- and high-level API here, since we already have .readFile .writeFile anyway.

doc/api/fs.md Outdated Show resolved Hide resolved
@RaisinTen
Copy link
Contributor

RaisinTen commented May 22, 2022

I think that importance of builtin utility functions is underestimated: their absence might be an invisible obstacle for the whole language to become a convenient utility tool for daily routine.

Well, the whole readline module is a builtin utility API. I don't see why we should create additional APIs on fs which internally just calls a single public API of readline given that it can be done very easily in userland, which is why we never got so many complaints about fs.readline not being a thing in Node.js.

I feel that landing this change would have some consequences:

I think it's only fair to add a builtin utility function to core if it's tedious to implement in userland using public APIs of Node.js.

It doesn't create inconsistency between low- and high-level API here, since we already have .readFile .writeFile anyway.

.readFile and .writeFile is exposed from fs., fsPromises. and filehandle.. If we don't expose the readline function from all of these 3, I get the signal that we aren't supporting fs. and fsPromises.. I think we should add this feature to those too in the same PR that's adding this feature.

@tniessen
Copy link
Member

I agree that a utility to read a stream of lines from an arbitrary data source, whether it's a node stream or a web stream or a file handle, is desirable -- the lack thereof might be why I often switch to Python whenever I have to process line-based input data.

The readline module has always felt a bit hacky to me -- its design clearly goes way beyond simply reading lines. For example, I don't think the result of file.readLines() should have a question() function. As long as it does, this probably does not belong in fs but rather in readline.

@aduh95
Copy link
Contributor Author

aduh95 commented Sep 15, 2022

@tniessen do you think it's ok to land this now that .question has been removed on main?

@aduh95 aduh95 added author ready PRs that have at least one approval, no pending requests for changes, and a CI started. request-ci Add this label to start a Jenkins CI on a PR. labels Sep 15, 2022
@github-actions github-actions bot removed the request-ci Add this label to start a Jenkins CI on a PR. label Sep 15, 2022
@nodejs-github-bot
Copy link
Collaborator

@nodejs-github-bot
Copy link
Collaborator

@aduh95 aduh95 merged commit a2fb3f9 into nodejs:main Oct 10, 2022
@aduh95 aduh95 deleted the fh-readLines branch October 10, 2022 00:16
danielleadams pushed a commit that referenced this pull request Oct 11, 2022
PR-URL: #42590
Reviewed-By: LiviaMedeiros <livia@cirno.name>
danielleadams added a commit that referenced this pull request Oct 11, 2022
Notable changes:

* cli:
  * (SEMVER-MINOR) add `--watch` (Moshe Atlow) #44366
* fs:
  * (SEMVER-MINOR) add `FileHandle.prototype.readLines` (Antoine du Hamel) #42590
* http:
  * (SEMVER-MINOR) add writeEarlyHints function to ServerResponse (Wing) #44180
* http2:
  * (SEMVER-MINOR) make early hints generic (Yagiz Nizipli) #44820
* src:
  * (SEMVER-MINOR) add detailed embedder process initialization API (Anna Henningsen) #44121
* util:
  * (SEMVER-MINOR) add default value option to parsearg (Manuel Spigolon) #44631
danielleadams added a commit that referenced this pull request Oct 11, 2022
Notable changes:

* cli:
  * (SEMVER-MINOR) add `--watch` (Moshe Atlow) #44366
* fs:
  * (SEMVER-MINOR) add `FileHandle.prototype.readLines` (Antoine du Hamel) #42590
* http:
  * (SEMVER-MINOR) add writeEarlyHints function to ServerResponse (Wing) #44180
* http2:
  * (SEMVER-MINOR) make early hints generic (Yagiz Nizipli) #44820
* src:
  * (SEMVER-MINOR) add detailed embedder process initialization API (Anna Henningsen) #44121
* util:
  * (SEMVER-MINOR) add default value option to parsearg (Manuel Spigolon) #44631

PR-URL: #44968
danielleadams added a commit that referenced this pull request Oct 11, 2022
Notable changes:

* cli:
  * (SEMVER-MINOR) add `--watch` (Moshe Atlow) #44366
* fs:
  * (SEMVER-MINOR) add `FileHandle.prototype.readLines` (Antoine du Hamel)
  #42590
* http:
  * (SEMVER-MINOR) add writeEarlyHints function to ServerResponse (Wing)
  #44180
* http2:
  * (SEMVER-MINOR) make early hints generic (Yagiz Nizipli) #44820
* lib:
  * (SEMVER-MINOR) refactor transferable AbortSignal (flakey5) #44048
* src:
  * (SEMVER-MINOR) add detailed embedder process initialization API (Anna
  Henningsen) #44121
* util:
  * (SEMVER-MINOR) add default value option to parsearg (Manuel Spigolon)
  #44631

PR-URL: #44968
danielleadams added a commit that referenced this pull request Oct 12, 2022
watch mode (experimental):

Running in 'watch' mode using `node --watch` restarts the process when an
imported file is changed.

Contributed by Moshe Atlow in [#44366](#44366)

Other notable changes:

* fs:
  * (SEMVER-MINOR) add `FileHandle.prototype.readLines` (Antoine du Hamel)
  #42590
* http:
  * (SEMVER-MINOR) add writeEarlyHints function to ServerResponse (Wing)
  #44180
* http2:
  * (SEMVER-MINOR) make early hints generic (Yagiz Nizipli) #44820
* lib:
  * (SEMVER-MINOR) refactor transferable AbortSignal (flakey5) #44048
* src:
  * (SEMVER-MINOR) add detailed embedder process initialization API (Anna
  Henningsen) #44121
* util:
  * (SEMVER-MINOR) add default value option to parsearg (Manuel Spigolon)
  #44631

PR-URL: #44968
danielleadams added a commit that referenced this pull request Oct 12, 2022
watch mode (experimental):

Running in 'watch' mode using `node --watch` restarts the process when an
imported file is changed.

Contributed by Moshe Atlow in [#44366](#44366)

Other notable changes:

* fs:
  * (SEMVER-MINOR) add `FileHandle.prototype.readLines` (Antoine du Hamel)
  #42590
* http:
  * (SEMVER-MINOR) add writeEarlyHints function to ServerResponse (Wing)
  #44180
* http2:
  * (SEMVER-MINOR) make early hints generic (Yagiz Nizipli) #44820
* lib:
  * (SEMVER-MINOR) refactor transferable AbortSignal (flakey5) #44048
* src:
  * (SEMVER-MINOR) add detailed embedder process initialization API (Anna
  Henningsen) #44121
* util:
  * (SEMVER-MINOR) add default value option to parsearg (Manuel Spigolon)
  #44631

PR-URL: #44968
danielleadams added a commit that referenced this pull request Oct 13, 2022
watch mode (experimental):

Running in 'watch' mode using `node --watch` restarts the process when an
imported file is changed.

Contributed by Moshe Atlow in [#44366](#44366)

Other notable changes:

* fs:
  * (SEMVER-MINOR) add `FileHandle.prototype.readLines` (Antoine du Hamel)
  #42590
* http:
  * (SEMVER-MINOR) add writeEarlyHints function to ServerResponse (Wing)
  #44180
* http2:
  * (SEMVER-MINOR) make early hints generic (Yagiz Nizipli) #44820
* lib:
  * (SEMVER-MINOR) refactor transferable AbortSignal (flakey5) #44048
* src:
  * (SEMVER-MINOR) add detailed embedder process initialization API (Anna
  Henningsen) #44121
* util:
  * (SEMVER-MINOR) add default value option to parsearg (Manuel Spigolon)
  #44631

PR-URL: #44968
danielleadams added a commit that referenced this pull request Oct 13, 2022
watch mode (experimental):

Running in 'watch' mode using `node --watch` restarts the process when an
imported file is changed.

Contributed by Moshe Atlow in [#44366](#44366)

Other notable changes:

* fs:
  * (SEMVER-MINOR) add `FileHandle.prototype.readLines` (Antoine du Hamel)
  #42590
* http:
  * (SEMVER-MINOR) add writeEarlyHints function to ServerResponse (Wing)
  #44180
* http2:
  * (SEMVER-MINOR) make early hints generic (Yagiz Nizipli) #44820
* lib:
  * (SEMVER-MINOR) refactor transferable AbortSignal (flakey5) #44048
* src:
  * (SEMVER-MINOR) add detailed embedder process initialization API (Anna
  Henningsen) #44121
* util:
  * (SEMVER-MINOR) add default value option to parsearg (Manuel Spigolon)
  #44631

PR-URL: #44968
danielleadams added a commit that referenced this pull request Oct 13, 2022
watch mode (experimental):

Running in 'watch' mode using `node --watch` restarts the process when an
imported file is changed.

Contributed by Moshe Atlow in #44366

Other notable changes:

* fs:
  * (SEMVER-MINOR) add `FileHandle.prototype.readLines` (Antoine du Hamel)
  #42590
* http:
  * (SEMVER-MINOR) add writeEarlyHints function to ServerResponse (Wing)
  #44180
* http2:
  * (SEMVER-MINOR) make early hints generic (Yagiz Nizipli) #44820
* lib:
  * (SEMVER-MINOR) refactor transferable AbortSignal (flakey5) #44048
* src:
  * (SEMVER-MINOR) add detailed embedder process initialization API (Anna
  Henningsen) #44121
* util:
  * (SEMVER-MINOR) add default value option to parsearg (Manuel Spigolon)
  #44631

PR-URL: #44968
danielleadams added a commit that referenced this pull request Oct 13, 2022
watch mode (experimental):

Running in 'watch' mode using `node --watch` restarts the process when an
imported file is changed.

Contributed by Moshe Atlow in #44366

Other notable changes:

* fs:
  * (SEMVER-MINOR) add `FileHandle.prototype.readLines` (Antoine du Hamel)
  #42590
* http:
  * (SEMVER-MINOR) add writeEarlyHints function to ServerResponse (Wing)
  #44180
* http2:
  * (SEMVER-MINOR) make early hints generic (Yagiz Nizipli) #44820
* lib:
  * (SEMVER-MINOR) refactor transferable AbortSignal (flakey5) #44048
* src:
  * (SEMVER-MINOR) add detailed embedder process initialization API (Anna
  Henningsen) #44121
* util:
  * (SEMVER-MINOR) add default value option to parsearg (Manuel Spigolon)
  #44631

PR-URL: #44968
danielleadams added a commit that referenced this pull request Oct 13, 2022
watch mode (experimental):

Running in 'watch' mode using `node --watch` restarts the process when an
imported file is changed.

Contributed by Moshe Atlow in #44366

Other notable changes:

* fs:
  * (SEMVER-MINOR) add `FileHandle.prototype.readLines` (Antoine du Hamel)
  #42590
* http:
  * (SEMVER-MINOR) add writeEarlyHints function to ServerResponse (Wing)
  #44180
* http2:
  * (SEMVER-MINOR) make early hints generic (Yagiz Nizipli) #44820
* lib:
  * (SEMVER-MINOR) refactor transferable AbortSignal (flakey5) #44048
* src:
  * (SEMVER-MINOR) add detailed embedder process initialization API (Anna
  Henningsen) #44121
* util:
  * (SEMVER-MINOR) add default value option to parsearg (Manuel Spigolon)
  #44631

PR-URL: #44968
@kamatil-dev
Copy link

is there a way to use zlib in this function? or i must use the readline package?

@tniessen
Copy link
Member

@kamatil-dev As you can see from the implementation, this is just a thin wrapper around readline and doesn't work for arbitrary streams. I'd suggest using readline.

@kamatil-dev
Copy link

@tniessen thank you ❤, but as i can see in my end, the new readLines() function performs better than the actual seperated package

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
author ready PRs that have at least one approval, no pending requests for changes, and a CI started. fs Issues and PRs related to the fs subsystem / file system. needs-ci PRs that need a full CI run. review wanted PRs that need reviews. semver-minor PRs that contain new features and should be released in the next minor version.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

7 participants