Line-wise transforms #695

ehmicky · 2024-01-16T20:13:03Z

Background

This feature was suggested by both #21 and #121.
Also mentioned by @sindresorhus in #693 (comment).

Rationale

Users can now transform stdin/stdout/stderr/stdio[*] by passing generators to those options.

The vast majority of processes' input and output is text-based. When transforming text-based data, it is very convenient to map one line at a time. For example, this is required for #121.

Currently, each chunk being transformed might be broken in the middle of a line. For example, the following does not currently work because the word secret might be split into 2 different chunks.

const transform = async function * (chunks) {
	for await (const chunk of chunks) {
		if (!chunk.includes('secret')) {
			yield chunk;
		}
	}
};

const {stdout} = await execa('echo', ['This is a secret.'], {stdout: transform});

Solution

Transforms should iterate over chunks line by line. We would use \n as an end delimiter (which works for \r\n too).

However, I think we should still allow users to opt out of this behavior because, although less common, some processes' input/output is in fact binary. Splitting binary input/output line by line would result in chunks of varying size, some potentially very large if the binary input/output does not contain any newline. This would take away the main benefits of streaming (incremental CPU load + low memory).

API

The users would get this behavior by default:

execa('...', {stdout: transform});

To allow users to opt-out, I would suggest the following format.

execa('...', {stdout: { transform, binary: true }});

In other words, a transform can be either AsyncGeneratorFunction or { transform: AsyncGeneractorFunction, binary: boolean }.

This format is also extensible to other options.

Alternatives

One other possible API would be:

execa('...', {stdout: ['binary', transform]});

However, some users might want to use multiple transforms for stdout, with some of them being binary, others not. For example, one transform might convert a binary stdout to a text-based one. Then a second transform might operate line-wise. The above approach would not allow this.

execa('...', {stdout: ['binary', binaryToTextTransform, secretFilterTransform]});

Yet another possible API would be:

execa('...', {stdout: transform, binary: true});

However, this has the same problem. It also has an additional one: it would apply to all streams (stdin/stdout/stderr/stdio[*]). For example, some processes might produce a binary stdout but a text-based stderr.

execa('...', {stdout: binaryTransform, stderr: textTransform, binary: true});

Implementing this should be rather simple, since splitting chunks line-wise can be done using the same kind of generator users can now use themselves. In other words, we just need to add an internal generator, like we already do for encoding the chunks.

execa/lib/stdio/generator.js

Lines 74 to 85 in e8f6bbd

    
           const encodingStartStringGenerator = async function * (chunks) { 
        
           	const textDecoder = new TextDecoder(); 
        
           	for await (const chunk of chunks) { 
        
           		yield textDecoder.decode(chunk, {stream: true}); 
        
           	} 
        
           	const lastChunk = textDecoder.decode(); 
        
           	if (lastChunk !== '') { 
        
           		yield lastChunk; 
        
           	} 
        
           };

What do you think @sindresorhus?

The text was updated successfully, but these errors were encountered:

sindresorhus · 2024-01-17T12:39:57Z

execa('...', {stdout: { transform, binary: true }});

👍

ehmicky · 2024-01-17T21:14:03Z

Done in #699.

This was referenced Jan 16, 2024

Allow passing generators to the std* options #693

Merged

Ability to prefix child process stdio lines with a custom string #121

Closed

ehmicky mentioned this issue Jan 17, 2024

Line-wise transforms #699

Merged

sindresorhus closed this as completed in #699 Jan 19, 2024

ehmicky mentioned this issue Jan 20, 2024

Line-wise standard streams #706

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Line-wise transforms #695

Line-wise transforms #695

ehmicky commented Jan 16, 2024

sindresorhus commented Jan 17, 2024

ehmicky commented Jan 17, 2024

Line-wise transforms #695

Line-wise transforms #695

Comments

ehmicky commented Jan 16, 2024

Background

Rationale

Solution

API

Alternatives

sindresorhus commented Jan 17, 2024

ehmicky commented Jan 17, 2024