-
Notifications
You must be signed in to change notification settings - Fork 30.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
process: add execve #56496
base: main
Are you sure you want to change the base?
process: add execve #56496
Conversation
Review requested:
|
What does it do on non-Unix systems? |
Codecov ReportAttention: Patch coverage is
Additional details and impacted files@@ Coverage Diff @@
## main #56496 +/- ##
==========================================
+ Coverage 88.53% 89.18% +0.64%
==========================================
Files 657 662 +5
Lines 190761 191855 +1094
Branches 36616 36895 +279
==========================================
+ Hits 168899 171097 +2198
+ Misses 15048 13629 -1419
- Partials 6814 7129 +315
|
On Windows:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
src/node_process_methods.cc
Outdated
node::Utf8Value pathname_string(env->isolate(), args[0].As<String>()); | ||
|
||
argv = new char*[2]; | ||
argv[0] = strdup(*pathname_string); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we need check not null also in here, maybe(?)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good spot. Added.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use std::memcpy or a similar modern C++ function other than this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
lgtm
@ljharb I was totally wrong. Despite of naming and similar signatures, the function I'll disable this function on Windows. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If this is to be added, I'd absolutely recommend referring to it by a standard name such as execve()
, or otherwise something that makes it clear that it spawns a new process.
This requires integration with the permissions API or is otherwise an immediate security hole.
Overall I'd recommend not adding this though, unless there's a concrete reason to believe that it fills a significant gap that the existing child_process
API doesn't cover.
doc/api/process.md
Outdated
resources from the current process are preserved, except for the standard input, | ||
standard output and standard error file descriptor. | ||
All other resources are discarded by system when the processes are swapped. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What if this isn't the a desirable behavior in a given situation?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you please expand this? What do you mean?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If there's any point in adding execve(), it's the customizability that the method brings with it. Leaving file descriptors open and/or redirecting them intentionally can be part of that -- just like you could in a bash script do exec bash 0<&4 1>&4
to create a shell that reads and writes to e.g. a pre-opened socket or something along those lines.
(With a similar reasoning, you could also allow users to actually specify argv[0]
with a value they control -- it doesn't need to equal the filename that's being executed)
|
||
THROW_ERR_PROCESS_REPLACE_FAILED(env, error_code); | ||
} | ||
#endif |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is quite hard-to-follow C++ with lots of unnecessary explicit memory management that Node.js has been doing a lot of effort to move away from. I'd recommend looking a bit a how other parts of the Node.js code base handle strings and conversion between C++ and JS values.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+1
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm absolutely willing to. Since I'm not familiar with the C++ codebase, do you have any suggestion of places I can look to?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@ShogunPanda Well, mostly anywhere else works. As a general rule, you'll want to get rid of new
, new[]
, char*
, memcpy()
and strdup()
as much as possible, and replace them with std::vector
, std::string
/Utf8Value
as much as possible.
(You won't be entirely able to avoid something like std::vector<char*>
because execve
expects char**
arguments, but the std::vector<char*>
's entries could point to the entries of a std::vector<std::string>
or std::vector<Utf8Value>
rather than having to manage memory manually).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@addaleax I vastly refactored the C++ part following your suggestions. It was great, thanks a lot for all the good hint.
I tried to use a single vector but when I instantiate a std::string
or a Utf8Value
inside a for
loop it obviously went out of scope and garbage collected.
So I have to use two vectors for argv
and two for envp
. Is this the right approach or am I still missing something?
Why would we want to add a function that can't work on all tier 1 platforms? |
Well, we already have a number of such apis... |
Gotcha, i wasn't aware of that. |
I don't consider it to be ideal. I would have preferred a pattern like |
+1 on @addaleax's alternative name suggestion. I'm fine with adding this so long as the cleanup logic/expectations are clearly documented. |
Thanks for involuntary hint. :) |
63d5444
to
037edbc
Compare
As requested, I've renamed it to
I just added integration with the permission API. It will require
In the shell scripting context, there is no way to create a new process which would replace the current one. Think about using Node.js to build a complex command to run. After it, Node.js was not used anymore but since there was no way to swap the process, you would have to spawn a new process, manage it's stdin/stdout/stderr and so forth. That's why I added |
I added two tests which check that:
I've added this to the documentation to reflect that. Does it look good now? |
I mean, to be clear, this is still a one-line operation (managing stdio literally just comes down to setting |
I kinda agree. Is your objection a full block or just an "intent"? |
@ShogunPanda Just fyi, have you seen https://www.npmjs.com/package/foreground-child? That also works on Windows 🙂 My "request changes" marker only refers to the C++ memory management here, i.e. this comment specifically (and the integration with the permissions API, of course). I don't intend to block this feature, I've given my opinion but it's pretty clear overall that the Node.js project has a different stance from my own when it comes to 'what should go into core' 🙂 |
src/node_process_methods.cc
Outdated
int error_code = errno; | ||
|
||
// Free up memory, then throw. | ||
delete[] argv; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The abundance of manual memory management in this function really makes me nervous.
envp_strings[i] = | ||
Utf8Value(isolate, envp_array->Get(context, i).ToLocalChecked()) | ||
.ToString(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For things like this, instead of using ToLocalChecked()
, use a pattern like the following... this will avoid a process crash when a throw is appropriate:
Local<Value> str;
if (!envp_array->Get(context, i).ToLocal(&str)) {
return;
}
envp_strings[i] = Utf8Value(str).ToString();
|
||
// Set stdin, stdout and stderr to be non-close-on-exec | ||
// so that the new process will inherit it. | ||
for (int i = 0; i < 3; i++) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rather than a using a loop here, consider something like:
if (persist_standard_stream(0) < 0 ||
persist_standard_stream(1) < 0 ||
persist_standard_stream(2) < 0) {
THROW_ERR_PROCSS_EXECVE_FAILED(env, errno);
}
Or persist all three at once in one call to persist_standard_stream()
execve(*executable, argv.data(), envp.data()); | ||
int error_code = errno; | ||
|
||
THROW_ERR_PROCESS_EXECVE_FAILED(env, error_code); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Given that you're running the RunAtExit(...)
here, and given that the expectation is for this process to be replaced, I wonder if it would be more appropriate to crash the process here instead of just throwing (a throw can be caught and ignored and I'm not sure we want that here)
This PR adds a new
process.execve
method (absolutely willing to change the name if you want), which is a wrapper for theexecve
UNIX function.The function will never return and will swap the current process with a new one.
All memory and system resources are automatically collected from
execve
, except for std{in,out,err}.The primary use of this function is in shell scripts to allow to setup proper logics and then spawn another command.