-
Notifications
You must be signed in to change notification settings - Fork 1.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[subprocess][go] run the subprocesses from go-land #843
Conversation
[py] install pympler too [py] fix multiple issues with python memory collector [py] initialize entries
Found another race condition on CircleCI (not locally) fixing that.... In this case the race only surfaces with golang 1.8.3 (and I presume others), but not with our default Data race on
No trace of the problem on Related material: |
It looks like there's still a GIL related bug, it could also be a "sticky" thread issue - will address tomorrow. |
It may be fixed now after locking the go routine to the thread. Haven't experienced the weird crashes for a while now but I will need to verify tomorrow. |
Feel free to merge now to test this a bit with a packaged agent, I went through the code quickly and it looks good. I'll make comments from a thorough review on this PR, a bit later :) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Made a bunch of comments, but this looks great overall :) Let me know what your thoughts are
One important thing we need to check is the safety of calling UnlockOSThread
from GetSubprocessOutput
.
Py_RETURN_NONE; | ||
} | ||
|
||
if (!PyList_Check(cmd_args)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for subprocess.Popen
, cmd_args
could also be a sequence (i.e. not strictly a list
) or a string. We don't have to handle this here (we can handle the conversion in the python get_subprocess_output
), but we should if we want to preserve the api.
if (!PyArg_ParseTuple(args, "O|O:get_subprocess_output", &cmd_args, &cmd_raise_on_empty)) { | ||
PyGILState_Release(gstate); | ||
PyErr_SetString(PyExc_TypeError, "unable to parse arguments"); | ||
Py_RETURN_NONE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should use return NULL
instead of Py_RETURN_NONE
when we set an error on the interpreter (otherwise, in python land, the function returns None
and doesn't raise an exception).
if (!PyList_Check(cmd_args)) { | ||
PyGILState_Release(gstate); | ||
PyErr_SetString(PyExc_TypeError, "command args not a list"); | ||
Py_RETURN_NONE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same thing here
if (cmd_raise_on_empty != NULL && !PyBool_Check(cmd_raise_on_empty)) { | ||
PyGILState_Release(gstate); | ||
PyErr_SetString(PyExc_TypeError, "bad raise on empty argument - should be bool"); | ||
Py_RETURN_NONE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
if(!(subprocess_args = (char **)malloc(sizeof(char *)*subprocess_args_sz))) { | ||
PyGILState_Release(gstate); | ||
PyErr_SetString(PyExc_MemoryError, "unable to allocate memory, bailing out"); | ||
Py_RETURN_NONE; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same
cErr := C.CString("unable to import subprocess empty output exception") | ||
C.PyErr_SetString(C.PyExc_Exception, cErr) | ||
C.free(unsafe.Pointer(cErr)) | ||
return C._none() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil
cErr := C.CString("get_subprocess_output expected output but had none.") | ||
C.PyErr_SetString((*C.PyObject)(unsafe.Pointer(excClass)), cErr) | ||
C.free(unsafe.Pointer(cErr)) | ||
return C._none() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil
C.free(unsafe.Pointer(cModuleName)) | ||
if utilModule == nil { | ||
cErr := C.CString("unable to import subprocess empty output exception") | ||
C.PyErr_SetString(C.PyExc_Exception, cErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure we should overwrite the original exception
C.free(unsafe.Pointer(cExcName)) | ||
if excClass == nil { | ||
cErr := C.CString("unable to import subprocess empty output exception") | ||
C.PyErr_SetString(C.PyExc_Exception, cErr) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
not sure we should overwrite the original exception
@@ -75,6 +75,67 @@ func TestFindSubclassOf(t *testing.T) { | |||
assert.Equal(t, 1, sclass.RichCompareBool(barClass, python.Py_EQ)) | |||
} | |||
|
|||
func TestSubprocessBindings(t *testing.T) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
could you also write a test that executes a python check that uses get_subprocess_output
? (a bit like what the tests in check_test.go
do)
What does this PR do?
Runs the subprocess from go-land as opposed to forking from within the python interpreter. This should hopefully address memory leak issues we have experienced, as well as resolve the zombie fork issue (triggered by an obscure race-condition) we have experienced.
Motivation
The
subprocess32
approach in #830 has helped with the memory leak, but wasn't a truly satisfactory fix as it didn't quite fix the issue on windows. The alternative, setting a global lock on windows would likely be less desirable than a pure-go cross-platform fix. Hopefully this addresses that.Additional Notes
I've been unable to get the tests to fly on gitlab, they seem fine locally. Let's see what happens on CircleCI. Let's be thorough with the GIL, reference count during the review 😅