-
Notifications
You must be signed in to change notification settings - Fork 30.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
flaky test-process-exit-recursive #18323
Comments
/cc @gibfahn |
I think our alpine containers are just docker images. We can get you access, but it might be easier for you to just spin up an alpine image and test in there. |
thanks @gibfahn . Would you mind getting one for me in one of our (IBM) dev boxes? (I am sure I will take much more time than you for this) |
That might be because of alpine's ridiculously low default stack size. Can you reproduce when you do |
Created an alpine image with @gibfahn 's help and attempted a 2K times, but no luck. these are the image attributes, fwiw:
|
At this point I stop working on this and wait to see if this show up again in the CI runs. |
@gireeshpunathil @bnoordhuis any further thoughts here? This has happened on a few other tests, all are very short running tests that don't do much. For example: #18585 It's making me think that this must be something related to the bootstrap / tear down where the code is clearly overreaching. For example, we could be manipulating V8 objects after Dispose or something of that nature? I first considered problematic |
The shortest running example was #18505 which literally exits after requiring |
I stopped working on this, but thanks for the collated information - that shows a pattern for the failure, and gives further opportunity to investigate. Let me further look into these. One challenge is the intermittent nature, even in the CI. If you ever spot a consistent failure in CI, the way forward would be to set |
@apapirovski - that provides sufficient proof point that the issue can be outside of node itself. the practical code that is run in #18505 on alpine is: 'use strict';
const common = require('../common');
if (!(common.isOSX || common.isWindows))
common.skip('recursive option is darwin/windows specific'); @gibfahn :
|
I think the containers are persistent, but I'm not familiar with the specifics. That's more a question for @rvagg. |
Dockerfiles are here https://github.com/nodejs/build/tree/master/ansible/roles/docker/templates Also, Alpine 3.4 if kind of old now, what do ppl in here think about just removing it? I'd like to do it at least by the time that 3.8 is released. |
pull requests welcome to the Dockerfiles of course! especially from people that actually use Alpine and have more of a clue than me about how it's generally configured for general use. |
thanks @rvagg . In that case is it possible to zip images alpine34-container-x64 and alpine35-container-x64 and send it for debugging? The fresh images I created using the dockerfile template never cause the noted failures. If we are able to recreate, we can conclude that these are issues with the images themselves, and refresh them.
|
I think that would fix a lot of issues people have reported, both test failures and build issues. I'm testing against 3.7 and it's been pretty smooth. It's a stock install in a KVM virtual machine, not a Docker image; don't know if that makes a difference. |
Seems reasonable to me, I assume 99% of people running Node in Alpine are doing it through docker, so they're unlikely to be pulling new versions of Node into existing images. |
I'm leaning towards closing this because we've fixed some segfaults in the last 3 months and this test hasn't failed in a long time. In addition we've had the V8 team help us with running Node.js with some debug parameters that would uncover segfaults and we've fixed all the findings of that. |
I am fine with that |
https://ci.nodejs.org/job/node-test-commit-linux/15696/nodes=alpine34-container-x64/console
The text was updated successfully, but these errors were encountered: