-
Notifications
You must be signed in to change notification settings - Fork 0
OSX nightly debug failed cloning from GitHub on mini1 and mini2 #14
Comments
And this job failed with a different, but similar error: |
Note that this problem only shows up on mini2 and not on the other machines |
Ah, thanks. I thought that might be the case, but I couldn't quite remember :). |
I also noticed this is specific to the macos jobs, I didn't think to check if it's specific to one particular host. It may be a certificate bundle or openssl version issue that could be resolved by grabbing a fresh version of either of those from homebrew. If the issue persists after bumping those packages we should find a way to troubleshoot the connection issues by setting |
The trend view is pretty useful to gather this kind of information:
👍 |
We recently had an occurence of this on mini1 as well: http://ci.ros2.org/view/nightly/job/nightly_osx_release/526/ |
Searching for the error output suggests that this is a frequent issue with MacOS hosts running CI. It seems like it may just be a persistent error somewhere in the network stack. It occurs not only during Git clones but other curl-based https operations. Folks not using Git have reported that switching to wget, which has more (which is to say: some) retry behavior has given them more stability. As far as I know there is no way to swap git's http backend nor a way I know of to instruct the http backend to auto-retry. A hacky fix would be to check the vcs exit code and try the vcs import again in 3-5 seconds. |
Thanks for the info! I think we can live with it until we switch to the new buildfarm (ros_buildfarm has retry behaviors for most network related operation (git, apt etc), ideally we could leverage the same type of retry behavior on non-docker-based builds) |
We've brought this new buildfarm up a couple of times in offline conversation, and we'll almost certainly have more offline discussion about it, but I think it would be really helpful to have an issue or document somewhere about what we're missing with the current CI. What advantages it has over a vanilla ROS buildfarm and how we can grow either in the right direction. @mikaelarguedas I've talked with you about it the most and you seem to have the clearest vision at the moment. Could you make a pitch or prompt style issue for us to iterate on and discuss? |
I will close this since as of ros2/ci#103 we are retrying if the cloning fails. |
This job: http://ci.ros2.org/view/nightly/job/nightly_osx_debug/435/
failed to clone from github with an error:
07:39:33 error: RPC failed; curl 56 SSLRead() return error -9806
We've seen this from time-to-time, and it usually goes away. Still, we've seen it enough that I think we should probably look into it. I think @dhood had done some research before, but I thought I would re-report it here.
The text was updated successfully, but these errors were encountered: