-
Notifications
You must be signed in to change notification settings - Fork 64
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
campaigns: add and use volume mounts by default on Intel macOS #412
Conversation
This mode is added by sourcegraph/src-cli#412.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fantastic, thorough work - docs, tests, scripts, code. And things can get up to 4 times faster. Good start into the new year :)
211151e
to
155e213
Compare
OK, so this has changed around a bit in response to @mrnugget's review. Most notably: there's now a PTAL! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very nice work 🌟
Co-authored-by: Thorsten Ball <mrnugget@gmail.com>
Co-authored-by: Thorsten Ball <mrnugget@gmail.com>
This mode is added by sourcegraph/src-cli#412.
Co-authored-by: Thorsten Ball <mrnugget@gmail.com>
Background
As we now know all too well, the default bind mount behaviour we use to manage workspaces when executing campaigns is slow on macOS: any file I/O in the workspace has to go through the loopback interface, and this causes problems with I/O heavy operations such as pretty much anything to do with
npm
oryarn
.The steps we perform within a workspace are extremely simple, however: we unzip the repository archive and run a variety of
git
commands to generate the diff. It's handy for this to be done on the host filesystem, since it allows the user to inspect what happens to their repository if a campaign step fails, but it's not a technical requirement.What's in the
boxPRThis PR generalises our existing concept of a workspace, and expands it significantly. Workspace modes are now defined by implementing a
WorkspaceCreator
interface, which knows how to createWorkspace
implementations which implement the common operations we need to set up, diff, and tear down workspaces. This is used to add a new workspace mode that uses a Docker volume to contain the workspace, rather than bind mounting the workspace from the host. This is controlled by the new-workspace
flag, and the default for Intel macOS has been switched to the new volume mode.For this to work, we have to perform the workspace management commands — the aforementioned unzipping and
git
malarkey — within a container, since that's the only way to access the volume. As such, part of this PR adds a Docker image that extendsalpine
to always havegit
available and configured appropriately, along withunzip
andcurl
. A GitHub Action has been added that will rebuild and push this image onsrc
release.There's also a lot of new testing stuff: the volume workspace code is copiously unit tested. In order to support this, there's a (very) minimal implementation of a mock API client, and an expanded
ripoffimplementation of the method Go'sos/exec
test suite uses to mock external commands, which means the new tests don't needdummydocker
and work on Windows.Adam's guess at obvious questions
Why is CI failing?
The
os/exec
method for mocking external commands basically redirectsexec.Cmd
calls back into the test binary, which gains a little bit of logic in itsTestMain
to handle those correctly.This works fine cross-platform, but something about AppVeyor specifically makes this not work on Windows. The errors are fairly inscrutable. However, the approach doesn't have any systemic issue on Windows: it works fine both for me locally, and in GitHub Actions, which now has Windows runner support.
I've opened #415, which is included in this PR as well, to remove our AppVeyor support and only use GitHub Actions for CI. If we decide to go ahead with that, then we can remove the AppVeyor integration and this PR will start passing.
How much faster is this on macOS?
It depends.
For campaigns that generate small-to-medium sized diffs and do little I/O, there isn't much difference: maybe a few percent here and there. For campaigns that generate large diffs and do little I/O, this can actually be slower. (That feels like a really weird case, though.)
For campaigns that perform lots of I/O, this is a big win. A test campaign that upgrades TypeScript in Sourcegraph completes in approximately a quarter of the time in volume mode compared to bind mode. (~11 minutes compared to ~42.)
Why change the default on macOS only?
Volume mode isn't universally better. On Linux (assuming nothing weird like a remote Docker server), there's no reason not to use bind mount mode: the performance is the same, and you have the advantage that you can inspect what happened in your workspace if a step fails.
Why change the default on Intel macOS only?
Conservatism. I've put in a bunch of effort to make multi-architecture builds of the Docker image needed by volume mode work, but I don't have an M1 Mac sitting around to test this. It should work, but I'd prefer to find that out by testing it using
-workspace volume
explicitly, rather than finding out that it doesn't work after making it the default.What about Windows?
My suspicion is that this will provide similar performance improvements on Windows, but I don't have good numbers on this right now, don't have an environment ready to go to prove that out, and I'd prefer to focus on macOS for now. It's easy enough to change the default later.
That said, I think there's another reason we should consider this for Windows sooner rather than later: it effectively removes the requirement to have
git
in your PATH, and removes any potential for Windows-specificgit
weirdness.What future work could we do to improve this further?
Glad you asked! I had more ideas, but this PR was more than big enough already.
sleep
toexec
into the container to inspect the workspace if that was necessary. We should probably provide some sort of interactive debugging mode where you get dropped into a container with the workspace already set up and running on error. (I mean, this is probably a good idea for bind mode, too.)stdout
than anything endemic to the approach. Another way of doing this would be to have the utility container running the whole time while executing a campaign on a repository, and provide some sort of RPC interface to perform setup and teardown and (most importantly) get diffs without having to go through Docker.internal/exec
package instead ofos/exec
directly. If we migrate other modules to use this, then we could use that down the track to provide verbose logging of every command that's run, since command execution will go through a central place.PR links
Skeletal end user documentation is provided by sourcegraph/sourcegraph#16979.
Fixes sourcegraph/sourcegraph#16809.