Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix: parallel plan and apply also in a single workspace (rebased) #5264

Open
wants to merge 5 commits into
base: main
Choose a base branch
from

Conversation

plentydone
Copy link

what

@finnag did all the real work here. I just rebased and regenerated mocks. I'm new to this codebase, but I've reviewed the code and it seems reasonable to me, and I've reviewed all the commits between October and today and I see no logical conflicts. If in fresh review any changes are needed, I'm happy to do so.

From the original PR description:

- Add more thorough locking around Clone() calls, covering all of these phases:
   Am I on the right commit
   Merge with upstream
   Clone if necessary
- Reduce the number of remote git operations when planning or applying in parallel
   Clean up the Clone() method, split into Clone() and MergeAgain()

For parallel mode to work, you must either set the environment variable `TF_PLUGIN_CACHE_MAY_BREAK_DEPENDENCY_LOCK_FILE` to something, or check in your .hcl files. Otherwise terraform cannot run in parallel.

why

The Clone call had several race conditions where it could miss clones or delete the working directory under running processes causing failures.

tests

I ran make test-all fmt lint

references

This is just #3670 rebased

@plentydone plentydone requested review from a team as code owners January 23, 2025 04:28
@plentydone plentydone requested review from jamengual, lukemassa and nitrocode and removed request for a team January 23, 2025 04:28
@github-actions github-actions bot added go Pull requests that update Go code provider/github provider/gitlab labels Jan 23, 2025
@dosubot dosubot bot added the feature New functionality/enhancement label Jan 23, 2025
@plentydone plentydone changed the title Parallel plans fix: parallel plan and apply also in a single workspace (rebased) Jan 23, 2025
@plentydone
Copy link
Author

I'm not sure what the etiquette is around signing off on this as the committer given that I just rebased someone else's work from their own MR.

@jamengual jamengual added the waiting-on-review Waiting for a review from a maintainer label Jan 23, 2025
@jamengual
Copy link
Contributor

I'm not sure what the etiquette is around signing off on this as the committer given that I just rebased someone else's work from their own MR.

just sign it with yours

We generate a list of all interesting directories, so we can target
the locks to the affected directories instead of using a (too) global lock

Signed-Off-By: Andrew Carter <andrew@emailcarter.com>
There is a race condition here where we test if we are current, and
only then if we are not current we grab the lock. In the meantime,
that information could be stale.

Extend the lock to cover all operations, and unconditionally wait for
the lock. We can't assume anything can be skipped if we have to wait for
the lock.

Signed-Off-By: Andrew Carter <andrew@emailcarter.com>
All Clone() calls that have signaled an interest in merging
before another Clone() checks whether a merge is necessary
can skip their own checks.

This should reduce the thundering herd problem at the
beginning of large paralell runs.

Signed-Off-By: Andrew Carter <andrew@emailcarter.com>
Clone is now a NOP if the PR has not changed, and loses its second
return value, the MergedAgain flag.

MergeAgain must be called explicitly in the only location that
cares about this flag, just before planning.

This cleans up the code for Clone and re-merging a bit.

Also regenerated mocks

Signed-Off-By: Andrew Carter <andrew@emailcarter.com>
@jseiser
Copy link

jseiser commented Jan 27, 2025

While we wait on this to get merged, we wanted to test running it locally. Are the images built, ever pushed to ghcr?

docker pull ghcr.io/runatlantis/atlantis:pr-5264-debian                                                         
Error response from daemon: manifest unknown

@jamengual
Copy link
Contributor

While we wait on this to get merged, we wanted to test running it locally. Are the images built, ever pushed to ghcr?

docker pull ghcr.io/runatlantis/atlantis:pr-5264-debian                                                         
Error response from daemon: manifest unknown

sha256:5f2412560425761d177fab5e1e878e3d9e08c0bce7a248d65eff88efde593b84

@jseiser
Copy link

jseiser commented Jan 27, 2025

sha256:5f2412560425761d177fab5e1e878e3d9e08c0bce7a248d65eff88efde593b84

Any chance you can paste syntax on this? Everything Im attempting is apparently wrong.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
feature New functionality/enhancement go Pull requests that update Go code provider/github provider/gitlab waiting-on-review Waiting for a review from a maintainer
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants