Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

feat: GitHub - Support loading git token from disk #4928

Merged
merged 1 commit into from
Nov 4, 2024

Conversation

meringu
Copy link
Contributor

@meringu meringu commented Sep 17, 2024

what

Adds a gh-token-file server setting that can be used instead of gh-token. The token is read from disk as part of the GitHub client transport, allowing the token to be rotated without needing to restart the Atlantis process. I've also re-used the .git-credentials token rotator from the GitHub app integration to ensure that write-git-creds will update the .git-credentials file as the gh-token-file is updated. This only works for GitHub, we use a --gh-* prefixed flag like the other gh configs. This could be extended to the other VCS options as needed.

why

We run about ~150 Atlantis instances in our organisation in our GitHub org. GitHub have a hard limit on 100 GitHub apps per org, and charge a seat per service account user. To get around these challenges we have developed a GitHub app which issues scoped token for each Atlantis and loads them as Kubernetes secrets. The app is also responsible for forwarding and re-signing the webhooks with per instance webhook secrets to the correct Atlantis instance. I'd potentially be interested in opening the source code for this app if there is interest. There are still a few issues we are working through, like this one, and it is currently in a complicated relationship with Keda.

We run Atlantis as a GitHub app, but configure it with short term credentials to run as a GitHub user. The tokens only last one hour, so we manage restarts as part of scale-to-zero to ensure that Atlantis is always running with a valid token.

If the Atlantis instance fails to restart within an hour due to high activity or long running plan or applies, the commands will finish, but results will fail to be commented back to the GitHub pull request.

With this change we can load the token from disk, and as our GitHub app rotates the token, it is immediately picked up by the running Atlantis instance, allowing it to run uninterrupted for longer periods of time.

tests

I ran an apply with the follow Terraform that used to fail to comment back:

resource "time_sleep" "wait_over_hour" {
  create_duration = "4000s"
}

I also ran some cat commands agains the .git-credentials file with --write-git-creds was specified to ensure that it was getting updated as the token was being rotated.

references

@meringu meringu requested review from a team as code owners September 17, 2024 01:26
@meringu meringu requested review from GenPage, lukemassa and X-Guardian and removed request for a team September 17, 2024 01:26
@github-actions github-actions bot added docs Documentation go Pull requests that update Go code provider/github labels Sep 17, 2024
@meringu meringu changed the title Support loading git token from disk feat: Support loading git token from disk Sep 17, 2024
cmd/server.go Show resolved Hide resolved
runatlantis.io/docs/server-configuration.md Outdated Show resolved Hide resolved
server/events/vcs/gh_app_creds_rotator.go Show resolved Hide resolved
@X-Guardian
Copy link
Contributor

X-Guardian commented Nov 2, 2024

Thanks for this @meringu, I have requested a few small changes. Can you also update the PR description to indicate that this is for GitHub.

@X-Guardian X-Guardian added the waiting-on-response Waiting for a response from the user label Nov 2, 2024
@meringu meringu changed the title feat: Support loading git token from disk feat: GitHub - Support loading git token from disk Nov 3, 2024
@meringu
Copy link
Contributor Author

meringu commented Nov 3, 2024

Cheers for the review @X-Guardian, I appreciate it.

I have addressed your comments.

Signed-off-by: Henry Muru Paenga <meringu@gmail.com>
@dosubot dosubot bot added the lgtm This PR has been approved by a maintainer label Nov 4, 2024
@X-Guardian X-Guardian enabled auto-merge (squash) November 4, 2024 23:01
@X-Guardian X-Guardian merged commit 74916fb into runatlantis:main Nov 4, 2024
34 checks passed
@X-Guardian
Copy link
Contributor

Thanks for this @meringu. You can test using one of these container images: dev-debian-74916fb or dev-alpine-74916fb

@meringu
Copy link
Contributor Author

meringu commented Nov 5, 2024

Thanks for merging @X-Guardian!

I hope I haven't messed up the release testing, but I noticed that there was a gap in my testing methodology when doing some more testing in my environment with the dev-debian-74916fb image.

The current gh-token is still working, so there was no regression, but the gh-token-file setting isn't working as expected. I've raised a fix here: #5067

@X-Guardian
Copy link
Contributor

No problem @meringu, this is why we have these beta container image releases. I'll take a look at your new PR.

terakoya76 pushed a commit to terakoya76/atlantis that referenced this pull request Dec 31, 2024
kvanzuijlen pushed a commit to kvanzuijlen/atlantis that referenced this pull request Jan 4, 2025
Signed-off-by: kvanzuijlen <8818390+kvanzuijlen@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
docs Documentation go Pull requests that update Go code lgtm This PR has been approved by a maintainer provider/github waiting-on-response Waiting for a response from the user
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants