Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Release 5.10.2 not starting any servers #3870

Closed
lionslair opened this issue Apr 25, 2024 · 10 comments · Fixed by #3885
Closed

Release 5.10.2 not starting any servers #3870

lionslair opened this issue Apr 25, 2024 · 10 comments · Fixed by #3885
Assignees
Labels
bug Something isn't working

Comments

@lionslair
Copy link

The latest release https://github.com/philips-labs/terraform-aws-github-runner/releases/tag/v5.10.2 is not starting any servers. 5.10.1 did but this one just is not starting a server.

V5.10.1 worked fine.

I have destroyed and recreated the terraform multiple no luck.
I deleted the termination-watcher.zip and added it back but no difference. I don't actually fully understand what this is doing.

However This is not starting any servers for me on 5.10.2 ... 5.10.1 was fine

@ryzr
Copy link

ryzr commented Apr 26, 2024

Suddenly started having issues, though was running 5.9. Not sure if related, but noticed this in our scale-up lambda logs, shortly after the "Received event" log.

"Ignoring error: response.json(...).catch is not a function"

EDIT: actually, I've only been getting the above error since upgrading to 5.10.x. Reverting to see what happens. My issue on 5.9 was likely something else.

EDIT2: response.json issue gone after reverting to 5.9.0. Curious if you find similar logs to above once you're all wiped/restored.

@lionslair
Copy link
Author

lionslair commented Apr 26, 2024

I have now got myself into such a state trying to wipe everything from aws and recreate fresh.

I keep getting errors like The specified log group already exists

EntityAlreadyExists: Instance Profile *****-gh-ci-runner-profile already exists

EntityAlreadyExists: Instance Profile *****-gh-ci-runner-profile already exists

@lionslair
Copy link
Author

Suddenly started having issues, though was running 5.9. Not sure if related, but noticed this in our scale-up lambda logs, shortly after the "Received event" log.

"Ignoring error: response.json(...).catch is not a function"

EDIT: actually, I've only been getting the above error since upgrading to 5.10.x. Reverting to see what happens. My issue on 5.9 was likely something else.

EDIT2: response.json issue gone after reverting to 5.9.0. Curious if you find similar logs to above once you're all wiped/restored.

Yes I have the same error in that log

    "level": "WARN",
    "message": "Ignoring error: response.json(...).catch is not a function",
    "service": "service_undefined",

@lionslair
Copy link
Author

lionslair commented Apr 26, 2024

My uneducated guess is maybe this change
v5.10.1...v5.10.2#diff-4eccdb723f617ab228bd897ae3b78dba1fb4a779f818efa21e55ccb75b06113dL19

I have rolled back to 5.10.1 and servers are starting again. However no jobs are picked up.

@sykhro
Copy link

sykhro commented Apr 26, 2024

Same issue here.

@lionslair
Copy link
Author

I have finally got things back up and working using v5.10.0 Last one I know worked fine.

I made a total mess of the system. Also the AMI user I was using within the instance images did not have all the permissions it needed and so instances were starting then shutting down again.

Once it had the permission to read from the s3 bucket download the github runner then things started to work again. When I cleared things out is where it really got messed up.

@espizo
Copy link

espizo commented Apr 29, 2024

We were seeing the same thing, downgrading to 5.10.0 fixed the problem.

We have also been seeing this in lambdas across several versions, although things appears to be working:
"service": "service_undefined",

@joemiller
Copy link

Ran into this after upgrading v5.10.0 -> v5.10.2 as well. Instances would not spin up anymore. Rolled back to v5.10.0 and things seem to be working again.

@rsavage-nozominetworks
Copy link

rsavage-nozominetworks commented May 1, 2024

I can confirm I am seeing the same issues. I upgraded both my runner code and lambdas from 5.9.0 to 5.10.2 and started seeing the problem with all of my scale-up lambdas (see below). If I downgrade just the lambdas from 5.10.2 to 5.9.0 they work. I have tried this twice (upgrade and downgrade) - the results are the same.

Scale Up Lambdas Errors

{
    "level": "WARN",
    "message": "Ignoring error: response.json(...).catch is not a function",
    "service": "service_undefined",
    "timestamp": "2024-05-01T14:03:59.036Z",
    "xray_trace_id": "1-66324bcd-b91f32230d4f1c5052a29617",
    "region": "us-east-1",
    "environment": "<REDACTED",
    "aws-request-id": "d9b95d22-1a54-59ca-9a20-038e545cb1f1",
    "function-name": "<REDACTED>-scale-up",
    "module": "lambda.ts"
}

This seems like a wide spread issue as I'm not the only one in the community experiencing this problem.

@npalm
Copy link
Member

npalm commented May 3, 2024

Thanks for reporting, we going to dig in further

@npalm npalm self-assigned this May 3, 2024
@npalm npalm added the bug Something isn't working label May 3, 2024
npalm added a commit that referenced this issue May 3, 2024
Release 5.10.2 is borken caused by PR #3867. This PR is reverting the
change

fix #3870, #3883
npalm pushed a commit that referenced this issue May 3, 2024
🤖 I have created a release *beep* *boop*
---


##
[5.10.3](v5.10.2...v5.10.3)
(2024-05-03)


### Bug Fixes

* revert depedency update / broken release 5.10.2
([#3885](#3885))
([7464f2b](7464f2b))
* problem reported in issues #3883 #3870 @rsavage-nozominetworks
@lionslair @ryzr @lionslair @sykhro @espizo @bicefalo @nap @joemiller


---
This PR was generated with [Release
Please](https://github.com/googleapis/release-please). See
[documentation](https://github.com/googleapis/release-please#release-please).

Co-authored-by: forest-releaser[bot] <80285352+forest-releaser[bot]@users.noreply.github.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug Something isn't working
Projects
None yet
Development

Successfully merging a pull request may close this issue.

7 participants