-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
api error VPCIdNotSpecified: No default VPC for this user #7834
Comments
For me it worked on the one environment after upgrading 1.2.2 -> 1.3.0, but didn't work on the other after fresh installation. |
Hi, i also have the same experience after upgrading from 1.2.2 to 1.3.0 it seems to work but after a clean install it doesn't. My ec2nodeclass looks like:
The cluster_primary_security_group_id has the vpc association. Cloudtrail also complains. Maybe i'm missing something in the class. |
We see this and similar issues after upgrading from v1.2.0 to v1.3.0. The error seems to come from new validation logic introduced in v1.3.0 in this PR: #7624 We also see a similar error like this:
|
I also encountered the same issue after upgrading karpenter chart from v1.2.1 to v1.3.0. The karpenter controller pod logs show multiple VPCIdNotSpecified errors when trying to create instances. It seems like Karpenter is trying to use GroupName, which is only supported in EC2-Classic or the default VPC. However, our AWS account does not have a default VPC. karpenter pod's error log: kubectl logs -l app.kubernetes.io/name=karpenter -n kube-system {"level":"ERROR","time":"2025-03-05T09:07:52.015Z","logger":"controller","message":"Reconciler error","commit":"ff59416","controller":"nodeclass","controllerGroup":"karpenter.k8s.aws","controllerKind":"EC2NodeClass","EC2NodeClass":{"name":"<REDACTED>"},"namespace":"","name":"<REDACTED>","reconcileID":"fff90e63-0b13-4ab8-a6b6-0a94461a2e26","error":"validating ec2:RunInstances authorization, operation error EC2: RunInstances, https response error StatusCode: 400, RequestID: e9ce7351-e9ad-43fd-ba6f-bc87279bfa97, api error VPCIdNotSpecified: No default VPC for this user. GroupName is only supported for EC2-Classic and default VPC."}
{"level":"ERROR","time":"2025-03-05T09:07:52.204Z","logger":"controller","message":"Reconciler error","commit":"ff59416","controller":"nodeclass","controllerGroup":"karpenter.k8s.aws","controllerKind":"EC2NodeClass","EC2NodeClass":{"name":"<REDACTED>"},"namespace":"","name":"default","reconcileID":"d1960fd9-270c-41e8-a0c4-9163ec7f8bb6","error":"validating ec2:RunInstances authorization, operation error EC2: RunInstances, https response error StatusCode: 400, RequestID: e7ee63c6-2e7b-4cf0-86bd-ea420389372a, api error VPCIdNotSpecified: No default VPC for this user. GroupName is only supported for EC2-Classic and default VPC."} |
It looks like setting a security group is missing in the call to ec2:RunInstances. |
I wonder, does this only affect AWS accounts where there is no actual default VPC setup? Happened to me going from 1.2.2 to 1.3.0 too |
Tested on two accounts - one with default VPN on it, and 2nd without. The one with Default VPC works fine, without errors, second one without default VPC producing a lot of errors.
|
I can see exactly the same behavior as MKnichal, but on 2 differents regions of the same account. |
Thanks all for the reports, we'll get someone looking at this today. /triage needs-investigation |
/assign @jonathan-innis |
Update: Looks like a miss in our auth checking validation logic that doesn't specify the subnets or security groups that are normally passed-in through CreateFleet -- validating the fix now but it should just be passing these in. I'm also updating our CI testing accounts so they don't have a default VPC since that would have caught this ahead of time |
hoping i can replace with a new public image |
This error is gone for me now with that latest image in the PR that was merged, but not able to get nodes up. Just see a bunch of "Starting Controller/etc" messages. |
Could we please get a comment when the new image is available? Still getting the issue as of now. |
Description
Observed Behavior:
On AWS EKS, I just upgrade karpenter from v1.2.2 to v1.3.0, and the controller logs the following error each minutes (1 line per nodeclass):
Did I forget to update some IAM permission, or did I miss a tag or something in SG Config ?
Expected Behavior:
Controller does not log ERRORs
Reproduction Steps (Please include YAML):
Versions:
kubectl version
):The text was updated successfully, but these errors were encountered: