Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Feature request: shared ALB across multiple Ingress objects #688

Closed
rbtcollins opened this issue Oct 23, 2018 · 64 comments
Closed

Feature request: shared ALB across multiple Ingress objects #688

rbtcollins opened this issue Oct 23, 2018 · 64 comments

Comments

@rbtcollins
Copy link

rbtcollins commented Oct 23, 2018

I want to highlight that this is not 'cross-namespace ingress', which is contentious in the core ( kubernetes/kubernetes#17088 ) but rather the ability to use a single ALB, with vhosts, to satisfy multiple independent ingress objects, which may safely be spread over multiple namespaces.

The key difference is that rather than having one ingress which is able to send traffic in ways unexpected to services, each team or service or whatever defines an ingress, and the vhost(s) they want, exactly as normal- its just an optimisation for the controller to reuse an ALB rather than making new ones.

Why? Two big reasons.

  1. security group limitations. With many ELBs (we were exceeding 120ish at one point), the number of security groups becomes a limitation, but actually they are all very boring, almost identical. Having a fixed sg works, until someone needs a different port, and then the whole self-service proposition breaks and an admin has to fix things. ugh.

  2. For test cluster deployment we've found DNS propogation issues (ELB provisioned but NXDOMAIN on a lookup too it if you are too fast (and then short negative caching and argggh)., and AWS ELB provisioning times (not all the time, but often enough to find assumptions in code rather to often to be good) to be ongoing stability and performance problems, and we're expecting that to translate to ALB's too; if someone from AWS can categorically state that thats not the case, then yippee, otherwise ... :)

The basic top level sketch is just to accept additional ingress documents that use different virtual hosts, and have the same ALB configuration, and then bundle them onto the same ALB.

Other ingress controllers like nginx have done exactly this: (https://github.com/ProProgrammer/cross-namespace-nginx-ingress-kubernetes).

Seems to me that it could be made a controller cmdline option to enable, making it opt-in, and thus mitigating any reliability / stability issues by not exposing every user to them.

Possible issues:

  • what if an existing ingress adds a vhost already served by a second ingress?
    what of it: its broken, but its broken whether or not the ALB is shared.
  • what if previously compatible ALB definitions are changed?
    ignore the change until all ingresses that are sharing the same ALB agree on the new setting (whatever it is), then action it (and in the mean time complain via events).
@M00nF1sh
Copy link
Collaborator

How about adding an annotation like alb.k8s.io/ingress.group: myGroup1
Ingresses belongs to same group will be aggregated into same ALB.
This allows a bit of benefits:

  1. users can freely choose how many ALB they want to use.
  2. users can/should ensure all rules for ingresses of same group don't conflict.
  3. isolate changes in cluster. The current codebase reconcile every ingresses in the cluster when event changes. which can cause problems when ingress count increases. With ingressGroup, changes can be isolated inside the same group of ingresses.
  4. we can add permission controller to limiting certain users creating ingresses to join ingressGroup when initializer and admission controller are generally available 😸

@rbtcollins
Copy link
Author

If the group works in all others as I sketched then I think its no different from a security perspective, but it does have the downside of requiring explicit coordination between different teams. I think conflicts have to be dealt with by the controller - refuse to update an ALB where rules currently conflict, and log errors to the relevant ingresses - not hard to do.

Use the group idea could make it reasonable to not just use different hostnames but also permit routes within one or more common hostnames, because namespaces are opting into this - but that seems harder to reason about than just 'unique hostnames apply' :) - OTOH, some folk may need that!

I think I see the group concept as largely a separate feature building on the sharing capability; I have no objections to having it exist and even be a mandatory thing to enable sharing - for our use case it would be fine.

w.r.t. performance, I don't understand the problem:

  • assuming the controller has a cache of the state of all relevant ingress objects (which except for the special case of controller restart, it should)
  • changes to any one ingress object would require at most updating one ALB configuration, which is the same overhead as today
  • more k8s objects may need to be consulted; but the cache gives a low constant factor on the lookup time, and there are no NP or NPC problems involved: finding compatible ALB configurations is a nlogn problem (hash the relevant fields in canonical form to get a key, insert all ingress objects into a map under their key, et voila). Resolving conflicts is similarly nlogn - build a unified route table from all the objects, and if a key already exists, its an error.

If the problem with many ingresses today is due to the ALB interactions and AWS ratelimits, then that would be made better simply by the act of grouping; if the problem is k8s interactions - I'd need more detail to model and help - but happy to do so...

@bigkraig
Copy link

I like the group idea and with regards to conflict resolution, we could log errors on the ingress that created the conflict, leveraging the last modified timestamps. It might look fairly similar in a debugging situation as we have now.

I'm also a bit of a fan of having the group be an opt-in feature via an annotation either on the ingress or at a namespace level.

From a performance perspective, its worth mentioning that ALBs do not scale instantaneously and when you start to share an ALB across ingresses, you are susceptible to noisy neighbors. The scaling rate is also something along the lines of 2x every N minutes if I remember correctly. That is slow but it will be increasingly slow if multiple services on the ALB receive a burst of traffic simultaneously, which is very common in SOA. All the more reason where from my perspective for Ticketmasters use case, its an opt-in cost optimization.

@M00nF1sh
Copy link
Collaborator

:D if we have such groups, we can add a optional controller option like defaultIngressGroup, which can applies to all ingress without explicit group annotation, which fulfills @rbtcollins use case 😄.

For the performance issue, it mainly came from AWS side. the k8s side is already relies on cache.

@rbtcollins
Copy link
Author

Cool; so to summarise:

  • valid use case
  • design seems sound
  • controlled via a 'group' concept
  • opt-in capability to have a default group
  • whether it makes sense for a given use case depends on workload + ALB specifics (e.g. SOA noise + ALB scale responsiveness), so we offer both shared and non-shared as a result.

@M00nF1sh
Copy link
Collaborator

M00nF1sh commented Oct 23, 2018

Yeah, i'd like to implement this once i finished the refactor on hand..
also, we need to figure out how to deal with conflicts for annotations on ingress among a group😸 .(e.g. internet facing alb or internal facing)
An naive solution can be aggregate all explicitly annotated annotation from ingresses, if value are mismatch, then we report error. (from user's perspective, they can only apply annotation to one ingress inside group)

@rbtcollins
Copy link
Author

I think conflict handling on things that are either a) already grouped or b) requested to group should be to:

  • notify all ingresses that will have their ingress request not work via events

  • don't apply conflicting requests at all

  • since precedence may not be establishable, a safe default is to cease updating the ALB altogether until the ingresses converge on a non-conflicting specification

    • but in some cases it may be possible to still honour ALB updates for some ingresses. I think that should be a second stage development.
  • not all differing annotations will conflict - for instance, target-type can be different per ingress object with no issues (that I can see)

Example of the special cases I'm thinking of:
Ingress A: host X, /
Ingress B: host X, /
Ingress C: host Y. /

While we can't canonically say which of A or B should get host X:/, Y:/ is unconflicted, and we could in principle merge that into the existing ALB config without altering any existing rules for X. Then only A and B ingresses need error events registered.

@dshmatov
Copy link

Is any activity around ALB sharing stuff? Maybe some help needed?

@M00nF1sh
Copy link
Collaborator

@dshmatov
Hi, the sharing stuff should be delivered in Dec. I want to add cognito support before that 😄

@sppwf
Copy link

sppwf commented Jan 9, 2019

Hi Guys,

Any idea when the sharing will be implemented?

@M00nF1sh
Copy link
Collaborator

M00nF1sh commented Jan 9, 2019

Hi Guys,

Any idea when the sharing will be implemented?

@sppwf sorry the delay on this feature. It need far more work than i originally expected.(also more features to support like use it to configure auth/rules on specific ingress path, and i toke a long vocation in Dec =.=).
The current priority on ALB is

  1. CognitoAuth(blocked on internal Sec review)
  2. GlobalAcceleratorSupport(in progress)
  3. IpModesForKubenet
  4. IngressGroup
    The ingress groups things needs a lot of refactor stuff in the code base, given the priority and complexity, i expect it another 2 month for it.

(BTW, is this an hard blocker for your use case? i.e. The only use case that cannot be mitigated with alternatives is have services in multiple namespaces, and wants a single ALB. I can try re-priority them if it's indeed some hard blocker)

@sppwf
Copy link

sppwf commented Jan 10, 2019

Hi,

Well, right know I am using Nginx controller with clasical ELB with SSL on it. We are still in development stage, however I would like to move to ALB once we get to prod. we keep our services in different namespaces, however i still want to use same ALB from cost perspective and operational one as well.

Two months is fine to wait, hope that will be start of March.

Thanks
Sergiu

@hikhvar
Copy link

hikhvar commented Jan 11, 2019

Hey,

we want to use a single ALB for services in multiple namespaces. They should serve different HTTP paths for the same hostname. Our RBAC setup require different namespaces for different services.

Thanks for your effort! Can we help you somehow? Maybe giving early feedback?

Regards,
Christoph

@xdrus
Copy link

xdrus commented Jan 16, 2019

Hello,

our use case is a microservice API that has multiple components developed and deployed by different teams into different namespaces (with proper RBAC), but we want to expose it through one endpoint (ALB).

Having ALB per namespace is a huge blocker for us (mainly because of very strict limits on subnets size in our air-gapped env).

Best,
Rus.

@M00nF1sh
Copy link
Collaborator

Hey,

we want to use a single ALB for services in multiple namespaces. They should serve different HTTP paths for the same hostname. Our RBAC setup require different namespaces for different services.

Thanks for your effort! Can we help you somehow? Maybe giving early feedback?

Regards,
Christoph

sure, i'll give update when there is an draft version available and testing will be super helpful 👍

@xdrus I'm aware this is an major pain point for different namespaces. Will address this soon.

@marcosdiez
Copy link
Contributor

Hey, I might have solved your problem in this PR: #830 . Testing and feedback is welcome :)

@linuxbsdfreak
Copy link

hi @marcosdiez . How do i test this single ALB feature.? The PR is still not merged. Any docs or updated config i need to use ? Could you guide me ?

@mdiez-modus
Copy link

@linuxbsdfreak I updated the instructions on the PR itself: #830
please read and ask me questions there! I'll be more than happy to help.

@linuxbsdfreak
Copy link

@marcosdiez . I am installing the package via helm charts as described here

https://hub.helm.sh/charts/incubator/aws-alb-ingress-controller

Set the clusterName, awsRegion, awsVpcID, podAnnotations.iam.amazonaws.com/role: k8s-alb-controller,
image:
repository: docker.io/amazon/aws-alb-ingress-controller
tag: "v1.1.1"

What do i need to do extra ?

@marcosdiez
Copy link
Contributor

Please don't use helm this time. Please install it using kubectl -f theYamlFileYouHaveEdited.yaml

@linuxbsdfreak
Copy link

@marcodiez- Could you post me the final yaml that i need to use for single ALB with multiple namespaces ?

@linuxbsdfreak
Copy link

@marcosdiez . This is the yaml

apiVersion: v1
kind: ServiceAccount
metadata:
  name: RELEASE-NAME-aws-alb-ingress-controller
  labels:
    app.kubernetes.io/name: aws-alb-ingress-controller
    app.kubernetes.io/instance: RELEASE-NAME

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRole
metadata:
  name: RELEASE-NAME-aws-alb-ingress-controller
  labels:
    app.kubernetes.io/name: aws-alb-ingress-controller
    app.kubernetes.io/instance: RELEASE-NAME
rules:
  - apiGroups:
      - ""
      - extensions
    resources:
      - configmaps
      - endpoints
      - events
      - ingresses
      - ingresses/status
      - services
    verbs:
      - create
      - get
      - list
      - update
      - watch
      - patch
  - apiGroups:
      - ""
      - extensions
    resources:
      - nodes
      - pods
      - secrets
      - services
      - namespaces
    verbs:
      - get
      - list
      - watch

apiVersion: rbac.authorization.k8s.io/v1
kind: ClusterRoleBinding
metadata:
  name: RELEASE-NAME-aws-alb-ingress-controller
  labels:
    app.kubernetes.io/name: aws-alb-ingress-controller
    app.kubernetes.io/instance: RELEASE-NAME
roleRef:
  apiGroup: rbac.authorization.k8s.io
  kind: ClusterRole
  name: RELEASE-NAME-aws-alb-ingress-controller
subjects:
  - kind: ServiceAccount
    name: RELEASE-NAME-aws-alb-ingress-controller
    namespace: default

apiVersion: apps/v1
kind: Deployment
metadata:
  name: RELEASE-NAME-aws-alb-ingress-controller
  labels:
    app.kubernetes.io/name: aws-alb-ingress-controller
    app.kubernetes.io/instance: RELEASE-NAME
spec:
  replicas: 1
  selector:
    matchLabels:
      app.kubernetes.io/name: aws-alb-ingress-controller
      app.kubernetes.io/instance: RELEASE-NAME
  template:
    metadata:
      labels:
        app.kubernetes.io/name: aws-alb-ingress-controller
        app.kubernetes.io/instance: RELEASE-NAME
      annotations:
        iam.amazonaws.com/role: k8s-alb-controller

    spec:
      containers:
        - name: aws-alb-ingress-controller
          image: "docker.io/amazon/aws-alb-ingress-controller:v1.1.1"
          imagePullPolicy: IfNotPresent
          args:
            - --cluster-name=TestCluster
            - --ingress-class=alb
            - --aws-region=eu-central-1
            - --aws-vpc-id=vpc-xxxxxx
         env:
          ports:
            - name: health
              containerPort: 10254
              protocol: TCP
          readinessProbe:
            httpGet:
              path: /healthz
              port: health
              scheme: HTTP
            initialDelaySeconds: 30
            periodSeconds: 60
            timeoutSeconds: 3
          livenessProbe:
            httpGet:
              path: /healthz
              port: health
              scheme: HTTP
            initialDelaySeconds: 60
            periodSeconds: 60
          resources:
            {}

      serviceAccountName: RELEASE-NAME-aws-alb-ingress-controller
      terminationGracePeriodSeconds: 60

@marcosdiez
Copy link
Contributor

@linuxbsdfreak https://gist.github.com/marcosdiez/d6943375c6d8b1dc607529e42d01f44e

don't forget to change some fields like the AWS KEY and SECRET (in case you are not using IAM) and some parameters like the cluster name

@linuxbsdfreak
Copy link

@marcodiez. Thanks for the info. How does the service config look like? What annotations do i need to provide for the service to use a single ALB? The ALB name in my case is k8s-alb in the controller config

@marcosdiez
Copy link
Contributor

You don't have to change your annotations.
It will overwrite whatever annotation you have and always use the same ALB.
If you have annotations for both internal and exteranl ALBs, than bad thinks will happen!

Just try. It should work!

@Revolution1
Copy link
Contributor

Revolution1 commented Mar 10, 2020

It's been 2 years since.

Is there anyone still working on this?

@sadoMasupilami
Copy link

this would be very useful to us.

@RuBiCK
Copy link

RuBiCK commented Mar 13, 2020

I have two services with one ALB each one with different hostnames and ACM certificates. Now joining two ALB in one
I don't know how to handle the annotations for ACM certificates. Each ingress should declare it's own ACM exclusively. I should use the same ACM list for each ingress?

@kbbqiu
Copy link

kbbqiu commented Apr 10, 2020

Bump. This would be a really useful feature to have.

@excavador
Copy link

+1

@fejta-bot
Copy link

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@k8s-ci-robot k8s-ci-robot added the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 19, 2020
@tylerohlsen
Copy link

@M00nF1sh any update on this request?

@phaze-
Copy link

phaze- commented Jul 22, 2020

/remove-lifecycle stale

@k8s-ci-robot k8s-ci-robot removed the lifecycle/stale Denotes an issue or PR has remained open with no activity and has become stale. label Jul 22, 2020
@satyamkapoor
Copy link

no update? :(

@mynameisneo7
Copy link

This is a desperately needed feature for my organization as we already have hundreds of load balancers created by various departments for lots of legacy apps. My department is transitioning one set of environments over to EKS which would further explode that number if we can't share ALBs created by the controller.

@rajeshwrn
Copy link

rajeshwrn commented Sep 5, 2020

I was came across the same kind of requirement.

Have to deploy 20+ services in kubernetes in AWS Fargate profile. Since the fargate is not support NLB as of now the only option is ALB. But each deployment created new Alb. This needs 20+ public ips and also more cost on alb.

I achieved the solution with two ingress controller, alb ingress and nginx ingress.

Nginx will be the target for alb with port 80. And application services running in cluster with different port and namespaces will communicate nginx.

I have documented my solution. I think it will help your requirement.

https://github.com/rajeshwrn/alb-nginx-controller

@excavador
Copy link

Alternative solution: https://github.com/zalando-incubator/kube-ingress-aws-controller

  • use CloudFormation for deploy ALB (guarantee resources cleanup / transparency)
  • automatically group ingresses

@excavador
Copy link

@rajeshwrn Yes, I read it before. There is no cotradiction between solution. We also use alb + ingress controller, but few days ago I found zalando ingress controller, and it handles cases with multiple domains better, from my humble point of view

@rajeshwrn
Copy link

@rajeshwrn Yes, I read it before. There is no cotradiction between solution. We also use alb + ingress controller, but few days ago I found zalando ingress controller, and it handles cases with multiple domains better, from my humble point of view

@excavador Sure, I will try this out. But do you have any idea whether it will work on EKS Fargate.

@teochenglim
Copy link

Hi guys, just share my 2 cents. Complete solutions might need AWS ACM (HTTPS) and AWS route53 (DNS) to work together. Most of you miss this part and start a new way of doing things.

Stacked load balancer can be hard to troubleshoot. So think twice first. For example timeout and headers.. you will know what I mean when you start using it.

ALB isn't always the best beacuse of the pre-warm behavior, meaning hard to be zero downtime. You may consider NLB. Another reason is because ALB is HTTP only. But I still like it because of target group binding.

To have 2 ACM, you just need to put a comma in between the ARN.

Also like to provide AWS provided way of NLB to do the same thing.
https://aws.amazon.com/blogs/opensource/network-load-balancer-nginx-ingress-controller-eks/

Cheers and wish you have fun.

@excavador
Copy link

excavador commented Sep 6, 2020

@teochenglim ALB + ACM works perfectly fine, as well as SNI (multi-domain) solution.
The only problem on kubernetes side: more useful to have separate small ingresses than single ingress to all domains terminated to nginx-ingress-controller to route traffic between namespaces (because: long start, AWS bill)

If alb-ingress-controller would be able to group separate small ingresses to single ALB me personally would be happy completely

@teochenglim
Copy link

If alb-ingress-controller would be able to group separate small ingresses to single ALB me personally would be happy completely
@excavador That's what everybody hope for.

https://github.com/zalando-incubator/kube-ingress-aws-controller looks very very nice. Would you consider helm Charts it?

@excavador
Copy link

Would you consider helm Charts it?

@teochenglim I am not an developer, just random user from random company :)

helm/charts#4538

@astrived
Copy link

We are doing the final phase of testing for the new aws-loadbalancer-controller and the RC image is here https://github.com/kubernetes-sigs/aws-alb-ingress-controller/releases/tag/v2.0.0-rc5. You can find details here https://github.com/kubernetes-sigs/aws-alb-ingress-controller/tree/v2_ga. Stay tuned for What's new coming soon !

@excavador
Copy link

Looked into documentation of new version: https://github.com/kubernetes-sigs/aws-alb-ingress-controller/blob/v2_ga/docs/guide/ingress/annotations.md

WOW. Just WOW. Thank you for your work!

@astrived
Copy link

We have published the v2.0.0 release at https://github.com/kubernetes-sigs/aws-load-balancer-controller/releases/tag/v2.0.0 The documents have been updated https://docs.aws.amazon.com/eks/latest/userguide/alb-ingress.html and https://docs.aws.amazon.com/eks/latest/userguide/load-balancing.html and the blog post is here https://aws.amazon.com/blogs/containers/introducing-aws-load-balancer-controller/

@M00nF1sh
Copy link
Collaborator

closing this as v2.0.0 is released

@benderillo
Copy link

🎉 https://aws.amazon.com/blogs/containers/introducing-aws-load-balancer-controller/ 🎉

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests