Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

haproxy-ingress is not resolving the correct pod/container port to use #280

Closed
mhyllander opened this issue Jan 9, 2019 · 9 comments
Closed
Labels
Milestone

Comments

@mhyllander
Copy link

I am using dynamic scaling with DNS to resolve the backend server IP addresses. The k8s services are headless (ClusterIP: "None"). I have noticed that haproxy-ingress does not correctly determine the pod port to use, instead it is using the port exactly as specified in the Ingress object. If the Ingress object specifies a port name instead of a port number, then haproxy-ingress produces an invalid haproxy configuration.

For example, given the following resources:

apiVersion: extensions/v1beta1
kind: Deployment
metadata:
  name: platform-ticket
  namespace: platform
  labels:
    app: ticket
    release: platform
spec:
  replicas: 2
  selector:
    matchLabels:
      app: ticket
      release: platform
  template:
    metadata:
      labels:
        app: ticket
        release: platform
    spec:
      containers:
        name: ticket
        image: path:tag
        ports:
        - name: http
          containerPort: 80
          protocol: TCP
      imagePullSecrets:
      - name: acr-reader-secret
---
apiVersion: v1
kind: Service
metadata:
  name: platform-ticket
  namespace: platform
  labels:
    app: ticket
    release: platform
spec:
  clusterIP: None
  ports:
  - name: http
    port: 9899
    protocol: TCP
    targetPort: http
  selector:
    app: ticket
    release: platform
  sessionAffinity: None
  type: ClusterIP
---
apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: platform-ticket
  namespace: platform
  annotations:
    ingress.kubernetes.io/use-resolver: kubernetes
    kubernetes.io/ingress.class: haproxy
  labels:
    app: ticket
    release: platform
spec:
  rules:
  - host: ticket.example.com
    http:
      paths:
      - backend:
          serviceName: platform-ticket
          servicePort: http
        path: /v1/auth
  tls:
  - hosts:
    - ticket.example.com
    secretName: example.com-tls

Notice that the ingress backend.servicePort is "http", which matches the service's port name. But this will not work with haproxy, because it tries to use "http" as the port number in the configuration file. If I change backend.servicePort to 9899, which is the port number used by the service, it will create a valid backend definition:

backend platform-platform-ticket-9899
    mode http
    balance roundrobin
    server-template server-dns 100 platform-ticket.platform.svc.cluster.local:9899 resolvers kubernetes resolve-prefer ipv4 init-addr none check inter 2s

but this will still fail, because the pod is listening on port 80, not port 9899, so haproxy's health check will fail and it will not find a backend server that is UP.

The configuration above works with nginx-ingress, which does the appropriate manifest lookups to determine port number to use. I believe haproxy-ingress should also do this: check the service's targetPort and the pod's containerPort to get the correct pod port.

@jcmoraisjr
Copy link
Owner

Hi, when using a resolver, haproxy ingress uses ingress/servicePort to fill the port number. This was done this way because, when using a name, the endpoints of that service could have different port numbers which isn't supported on use-resolver config. Try changing the targetPort (service) and also the servicePort (ingress) to the port number, hope this solve the issue.

Perhaps this could be improved a bit - eg use the service/targetPort instead the ingress one; check the endpoints and use their port number if all of them are the same and ignore/errorlog the backend if not. I'll have a look at this.

Out of curiosity - which type of ingress-nginx config works using a name as the target port? Do they also implement dynamic config via dns and headless service?

@mhyllander
Copy link
Author

mhyllander commented Jan 11, 2019

Hi @jcmoraisjr
It's precisely that the endpoints might be using a different port number than the service that is the problem. Please correct me if I'm wrong, but doesn't haproxy-ingress always bypass the service and load balance over the endpoints (pods) directly? So the haproxy-ingress controller should always determine the correct endpoint port number to use, even when using a headless service and a DNS resolver for IP address lookups, and use that when generating haproxy.cfg.

And since kubernetes allows referencing ports by either name or number in the Ingress and Service objects, haproxy-ingress should be able to handle either when it looks up and finds the endpoint port number (pod/containerPort).

Or am I missing something here? I guess it's possible for a service to match a mix of endpoint pods that use different port numbers. And I see that that could represent a problem when using a headless service, in combination with using a DNS resolver.

But, according to https://www.haproxy.com/blog/dns-service-discovery-haproxy/, haproxy 1.8 supports doing DNS lookups for SRV records. This means that you configure haproxy to find the endpoint port numbers automatically. Instead of configuring the backend using:

server-template server-dns 100 platform-ticket.platform.svc.cluster.local:9899 resolvers kubernetes resolve-prefer ipv4 init-addr none check inter 2s

you could specify

server-template server-dns 100 _http._tcp.platform-ticket.platform.svc.cluster.local resolvers kubernetes resolve-prefer ipv4 init-addr none check inter 2s

and it should find the correct port numbers automatically. Here "_http" is the name of the service port (as specified in the Service object, and Ingress object), and I suppose "_tcp" should be the protocol specified for the service port in the Service object. So maybe the solution is to use SRV lookups instead? Notice the SRV lookup returns the correct endpoint port (80):

/ # dig _http._tcp.platform-ticket.platform.svc.cluster.local SRV

; <<>> DiG 9.12.3 <<>> _http._tcp.platform-ticket.platform.svc.cluster.local SRV
;; global options: +cmd
;; Got answer:
;; WARNING: .local is reserved for Multicast DNS
;; You are currently testing what happens when an mDNS query is leaked to DNS
;; ->>HEADER<<- opcode: QUERY, status: NOERROR, id: 1768
;; flags: qr aa rd ra; QUERY: 1, ANSWER: 2, AUTHORITY: 0, ADDITIONAL: 2

;; QUESTION SECTION:
;_http._tcp.platform-ticket.platform.svc.cluster.local. IN SRV

;; ANSWER SECTION:
_http._tcp.platform-ticket.platform.svc.cluster.local. 5 IN SRV 0 50 80 10-182-0-217.platform-ticket.platform.svc.cluster.local.
_http._tcp.platform-ticket.platform.svc.cluster.local. 5 IN SRV 0 50 80 10-182-0-75.platform-ticket.platform.svc.cluster.local.

;; ADDITIONAL SECTION:
10-182-0-217.platform-ticket.platform.svc.cluster.local. 5 IN A 10.182.0.217
10-182-0-75.platform-ticket.platform.svc.cluster.local. 5 IN A 10.182.0.75

;; Query time: 1 msec
;; SERVER: 10.128.0.10#53(10.128.0.10)
;; WHEN: Fri Jan 11 10:29:28 UTC 2019
;; MSG SIZE  rcvd: 467

Regarding nginx-ingress, I don't think nginx has the ability to use DNS lookups for dynamic scaling. Instead I think nginx uses a LUA script to implement something similar to haproxy's Runtime API, and then the nginx-ingress controller sends updates to the LUA script/nginx as endpoints come and go. The LUA script dynamically changes the list of backend servers in the nginx config.
We noticed when we were using nginx-ingress that the pods were using very different amounts of memory and was not so stable under load, which is why we switched to using haproxy-ingress. (BTW I just submitted all my updates to the incubator/haproxy-ingress helm chart as a result of that switch.)

@jcmoraisjr
Copy link
Owner

Hi, yes, haproxy ingress will retrieve the real endpoints behind a service but only if dns/resolver isn't used. When using dns/resolver, HAProxy itself will query kubedns/coredns, there is no action from haproxy ingress. Because of that - HAProxy querying the DNS - it's important that the service is configured as headless otherwise HAProxy will proxy to the service IP instead of balance the endpoints.

You can also try dynamic update instead of dns/resolver. This should also avoid reloads if only endpoints and weights are changed.

I admit I've missed the lookup of SRV records and I can confirm that this should work fine on coredns. I'll definitely have a look at this and tag a new backward compatible beta in the following days.

The community is really awesome. I wasn't aware of haproxy-ingress chart, thanks for sharing. And thanks also for the detailed explanation about this issue.

@mhyllander
Copy link
Author

Hi, yes that makes sense. It would be awesome if SRV lookups could be supported. The issue with the differing ports bit us when we switched from nginx-ingress to haproxy-ingress. I understood that I had to use port numbers instead of names in the Ingress, but I missed the part that it didn't work if using DNS resolver and the pod had a different port. Which resulted in a service not being reachable for some time.

The reason we wanted to use DNS resolver for dynamic scaling is because we are trying to use the readiness probes on the endpoint pods to control when they are available for connections. When a pod's readiness probe fails, the pod is removed from the service DNS lookup, and therefore removed from haproxy backend, and will not receive new connections. But we think this was not a good solution for various reasons and will test using haproxy's agent check feature instead. (BTW it would be very nice if it was possible to configure agent checks for a backend, as well.)

I'm not sure how SRV lookup configuration would be best implemented. I'm thinking that regardless of if a port number or name is used in Ingress, it should be possible to find the port name in the service, and use that for the SRV lookup. The protocol should always be "_tcp", right? And if no service port name can be found, I suppose it should fall back to current solution using the A record lookup with an explicit port?

@jcmoraisjr
Copy link
Owner

Hi, on #285 I just check if the backend port is a valid number, if not, SRV records from DNS is used. The protocol is always _tcp. Try to replace the template of your environment and check if this works as expected. Let me know if I can help.

Regarding readiness, if failing, the endpoint is also removed which will also update the haproxy backends almost at the same time. If using dynamic-scaling this will happen without reloads - add --v=2 at the command line and follow the controller logs.

@mhyllander
Copy link
Author

Hi, I've tested the change in #285 now, and it seems to work correctly. I used different ports on the pod container and the service, then I used the service port name ("http") in the ingress (instead of the service port number). This is the backend that was generated:

backend platform-platform-ticket-http
    mode http
    balance roundrobin
    server-template server-dns 100 _http._tcp.platform-ticket.platform.svc.cluster.local resolvers kubernetes resolve-prefer ipv4 init-addr
none check inter 2s

I was able to send request to the platform-ticket service. Very nice!

@jcmoraisjr
Copy link
Owner

Nice, thanks! Merging the fix, hope to release a new beta tomorrow.

@jcmoraisjr
Copy link
Owner

beta.7 was just released and has this fix.

@jcmoraisjr jcmoraisjr added this to the v0.7 milestone Jan 25, 2019
@jcmoraisjr
Copy link
Owner

Closing. Just update this same issue if you have any problem.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants