Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

nginx-ingress-controller crashes with SIGSEGV #3545

Closed
adriansqrd opened this issue Dec 11, 2018 · 15 comments
Closed

nginx-ingress-controller crashes with SIGSEGV #3545

adriansqrd opened this issue Dec 11, 2018 · 15 comments
Labels
kind/bug Categorizes issue or PR as related to a bug.

Comments

@adriansqrd
Copy link

Is this a request for help? (If yes, you should use our troubleshooting guide and community support channels, see https://kubernetes.io/docs/tasks/debug-application-cluster/troubleshooting/.):

What keywords did you search in NGINX Ingress controller issues before filing this one? (If you have found any duplicates, you should instead reply there.):


Is this a BUG REPORT or FEATURE REQUEST? (choose one):
BUG REPORT

NGINX Ingress controller version:
NGINX Ingress controller
Release: 0.20.0
Build: git-e8d8103
Repository: https://github.com/kubernetes/ingress-nginx.git
Image: quay.io/kubernetes-ingress-controller/nginx-ingress-controller-arm:0.20.0

Kubernetes version (use kubectl version):
Client Version: version.Info{Major:"1", Minor:"10", GitVersion:"v1.10.3", GitCommit:"2bba0127d85d5a46ab4b778548be28623b32d0b0", GitTreeState:"clean", BuildDate:"2018-05-21T09:17:39Z", GoVersion:"go1.9.3", Compiler:"gc", Platform:"darwin/amd64"}

Server Version: version.Info{Major:"1", Minor:"11", GitVersion:"v1.11.4", GitCommit:"bf9a868e8ea3d3a8fa53cbb22f566771b3f8068b", GitTreeState:"clean", BuildDate:"2018-10-25T19:06:30Z", GoVersion:"go1.10.3", Compiler:"gc", Platform:"linux/arm"}

Environment:

  • Cloud provider or hardware configuration: Raspberry Pi 3 Model B+
  • OS (e.g. from /etc/os-release): Raspbian GNU/Linux 9 (stretch
  • Kernel (e.g. uname -a): Linux 4.14.79-v7+ Fix minor typos in Ingress admin guide #1159 SMP Sun Nov 4 17:50:20 GMT 2018 armv7l GNU/Linux
  • Install tools:
    Client: &version.Version{SemVer:"v2.11.0", GitCommit:"2e55dbe1fdb5fdb96b75ff144a339489417b146b", GitTreeState:"clean"}
    Server: &version.Version{SemVer:"v2.11.0+unreleased", GitCommit:"158d6dbb746f525bad9a0aacb698af7d370ac3f5", GitTreeState:"dirty"}
  • Others:
    • Docker version 18.04.0-ce, build 3d479c0
    • metallb/controller:v0.7.3-arm

What happened:
The nginx-ingress-controller pod crashes with segfault after successful request to service.

  1. kubectl get pod nginx-ingress-controller-59cc49954d-b5kch
    nginx-ingress-controller-59cc49954d-b5kch   1/1       Running   26        3d
    
  2. Request to service
    curl --verbose ranger/towel/helloworld --header "Host: ranger"
    *   Trying 2003:c8:9f2e:2100:15d8:1b9f:5a9c:b715...
    * TCP_NODELAY set
    * Connection failed
    * connect to 2003:c8:9f2e:2100:15d8:1b9f:5a9c:b715 port 80 failed: Connection refused
    *   Trying 192.168.178.77...
    * TCP_NODELAY set
    * Connected to ranger (192.168.178.77) port 80 (#0)
    > GET /towel/helloworld HTTP/1.1
    > Host: ranger
    > User-Agent: curl/7.54.0
    > Accept: */*
    >
    < HTTP/1.1 200
    < Server: nginx/1.15.5
    < Date: Tue, 11 Dec 2018 07:46:08 GMT
    < Content-Type: application/json;charset=UTF-8
    < Transfer-Encoding: chunked
    < Connection: keep-alive
    < Vary: Accept-Encoding
    <
    * Connection #0 to host ranger left intact
    {"id":14,"message":"Hello, world!"}%
    
  3. nginx-ingress-controller crashes

kubectl get pod nginx-ingress-controller-59cc49954d-b5kc:

nginx-ingress-controller-59cc49954d-b5kch   0/1       Error     27        3d

kubectl logs nginx-ingress-controller-59cc49954d-b5kch:

I1211 07:46:20.803405       8 flags.go:180] Watching for Ingress class: nginx
nginx version: nginx/1.15.5
W1211 07:46:20.814264       8 client_config.go:552] Neither --kubeconfig nor --master was specified.  Using the inClusterConfig.  This might not work.
I1211 07:46:20.817531       8 main.go:196] Creating API client for https://10.96.0.1:443
I1211 07:46:20.947391       8 main.go:240] Running in Kubernetes cluster version v1.11 (v1.11.4) - git (clean) commit bf9a868e8ea3d3a8fa53cbb22f566771b3f8068b - platform linux/arm
I1211 07:46:20.954249       8 main.go:101] Validated default/nginx-ingress-default-backend as the default backend.
I1211 07:46:23.600322       8 nginx.go:256] Starting NGINX Ingress controller
I1211 07:46:23.668833       8 event.go:221] Event(v1.ObjectReference{Kind:"ConfigMap", Namespace:"default", Name:"nginx-ingress-controller", UID:"9dc046fd-fa2a-11e8-bc35-b827eb5c793c", APIVersion:"v1", Re
sourceVersion:"150947", FieldPath:""}): type: 'Normal' reason: 'CREATE' ConfigMap default/nginx-ingress-controller
I1211 07:46:24.710633       8 event.go:221] Event(v1.ObjectReference{Kind:"Ingress", Namespace:"default", Name:"towel-ingress", UID:"23aabc05-fc87-11e8-bc35-b827eb5c793c", APIVersion:"extensions/v1beta1",
 ResourceVersion:"573396", FieldPath:""}): type: 'Normal' reason: 'CREATE' Ingress default/towel-ingress
I1211 07:46:24.801389       8 leaderelection.go:185] attempting to acquire leader lease  default/ingress-controller-leader-nginx...
I1211 07:46:24.801457       8 nginx.go:277] Starting NGINX process
I1211 07:46:24.813672       8 controller.go:177] Configuration changes detected, backend reload required.
I1211 07:46:24.825633       8 leaderelection.go:194] successfully acquired lease default/ingress-controller-leader-nginx
I1211 07:46:24.825894       8 status.go:197] new leader elected: nginx-ingress-controller-59cc49954d-b5kch
I1211 07:46:25.298432       8 controller.go:195] Backend successfully reloaded.
I1211 07:46:25.298820       8 controller.go:205] Initial synchronization of the NGINX configuration.
I1211 07:46:26.309819       8 controller.go:212] Dynamic reconfiguration succeeded.
W1211 07:46:50.597452       8 controller.go:824] Service "default/towel-tool" does not have any active Endpoint.
I1211 07:46:50.597887       8 controller.go:177] Configuration changes detected, backend reload required.
I1211 07:46:51.053256       8 controller.go:195] Backend successfully reloaded.
I1211 07:46:51.057831       8 controller.go:212] Dynamic reconfiguration succeeded.
W1211 07:47:18.153682       8 controller.go:824] Service "default/towel-tool" does not have any active Endpoint.
I1211 07:47:21.487755       8 controller.go:177] Configuration changes detected, backend reload required.
I1211 07:47:21.939385       8 controller.go:195] Backend successfully reloaded.
I1211 07:47:21.946202       8 controller.go:212] Dynamic reconfiguration succeeded.
192.168.178.50 - [192.168.178.50] - - [11/Dec/2018:07:55:42 +0000] "GET /towel/helloworld HTTP/1.1" 200 46 "-" "curl/7.54.0" 86 0.022 [default-towel-tool-8080] 10.244.2.27:8080 46 0.020 200 93bea19734a3da7dc93776b3c9a5d2ba
panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x1 addr=0x0 pc=0x11a70]

goroutine 1308 [running]:
runtime/internal/atomic.goXadd64(0x38a652c, 0x2, 0x0, 0x20c49ba, 0x3f96872b)
        /usr/local/go/src/runtime/internal/atomic/atomic_arm.go:96 +0x1c
k8s.io/ingress-nginx/vendor/github.com/prometheus/client_golang/prometheus.(*histogram).Observe(0x38a64d0, 0x20c49ba, 0x3f96872b)
        /go/src/k8s.io/ingress-nginx/vendor/github.com/prometheus/client_golang/prometheus/histogram.go:272 +0x68
k8s.io/ingress-nginx/internal/ingress/metric/collectors.(*SocketCollector).handleMessage(0x38f7c00, 0x407ac00, 0x146, 0x600)
        /go/src/k8s.io/ingress-nginx/internal/ingress/metric/collectors/socket.go:269 +0xb8c
k8s.io/ingress-nginx/internal/ingress/metric/collectors.(*SocketCollector).handleMessage-fm(0x407ac00, 0x146, 0x600)
        /go/src/k8s.io/ingress-nginx/internal/ingress/metric/collectors/socket.go:317 +0x34
k8s.io/ingress-nginx/internal/ingress/metric/collectors.handleMessages(0x66c40d00, 0x3d3adb8, 0x3d3adc0)
        /go/src/k8s.io/ingress-nginx/internal/ingress/metric/collectors/socket.go:435 +0xa8
created by k8s.io/ingress-nginx/internal/ingress/metric/collectors.(*SocketCollector).Start
        /go/src/k8s.io/ingress-nginx/internal/ingress/metric/collectors/socket.go:317 +0xc0
  1. Controller then restarts and runs until next request
nginx-ingress-controller-59cc49954d-b5kch   0/1       CrashLoopBackOff   27        3d
nginx-ingress-controller-59cc49954d-b5kch   0/1       Running   28        3d
nginx-ingress-controller-59cc49954d-b5kch   1/1       Running   28        3d

What you expected to happen:
The controller not to crash.

How to reproduce it (as minimally and precisely as possible):

ingress.yml used:

apiVersion: extensions/v1beta1
kind: Ingress
metadata:
  name: towel-ingress
  annotations:
    nginx.ingress.kubernetes.io/rewrite-target: /
    nginx.ingress.kubernetes.io/ssl-redirect: "false"
spec:
  rules:
  - host: ranger
    http:
      paths:
        - path: /towel
          backend:
            serviceName: towel-tool
            servicePort: 8080

Service:

apiVersion: v1
kind: Service
metadata:
  name: towel-tool
  labels:
    app: towel-tool
spec:
  ports:
    - port: 8080
      protocol: TCP
      targetPort: 8080
  selector:
    app: towel-tool

Deployment:

apiVersion: apps/v1beta1
kind: Deployment
metadata:
  name: towel-tool
spec:
  replicas: 1
  template:
    metadata:
      labels:
        app: towel-tool
    spec:
      containers:
      - name: towel-tool
        image: aqube/towel-tool-arm:1.1-SNAPSHOT
        imagePullPolicy: Always
        ports:
        - containerPort: 8080
          protocol: TCP

The towel-tool image runs a spring app returning a hello world message at /helloworld.

Anything else we need to know:
Only happens when a host is specified in the ingress rule.

@aledbf aledbf added the kind/bug Categorizes issue or PR as related to a bug. label Dec 14, 2018
@aledbf
Copy link
Member

aledbf commented Dec 14, 2018

@addieter please update to 0.21.0

@pvormittag
Copy link

@aledbf - I'm experiencing the same issue, however there doesn't appear to be a 0.21.0 build for arm?

https://quay.io/repository/kubernetes-ingress-controller/nginx-ingress-controller-arm

@aledbf
Copy link
Member

aledbf commented Dec 22, 2018

@pvormittag that's correct. The luajit dependency (from openresty) only works on amd64 and arm64. I 've been trying to build from the original luajit source code but the NGINX compilation fails (because of lua-nginx-module)
If I cannot find a way to fix this, we will drop support for arm.

@jorneilander
Copy link

@pvormittag that's correct. The luajit dependency (from openresty) only works on amd64 and arm64. I 've been trying to build from the original luajit source code but the NGINX compilation fails (because of lua-nginx-module)
If I cannot find a way to fix this, we will drop support for arm.

Any update on this? I'd seriously like to use this on my RPi-cluster :)

@aledbf
Copy link
Member

aledbf commented Feb 2, 2019

Any update on this? I'd seriously like to use this on my RPi-cluster :)

arm64 works fine. Still cannot compile for arm

@goshlanguage
Copy link

Is there no way that the fix for nil pointer reference can't be backported into a patch version of 0.20.X ?

That seems like a semi easy resolution. The dependency for luajit may had been introduced in 0.21, however this seems on the surface like a bug that I wouldn't expect introducing a dependency would fix.

Alternatively, the stack-trace mentions prometheus and other metrics collections calls. arm users may not be using this, does turning this feature off resolve the issue?

@aledbf
Copy link
Member

aledbf commented Feb 21, 2019

Is there no way that the fix for nil pointer reference can't be backported into a patch version of 0.20.X ?

I am sorry but we don't have the resources to do that. Keep in mind this is a community project and no one works full time on ingress-nginx.
That said, you are more than welcome to check the tag for 0.22, create a patch and use a personal registry.

The dependency for luajit may had been introduced in 0.21, however this seems on the surface like a bug that I wouldn't expect introducing a dependency would fix.

That is not something we can control. To be able to use the lua nginx module we need to use https://github.com/openresty/luajit2 and in that dependency is where we have the problem with arm.
It seems this is fixed now openresty/luajit2#37 but I still need to test this with the current from master.

Alternatively, the stack-trace mentions prometheus and other metrics collections calls. arm users may not be using this, does turning this feature off resolve the issue?

The flag --enable-metrics was introduced in 0.22.0

@goshlanguage
Copy link

@aledbf I'm a little confused. You mention luajit yet again. How does luajit factor into this? I wouldn't imagine that adding a dependency on lua would ever fix a nil pointer error in golang, could you please elaborate on this a little bit?

@goshlanguage
Copy link

p.s. understood that this is an open source project. Thanks for your hard work, and for the hard work of this community.

@aledbf
Copy link
Member

aledbf commented Feb 21, 2019

You mention luajit yet again. How does luajit factor into this?

We depend on luajit but between 0.20 and 0.21 support for arm was removed.

I wouldn't imagine that adding a dependency on lua would ever fix a nil pointer error in golang, could you please elaborate on this a little bit?

There is no correlation. That issue is fixed in 0.22 but there is no arm image because of the lack of luajit support.

Edit: "was removed" not in this repository but in the upstream project

This was referenced Mar 5, 2019
@SteveLillis
Copy link

Since:

  • nginx ingress supports ARM64 but not ARM going forward
  • Raspbian OS does not currently have an ARM64 offering

It's worth noting for people landing here regarding Raspberry Pi 3 (like myself) that Ubuntu 18.04.02 LTS ARM64 Server offers a pre-installed image for Pi 3, and there are also many other 64-bit Pi OS alternatives such as Pi64, so there are at least options for getting compatibility with nginx ingress. Of course, if you also need ARM nodes, you can mix and match nodes and use node selectors to appropriately locate pods.

I will be testing switching over to Ubuntu Server at the weekend, but feedback here seems pretty positive.

@SteveLillis
Copy link

SteveLillis commented May 24, 2019

So, the only pain point with ubuntu ARM64 so far has been about six hours realising that the /boot/cmdline.txt is instead /boot/firmware/cmdline.txt.

So, when adding cgroup_enable=memory be sure to add it to /boot/firmware/config.txt if using the Ubuntu 18.04.02 LTS ARM64 image I linked above!

@aledbf
Copy link
Member

aledbf commented May 24, 2019

nginx ingress supports ARM64 but not ARM going forward

@SteveLillis I've been trying to get ARM again in #3852 but we get compilation errors. I'm still waiting for feedback from jaegertracing/jaeger-client-cpp#151

@SteveLillis
Copy link

Thanks @aledbf

For completeness, I can say that the Ubuntu ARM64 image linked above seems stable so far and that the latest version of nginx ingress for ARM64 does not have this issue, so either people can migrate to ARM64 or wait for @aledbf's changes to allow for an updated version of the ARM image.

@aledbf
Copy link
Member

aledbf commented Jun 27, 2019

Closing. We enabled ARM again.
Please use quay.io/kubernetes-ingress-controller/nginx-ingress-controller-arm:dev

ARM will be included in 0.25.0

@aledbf aledbf closed this as completed Jun 27, 2019
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
kind/bug Categorizes issue or PR as related to a bug.
Projects
None yet
Development

No branches or pull requests

6 participants