Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Server crash if any, but not all, service lacks an identity #19986

Closed
sorenisanerd opened this issue Feb 14, 2024 · 3 comments · Fixed by #19987
Closed

Server crash if any, but not all, service lacks an identity #19986

sorenisanerd opened this issue Feb 14, 2024 · 3 comments · Fixed by #19987
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/crash theme/workload-identity type/bug
Milestone

Comments

@sorenisanerd
Copy link
Contributor

Nomad version

Nomad v1.7.6-dev
BuildDate 2024-02-14T16:14:11Z
Revision 994a2b10363dab995109d172a8ee772616d2c901+CHANGES

Operating system and Environment details

Irrelevant

Issue

Scheduling a job that has a service whose identity needs to be signed causes a crash if any another service lacks an identity:

panic: runtime error: invalid memory address or nil pointer dereference
[signal SIGSEGV: segmentation violation code=0x2 addr=0x0 pc=0x106599664]

goroutine 250 [running]:
github.com/hashicorp/nomad/nomad.(*Alloc).signServices(0x14000c00c00, 0x14001568000, 0x140006af600, 0x14000666a40, 0x14000b6cd20, {0x211fc068, 0xedd5f0b41, 0x0})
	/home/soren/src/nomad/nomad/alloc_endpoint.go:640 +0x124
github.com/hashicorp/nomad/nomad.(*Alloc).SignIdentities(0x14000c00c00, 0x14000bfe790, 0x14000b6cd20)
	/home/soren/src/nomad/nomad/alloc_endpoint.go:592 +0xc0c
reflect.Value.call({0x1400084ae40, 0x14000689c18, 0x13}, {0x107057221, 0x4}, {0x14000e35b60, 0x3, 0x3})
	/usr/local/go/src/reflect/value.go:596 +0x9e8
reflect.Value.Call({0x1400084ae40, 0x14000689c18, 0x13}, {0x14000e35b60, 0x3, 0x3})
	/usr/local/go/src/reflect/value.go:380 +0x74
net/rpc.(*service).call(0x1400017fc00, 0x140005dbe40, 0x14000610430, 0x0, 0x140008a0c00, 0x14000c00de0, {0x107ee2760, 0x14000bfe790, 0x16}, {0x107884b60, ...}, ...)
	/usr/local/go/src/net/rpc/server.go:382 +0x200
net/rpc.(*Server).ServeRequest(0x140005dbe40, {0x1080ba998, 0x1400004e600})
	/usr/local/go/src/net/rpc/server.go:503 +0x254
github.com/hashicorp/nomad/nomad.(*rpcHandler).handleNomadConn(0x14000685180, {0x1080b93d0, 0x14000968370}, {0x1080c51c8, 0x14000e66000}, 0x140005dbe40)
	/home/soren/src/nomad/nomad/rpc.go:456 +0x180
created by github.com/hashicorp/nomad/nomad.(*rpcHandler).handleMultiplexV2 in goroutine 200
	/home/soren/src/nomad/nomad/rpc.go:561 +0x860
Process 98006 has exited with status 2

Reproduction steps

cat <<EOF > nomad.hcl
consul {
  enabled = true
  address = "http://127.0.0.1:8500"
}
EOF

cat <<EOF > consul.hcl
acl {
  enabled = true
}
EOF
cat <<EOF > job.hcl
job "a" {
  group "b" {
    service {
      provider = "consul"
    }
    task "c" {
      service {
        identity {
          aud         = ["consul.io"]
        }
      }
      driver = "raw_exec"
      config {
        command = "/bin/sh"
        args    = ["-c", "env;sleep 3600"]
      }
    }
  }
}
EOF
export NOMAD_ADDR=http://127.0.0.1:4646/ CONSUL_HTTP_ADDR=http://127.0.0.1:8500
consul agent -dev -config-file=consul.hcl > consul.log &

echo Sleeping for 10 seconds give Consul some time to get ready
sleep 10 
../bin/nomad agent -dev -config=nomad.hcl > nomad.log &

echo Sleeping for 10 seconds give Nomad some time to get ready
sleep 10 # Give Nomad some time to get ready

export CONSUL_HTTP_TOKEN=$(consul acl bootstrap -format=json | jq -r .SecretID)
../bin/nomad setup consul -y
../bin/nomad job run job.hcl

Expected Result

No crash.

Actual Result

Complete crash. The RPC handler doesn't even catch it and let the server continue.

sorenisanerd added a commit to sorenisanerd/nomad that referenced this issue Feb 14, 2024
Fixes a null pointer exception if `Alloc.SignIdentities` was called for
any service and any service lacked an identity.

Fixes hashicorp#19986
@jrasell
Copy link
Member

jrasell commented Feb 19, 2024

Hi @sorenisanerd and thanks for raising this issue and with the related PR.

@jrasell jrasell added theme/crash stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/workload-identity labels Feb 19, 2024
tgross pushed a commit that referenced this issue Feb 22, 2024
Fixes a null pointer exception if `Alloc.SignIdentities` was called for
any service and any service lacked an identity.

Fixes #19986
@tgross tgross added this to the 1.7.x milestone Feb 22, 2024
@tgross
Copy link
Member

tgross commented Feb 22, 2024

Fixed by #19987, which will ship in the next version of Nomad.

Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Dec 31, 2024
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
stage/accepted Confirmed, and intend to work on. No timeline committment though. theme/crash theme/workload-identity type/bug
Projects
Development

Successfully merging a pull request may close this issue.

3 participants