Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[question] Services using Consul Connect and TCP Checks? #9945

Closed
evandam opened this issue Feb 2, 2021 · 6 comments
Closed

[question] Services using Consul Connect and TCP Checks? #9945

evandam opened this issue Feb 2, 2021 · 6 comments

Comments

@evandam
Copy link

evandam commented Feb 2, 2021

Nomad version

Nomad v1.0.3 (08741d9f2003ec26e44c72a2c0e27cdf0eadb6ee)

Operating system and Environment details

Ubuntu 18.04

Issue

Hi folks,

I was just hoping for some clarification around Consul Connect and TCP health checks.

I have a Nomad job to run a Redis server that previously used a TCP check that I would like to use with Consul Connect, but it's not allowed. I'm not sure if this is an issue on the Nomad or Consul side, though.

Not all of our services use Consul Connect (ex: Fabio), so I'm still using a named port and publishing it so other services can connect to evandam-redis.service.consul if needed.

My assumption is that TCP checks are not allowed, thinking that the port will not be accessible outside of the bridge network? However, with this approach Consul should still be able to reach the port to do a TCP check. Is my understanding of this correct?

Reproduction steps

$ nomad job plan evandam-redis.hcl
Error during plan: Unexpected response code: 500 (1 error occurred:
	* Task group cache validation failed: 1 error occurred:
	* Task group service validation failed: 1 error occurred:
	* Service[0] evandam-redis validation failed: 1 error occurred:
	* Check evandam-redis-ping invalid: tcp checks are not valid for Connect enabled services

Job file

job "evandam-redis" {
  datacenters = ["test"]
  type        = "service"

  group "cache" {
    network {
      mode = "bridge"
      port "tcp" {}
    }

    service {
      name = "evandam-redis"
      port = "tcp"

      connect {
        sidecar_service {}
        sidecar_task {
          resources {
            cpu    = 20
            memory = 20
          }
        }
      }

      check {
        name     = "evandam-redis-ping"
        type     = "tcp"
        interval = "10s"
        timeout  = "2s"
      }
    }

    task "evandam-redis" {
      driver = "docker"

      config {
        image       = "redis:${NOMAD_META_image_tag}"
        ports       = ["tcp"]
        force_pull  = true
        args        = [
          "--port",
          "${NOMAD_PORT_tcp}",
        ]
      }
    }
  }
}
@nickethier
Copy link
Member

Hi @evandam

When Consul Connect is enabled, the application (redis in this case) is usually configured to listen on localhost and Nomad will configure the Envoy proxy to act as a gateway to your application. Consul (which executes the tcp check) sits in the host's network namespace where this allocation will have its own network namespace. This means that Consul can't perform a meaningful tcp health check because it can't reach redis over plaintext (because of the separate network namespaces) and performing a tcp check against the Envoy proxy only checks that Envoy is listening.

I want to call out the expose field of the check stanza. Since Envoy understands layer 7 traffic we can tell it to expose a specific http path or grpc service over plaintext. This option will inject a port just for this purpose and register it with the check so Consul can perform the check through this exposed endpoint. So your best bet may be to run a second task that exposes a health check through an HTTP endpoint. Some quick googling lead me to this project.

Finally I wanted to point out one more thing in your jobfile. You do not need to define ports in your docker task because you are using the bridge group network mode. You do have to tell the port where to port map inside of the allocation network namespace. So what you could do is set the following for your port then remove the args and ports fields from your driver config since the port will be forwarding to the default port for redis:

network {
  mode = "bridge"
  port "tcp" {
    to = 6379
  }
}

I hope this helps, please let us know if you have more questions.

@evandam
Copy link
Author

evandam commented Feb 2, 2021

Thanks @nickethier! I also noticed that script checks work, so switching to a check that essentially does redis-cli ping works too 🤷‍♂️

I tried switching the ports bit like you mentioned, but I think something is getting lost along the way.

Here's my HCL: https://gist.github.com/evandam/6642b3ff3dcf49baaa97da2635a31dae

I'm testing with this:

nomad alloc exec -task evandam-redis-client -job evandam-redis redis-cli ping

Note with the first one you get PONG back as expected, but the second job file results in:

Error: Connection reset by peer

I think there's something going on like the sidecar proxy is using the wrong port? Not sure if this is a separate problem - happy to open a new issue if needed.

Thanks!

@nickethier
Copy link
Member

nickethier commented Feb 3, 2021

Ah yes script checks are a great option too!!

I forked your gist and made some changes and added comments. I figured that would translate better: https://gist.github.com/nickethier/dbb4312a66f7688d37fcf71a246b55cc/revisions

That should get you going. Let me know if it doesn't.

Also as an FYI we have a discuss board where we like to direct questions. This issue is totally fine just letting you know for future reference.

@evandam
Copy link
Author

evandam commented Feb 3, 2021

Thanks @nickethier

I think this makes perfect sense for majority of use cases using Consul Connect, but I was looking at things slightly differently.

While this example is using Redis, a more practical example is a service that connects to upstream services using Consul Connect, but should also be accessible with Fabio. In this case, I actually want the port stanza so it can be accessed from the host network by Fabio. I realize it goes around some of the security benefits of Consul Connect, but it seems to be the only option to keep things working with services like Fabio.

Either way, I'll keep the discussion board in mind for the future!

@nickethier
Copy link
Member

Ah I see then in that case, uncommenting the port stanza should get you there.

It sounds like we can close this issue. Please let me know if thats not the case. Thanks!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 24, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

2 participants