Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

mount: operation not permitted #4199

Closed
SuryaBommadevara opened this issue Apr 23, 2018 · 11 comments
Closed

mount: operation not permitted #4199

SuryaBommadevara opened this issue Apr 23, 2018 · 11 comments

Comments

@SuryaBommadevara
Copy link

Nomad version

Nomad v0.7.0

Operating system and Environment details

Ubuntu 16.04

Issue

i have a nomad cluster on AWS with 3 nomad servers & 5 nomad clients. I see this issue very often these days Setup Failure failed to build task directory for elasticsearch: mount: operation not permitted. But what bothers me is that the client where i am seeing this error is successfully running my other containers. Any clue as to why this is happening

Reproduction steps

Nomad Server logs (if appropriate)

There are no server error logs

Nomad Client logs (if appropriate)

There are no server error logs

Client config file (if appropriate)

name       = "nomad-client-1"
region     = "california"
bind_addr  = "0.0.0.0"
data_dir  = "/opt/nomad/data"
log_level = "DEBUG"

advertise {
  http = "xxxxxxxxx"
  rpc  = "xxxxxxxxx"
  serf = "xxxxxxxxx"
}

client {
  enabled = true
  options {
    "driver.raw_exec.enable" = "1"
    "docker.privileged.enabled" = "true"
    "docker.auth.config"     = "/etc/docker/config.json"
    "docker.auth.helper" = "ecr-login"
  }
}

consul {
  address = "127.0.0.1:8500"
}
@dadgar
Copy link
Contributor

dadgar commented Apr 23, 2018

Hey,

You will need to share the job file in question, the client logs and the output of the alloc-status before we can help. There is not enough information in this ticket.

@SuryaBommadevara
Copy link
Author

@dadgar This behaviour is not confined to a particular job file. It happens once in a while irrespective of any nomad job we run. And i was not able to retrieve the logs because as soon as allocation fails i dont see any log files regarding that particular allocation.

@dadgar
Copy link
Contributor

dadgar commented Apr 23, 2018

Are you mounting particular directories? What drivers are you using?

@SuryaBommadevara
Copy link
Author

SuryaBommadevara commented Apr 23, 2018

Okay. Below is the job file if appropriate. And no i am not mounting any directory.

  region      = "california"
  datacenters = ["us-west-1"]
  type        = "service"
  priority    = 50

  update {
    stagger      = "10s"
    max_parallel = 1
  }

  group "group-mapper" {
    count = 1

    restart {
      attempts = 10
      interval = "5m"
      delay    = "25s"
      mode     = "delay"
    }

    task "run-mapper" {
      driver = "docker"
      config {
        image = "310880495183.dkr.ecr.us-west-1.amazonaws.com/jenkins-ci:mapper-latest"
	network_mode = "host"
	force_pull = true
        port_map {
          mapperport = 9712
        }
      }

      service {
        name = "mapper"
        tags = ["mapper", "mapper-run"]
        port = "mapperport"

	check {
          name     = "alive"
          type     = "tcp"
          interval = "10s"
          timeout  = "2s"
        }
      }

      resources {
        cpu    = 500
        memory = 1500

        network {
          mbits = 10

          port "mapperport" {
            static = 9712
          }
        }
      }

      logs {
        max_files     = 10
        max_file_size = 15
      }
    }
  }
}```

@dadgar
Copy link
Contributor

dadgar commented Apr 24, 2018

@SuryaBommadevara Are you by chance running Nomad in a container itself? Can you mount a tmpfs regularly?

mkdir /tmp/foo
sudo mount -t tmpfs none /tmp/foo
sudo umount /tmp/foo

@SuryaBommadevara
Copy link
Author

@dadgar No i am running nomad on AWS host machines. And yes i'm able to mount a tmpfs
screen shot 2018-04-24 at 10 33 04

@nickethier
Copy link
Member

nickethier commented Apr 26, 2018

Hey @SuryaBommadevara

Could you add a few more details to help us reproduce/debug this including:

  • Linux kernel and docker versions
  • Output of mount on affected client(s)
  • Output of ps auxf on affected client(s)
  • AWS instance type
  • Is this issue only occurring on a single client?
  • How long have you been observing this error? Has the occurrence increased over time?

@nickethier
Copy link
Member

@SuryaBommadevara in the process tree that you sent me I see that there is a nomad client running inside of a docker container. It looks to me like nomad is running on the host machine and a second nomad agent is running in a jenkins slave container on the same host. Can you confirm this?

If the nomad client in the container is setting up the task then its likely that is causing you to see the operation not permitted error from the mount call. The container must have the SYS_ADMIN capability to mount a tmpfs for the task directory.

@SuryaBommadevara
Copy link
Author

SuryaBommadevara commented May 1, 2018

@nickethier My apologies for the delayed response. So this is why we were having mount error occassionally. We have 5 nomad clients and on which we are running jenkins master & slave containers. But the catch here was we were also running nomad agent in those containers to be able to submit jobs to the server. What i ignored completely was when running nomad agent in containers . The nomad servers considers them also as a nomad client. And when deploying nomad job hoping that it would be placed on one of the 5 clients it was going to the client running inside the container. Since we dint give proper permissions for agents running inside the containers. That was causing the mount issue. Now we are not running nomad agents in containers. That fixed it. Thank you for assisting us on this. Sorry for any inconvenience caused. You can go ahead and close this issue.

@nickethier
Copy link
Member

@SuryaBommadevara no worries! I'm glad we were able to help sort this out. Thanks for using nomad!

@github-actions
Copy link

I'm going to lock this issue because it has been closed for 120 days ⏳. This helps our maintainers find and focus on the active issues.
If you have found a problem that seems similar to this, please open a new issue and complete the issue template so we can capture all the details necessary to investigate further.

@github-actions github-actions bot locked as resolved and limited conversation to collaborators Nov 30, 2022
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

3 participants