Nomad and Consul

Nomad

  • workload orchestrator (containers, java, vm, windows, binaries)
  • single binary (linux, arm, osx, windows)
  • unified workflow for deployments
    • zero downtime (blue-green, canary, rolling-release*)
    • batch
    • service / system
  • easly integrates with Consul and Vault
  • plugin system
    • Container Storage Interface (CSI)
    • Container Network Interface (CNI)
  • use HCL to define workloads

Nomad versions

asdf plugin-add nomad
asdf install nomad 0.12.5
asdf local nomad 0.12.5

CLI

nomad run           # run/update a job definition
nomad stop          # stop a job
nomad status        # check a resource status
nomad agent         # run nomad agent
nomad system -help  # see a help for command
nomad -help         # see global help

Architecture

  • Job a declared workload for Nomad; defines a desired state
  • Task Group is a set of tasks that must run together
  • Driver is a tool to run a workload
  • Task is a smallest unit of work in Nomad
  • Client is a machine where Nomad can run tasks
  • Allocation is kind a mapping between task group and job on Client
  • Evaluation is a mechanism by which Nomad make scheduling decisions
  • Server are bain of the cluster
  • Regions and Datacenters are places where Nomad Clients and Servers are deployed

Architecture Overview

Single Region

Nomad Architecture

Multiple Regions

Multiple Regions

Scheduling in Nomad

Scheduling

Evaluation Flow

Job Specification

Options

  • affinity - soft placement preferences
  • artifact - download an archive
  • check_restart - define rules when to restart an unhealthy task
  • constraint - hard placement preferences
  • env - set ENV variables
  • ephemeral_disk - disk setup
  • group - a group for set of tasks
  • lifecycle - define task dependencies
  • logs - set logs rotation policies
  • migrate - define a strategy for migrating off draning nodes
  • network - define network settings (ports, throughput, DNS)
  • periodic - run a nomad job like a cron
  • reschedule - define a schedule strategy to move all job’s allocations (to another node) if any allocation become “failed”
  • resource - describe the requirements for a task to execute
  • restart - define a tasks’s behaviour on task failure
  • service - describe how tasts are registered in Consul
  • spread - define the way how tasks should spread across nodes
  • template - simple template engine to render configuration files
  • update - define a strategy job’s updates (rolling upgrades, canary, blue-green)
  • volume - allows a group to require a volume from cluster
  • volume_mount - mount a volume in a specific location inside a task

Hierarchy

job
  \_ group
        \_ task

Example

# This declares a job named "docs". There can be exactly one
# job declaration per job file.
job "docs" {
  # Specify this job should run in the region named "us". Regions
  # are defined by the Nomad servers' configuration.
  region = "us"

  # Spread the tasks in this job between us-west-1 and us-east-1.
  datacenters = ["us-west-1", "us-east-1"]

  # Run this job as a "service" type. Each job type has different
  # properties. See the documentation below for more examples.
  type = "service"

  # Specify this job to have rolling updates, two-at-a-time, with
  # 30 second intervals.
  update {
    stagger      = "30s"
    max_parallel = 2
  }


  # A group defines a series of tasks that should be co-located
  # on the same client (host). All tasks within a group will be
  # placed on the same host.
  group "webs" {
    # Specify the number of these tasks we want.
    count = 5

    network {
      # This requests a dynamic port named "http". This will
      # be something like "46283", but we refer to it via the
      # label "http".
      port "http" {}

      # This requests a static port on 443 on the host. This
      # will restrict this task to running once per host, since
      # there is only one port 443 on each host.
      port "https" {
        static = 443
      }
    }

    # The service block tells Nomad how to register this service
    # with Consul for service discovery and monitoring.
    service {
      # This tells Consul to monitor the service on the port
      # labelled "http". Since Nomad allocates high dynamic port
      # numbers, we use labels to refer to them.
      port = "http"

      check {
        type     = "http"
        path     = "/health"
        interval = "10s"
        timeout  = "2s"
      }
    }

    # Create an individual task (unit of work). This particular
    # task utilizes a Docker container to front a web application.
    task "frontend" {
      # Specify the driver to be "docker". Nomad supports
      # multiple drivers.
      driver = "docker"

      # Configuration is specific to each driver.
      config {
        image = "hashicorp/web-frontend"
      }

      # It is possible to set environment variables which will be
      # available to the task when it runs.
      env {
        "DB_HOST" = "db01.example.com"
        "DB_USER" = "web"
        "DB_PASS" = "loremipsum"
      }

      # Specify the maximum resources required to run the task,
      # include CPU and memory.
      resources {
        cpu    = 500 # MHz
        memory = 128 # MB
      }
    }
  }
}

Running Nomad

  • dev mode (do not use dev mode on production)
  • agent runs on servers and clients
  • UI runs on port 4646
  • http://localhost:4646
$ nomad agent -dev

Sample job

$ nomad job init

Demo Examples

Static Port

job "http-echo" {
  datacenters = ["dc1"]
  type        = "service" # default

  group "echo" {
    count = 1
    task "server" {
      driver = "docker"

      config {
        image = "hashicorp/http-echo:latest"
        args = [
          "-listen", ":8080",
          "-text", "Hello there from 127.0.0.1:8080"
        ]
      }

      resources {
        network {
          mbits = 10
          port "http" {
            static = 8080
          }
        }
      }
    }
  }
}

Dynamic Port

job "http-echo" {
  datacenters = ["dc1"]
  type        = "service" # default

  group "echo" {
    count = 1
    task "server" {
      driver = "docker"

      config {
        image = "hashicorp/http-echo:latest"
        args = [
          "-listen", ":${NOMAD_PORT_http}",
          "-text", "Hello there from 127.0.0.1:${NOMAD_PORT_http}"
        ]
      }

      resources {
        network {
          mbits = 10
          port "http" { }
        }
      }
    }
  }
}

Service Discovery aka Consul

  • adding more than 1 container (of given image)
  • challanges: ports
  • simplify sending requests

Consul

  • run Consul
asdf plugin-add consul
asdf install consul 1.8.4
asdf local consul 1.8.4

or

https://www.consul.io/downloads

Run Consul

job "consul" {
  datacenters = ["dc1"]

  group "consul" {
    task "consul" {
      driver = "raw_exec"

      config {
        command = "consul"
        args    = ["agent", "-dev"]
      }

      artifact {
        source = "https://releases.hashicorp.com/consul/1.8.4/consul_1.8.4_linux_amd64.zip"
      }
    }
  }
}

Register service

job "http-echo" {
  datacenters = ["dc1"]
  type        = "service" # default

  group "echo" {
    count = 3
    task "server" {
      driver = "docker"

      config {
        image = "hashicorp/http-echo:latest"
        args = [
          "-listen", ":${NOMAD_PORT_http}",
          "-text", "Hello there from 127.0.0.1:${NOMAD_PORT_http}"
        ]
      }

      resources {
        network {
          mbits = 10
          port "http" {}
        }
      }

      service {
        name = "http-echo"
        tags = ["http-echo", "we", "need", "mode", "tags"]
        port = "http"

        check {
          name     = "http-echo port alive"
          type     = "http"
          path     = "/"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}

Internal Load Balancing

  • use Fabio deployed as container
    • internal load balancer - fabio
    • http://localhost:9998/routes
    • watch -n 0.2 "curl -s http://localhost:9999/http-echo-dynamic" (and keep it open all the time)

Run Fabio

job "fabio" {
  datacenters = ["dc1"]

  group "fabio" {
    task "fabio" {
      driver = "docker"

      config {
        network_mode = "host"
        image        = "fabiolb/fabio:1.5.14-go1.15"
        args         = ["-proxy.strategy=rr"]
      }

      resources {
        network {
          mbits = 10

          port "lb" {
            static = 9998
          }

          port "ui" {
            static = 9999
          }
        }
      }
    }
  }
}
job "http-echo" {
  datacenters = ["dc1"]
  type        = "service" # default

  group "echo" {
    count = 3
    task "server" {
      driver = "docker"

      config {
        image = "hashicorp/http-echo:latest"
        args = [
          "-listen", ":${NOMAD_PORT_http}",
          "-text", "Hello there from 127.0.0.1:${NOMAD_PORT_http}"
        ]
      }

      resources {
        network {
          mbits = 100
          port "http" {}
        }
      }

      service {
        name = "http-echo"
        tags = [
          "http-echo",
          "urlprefix-/echo" # <--LOOK-HERE--
        ]
        port = "http"

        check {
          name     = "http-echo port alive"
          type     = "http"
          path     = "/"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}
  • run checker watch -n 0.2 "curl -s http://localhost:9999/echo"

Deployment and Job Versions

  • Nomad servers (masters) keep a state of each job
  • use plan or run (like terraform does)
  • use terraform (with nomad provider)
$ nomad job plan job.nomad
..
...
Job Modify Index: 0
To submit the job with version verification run:

nomad run job -check-index 0 job.nomad
$ nomad run job -check-index 0 job.nomad

==> Monitoring evaluation "dc8266e6"
    Evaluation triggered by job "http-echo"
    Allocation "54ac20ff" created: node "25f73cb5", group "echo"
    Evaluation within deployment: "543d74a0"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "dc8266e6" finished with status "complete"

Deployments

Canary

job "http-echo" {
  datacenters = ["dc1"]
  type        = "service" # default

  update {
    canary       = 1
    max_parallel = 2
    # auto_promote = 1
  }

  group "echo" {
    count = 3
    task "server" {
      driver = "docker"

      config {
        image = "hashicorp/http-echo:latest"
        args = [
          "-listen", ":${NOMAD_PORT_http}",
          "-text", "Hello there from 127.0.0.1:${NOMAD_PORT_http} - CANARY 1/2"
        ]
      }

      resources {
        network {
          mbits = 100
          port "http" {}
        }
      }

      service {
        name = "http-echo"
        tags = [
          "http-echo",
          "urlprefix-/echo"
        ]
        port = "http"

        check {
          name     = "http-echo port alive"
          type     = "http"
          path     = "/"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}

Environment variables

job "http-echo" {
  datacenters = ["dc1"]
  type        = "service" # default

  update {
    canary       = 3
    max_parallel = 3
    # auto_promote = 1
  }

  group "echo" {
    count = 3
    task "server" {
      driver = "docker"

      template { # <--LOOK-HERE--
        data = <<EOH
      DUMMY_KEY="{{key "dummy_key"}}"
      EOH

        destination = "secrets/file.env"
        env         = true
      }

      config {
        image = "hashicorp/http-echo:latest"
        args = [
          "-listen", ":${NOMAD_PORT_http}",
          "-text", "Hello there from 127.0.0.1:${NOMAD_PORT_http} - CANARY 3/3 - ${DUMMY_KEY}" # <--LOOK-HERE--
        ]
      }

      resources {
        network {
          mbits = 100
          port "http" {}
        }
      }

      service {
        name = "http-echo"
        tags = [
          "http-echo",
          "urlprefix-/echo"
        ]
        port = "http"

        check {
          name     = "http-echo port alive"
          type     = "http"
          path     = "/"
          interval = "10s"
          timeout  = "2s"
        }
      }
    }
  }
}

CLI

$ nomad status
No running jobs
$ nomad server members
Name        Address    Port  Status  Leader  Protocol  Build   Datacenter  Region
x1e.global  127.0.0.1  4648  alive   true    2         0.12.5  dc1         global
$ nomad run job.nomad
==> Monitoring evaluation "dc8266e6"
    Evaluation triggered by job "http-echo"
    Allocation "54ac20ff" created: node "25f73cb5", group "echo"
    Evaluation within deployment: "543d74a0"
    Evaluation status changed: "pending" -> "complete"
==> Evaluation "dc8266e6" finished with status "complete"
$ nomad status
ID         Type     Priority  Status   Submit Date
http-echo  service  50        running  2020-10-16T08:01:06+02:00
$ nomad status http-echo
ID            = http-echo
Name          = http-echo
Submit Date   = 2020-10-16T08:01:06+02:00
Type          = service
Priority      = 50
Datacenters   = dc1
Namespace     = default
Status        = running
Periodic      = false
Parameterized = false

Summary
Task Group  Queued  Starting  Running  Failed  Complete  Lost
echo        0       0         1        0       0         0

Latest Deployment
ID          = 543d74a0
Status      = running
Description = Deployment is running

Deployed
Task Group  Desired  Placed  Healthy  Unhealthy  Progress Deadline
echo        1        1       0        0          2020-10-16T08:11:06+02:00

Allocations
ID        Node ID   Task Group  Version  Desired  Status   Created  Modified
54ac20ff  25f73cb5  echo        0        run      running  48s ago  46s ago
$ nomad status 54ac20ff
ID                  = 54ac20ff-4390-9884-eb35-f691a7241b50
Eval ID             = dc8266e6
Name                = http-echo.echo[0]
Node ID             = 25f73cb5
Node Name           = x1e
Job ID              = http-echo
Job Version         = 0
Client Status       = running
Client Description  = Tasks are running
Desired Status      = run
Desired Description = <none>
Created             = 1m14s ago
Modified            = 1m12s ago
Deployment ID       = 543d74a0
Deployment Health   = unset

Task "server" is "running"
Task Resources
CPU        Memory           Disk     Addresses
0/100 MHz  288 KiB/300 MiB  300 MiB  http: 127.0.0.1:24183

Task Events:
Started At     = 2020-10-16T06:01:09Z
Finished At    = N/A
Total Restarts = 0
Last Restart   = N/A

Recent Events:
Time                       Type        Description
2020-10-16T08:01:09+02:00  Started     Task started by client
2020-10-16T08:01:06+02:00  Driver      Downloading image
2020-10-16T08:01:06+02:00  Task Setup  Building Task Directory
2020-10-16T08:01:06+02:00  Received    Task received by client
$ nomad exec 54ac20ff bash
$

History and revert

$ nomad job history fabio
Version     = 4
Stable      = true
Submit Date = 2020-10-14T12:58:08+02:00

Version     = 3
Stable      = true
Submit Date = 2020-10-14T12:57:13+02:00

Version     = 2
Stable      = true
Submit Date = 2020-10-14T12:54:04+02:00

Version     = 1
Stable      = true
Submit Date = 2020-10-14T12:51:53+02:00

Version     = 0
Stable      = false
Submit Date = 2020-10-14T12:51:42+02:00

$ nomad job revert fabio 3

Schedule jobs

job "cron" {
  datacenters = ["dc1"]
  type        = "batch"

  periodic {
    cron = "*/1 * * * * *"
  }

  group "cron" {
    task "cron" {
      driver = "raw_exec"
      config {
        command = "/tmp/cron"
      }
    }
  }
}