1.2.5 Running a worker node

The worker node registers with the web node and is then used for executing builds and performing resource checks. It doesn't really decide much on its own.

Prerequisites

  • Linux: We test and support the following distributions. Minimum kernel version tested is 4.4.

    • Ubuntu 16.04 (kernel 4.4)

    • Ubuntu 18.04 (kernel 5.3)

    • Ubuntu 20.04 (kernel 5.4)

    • Debian 10 (kernel 4.19)

    Other Requirements:

  • Windows/Darwin: no special requirements (that we know of).

    NOTE: Windows containers are currently not supported and Darwin does not have native containers. Steps will run inside a temporary directory on the Windows/Darwin worker. Any dependencies needed for your tasks (e.g. git, .NET, golang, ssh) should be pre-installed on the worker. Windows/Darwin workers do not come with any resource types.

Running concourse worker

The concourse CLI can run as a worker node via the worker subcommand.

First, you'll need to configure a directory for the worker to store data:

CONCOURSE_WORK_DIR=/opt/concourse/worker

This is where all the builds run, and where all resources are fetched in to, so make sure it's backed by enough storage.

Next, point the worker at your web node like so:

CONCOURSE_TSA_HOST=10.0.2.15:2222
CONCOURSE_TSA_PUBLIC_KEY=path/to/tsa_host_key.pub
CONCOURSE_TSA_WORKER_PRIVATE_KEY=path/to/worker_key

Finally start the worker:

# run with -E to forward env config, or just set it all as root
sudo -E concourse worker

Note that the worker must be run as root because it orchestrates containers.

All logs will be emitted to stdout, with any panics or lower-level errors being emitted to stderr.

Resource utilization

CPU usage: almost entirely subject to pipeline workloads. More resources configured will result in more checking, and in-flight builds will use as much CPU as they want.

Memory usage: also subject to pipeline workloads. Expect usage to increase with the number of containers on the worker and spike as builds run.

Bandwidth usage: again, almost entirely subject to pipeline workloads. Expect spikes from periodic checking, though the intervals should spread out over enough time. Resource fetching and pushing will also use arbitrary bandwidth.

Disk usage: arbitrary data will be written as builds run, and resource caches will be kept and garbage collected on their own life cycle. We suggest going for a larger disk size if it's not too much trouble. All state on disk must not outlive the worker itself; it is all ephemeral. If the worker is re-created (i.e. fresh VM/container and all processes were killed), it should be brought back with an empty disk.

Highly available: not applicable. Workers are inherently singletons, as they're being used as drivers running entirely different workloads.

Horizontally scalable: yes; workers directly correlate to your capacity required by however many pipelines, resources, and in-flight builds you want to run. It makes sense to scale them up and down with demand.

Outbound traffic:

  • External traffic to arbitrary locations as a result of periodic resource checking and running builds

  • External traffic to the web node's configured external URL when downloading the inputs for a fly execute

  • External traffic to the web node's TSA port (2222) for registering the worker

  • If P2P streaming is enabled there will be traffic to other workers.

Inbound traffic:

  • From the web node on port 7777 (Garden) and 7788 (BaggageClaim). These ports do not need to be exposed, they are forwarded to the web node via the ssh connection on port 2222.

  • If P2P streaming is enabled there will be traffic to other workers.

Operating a worker node

The worker nodes are designed to be stateless and as interchangeable as possible. Tasks and Resources bring their own Docker images, so you should never have to install dependencies on the worker. Windows and Darwin workers are the exception to this. Any dependencies should be pre-installed on Windows and Darwin workers.

In Concourse, all important data is represented by Resources, so the workers themselves are dispensible. Any data in the work-dir is ephemeral and should go away when the worker machine is removed - it should not be persisted between worker VM or container re-creates.

Scaling Workers

More workers should be added to accommodate more pipelines. To know when this is necessary you should probably set up Metrics and keep an eye on container counts. If average container count starts to approach 200 or so per worker, you should probably add another worker. Load average is another metric to keep an eye on.

To add a worker, just create another machine for the worker and follow the Running concourse worker instructions again.

Note: it doesn't make sense to run multiple workers on one machine since they'll both be contending for the same physical resources. Workers should be given their own VMs or physical machines to maximize resource usage.

Horizontal vs Vertical Scaling

The answer to whether you should scale your workers horizontally or vertically depends heavily on what workloads your pipelines are running. Anecdotally though, we have seen that a lot of smaller workers (horizontal scaling) is usually better than a few large workers (vertical scaling).

Again, this is not an absolute answer! You will have to test this out against the workloads your pipelines demand and adjust based on the Metrics that you are tracking.

Worker Heartbeating & Stalling

Workers will continuously heartbeat to the Concourse cluster in order to remain registered and healthy. If a worker hasn't checked in after a while, possibly due to a network error, being overloaded, or having crashed, the web node will transition its state to stalled and new workloads will not be scheduled on that worker until it recovers.

If the worker remains in this state and cannot be recovered, it can be removed using the fly prune-worker command.

Restarting a Worker

Workers can be restarted in-place by sending SIGTERM to the worker process and starting it back up. Containers will remain running and Concourse will reattach to builds that were in flight.

This is a pretty aggressive way to restart a worker, and may result in errored builds - there are a few moving parts involved and we're still working on making this airtight.

A safer way to restart a worker is to land it by sending SIGUSR1 to the worker process. This will switch the worker to the landing state and Concourse will stop scheduling new work on it. When all builds running on the worker have finished, the process will exit.

You may want to enforce a timeout for draining - that way a stuck build won't prevent your workers from being upgraded. This can be enforced by common tools like start-stop-daemon:

start-stop-daemon \
  --pidfile worker.pid \
  --stop \
  --retry USR1/300/TERM/15/KILL

This will send SIGUSR1, wait up to 5 minutes, and then send SIGTERM. If it's still running, it will be killed after an additional 15 seconds.

Once the timeout is enforced, there's still a chance that builds that were running will continue when the worker comes back.

Gracefully Removing a Worker

When a worker machine is going away, it should be retired. This is similar to landing, except at the end the worker is completely unregistered, along with its volumes and containers. This should be done when a worker's VM or container is being destroyed.

To retire a worker, send SIGUSR2 to the worker process. This will switch the worker to retiring state, and Concourse will stop scheduling new work on it. When all builds running on the worker have finished, the worker will be removed and the worker process will exit.

Just like with landing, you may want to enforce a timeout for draining - that way a stuck build won't prevent your workers from being upgraded. This can be enforced by common tools like start-stop-daemon:

start-stop-daemon \
  --pidfile worker.pid \
  --stop \
  --retry USR2/300/TERM/15/KILL

This will send SIGUSR2, wait up to 5 minutes, and then send SIGTERM. If it's still running, it will be killed after an additional 15 seconds.

Configuring the worker node

Tagging Workers

If there's something special about your worker and you'd like to target builds at it specifically, you can configure tags like so:

CONCOURSE_TAG="tag-1,tag-2"

A tagged worker is taken out of the default placement logic. Tagged workers will not be used for any untagged Steps.

To run build steps on a tagged worker, specify the tags on any particular step in your job.

To perform resource checks on on a tagged worker, specify tags on the resource declaration.

Team Workers

If you want to isolate all workloads for a team then you can configure a worker to belong to a single team like so:

CONCOURSE_TEAM="lightweavers"

Once an untagged team worker is registered Concourse will schedule all untagged builds for that team on its team worker(s). Builds for this team will no longer be scheduled on any untagged, non-team workers.

It is possible to have a Concourse cluster made up of only team workers and have zero non-team workers, though this is not a common setup because resource utilization across all workers ends up underutilized. It is useful though if you have a particular team with heavy workloads that usually bothers other teams pipelines.

Tags and Team Workers

When you have a worker configured with tag(s) and a team like so:

CONCOURSE_TAG="tag-1,tag-2"
CONCOURSE_TEAM="lightweavers"

Only steps that are tagged and from the specified team will be scheduled on such a worker. Any untagged work the team has will land on either:

  1. Untagged team workers belonging to the team, or

  2. Untagged workers not configured to a specific team

Healthcheck Endpoint

The worker will automatically listen on port 8888 as its healthcheck endpoint. It will return a HTTP 200 status code with an empty body on a successful check. A successful check means the worker can reach the Garden and BaggageClaim servers.

The healthcheck endpoint is configurable through three variables:

--healthcheck-bind-ip=
IP address on which to listen for health checking requests. (default: 0.0.0.0)

--healthcheck-bind-port
Port on which to listen for health checking requests. (default: 8888)

--healthcheck-timeout
HTTP timeout for the full duration of health checking. (default: 5s)

Resource Types

The following section only applies to Linux workers. Resource types are simply Linux container images and therefore can't be run on Windows or Darwin workers.

Bundled Resource Types

Workers come prepackaged with a bundle of resource types. They are included in the tarball from the GitHub release page and are part of the concourse/concourse image.

To view the resource types available on a worker run:

fly workers --details

If you want more details, like the version number of each resource, you can run:

fly curl api/v1/workers

Installing or Upgrading Bundled Resource Types

You may want to upgrade the bundled resource types outside of Concourse upgrades or even install additional resource types on your workers to reduce the polling on some external image repository like Docker Hub.

We will use the git resource as our example. We will assume your Concourse installation is at /usr/local/concourse.

First, pull and create a container of the resource you're installing/upgrading. Grab the ID of the container that Docker creates.

$ docker run -d concourse/git-resource
b253417142565cd5eb43902e94a2cf355d5354b583fbc686488c9a153584c6ba

Export the containers file system into a gzip compressed tar archive named rootfs.tgz

docker export b253417142 | gzip > rootfs.tgz

Create a file called resource_metadata.json and populate it with the following contents. Make sure the type does not conflict with an existing resource type when you're installing a new resource type. In our example here we're calling the type gitv2 to avoid conflicting with the pre-existing git resource.

{
  "type": "gitv2",
  "version": "1.13.0",
  "privileged": false,
  "unique_version_history": false
}

At this point you should have two files: rootfs.tgz and resource_metadata.json.

Create a new directory under the resource-types folder in your Concourse installation directory. By convention it should be the same name as the type.

mkdir /usr/local/concourse/resource-types/gitv2

Place the rootfs.tgz and resource_metadata.json inside the folder. Restart your worker and verify the new resource type is on there by running one of the following commands:

fly workers --details
# or
fly curl api/v1/workers

You can also verify that Concourse can create a container with the rootfs.tgz you made by running a simple pipeline:

resources:
- name: some-resource
  type: gitv2 #change to your resource type
  source:
    uri: https://github.com/concourse/git-resource.git

jobs:
- name: simple-job
  plan:
  - get: some-resource

Configuring Runtimes

The worker can be run with multiple container runtimes - containerd, Guardian, and Houdini (an experimental and the only runtime for Darwin and Windows). Only containerd and Guardian are meant for production use. Guardian is the default runtime for Concourse.

Note about architecture: The web node (ATC) talks to all 3 runtimes via a single interface called the Garden server. While Guardian comes packaged with a Garden server and its flags in Concourse are unfortunately prefixed with --garden-*, Guardian (a runtime) and Garden (an interface and server) are two separate tools. An analogy for Garden would be the Container Runtime Interface (CRI) used in Kubernetes. Kubernetes uses containerd via CRI. Concourse uses containerd via Garden.

containerd runtime

To use the containerd runtime manually set the --runtime (CONCOURSE_RUNTIME) to containerd on the concourse worker command.

The following is a list of the containerd runtime specific flags for Concourse that can be set. They are all optional and have default values.

Containerd Configuration:
  --containerd-config=                               Path to a config file to use for the Containerd daemon. [$CONCOURSE_CONTAINERD_CONFIG]
  --containerd-bin=                                  Path to a containerd executable (non-absolute names get resolved from $PATH). [$CONCOURSE_CONTAINERD_BIN]
  --containerd-init-bin=                             Path to an init executable (non-absolute names get resolved from $PATH). (default: /usr/local/concourse/bin/init) [$CONCOURSE_CONTAINERD_INIT_BIN]
  --containerd-cni-plugins-dir=                      Path to CNI network plugins. (default: /usr/local/concourse/bin) [$CONCOURSE_CONTAINERD_CNI_PLUGINS_DIR]
  --containerd-request-timeout=                      How long to wait for requests to Containerd to complete. 0 means no timeout. (default: 5m) [$CONCOURSE_CONTAINERD_REQUEST_TIMEOUT]
  --containerd-max-containers=                       Max container capacity. 0 means no limit. (default: 250) [$CONCOURSE_CONTAINERD_MAX_CONTAINERS]

Containerd Container Networking:
  --containerd-external-ip=                          IP address to use to reach container's mapped ports. Autodetected if not specified. [$CONCOURSE_CONTAINERD_EXTERNAL_IP]
  --containerd-dns-server=                           DNS server IP address to use instead of automatically determined servers. Can be specified multiple times. [$CONCOURSE_CONTAINERD_DNS_SERVER]
  --containerd-restricted-network=                   Network ranges to which traffic from containers will be restricted. Can be specified multiple times. [$CONCOURSE_CONTAINERD_RESTRICTED_NETWORK]
  --containerd-network-pool=                         Network range to use for dynamically allocated container subnets. (default: 10.80.0.0/16) [$CONCOURSE_CONTAINERD_NETWORK_POOL]
  --containerd-mtu=                                  MTU size for container network interfaces. Defaults to the MTU of the interface used for outbound access by the host. [$CONCOURSE_CONTAINERD_MTU]
  --containerd-allow-host-access                     Allow containers to reach the host's network. This is turned off by default. [$CONCOURSE_CONTAINERD_ALLOW_HOST_ACCESS]

DNS Proxy Configuration:
  --containerd-dns-proxy-enable                      Enable proxy DNS server. Note: this will enable containers to access the host network. [$CONCOURSE_CONTAINERD_DNS_PROXY_ENABLE]

Make sure to read A note on allowing host access and DNS proxy to understand the implications of using --containerd-allow-host-access and --containerd-dns-proxy-enable

Transitioning from Guardian to containerd

If you are transitioning from Guardian to containerd you will need to convert any --garden-* (CONCOURSE_GARDEN_*) flags to their containerd (CONCOURSE_CONTAINERD_*) counterparts:

Guardian Flags Containerd Flags

--garden-request-timeout CONCOURSE_GARDEN_REQUEST_TIMEOUT

--containerd-request-timeout CONCOURSE_CONTAINERD_REQUEST_TIMEOUT

--garden-dns-proxy-enable CONCOURSE_GARDEN_DNS_PROXY_ENABLE

--containerd-dns-proxy-enable CONCOURSE_CONTAINERD_DNS_PROXY_ENABLE

No equivalent CLI flag CONCOURSE_GARDEN_ALLOW_HOST_ACCESS

--containerd-allow-host-access CONCOURSE_CONTAINERD_ALLOW_HOST_ACCESS

--garden-network-pool CONCOURSE_GARDEN_NETWORK_POOL

--containerd-network-pool CONCOURSE_CONTAINERD_NETWORK_POOL

--garden-max-containers CONCOURSE_GARDEN_MAX_CONTAINERS

--containerd-max-containers CONCOURSE_CONTAINERD_MAX_CONTAINERS

No equivalent CLI flag CONCOURSE_GARDEN_DENY_NETWORKS

--containerd-restricted-network CONCOURSE_CONTAINERD_RESTRICTED_NETWORK

No equivalent CLI flag CONCOURSE_GARDEN_DNS_SERVER

--containerd-dns-server CONCOURSE_CONTAINERD_DNS_SERVER

No equivalent CLI flag CONCOURSE_GARDEN_EXTERNAL_IP

--containerd-external-ip CONCOURSE_CONTAINERD_EXTERNAL_IP

No equivalent CLI flag CONCOURSE_GARDEN_MTU

--containerd-mtu CONCOURSE_CONTAINERD_MTU

Guardian runtime

Guardian is currently the default runtime for Concourse. It can also be set by setting the --runtime flag to guardian on the concourse worker command.

The concourse worker command automatically configures and runs Guardian using the gdn binary, but depending on the environment you're running Concourse in, you may need to pop open the hood and configure a few things.

The gdn server can be configured in two ways:

  1. By creating a config.ini file and passing it as --garden-config (or CONCOURSE_GARDEN_CONFIG).

    The .ini file should look something like this:

    [server]
    flag-name=flag-value

    To learn which flags can be set, consult gdn server --help. Each flag listed can be set under the [server] heading.

  2. By setting CONCOURSE_GARDEN_* environment variables.

    This is primarily supported for backwards compatibility, and these variables are not present in concourse worker --help. They are translated to flags passed to gdn server by lower-casing the * portion and replacing underscores with hyphens.

Troubleshooting and fixing DNS resolution

Note: The Guardian runtime took care of a lot of container creation operations for Concourse in the past. It was very user-friendly for the project to use as a container runtime. While implementing the containerd runtime most reported bugs were actually a difference in containerd's default behaviour compared to Guardian's. Currently Concourse's containerd runtime mostly behaves like the Guardian runtime did. Most of the following DNS section should apply to both runtimes.

By default, containers created by the Guardian or containerd (will refer to both as runtime) runtime will carry over the /etc/resolv.conf from the host into the container. This is often fine, but some Linux distributions configure a special 127.x.x.x DNS resolver (e.g. systemd-resolved).

When the runtime copies the resolv.conf over, it removes these entries as they won't be reachable from the container's network namespace. As a result, your containers may not have any valid nameservers configured.

To diagnose this problem you can fly intercept into a failing container and check which nameservers are in /etc/resolv.conf:

$ fly -t ci intercept -j concourse/concourse
bash-5.0$ grep nameserver /etc/resolv.conf
bash-5.0$

In this case it is empty, as the host only listed a single 127.0.0.53 address which was then stripped out. To fix this you'll need to explicitly configure DNS instead of relying on the default runtime behavior.

Pointing to external DNS servers

If you have no need for special DNS resolution within your Concourse containers, you can configure your containers to use specific DNS server addresses external to the VM.

The Guardian and containerd runtimes can have their DNS servers configured with flags or envs vars.

DNS Servers via flags (containerd runtime only)
concourse worker --containerd-dns-server="1.1.1.1" --containerd-dns-server="8.8.8.8"

DNS Servers via env vars
# containerd runtime
CONCOURSE_CONTAINERD_DNS_SERVER="1.1.1.1,8.8.8.8"
# Guardian runtime
CONCOURSE_GARDEN_DNS_SERVER="1.1.1.1,8.8.8.8"

config.ini (Guardian runtime only)
[server]
; configure Google DNS
dns-server=8.8.8.8
dns-server=8.8.4.4

To verify this solves your problem you can fly intercept into a container and check which nameservers are in /etc/resolv.conf:

$ fly -t ci intercept -j my-pipeline/the-job
bash-5.0$ cat /etc/resolv.conf
nameserver 1.1.1.1
nameserver 8.8.8.8
bash-5.0$ ping google.com
PING google.com (108.177.111.139): 56 data bytes
64 bytes from 108.177.111.139: seq=0 ttl=47 time=2.672 ms
64 bytes from 108.177.111.139: seq=1 ttl=47 time=0.911 ms
Using a local DNS server

If you would like to use Consul, dnsmasq, or some other DNS server running on the worker VM, you'll have to configure the LAN address of the VM as the DNS server and allow the containers to reach the address, like so:

Local DNS Servers via flags (containerd runtime only)
concourse worker --containerd-dns-server="10.0.1.3" --containerd-allow-host-access="true"

Local DNS Servers via env vars
# containerd runtime
CONCOURSE_CONTAINERD_DNS_SERVER="10.0.1.3"
CONCOURSE_CONTAINERD_ALLOW_HOST_ACCESS="true"
# Guardian runtime
CONCOURSE_GARDEN_DNS_SERVER="10.0.1.3"
CONCOURSE_GARDEN_ALLOW_HOST_ACCESS="true"

config.ini (Guardian runtime only)
[server]
; internal IP of the worker machine
dns-server=10.0.1.3

; allow containers to reach the above IP
allow-host-access=true

Make sure to read A note on allowing host access and DNS proxy to understand the implications of using allow-host-access

To validate whether the changes have taken effect, you can fly intercept into any container and check /etc/resolv.conf once again:

$ fly -t ci intercept -j my-pipeline/the-job
bash-5.0$ cat /etc/resolv.conf
nameserver 10.1.2.3
bash-5.0$ nslookup concourse-ci.org
Server:         10.1.2.3
Address:        10.1.2.3#53

Non-authoritative answer:
Name:   concourse-ci.org
Address: 185.199.108.153
Name:   concourse-ci.org
Address: 185.199.109.153
Name:   concourse-ci.org
Address: 185.199.110.153
Name:   concourse-ci.org
Address: 185.199.111.153

If nslookup times out or fails, you may need to open up firewalls or security group configuration so that the worker VM can send UDP/TCP packets to itself.

A note on allowing host access and DNS proxy

Setting allow-host-access will, well, allow containers to access your host VM's network. If you don't trust your container workloads, you may not want to allow this. With host network access, containers will be able to reach out to any other locally running network processes running on the worker including the garden and baggageclaim servers which would allow them to issue commands and manipulate other containers and volumes on the same worker.

Setting dns-proxy-enable will also enable allow-host-access (since the dns proxy will be run on the host, therefore requiring host access be enabled).

Configuring Peer-to-Peer Volume Streaming

Peer-to-Peer (P2P) volume streaming enables the workers to stream volumes directly to each other instead of always streaming volumes through the web node(s). This can reduce the time it takes for individual steps in a job to start and reduce the amount of network traffic used by the Concourse cluster.

NOTE: This feature is experimental. It is not as robust as the default volume streaming setup which always goes through web nodes.

Pre-Requisites

  • All worker nodes need to be able to reach each other via IP address. This usually means they are on the same LAN. You can test this by trying to ping one worker from another worker. If even one worker does not meet this requirement then you cannot use P2P volume streaming.

  • The baggageclaim port (7788 is the default) is open to traffic on all worker nodes. You can verify the port is open and reaching the baggageclaim API server by hitting the /volumes endpoint.

    curl http://<worker-IP-address>:7788/volumes

To enable P2P volume streaming you need to configure some settings on the web and worker nodes. Configure the worker nodes first. Configure the web node(s) last.

P2P Worker Configuration

  • CONCOURSE_BAGGAGECLAIM_BIND_IP=0.0.0.0 - Required. The worker needs to listen for traffic over 127.0.0.1 (to receive info from the web node) as well as its LAN IP in a P2P setup. Therefore we need to set the IP baggageclaim binds to to 0.0.0.0.

  • CONCOURSE_BAGGAGECLAIM_P2P_INTERFACE_NAME_PATTERN=eth0 - Optional. Regular expression to match a network interface for P2P streaming. This is how a worker determines its own LAN IP address, by looking it up via the LAN interface specified by this flag.

    You can determine the name of the LAN interface for any worker by listing all network interfaces and noting which interface has the LAN IP that you want the worker to use.

    To view all available network interfaces on your worker:

    • On Linux run ip addr list

    • On MacOS run ifconfig

    • On Windows run ipconfig. Windows network interface names are very different from Unix device names. Example network interface names for Windows include:

      Ethernet 4
      Local Area Connection* 2
      Local Area Connection* 12
      Wi-Fi 5
      Bluetooth Network Connection 2
      Loopback Pseudo-Interface 1

  • CONCOURSE_BAGGAGECLAIM_P2P_INTERFACE_FAMILY=4 - Optional. Tells the worker to use IPv4 or IPv6. Defaults to 4 for IPv4. Set to 6 for IPv6.

P2P Web Configuration

You need to tell the web node(s) to use P2P volume streaming.

CONCOURSE_ENABLE_P2P_VOLUME_STREAMING=true

Once that flag is set and the web node is restarted, P2P volume streaming will start occurring in your Concourse cluster.