Notes to self

How does Kamal deploy to multiple hosts

How does Kamal deploy to multiple hosts at once? And how to configure it?

SSHKit

Kamal is built around SSHKit which provides Kamal with the SSH connections to issue remote commands. Through out the Kamal codebase we can notice the following SSHKit DSL which let’s Kamal schedule work on each hosts with their own thread:

on(KAMAL.hosts) do |host|
  # Execute commands on each host
end

Kamal also enhances the SSHKit capabilities further in some ways. The most notable are the changes to SSHKit::Runner::Parallel that lets Kamal wait on all threads and collect the failures:

class SSHKit::Runner::Parallel
  # SSHKit joins the threads in sequence and fails on the first error it encounters, which means that we wait threads
  # before the first failure to complete but not for ones after.
  #
  # We'll patch it to wait for them all to complete, and to record all the threads that errored so we can see when a
  # problem occurs on multiple hosts.
  module CompleteAll
    def execute
      threads = hosts.map do |host|
        Thread.new(host) do |h|
          backend(h, &block).run
        rescue ::StandardError => e
          e2 = SSHKit::Runner::ExecuteError.new e
          raise e2, "Exception while executing #{host.user ? "as #{host.user}@" : "on host "}#{host}: #{e.message}"
        end
      end

      exceptions = []
      threads.each do |t|
        begin
          t.join
        rescue SSHKit::Runner::ExecuteError => e
          exceptions << e
        end
      end
      if exceptions.one?
        raise exceptions.first
      elsif exceptions.many?
        raise exceptions.first, [ "Exceptions on #{exceptions.count} hosts:", exceptions.map(&:message) ].join("\n")
      end
    end
  end

  prepend CompleteAll
end

Web barrier

Kamal splits servers into roles with a primary role. The usual primary role is a web role, e.g. our Puma server serving HTTP requests.

This role is important as it’s the first role to be booted before any other role:

def run
  old_version = old_version_renamed_if_clashing
  wait_at_barrier if queuer?

  begin
    start_new_version
  rescue => e
    close_barrier if gatekeeper?
    stop_new_version
    raise
  end

  release_barrier if gatekeeper?

  if old_version
    stop_old_version(old_version)
  end
end

Only once at least one host of the primary role succeeds, everything else gets a green light for booting as well.

Controlled rollouts

Kamal supports rolling deployments through the boot configuration in config/deploy.

With limit and wait we can decide how many servers to deploy at a time and how long to wait between groups:

boot:
  limit: 2  # Deploy to 2 servers at a time
  wait: 10  # Wait 10 seconds between groups

Alternatively the limit can also be specified as percentage:

boot:
  limit: 25% # Deploy to 25% of servers at a time
  wait: 10

This lets us rollout the new changes in a pace we are comfortable with.

Note that on each hosts containers are booted in a blue-green style of deployment where both old and new containers run next to each other for a brief period of time.

Finally, there is one more related setting for controlling the number of started concurrent connections at a time:

sshkit:
  max_concurrent_starts: 10
Check out my book
Learn how to use Kamal to deploy your web applications with Kamal Handbook. Visualize Kamal's concepts, understand Kamal's configuration, and deploy practical life examples.
by Josef Strzibny
RSS