How does Kamal deploy to multiple hosts at once? And how to configure it?
SSHKit
Kamal is built around SSHKit which provides Kamal with the SSH connections to issue remote commands. Through out the Kamal codebase we can notice the following SSHKit DSL which let’s Kamal schedule work on each hosts with their own thread:
on(KAMAL.hosts) do |host|
# Execute commands on each host
end
Kamal also enhances the SSHKit capabilities further in some ways. The most notable are the changes to SSHKit::Runner::Parallel
that lets Kamal wait on all threads and collect the failures:
class SSHKit::Runner::Parallel
# SSHKit joins the threads in sequence and fails on the first error it encounters, which means that we wait threads
# before the first failure to complete but not for ones after.
#
# We'll patch it to wait for them all to complete, and to record all the threads that errored so we can see when a
# problem occurs on multiple hosts.
module CompleteAll
def execute
threads = hosts.map do |host|
Thread.new(host) do |h|
backend(h, &block).run
rescue ::StandardError => e
e2 = SSHKit::Runner::ExecuteError.new e
raise e2, "Exception while executing #{host.user ? "as #{host.user}@" : "on host "}#{host}: #{e.message}"
end
end
exceptions = []
threads.each do |t|
begin
t.join
rescue SSHKit::Runner::ExecuteError => e
exceptions << e
end
end
if exceptions.one?
raise exceptions.first
elsif exceptions.many?
raise exceptions.first, [ "Exceptions on #{exceptions.count} hosts:", exceptions.map(&:message) ].join("\n")
end
end
end
prepend CompleteAll
end
Web barrier
Kamal splits servers into roles with a primary role. The usual primary role is a web role, e.g. our Puma server serving HTTP requests.
This role is important as it’s the first role to be booted before any other role:
def run
old_version = old_version_renamed_if_clashing
wait_at_barrier if queuer?
begin
start_new_version
rescue => e
close_barrier if gatekeeper?
stop_new_version
raise
end
release_barrier if gatekeeper?
if old_version
stop_old_version(old_version)
end
end
Only once at least one host of the primary role succeeds, everything else gets a green light for booting as well.
Controlled rollouts
Kamal supports rolling deployments through the boot
configuration in config/deploy
.
With limit
and wait
we can decide how many servers to deploy at a time and how long to wait between groups:
boot:
limit: 2 # Deploy to 2 servers at a time
wait: 10 # Wait 10 seconds between groups
Alternatively the limit
can also be specified as percentage:
boot:
limit: 25% # Deploy to 25% of servers at a time
wait: 10
This lets us rollout the new changes in a pace we are comfortable with.
Note that on each hosts containers are booted in a blue-green style of deployment where both old and new containers run next to each other for a brief period of time.
Finally, there is one more related setting for controlling the number of started concurrent connections at a time:
sshkit:
max_concurrent_starts: 10