Replaces our current host-based HTTP healthchecks with Docker healthchecks, and adds a new `healthcheck.cmd` config option that can be used to define a custom health check command. Also removes Traefik's healthchecks, since they are no longer necessary. When deploying a container that has a healthcheck defined, we wait for it to report a healthy status before stopping the old container that it replaces. Containers that don't have a healthcheck defined continue to wait for `MRSK.config.readiness_delay`. There are some pros and cons to using Docker healthchecks rather than checking from the host. The main advantages are: - Supports non-HTTP checks, and app-specific check scripts provided by a container. - When booting a container, allows MRSK to wait for a container to be healthy before shutting down the old container it replaces. This should be safer than relying on a timeout. - Containers with healthchecks won't be active in Traefik until they reach a healthy state, which prevents any traffic from being routed to them before they are ready. The main _disadvantage_ is that containers are now required to provide some way to check their health. Our default check assumes that `curl` is available in the container which, while common, won't always be the case.
40 lines
1.0 KiB
Ruby
40 lines
1.0 KiB
Ruby
class Mrsk::Utils::HealthcheckPoller
|
|
TRAEFIK_HEALTHY_DELAY = 1
|
|
|
|
class HealthcheckError < StandardError; end
|
|
|
|
class << self
|
|
def wait_for_healthy(pause_after_ready: false, &block)
|
|
attempt = 1
|
|
max_attempts = MRSK.config.healthcheck["max_attempts"]
|
|
|
|
begin
|
|
case status = block.call
|
|
when "healthy"
|
|
sleep TRAEFIK_HEALTHY_DELAY if pause_after_ready
|
|
when "running" # No health check configured
|
|
sleep MRSK.config.readiness_delay if pause_after_ready
|
|
else
|
|
raise HealthcheckError, "container not ready (#{status})"
|
|
end
|
|
rescue HealthcheckError => e
|
|
if attempt <= max_attempts
|
|
info "#{e.message}, retrying in #{attempt}s (attempt #{attempt}/#{max_attempts})..."
|
|
sleep attempt
|
|
attempt += 1
|
|
retry
|
|
else
|
|
raise
|
|
end
|
|
end
|
|
|
|
info "Container is healthy!"
|
|
end
|
|
|
|
private
|
|
def info(message)
|
|
SSHKit.config.output.info(message)
|
|
end
|
|
end
|
|
end
|