* main:
Simplify domain language to just "boot" and unscoped config keys
Retain a fixed number of containers when pruning
Don't assume rolling back in message
Check all hosts before rolling back
Ensure Traefik service name is consistent
Extend traefik delay by 1 second
Include traefik access logs
Check if we are still getting a 404
Also dump load balancer logs
Dump traefik logs when app not booted
Fix missing for apt-get
Report on container health after failure
Fix the integration test healthcheck
Allow percentage-based rolling deployments
Move `group_limit` & `group_wait` under `boot`
Limit rolling deployment to boot operation
Allow performing boot & start operations in groups
Time based container and image retention can have variable space
requirements depending on how often we deploy.
- Only prune stopped containers, retaining the 5 newest
- Then prune dangling images so we only keep images for the retained
containers.
Hosts could end up out of sync with each other if prune commands are run
manually or when new hosts are added.
Before rolling back confirm that the required container is available on
all hosts and roles.
If we don't specify any service properties when labelling containers,
the generated service will be named according to the container. However,
we change the container name on every deployment (as it is versioned),
which means that the auto-generated service name will be different in
each container.
That is a problem for two reasons:
- Multiple containers share a common router while a deployment is
happening. At this point, the router configuration will be different
between the containers; Traefik flags this as an error, and stops
routing to the containers until it's resolved.
- We allow custom labels to be set in an app's config. In order to
define custom configuration on the service, we'll need to know what
it will be called.
Changed to force the service name by setting one of its properties.
Add checks for:
* Docker installed locally
* Docker buildx plugin installed locally
* Dockerfile exists
If checks fail, it will halt deployment and provide more specific error messages.
Also adds a cli subcommand:
`mrsk build dependencies`
Fixes: #109 and #237
Getting the lock status with invoke passes through any options from the
original command which will raise an exception if they are not also
valid for the lock status command.
Fixes https://github.com/mrsked/mrsk/issues/239
Rather than waiting 5 seconds and hoping for the best after we boot
docker compose, add docker healthchecks and wait for all the containers
to be healthy.
Adds a simple integration test to ensure that `mrsk deploy` works.
Everything required is spun up with docker compose:
- shared: a container that contains an ssh key and a self signed cert to
be shared between the images
- deployer: the image we will deploy from
- registry: a docker registry
- two vm images to deploy into
- load_balancer: an nginx load balancer to use between our images
The other images are in privileged mode so that we can run
docker-in-docker. We need to run docker inside the images - mapping in
the docker socket doesn't work because both VMs would share the host
daemon.
The docker registry requires a self signed cert as you cannot use basic
auth over HTTP except on localhost. It runs on port 4443 rather than 443
because docker refused to accept that "registry" is a docker host and
tries to push images to docker.io/registry. "registry:4443" works fine.
The shared container contains the ssh keys for the deployer and vms, and
the self signed cert for the registry. When the shared container boots,
it copies them into a shared volume.
The other deployer and vm images are built with soft links from the
shared volume to the require locations. Their boot scripts wait for the
files to be copied in before continuing.
The root mrsk folder is mapped into the deployer container. On boot it
builds the gem and installs it.
Right now there's just a single test. We confirm that the load balancer
is returning a 502, run `mrsk deploy` and then confirm it returns 200.