Adding -T to the copy command ensures that the files are copied at the
same level into the target directory whether it exists or not.
That allows us to drop the `/*` which was not picking up hidden files.
Fixes: https://github.com/basecamp/kamal/issues/465
Kamal needs images to have the service label so it can track them for
pruning. Images built by Kamal will have the label, but externally built
ones may not.
Without it images will build up over time. The worst case is an outage
if all the hosts disks fill up at the same time.
We'll add a check for the label and halt if it is not there.
During deployments both the old and new containers will be active for a
small period of time. There also may be lagging requests for older CSS
and JS after the deployment.
This can lead to 404s if a request for old assets hits a new container
or visa-versa.
This PR makes sure that both sets of assets are available throughout the
deployment from before the new version of the app is booted.
This can be configured by setting the asset path:
```yaml
asset_path: "/rails/public/assets"
```
The process is:
1. We extract the assets out of the container, with docker run, docker
cp, docker stop. Docker run sets the container command to "sleep" so
this needs to be available in the container.
2. We create an asset volume directory on the host for the new version
of the app on the host and copy the assets in there.
3. If there is a previous deployment we also copy the new assets into
its asset volume and copy the older assets into the new asset volume.
4. We start the new container mapping the asset volume over the top of
the container's asset path.
This means the both the old and new versions have replaced the asset
path with a volume containing both sets of assets and should be able
to serve any request during the deployment. The older assets will
continue to be available until the next deployment.
The go template was concatenating all the mounts into one line. It
happened to work because the mount we are interested was always first.
Fix it to output one mount per line instead.
When replacing a container currently we:
1. Boot the new container
2. Wait for it to become healthy
3. Stop the old container
Traefik will send requests to the old container until it notices that it
is unhealthy. But it may have stopped serving requests before that point
which can result in errors.
To get round that the new boot process is:
1. Create a directory with a single file on the host
2. Boot the new container, mounting the cord file into /tmp and
including a check for the file in the docker healthcheck
3. Wait for it to become healthy
4. Delete the healthcheck file ("cut the cord") for the old container
5. Wait for it to become unhealthy and give Traefik a couple of seconds
to notice
6. Stop the old container
The extra steps ensure that Traefik stops sending requests before the
old container is shutdown.
Setting env variables in the docker arguments requires having them on
the deploy host.
Instead we'll add two new commands `kamal env push` and
`kamal env delete` which will manage copying the environment as .env
files to the remote host.
Docker will pick up the file with `--env-file <path-to-file>`. Env files
will be stored under `<kamal run directory>/env`.
Running `kamal env push` will create env files for each role and
accessory, and traefik if required.
`kamal envify` has been updated to also push the env files.
By avoiding using `kamal envify` and creating the local and remote
secrets manually, you can now avoid accessing secrets needed
for the docker runtime environment locally. You will still need build
secrets.
One thing to note - the Docker doesn't parse the environment variables
in the env file, one result of this is that you can't specify multi-line
values - see https://github.com/moby/moby/issues/12997.
We maybe need to look docker config or docker secrets longer term to get
around this.
Hattip to @kevinmcconnell - this was all his idea.
To avoid polluting the default SSH directory with lots of Kamal config,
we'll default to putting them in a `kamal` sub directory.
But also make the directory configurable with the `run_directory` key,
so for example you can set it as `/var/run/kamal/`
The directory is created during bootstrap or before any command that
will need to access a file.
Adds the `publish` option which, if set to false, does not pass `--publish` to
`docker run` when starting Traefik. This is useful when running Traefik
behind a reverse proxy, for example.
The version extraction assumed that the version is everything after the
last `-` in the container name. This doesn't work if you deploy a
non-MRSK generated version that contains a `-`.
To fix we'll generate the non version prefix and strip it off. In some
places for this to work we need to make sure to pass the role through.
Fixes: https://github.com/mrsked/mrsk/issues/402
To make it easier to identity where a docker container is running,
prefix its hostname with the underlying one from the host.
Docker chooses a 12 character random hex string by default, so we'll
keep that as the suffix.
If you different images with the same git SHA, on the second deploy the
tag is moved and the first image becomes untagged. It may however still
be attached to an existing container.
To handle this:
1. Initially prune dangling images - this will remove any untagged
images that are not attached to an existing image
2. Then filter out the untagged images when deleting tagged images - any
that remain will be attached to a container.
The second issue is that `docker container ls -a --format '{{.Image}}`
will sometimes return the image id rather than a tag. This means that
the image doesn't get filtered out when we grep to remove the active
images.
To fix that we'll grep against both the image id and repo:tag.
dangling=true doesn't prune any images, as we are not creating dangling
images.
Using --all should remove unused images, but it considers the Git SHA
tag on the latest image to be unused (presumably because there are two
tags, the SHA and latest and the running container is only considered to
be using "latest"). As a result it deletes the tag, which means that we
can't rollback to that SHA later.
Its a bit more complicated to only remove images that are not referenced
by any containers.
First we find the tags we want to keep from the containers (running and
stopped).
Then we append the latest tag to that list.
Then we get a full list of image tags and remove those tags from that
list (using `grep -v -w`).
Finally we pass the tags to `docker rmi`. That either deletes the tag if
there are other references to the image or both the tag and the image if
it is the only one.
These replace the custom audit_broadcast_cmd code. An additional env
variable MRSK_RUNTIME is passed to them.
The audit broadcast after booting an accessory has been removed.
Adds hooks to MRSK. Currently just two hooks, pre-build and post-push.
We could break the build and push into two separate commands if we
found the need for post-build and/or pre-push hooks.
Hooks are stored in `.mrsk/hooks`. Running `mrsk init` will now create
that folder and add sample hook scripts.
Hooks returning non-zero exit codes will abort the current command.
Further potential work here:
- We could replace the audit broadcast command with a
post-deploy/post-rollback hook or similar
- Maybe provide pre-command/post-command hooks that run after every
mrsk invocation
- Also look for hooks in `~/.mrsk/hooks`
In the image prune command --all overrides --dangling=true. This removes
the image git sha image tag for the latest image which prevented
us from rolling back to it.
I've updated the integration test to now test deploy, redeploy and
rollback.
Audit details
* Audit logs and broadcasts accept `details` whose values are included as log tags and MRSK_* env vars passed to the broadcast command
* Commands may return execution options to the CLI in their args list
* Introduce `mrsk broadcast` helper for sending audit broadcasts
* Report UTC time, not local time, in audit logs. Standardize on ISO 8601 format
* main:
Don't run actions twice on PRs
Further distinguish dependency verification
Naming
Reveal configured dockerfile path
Style
Distinguish from server dependencies
Distinguish from local dependency verification
Improve clarity and intent
Style
Style
Style
Add local dependencies check
Bootstrap: use multi-platform installer
* main:
Simplify domain language to just "boot" and unscoped config keys
Retain a fixed number of containers when pruning
Don't assume rolling back in message
Check all hosts before rolling back
Ensure Traefik service name is consistent
Extend traefik delay by 1 second
Include traefik access logs
Check if we are still getting a 404
Also dump load balancer logs
Dump traefik logs when app not booted
Fix missing for apt-get
Report on container health after failure
Fix the integration test healthcheck
Allow percentage-based rolling deployments
Move `group_limit` & `group_wait` under `boot`
Limit rolling deployment to boot operation
Allow performing boot & start operations in groups
* main:
Simplify domain language to just "boot" and unscoped config keys
Retain a fixed number of containers when pruning
Don't assume rolling back in message
Check all hosts before rolling back
Ensure Traefik service name is consistent
Extend traefik delay by 1 second
Include traefik access logs
Check if we are still getting a 404
Also dump load balancer logs
Dump traefik logs when app not booted
Fix missing for apt-get
Report on container health after failure
Fix the integration test healthcheck
Allow percentage-based rolling deployments
Move `group_limit` & `group_wait` under `boot`
Limit rolling deployment to boot operation
Allow performing boot & start operations in groups