If there are uncommitted changes in the app repository when building,
then append `_uncommitted_<random>` to it to distinguish the image
from one built from a clean checkout.
Also change the version used when renaming a container on redeploy to
distinguish and explain the version suffixes.
Audit details
* Audit logs and broadcasts accept `details` whose values are included as log tags and MRSK_* env vars passed to the broadcast command
* Commands may return execution options to the CLI in their args list
* Introduce `mrsk broadcast` helper for sending audit broadcasts
* Report UTC time, not local time, in audit logs. Standardize on ISO 8601 format
Replaces our current host-based HTTP healthchecks with Docker
healthchecks, and adds a new `healthcheck.cmd` config option that can be
used to define a custom health check command. Also removes Traefik's
healthchecks, since they are no longer necessary.
When deploying a container that has a healthcheck defined, we wait for
it to report a healthy status before stopping the old container that it
replaces. Containers that don't have a healthcheck defined continue to
wait for `MRSK.config.readiness_delay`.
There are some pros and cons to using Docker healthchecks rather than
checking from the host. The main advantages are:
- Supports non-HTTP checks, and app-specific check scripts provided by a
container.
- When booting a container, allows MRSK to wait for a container to be
healthy before shutting down the old container it replaces. This
should be safer than relying on a timeout.
- Containers with healthchecks won't be active in Traefik until they
reach a healthy state, which prevents any traffic from being routed to
them before they are ready.
The main _disadvantage_ is that containers are now required to provide
some way to check their health. Our default check assumes that `curl` is
available in the container which, while common, won't always be the
case.
Adds top-level configuration options for `group_limit` and `group_wait`.
When a `group_limit` is present, we'll perform app boot & start
operations on no more than `group_limit` hosts at a time, optionally
sleeping for `group_wait` seconds after each batch.
We currently only do this batching on boot & start operations (including
when they are part of a deployment). Other commands, like `app stop` or
`app details` still work on all hosts in parallel.
If we get an error we'll only hold the deploy lock if it occurs while
trying to switch the running containers.
We'll also move tagging the latest image from when the image is pulled
to just before the container switch. This ensures that earlier errors
don't leave the hosts with an updated latest tag while still running the
older version.
When deploying check if there is already a container with the existing
name. If there is rename it to "<version>_<random_hex_string>" to remove
the name clash with the new container we want to boot.
We can then do the normal zero downtime run/wait/stop.
While implementing this I discovered the --filter name=foo does a
substring match for foo, so I've updated those filters to do an exact
match instead.
* main: (32 commits)
Inline default as with other options
Symbols!
Fix tests
test stop with custom stop wait time
No need to replicate Docker default
Describe purpose rather than elements
Style and ordering
Customizable stop wait time
Fix tests
Ensure it also works when configuring just log options without setting a driver
Add accessory test
Undo change
Improve test
Update README
Ensure default log option `max-size=10m`
#142 Allow to customize container options in accessories
Fix flaky test
Fix tests
More resilient tests
Fix other tests
...
* main:
Wording
Remove accessory images using tags rather than labels
Update readme to point to ghcr.io/mrsked/mrsk
Validate that all roles have hosts
Commander needn't accumulate configuration
Pull latest image tag, so we can identity it
Default to deploying the config version
Remove unneeded Dockerfile.dind, update Readme
add D-in-D dockerfile, update Readme
Add a deploy lock for commands that are unsafe to run concurrently.
The lock is taken by creating a `mrsk_lock` directory on the primary
host. Details of who took the lock are added to a details file in that
directory.
Additional CLI commands have been added to manual release and acquire
the lock and to check its status.
```
Commands:
mrsk lock acquire -m, --message=MESSAGE # Acquire the deploy lock
mrsk lock help [COMMAND] # Describe subcommands or one specific subcommand
mrsk lock release # Release the deploy lock
mrsk lock status # Report lock status
Options:
-v, [--verbose], [--no-verbose] # Detailed logging
-q, [--quiet], [--no-quiet] # Minimal logging
[--version=VERSION] # Run commands against a specific app version
-p, [--primary], [--no-primary] # Run commands only on primary host instead of all
-h, [--hosts=HOSTS] # Run commands on these hosts instead of all (separate by comma)
-r, [--roles=ROLES] # Run commands on these roles instead of all (separate by comma)
-c, [--config-file=CONFIG_FILE] # Path to config file
# Default: config/deploy.yml
-d, [--destination=DESTINATION] # Specify destination to be used for config file (staging -> deploy.staging.yml)
-B, [--skip-broadcast], [--no-skip-broadcast] # Skip audit broadcasts
```
If we add support for running multiple deployments on a single server
we'll need to extend the locking to lock per deployment.
Commander had version/destination solely to incrementally accumulate CLI
options. Simpler to configure in one shot.
Clarifies responsibility and lets us introduce things like
`abbreviated_version` in one spot - Configuration.
`docker image ls` doesn't tell us what the latest deployed image is (e.g
if we've rolled back). Pull the latest image tag through to the server
so we can use it instead.
Match word
Language
Suggest what accessories are
There are also accessories
Default already shown
Better example
Warn about secrets being shown
Now also accessories
Wording
Clarifications
Clarify how to see options
General option for all
Options important here too
Hide subcommands
Implied
Simpler as just version
Be concise
Missing word
Wordsmith
Simpler and uniform words are better
Clarify what exactly we're manipulating
Wordsmithing
Implicit
Simpler language
Hide subcommands
Clarify its container management
Just one per server
Simpler