Audit details
* Audit logs and broadcasts accept `details` whose values are included as log tags and MRSK_* env vars passed to the broadcast command
* Commands may return execution options to the CLI in their args list
* Introduce `mrsk broadcast` helper for sending audit broadcasts
* Report UTC time, not local time, in audit logs. Standardize on ISO 8601 format
* main:
Simplify domain language to just "boot" and unscoped config keys
Retain a fixed number of containers when pruning
Don't assume rolling back in message
Check all hosts before rolling back
Ensure Traefik service name is consistent
Extend traefik delay by 1 second
Include traefik access logs
Check if we are still getting a 404
Also dump load balancer logs
Dump traefik logs when app not booted
Fix missing for apt-get
Report on container health after failure
Fix the integration test healthcheck
Allow percentage-based rolling deployments
Move `group_limit` & `group_wait` under `boot`
Limit rolling deployment to boot operation
Allow performing boot & start operations in groups
When invoking the audit broadcast command, provide a few environment
variables so that people can customize the format of the message if they
want.
We currently provide `MRSK_PERFORMER`, `MRSK_ROLE`, `MRSK_DESTINATION` and
`MRSK_EVENT`.
Also adds the destination to the default message, which we continue to
send as the first argument as before.
Replaces our current host-based HTTP healthchecks with Docker
healthchecks, and adds a new `healthcheck.cmd` config option that can be
used to define a custom health check command. Also removes Traefik's
healthchecks, since they are no longer necessary.
When deploying a container that has a healthcheck defined, we wait for
it to report a healthy status before stopping the old container that it
replaces. Containers that don't have a healthcheck defined continue to
wait for `MRSK.config.readiness_delay`.
There are some pros and cons to using Docker healthchecks rather than
checking from the host. The main advantages are:
- Supports non-HTTP checks, and app-specific check scripts provided by a
container.
- When booting a container, allows MRSK to wait for a container to be
healthy before shutting down the old container it replaces. This
should be safer than relying on a timeout.
- Containers with healthchecks won't be active in Traefik until they
reach a healthy state, which prevents any traffic from being routed to
them before they are ready.
The main _disadvantage_ is that containers are now required to provide
some way to check their health. Our default check assumes that `curl` is
available in the container which, while common, won't always be the
case.
Adds top-level configuration options for `group_limit` and `group_wait`.
When a `group_limit` is present, we'll perform app boot & start
operations on no more than `group_limit` hosts at a time, optionally
sleeping for `group_wait` seconds after each batch.
We currently only do this batching on boot & start operations (including
when they are part of a deployment). Other commands, like `app stop` or
`app details` still work on all hosts in parallel.
* main:
Simpler
Make it explicit, focus on Ubuntu
More explicit
Not that --bundle is a Rails 7+ option
Update README.md
Update README.md
Add github discussions link to readme
Bump debug to fix missing deps in CI
Only redact the non-sensitive bits of build args and env vars.
improve code sample (traefik configuration)
I realize that there's a discussions link on github but I didn't realize mrsk actually utilized it until I saw it mentioned on Discord. I was thinking adding it to the readme would help push people there.
Accounts for the 2.9.10 security release and allows testing Traefik 3 betas.
* Use `image` to configure a specific Traefik Docker image.
* Default to `traefik:v2.9` to track future 2.9.x minor releases rather
than tightly pinning to `v2.9.9`.
* Support images from the configured registry.
References #165
Allow the hosts for accessories to be specified by host or role, or on
all app hosts by setting `daemon: true`.
```
# Single host
mysql:
host: 1.1.1.1
# Multiple hosts
redis:
hosts:
- 1.1.1.1
- 1.1.1.2
# By role
monitoring:
roles:
- web
- jobs
```
* main: (32 commits)
Inline default as with other options
Symbols!
Fix tests
test stop with custom stop wait time
No need to replicate Docker default
Describe purpose rather than elements
Style and ordering
Customizable stop wait time
Fix tests
Ensure it also works when configuring just log options without setting a driver
Add accessory test
Undo change
Improve test
Update README
Ensure default log option `max-size=10m`
#142 Allow to customize container options in accessories
Fix flaky test
Fix tests
More resilient tests
Fix other tests
...