kamal

Author	SHA1	Message	Date
Donal McBreen	5d33fb6c33	Better lock messages - Debug verbosity commands - Show lock status when we fail to acquire it - Include lock acquire/release in runtime	2023-05-09 14:17:58 +01:00
David Heinemeier Hansson	aafaee7ac8	Merge pull request #223 from basecamp/customizable-audit-broadcast Allow customizing audit broadcast with env	2023-05-05 14:30:04 +02:00
Donal McBreen	326711a3e0	Fix aggressive prune breaking rollback In the image prune command --all overrides --dangling=true. This removes the image git sha image tag for the latest image which prevented us from rolling back to it. I've updated the integration test to now test deploy, redeploy and rollback.	2023-05-05 12:13:14 +01:00
Kevin McConnell	82be521e66	Merge branch 'main' into customizable-audit-broadcast * main: Fix staging label bug Fix typo Capture container health log when unhealthy Bump version for 0.12.0	2023-05-05 11:40:29 +01:00
Jberczel	0e19ead37c	Capture container health log when unhealthy	2023-05-03 15:03:05 -04:00
Jeremy Daer	048aecf352	Audit details (#1 ) Audit details * Audit logs and broadcasts accept `details` whose values are included as log tags and MRSK_* env vars passed to the broadcast command * Commands may return execution options to the CLI in their args list * Introduce `mrsk broadcast` helper for sending audit broadcasts * Report UTC time, not local time, in audit logs. Standardize on ISO 8601 format	2023-05-02 11:42:05 -07:00
David Heinemeier Hansson	b7877c59b4	Merge branch 'main' into docker-readiness	2023-05-02 14:30:35 +02:00
David Heinemeier Hansson	35b5b317af	Merge branch 'main' into pr/205 * main: Simplify domain language to just "boot" and unscoped config keys Retain a fixed number of containers when pruning Don't assume rolling back in message Check all hosts before rolling back Ensure Traefik service name is consistent Extend traefik delay by 1 second Include traefik access logs Check if we are still getting a 404 Also dump load balancer logs Dump traefik logs when app not booted Fix missing for apt-get Report on container health after failure Fix the integration test healthcheck Allow percentage-based rolling deployments Move `group_limit` & `group_wait` under `boot` Limit rolling deployment to boot operation Allow performing boot & start operations in groups	2023-05-02 14:29:06 +02:00
David Heinemeier Hansson	4c448f7eb1	Merge pull request #256 from Jberczel/check-local-dependencies Add local dependencies check	2023-05-02 14:13:23 +02:00
David Heinemeier Hansson	263a24afe3	Further distinguish dependency verification	2023-05-02 14:09:10 +02:00
David Heinemeier Hansson	a2d99e48bf	Naming	2023-05-02 14:08:29 +02:00
David Heinemeier Hansson	ae2effb80c	Improve clarity and intent	2023-05-02 14:04:23 +02:00
David Heinemeier Hansson	8854bb63a1	Merge pull request #254 from basecamp/retain-last-5-containers Retain a fixed number of containers when pruning	2023-05-02 13:16:49 +02:00
David Heinemeier Hansson	35ea9f3c81	Merge pull request #255 from basecamp/check-all-hosts-for-rollback-container Check all hosts before rolling back	2023-05-02 13:16:03 +02:00
David Heinemeier Hansson	71bc9bcf54	Merge pull request #222 from basecamp/deploy-groups Allow booting containers in groups for rolling restarts	2023-05-02 13:14:32 +02:00
David Heinemeier Hansson	c83b74dcb7	Simplify domain language to just "boot" and unscoped config keys	2023-05-02 13:11:31 +02:00
Donal McBreen	971a91da15	Retain a fixed number of containers when pruning Time based container and image retention can have variable space requirements depending on how often we deploy. - Only prune stopped containers, retaining the 5 newest - Then prune dangling images so we only keep images for the retained containers.	2023-05-02 10:15:08 +01:00
Donal McBreen	7fe24d5048	Check all hosts before rolling back Hosts could end up out of sync with each other if prune commands are run manually or when new hosts are added. Before rolling back confirm that the required container is available on all hosts and roles.	2023-05-02 10:14:50 +01:00
Jberczel	bfb70b2118	Add local dependencies check Add checks for: * Docker installed locally * Docker buildx plugin installed locally * Dockerfile exists If checks fail, it will halt deployment and provide more specific error messages. Also adds a cli subcommand: `mrsk build dependencies` Fixes: #109 and #237	2023-05-01 16:32:41 -04:00
Jeremy Daer	e85bd5ff63	Bootstrap: use multi-platform installer * Limit auto-install to root users; otherwise, give manual install guidance * Support non-Debian/Ubuntu with the multi-OS get.docker.com installer	2023-05-01 13:26:00 -07:00
David Heinemeier Hansson	4fa6a6c06d	Merge pull request #219 from basecamp/docker-health-checks	2023-04-28 11:43:33 +02:00
Donal McBreen	cd668066ff	Get lock status by executing directly Getting the lock status with invoke passes through any options from the original command which will raise an exception if they are not also valid for the lock status command. Fixes https://github.com/mrsked/mrsk/issues/239	2023-04-25 16:57:02 +01:00
Donal McBreen	52ca5b846a	Wait for healthy containers in integration test Rather than waiting 5 seconds and hoping for the best after we boot docker compose, add docker healthchecks and wait for all the containers to be healthy.	2023-04-25 15:41:25 +01:00
Kevin McConnell	f055766918	Allow percentage-based rolling deployments	2023-04-14 12:46:14 +01:00
Kevin McConnell	df202d6ef4	Move health checks into Docker Replaces our current host-based HTTP healthchecks with Docker healthchecks, and adds a new `healthcheck.cmd` config option that can be used to define a custom health check command. Also removes Traefik's healthchecks, since they are no longer necessary. When deploying a container that has a healthcheck defined, we wait for it to report a healthy status before stopping the old container that it replaces. Containers that don't have a healthcheck defined continue to wait for `MRSK.config.readiness_delay`. There are some pros and cons to using Docker healthchecks rather than checking from the host. The main advantages are: - Supports non-HTTP checks, and app-specific check scripts provided by a container. - When booting a container, allows MRSK to wait for a container to be healthy before shutting down the old container it replaces. This should be safer than relying on a timeout. - Containers with healthchecks won't be active in Traefik until they reach a healthy state, which prevents any traffic from being routed to them before they are ready. The main _disadvantage_ is that containers are now required to provide some way to check their health. Our default check assumes that `curl` is available in the container which, while common, won't always be the case.	2023-04-13 16:08:43 +01:00
Kevin McConnell	f530009a6e	Allow performing boot & start operations in groups Adds top-level configuration options for `group_limit` and `group_wait`. When a `group_limit` is present, we'll perform app boot & start operations on no more than `group_limit` hosts at a time, optionally sleeping for `group_wait` seconds after each batch. We currently only do this batching on boot & start operations (including when they are part of a deployment). Other commands, like `app stop` or `app details` still work on all hosts in parallel.	2023-04-13 15:58:27 +01:00
Jacopo	9ddb181f50	Merge branch 'main' into cleanup-excessive-containers-running * main: Pull the primary host from the role Minimise holding the deploy lock	2023-04-12 15:19:19 +02:00
Donal McBreen	051556674f	Minimise holding the deploy lock If we get an error we'll only hold the deploy lock if it occurs while trying to switch the running containers. We'll also move tagging the latest image from when the image is pulled to just before the container switch. This ensures that earlier errors don't leave the hosts with an updated latest tag while still running the older version.	2023-04-12 12:09:56 +01:00
Jacopo	5ed431b807	Merge branch 'main' into cleanup-excessive-containers-running * main: (24 commits) Bump version for 0.11.0 Labels can be added to Traefik Make rollbacks role-aware fix typo role to roles Explained the latest modifications of Traefik container labels Remove .idea folder Updated README.md with new healthcheck.max_attempts option Fix test case: console output message was not updated to display the current/total attempts Require net-ssh ~> 7.0 for SHA-2 support Improved deploy lock acquisition Excess CR Style Simpler Make it explicit, focus on Ubuntu More explicit Not that --bundle is a Rails 7+ option Update README.md Update README.md Improved: configurable max_attempts for healthcheck Traefik service name to be derived from role and destination ...	2023-04-12 11:52:47 +02:00
Donal McBreen	43f7409de0	Make rollbacks role-aware Rollbacks stopped working after https://github.com/mrsked/mrsk/pull/99. We'll confirm that a container is available for the first role on the primary host before attempting to rollback.	2023-04-12 09:59:39 +01:00
David Heinemeier Hansson	daa0c9b5be	Merge pull request #196 from handy-la/main Configurable max_attempts for healthcheck	2023-04-11 14:17:17 +02:00
Jacopo	03d933d10b	Add Role to the message	2023-04-11 10:59:25 +02:00
Jacopo	579b4cd9aa	Simplify By using and ad-hoc command to detect and stop stale containers. By default stale containers are only detected.	2023-04-11 10:22:03 +02:00
Jacopo	8ae5331d97	Boot stop all the old containers	2023-04-11 08:53:33 +02:00
Jacopo	4d47fbdf41	Merge stop and stop_stale_containers	2023-04-11 08:53:33 +02:00
Jacopo	e980f1164e	Avoid using GNU-only Perl Regepx Grep	2023-04-11 08:53:33 +02:00
Jacopo	e2f6db5cae	Clear stale containers By stopping all the older containers with matching /#{service}-#{role}-#{dest}-.*/ running on the same host.	2023-04-11 08:53:33 +02:00
Arturo Ojeda	514b2aa243	Fix test case: console output message was not updated to display the current/total attempts	2023-04-10 09:29:19 -06:00
Donal McBreen	c4df440c79	Improved deploy lock acquisition 1. Don't raise lock error for non-lock issues during lock acquire (see https://github.com/mrsked/mrsk/pull/181) 2. If there is an error while the lock is held, don't release the lock and send a warning to stderr	2023-04-10 15:23:00 +01:00
Jeremy Daer	bd8f13dd5e	Traefik image config for version pinning, upgrades, and custom images Accounts for the 2.9.10 security release and allows testing Traefik 3 betas. * Use `image` to configure a specific Traefik Docker image. * Default to `traefik:v2.9` to track future 2.9.x minor releases rather than tightly pinning to `v2.9.9`. * Support images from the configured registry. References #165	2023-04-07 14:15:25 -07:00
David Heinemeier Hansson	1f83b5f6be	Fix failure to pass on class options to subcommands	2023-03-28 18:04:16 +02:00
Kevin McConnell	2957388bf6	Pin Traefik to v2.9.9	2023-03-28 14:59:03 +01:00
David Heinemeier Hansson	7f178101f7	Merge pull request #164 from basecamp/accessory-hosts-or-roles Run accessories on multiple hosts or roles	2023-03-28 14:31:24 +02:00
Donal McBreen	c06585fef4	Daemon/host/role accessories Allow the hosts for accessories to be specified by host or role, or on all app hosts by setting `daemon: true`. ``` # Single host mysql: host: 1.1.1.1 # Multiple hosts redis: hosts: - 1.1.1.1 - 1.1.1.2 # By role monitoring: roles: - web - jobs ```	2023-03-28 13:26:27 +01:00
milk1000cc	03614bfb79	Rolify app logs cli/command	2023-03-27 23:08:46 +09:00
Donal McBreen	05488e4c1e	Zero downtime redeploys When deploying check if there is already a container with the existing name. If there is rename it to "<version>_<random_hex_string>" to remove the name clash with the new container we want to boot. We can then do the normal zero downtime run/wait/stop. While implementing this I discovered the --filter name=foo does a substring match for foo, so I've updated those filters to do an exact match instead.	2023-03-24 17:09:20 +00:00
David Heinemeier Hansson	84540cee7b	Merge branch 'main' into pr/154 * main: (32 commits) Inline default as with other options Symbols! Fix tests test stop with custom stop wait time No need to replicate Docker default Describe purpose rather than elements Style and ordering Customizable stop wait time Fix tests Ensure it also works when configuring just log options without setting a driver Add accessory test Undo change Improve test Update README Ensure default log option `max-size=10m` #142 Allow to customize container options in accessories Fix flaky test Fix tests More resilient tests Fix other tests ...	2023-03-24 15:43:17 +01:00
Samuel Sieg	bc64a07a95	Merge branch 'main' into global-logging-config	2023-03-24 15:24:06 +01:00
Samuel Sieg	86e99fb079	Merge branch 'main' into global-logging-config	2023-03-24 14:40:27 +01:00
David Heinemeier Hansson	494e29d672	Fix tests	2023-03-24 14:35:17 +01:00

... 3 4 5 6 7

321 Commits