kamal

Author	SHA1	Message	Date
Donal McBreen	db0bf6bb16	Add a pre-deploy hook Useful for checking the status of CI before deploying. Doing this at this point in the deployment maximises the parallelisation of building and running CI.	2023-05-29 16:06:41 +01:00
Donal McBreen	ff7a1e6726	Prune unused images correctly dangling=true doesn't prune any images, as we are not creating dangling images. Using --all should remove unused images, but it considers the Git SHA tag on the latest image to be unused (presumably because there are two tags, the SHA and latest and the running container is only considered to be using "latest"). As a result it deletes the tag, which means that we can't rollback to that SHA later. Its a bit more complicated to only remove images that are not referenced by any containers. First we find the tags we want to keep from the containers (running and stopped). Then we append the latest tag to that list. Then we get a full list of image tags and remove those tags from that list (using `grep -v -w`). Finally we pass the tags to `docker rmi`. That either deletes the tag if there are other references to the image or both the tag and the image if it is the only one.	2023-05-25 17:16:46 +01:00
David Heinemeier Hansson	e35334e5fe	Merge pull request #313 from basecamp/stop-restarting-containers Stop containers with restarting status	2023-05-25 14:04:09 +02:00
Donal McBreen	cedb8d900f	Stop containers with restarting status When stopping the old container we need to also look for ones with a restarting status.	2023-05-25 12:10:26 +01:00
Donal McBreen	66f9ce0e90	Add a pre-connect hook This can be used for hooks that should run before connecting to remote hosts. An example use case is pre-warming DNS.	2023-05-24 14:39:30 +01:00
Donal McBreen	19f0f40adf	Add skip_hooks option	2023-05-23 15:56:47 +01:00
Donal McBreen	f9cb87e55a	Fixup rebase issues	2023-05-23 14:10:38 +01:00
Donal McBreen	cc2b321d93	Combine post-deploy and post-rollback	2023-05-23 13:57:24 +01:00
Donal McBreen	004f1b04e6	Remove the skip_broadcast option	2023-05-23 13:57:00 +01:00
Donal McBreen	9fd184dc32	Add post-deploy and post-rollback hooks These replace the custom audit_broadcast_cmd code. An additional env variable MRSK_RUNTIME is passed to them. The audit broadcast after booting an accessory has been removed.	2023-05-23 13:56:16 +01:00
Donal McBreen	38023fe538	Remove post push hook	2023-05-23 13:55:05 +01:00
Donal McBreen	58c1096a90	MRSK hooks Adds hooks to MRSK. Currently just two hooks, pre-build and post-push. We could break the build and push into two separate commands if we found the need for post-build and/or pre-push hooks. Hooks are stored in `.mrsk/hooks`. Running `mrsk init` will now create that folder and add sample hook scripts. Hooks returning non-zero exit codes will abort the current command. Further potential work here: - We could replace the audit broadcast command with a post-deploy/post-rollback hook or similar - Maybe provide pre-command/post-command hooks that run after every mrsk invocation - Also look for hooks in `~/.mrsk/hooks`	2023-05-23 13:55:04 +01:00
Donal McBreen	340ed94fa9	Make verify_local_dependencies private We don't need to what it returns, it raises if there is a problem. Move it out of the run_locally block to make it easier to add hooks.	2023-05-23 13:55:04 +01:00
David Heinemeier Hansson	4e9c39f26d	Merge pull request #271 from basecamp/app-boot-for-rollback Call app:boot to rollback	2023-05-23 13:17:30 +02:00
Donal McBreen	7cd25fd163	Add more integration tests Add tests for main, app, accessory, traefik and lock commands. Other commands are generally covered by the main tests. Also adds some changes to speed up the integration specs: - Use a persistent volume for the registry so we can push images to to reuse between runs (also gets around docker hub rate limits) - Use persistent volume for mrsk gem install, to avoid re-installing between tests - Shorter stop wait time - Shorter connection timeouts on the load balancer Takes just over 2 minutes to run all tests locally on an M1 Mac after docker caches are primed.	2023-05-16 10:35:35 +01:00
Donal McBreen	ee25f200d7	Call app:boot to rollback The code in Mrsk::Cli::Main#rollback was very similar to Mrsk::Cli::App#boot. Modify Mrsk::Cli::App#boot so it can handle rollbacks by: 1. Only renaming running containers 2. Trying first to start then run the new container	2023-05-16 08:59:07 +01:00
Donal McBreen	a5ef1f254f	Highlight uncommitted changes in version If there are uncommitted changes in the app repository when building, then append `_uncommitted_<random>` to it to distinguish the image from one built from a clean checkout. Also change the version used when renaming a container on redeploy to distinguish and explain the version suffixes.	2023-05-12 11:08:48 +01:00
Donal McBreen	5d33fb6c33	Better lock messages - Debug verbosity commands - Show lock status when we fail to acquire it - Include lock acquire/release in runtime	2023-05-09 14:17:58 +01:00
David Heinemeier Hansson	aafaee7ac8	Merge pull request #223 from basecamp/customizable-audit-broadcast Allow customizing audit broadcast with env	2023-05-05 14:30:04 +02:00
Donal McBreen	326711a3e0	Fix aggressive prune breaking rollback In the image prune command --all overrides --dangling=true. This removes the image git sha image tag for the latest image which prevented us from rolling back to it. I've updated the integration test to now test deploy, redeploy and rollback.	2023-05-05 12:13:14 +01:00
Kevin McConnell	82be521e66	Merge branch 'main' into customizable-audit-broadcast * main: Fix staging label bug Fix typo Capture container health log when unhealthy Bump version for 0.12.0	2023-05-05 11:40:29 +01:00
Jberczel	0e19ead37c	Capture container health log when unhealthy	2023-05-03 15:03:05 -04:00
Jeremy Daer	048aecf352	Audit details (#1 ) Audit details * Audit logs and broadcasts accept `details` whose values are included as log tags and MRSK_* env vars passed to the broadcast command * Commands may return execution options to the CLI in their args list * Introduce `mrsk broadcast` helper for sending audit broadcasts * Report UTC time, not local time, in audit logs. Standardize on ISO 8601 format	2023-05-02 11:42:05 -07:00
David Heinemeier Hansson	b7877c59b4	Merge branch 'main' into docker-readiness	2023-05-02 14:30:35 +02:00
David Heinemeier Hansson	35b5b317af	Merge branch 'main' into pr/205 * main: Simplify domain language to just "boot" and unscoped config keys Retain a fixed number of containers when pruning Don't assume rolling back in message Check all hosts before rolling back Ensure Traefik service name is consistent Extend traefik delay by 1 second Include traefik access logs Check if we are still getting a 404 Also dump load balancer logs Dump traefik logs when app not booted Fix missing for apt-get Report on container health after failure Fix the integration test healthcheck Allow percentage-based rolling deployments Move `group_limit` & `group_wait` under `boot` Limit rolling deployment to boot operation Allow performing boot & start operations in groups	2023-05-02 14:29:06 +02:00
David Heinemeier Hansson	4c448f7eb1	Merge pull request #256 from Jberczel/check-local-dependencies Add local dependencies check	2023-05-02 14:13:23 +02:00
David Heinemeier Hansson	263a24afe3	Further distinguish dependency verification	2023-05-02 14:09:10 +02:00
David Heinemeier Hansson	a2d99e48bf	Naming	2023-05-02 14:08:29 +02:00
David Heinemeier Hansson	ae2effb80c	Improve clarity and intent	2023-05-02 14:04:23 +02:00
David Heinemeier Hansson	8854bb63a1	Merge pull request #254 from basecamp/retain-last-5-containers Retain a fixed number of containers when pruning	2023-05-02 13:16:49 +02:00
David Heinemeier Hansson	35ea9f3c81	Merge pull request #255 from basecamp/check-all-hosts-for-rollback-container Check all hosts before rolling back	2023-05-02 13:16:03 +02:00
David Heinemeier Hansson	71bc9bcf54	Merge pull request #222 from basecamp/deploy-groups Allow booting containers in groups for rolling restarts	2023-05-02 13:14:32 +02:00
David Heinemeier Hansson	c83b74dcb7	Simplify domain language to just "boot" and unscoped config keys	2023-05-02 13:11:31 +02:00
Donal McBreen	971a91da15	Retain a fixed number of containers when pruning Time based container and image retention can have variable space requirements depending on how often we deploy. - Only prune stopped containers, retaining the 5 newest - Then prune dangling images so we only keep images for the retained containers.	2023-05-02 10:15:08 +01:00
Donal McBreen	7fe24d5048	Check all hosts before rolling back Hosts could end up out of sync with each other if prune commands are run manually or when new hosts are added. Before rolling back confirm that the required container is available on all hosts and roles.	2023-05-02 10:14:50 +01:00
Jberczel	bfb70b2118	Add local dependencies check Add checks for: * Docker installed locally * Docker buildx plugin installed locally * Dockerfile exists If checks fail, it will halt deployment and provide more specific error messages. Also adds a cli subcommand: `mrsk build dependencies` Fixes: #109 and #237	2023-05-01 16:32:41 -04:00
Jeremy Daer	e85bd5ff63	Bootstrap: use multi-platform installer * Limit auto-install to root users; otherwise, give manual install guidance * Support non-Debian/Ubuntu with the multi-OS get.docker.com installer	2023-05-01 13:26:00 -07:00
David Heinemeier Hansson	4fa6a6c06d	Merge pull request #219 from basecamp/docker-health-checks	2023-04-28 11:43:33 +02:00
Donal McBreen	cd668066ff	Get lock status by executing directly Getting the lock status with invoke passes through any options from the original command which will raise an exception if they are not also valid for the lock status command. Fixes https://github.com/mrsked/mrsk/issues/239	2023-04-25 16:57:02 +01:00
Donal McBreen	52ca5b846a	Wait for healthy containers in integration test Rather than waiting 5 seconds and hoping for the best after we boot docker compose, add docker healthchecks and wait for all the containers to be healthy.	2023-04-25 15:41:25 +01:00
Kevin McConnell	f055766918	Allow percentage-based rolling deployments	2023-04-14 12:46:14 +01:00
Kevin McConnell	df202d6ef4	Move health checks into Docker Replaces our current host-based HTTP healthchecks with Docker healthchecks, and adds a new `healthcheck.cmd` config option that can be used to define a custom health check command. Also removes Traefik's healthchecks, since they are no longer necessary. When deploying a container that has a healthcheck defined, we wait for it to report a healthy status before stopping the old container that it replaces. Containers that don't have a healthcheck defined continue to wait for `MRSK.config.readiness_delay`. There are some pros and cons to using Docker healthchecks rather than checking from the host. The main advantages are: - Supports non-HTTP checks, and app-specific check scripts provided by a container. - When booting a container, allows MRSK to wait for a container to be healthy before shutting down the old container it replaces. This should be safer than relying on a timeout. - Containers with healthchecks won't be active in Traefik until they reach a healthy state, which prevents any traffic from being routed to them before they are ready. The main _disadvantage_ is that containers are now required to provide some way to check their health. Our default check assumes that `curl` is available in the container which, while common, won't always be the case.	2023-04-13 16:08:43 +01:00
Kevin McConnell	f530009a6e	Allow performing boot & start operations in groups Adds top-level configuration options for `group_limit` and `group_wait`. When a `group_limit` is present, we'll perform app boot & start operations on no more than `group_limit` hosts at a time, optionally sleeping for `group_wait` seconds after each batch. We currently only do this batching on boot & start operations (including when they are part of a deployment). Other commands, like `app stop` or `app details` still work on all hosts in parallel.	2023-04-13 15:58:27 +01:00
Jacopo	9ddb181f50	Merge branch 'main' into cleanup-excessive-containers-running * main: Pull the primary host from the role Minimise holding the deploy lock	2023-04-12 15:19:19 +02:00
Donal McBreen	051556674f	Minimise holding the deploy lock If we get an error we'll only hold the deploy lock if it occurs while trying to switch the running containers. We'll also move tagging the latest image from when the image is pulled to just before the container switch. This ensures that earlier errors don't leave the hosts with an updated latest tag while still running the older version.	2023-04-12 12:09:56 +01:00
Jacopo	5ed431b807	Merge branch 'main' into cleanup-excessive-containers-running * main: (24 commits) Bump version for 0.11.0 Labels can be added to Traefik Make rollbacks role-aware fix typo role to roles Explained the latest modifications of Traefik container labels Remove .idea folder Updated README.md with new healthcheck.max_attempts option Fix test case: console output message was not updated to display the current/total attempts Require net-ssh ~> 7.0 for SHA-2 support Improved deploy lock acquisition Excess CR Style Simpler Make it explicit, focus on Ubuntu More explicit Not that --bundle is a Rails 7+ option Update README.md Update README.md Improved: configurable max_attempts for healthcheck Traefik service name to be derived from role and destination ...	2023-04-12 11:52:47 +02:00
Donal McBreen	43f7409de0	Make rollbacks role-aware Rollbacks stopped working after https://github.com/mrsked/mrsk/pull/99. We'll confirm that a container is available for the first role on the primary host before attempting to rollback.	2023-04-12 09:59:39 +01:00
David Heinemeier Hansson	daa0c9b5be	Merge pull request #196 from handy-la/main Configurable max_attempts for healthcheck	2023-04-11 14:17:17 +02:00
Jacopo	03d933d10b	Add Role to the message	2023-04-11 10:59:25 +02:00
Jacopo	579b4cd9aa	Simplify By using and ad-hoc command to detect and stop stale containers. By default stale containers are only detected.	2023-04-11 10:22:03 +02:00

1 2 3 4 5

238 Commits