kamal

Author	SHA1	Message	Date
Donal McBreen	58c1096a90	MRSK hooks Adds hooks to MRSK. Currently just two hooks, pre-build and post-push. We could break the build and push into two separate commands if we found the need for post-build and/or pre-push hooks. Hooks are stored in `.mrsk/hooks`. Running `mrsk init` will now create that folder and add sample hook scripts. Hooks returning non-zero exit codes will abort the current command. Further potential work here: - We could replace the audit broadcast command with a post-deploy/post-rollback hook or similar - Maybe provide pre-command/post-command hooks that run after every mrsk invocation - Also look for hooks in `~/.mrsk/hooks`	2023-05-23 13:55:04 +01:00
Donal McBreen	340ed94fa9	Make verify_local_dependencies private We don't need to what it returns, it raises if there is a problem. Move it out of the run_locally block to make it easier to add hooks.	2023-05-23 13:55:04 +01:00
David Heinemeier Hansson	4e9c39f26d	Merge pull request #271 from basecamp/app-boot-for-rollback Call app:boot to rollback	2023-05-23 13:17:30 +02:00
David Heinemeier Hansson	d08aacadac	Merge pull request #287 from Novtopro/traefik-inject-environment-variables Allow to inject environment variables to traefik	2023-05-22 10:10:34 +02:00
Donal McBreen	7cd25fd163	Add more integration tests Add tests for main, app, accessory, traefik and lock commands. Other commands are generally covered by the main tests. Also adds some changes to speed up the integration specs: - Use a persistent volume for the registry so we can push images to to reuse between runs (also gets around docker hub rate limits) - Use persistent volume for mrsk gem install, to avoid re-installing between tests - Shorter stop wait time - Shorter connection timeouts on the load balancer Takes just over 2 minutes to run all tests locally on an M1 Mac after docker caches are primed.	2023-05-16 10:35:35 +01:00
Donal McBreen	ee25f200d7	Call app:boot to rollback The code in Mrsk::Cli::Main#rollback was very similar to Mrsk::Cli::App#boot. Modify Mrsk::Cli::App#boot so it can handle rollbacks by: 1. Only renaming running containers 2. Trying first to start then run the new container	2023-05-16 08:59:07 +01:00
Donal McBreen	a5ef1f254f	Highlight uncommitted changes in version If there are uncommitted changes in the app repository when building, then append `_uncommitted_<random>` to it to distinguish the image from one built from a clean checkout. Also change the version used when renaming a container on redeploy to distinguish and explain the version suffixes.	2023-05-12 11:08:48 +01:00
River He	44b83151e3	Allow to inject environment variables to traefik	2023-05-10 03:18:26 +00:00
Donal McBreen	5d33fb6c33	Better lock messages - Debug verbosity commands - Show lock status when we fail to acquire it - Include lock acquire/release in runtime	2023-05-09 14:17:58 +01:00
David Heinemeier Hansson	aafaee7ac8	Merge pull request #223 from basecamp/customizable-audit-broadcast Allow customizing audit broadcast with env	2023-05-05 14:30:04 +02:00
Donal McBreen	326711a3e0	Fix aggressive prune breaking rollback In the image prune command --all overrides --dangling=true. This removes the image git sha image tag for the latest image which prevented us from rolling back to it. I've updated the integration test to now test deploy, redeploy and rollback.	2023-05-05 12:13:14 +01:00
Kevin McConnell	82be521e66	Merge branch 'main' into customizable-audit-broadcast * main: Fix staging label bug Fix typo Capture container health log when unhealthy Bump version for 0.12.0	2023-05-05 11:40:29 +01:00
Jberczel	0e19ead37c	Capture container health log when unhealthy	2023-05-03 15:03:05 -04:00
Jeremy Daer	048aecf352	Audit details (#1 ) Audit details * Audit logs and broadcasts accept `details` whose values are included as log tags and MRSK_* env vars passed to the broadcast command * Commands may return execution options to the CLI in their args list * Introduce `mrsk broadcast` helper for sending audit broadcasts * Report UTC time, not local time, in audit logs. Standardize on ISO 8601 format	2023-05-02 11:42:05 -07:00
David Heinemeier Hansson	88a7413b3e	Merge branch 'main' into pr/223 * main: Don't run actions twice on PRs Further distinguish dependency verification Naming Reveal configured dockerfile path Style Distinguish from server dependencies Distinguish from local dependency verification Improve clarity and intent Style Style Style Add local dependencies check Bootstrap: use multi-platform installer	2023-05-02 14:44:16 +02:00
David Heinemeier Hansson	9cc73fed9a	Merge branch 'main' into pr/223 * main: Simplify domain language to just "boot" and unscoped config keys Retain a fixed number of containers when pruning Don't assume rolling back in message Check all hosts before rolling back Ensure Traefik service name is consistent Extend traefik delay by 1 second Include traefik access logs Check if we are still getting a 404 Also dump load balancer logs Dump traefik logs when app not booted Fix missing for apt-get Report on container health after failure Fix the integration test healthcheck Allow percentage-based rolling deployments Move `group_limit` & `group_wait` under `boot` Limit rolling deployment to boot operation Allow performing boot & start operations in groups	2023-05-02 14:43:17 +02:00
David Heinemeier Hansson	b7877c59b4	Merge branch 'main' into docker-readiness	2023-05-02 14:30:35 +02:00
David Heinemeier Hansson	35b5b317af	Merge branch 'main' into pr/205 * main: Simplify domain language to just "boot" and unscoped config keys Retain a fixed number of containers when pruning Don't assume rolling back in message Check all hosts before rolling back Ensure Traefik service name is consistent Extend traefik delay by 1 second Include traefik access logs Check if we are still getting a 404 Also dump load balancer logs Dump traefik logs when app not booted Fix missing for apt-get Report on container health after failure Fix the integration test healthcheck Allow percentage-based rolling deployments Move `group_limit` & `group_wait` under `boot` Limit rolling deployment to boot operation Allow performing boot & start operations in groups	2023-05-02 14:29:06 +02:00
David Heinemeier Hansson	4c448f7eb1	Merge pull request #256 from Jberczel/check-local-dependencies Add local dependencies check	2023-05-02 14:13:23 +02:00
David Heinemeier Hansson	263a24afe3	Further distinguish dependency verification	2023-05-02 14:09:10 +02:00
David Heinemeier Hansson	a2d99e48bf	Naming	2023-05-02 14:08:29 +02:00
David Heinemeier Hansson	ae2effb80c	Improve clarity and intent	2023-05-02 14:04:23 +02:00
David Heinemeier Hansson	8854bb63a1	Merge pull request #254 from basecamp/retain-last-5-containers Retain a fixed number of containers when pruning	2023-05-02 13:16:49 +02:00
David Heinemeier Hansson	35ea9f3c81	Merge pull request #255 from basecamp/check-all-hosts-for-rollback-container Check all hosts before rolling back	2023-05-02 13:16:03 +02:00
David Heinemeier Hansson	18312f5191	Merge pull request #253 from basecamp/ensure-consistent-service-name Ensure Traefik service name is consistent	2023-05-02 13:15:36 +02:00
David Heinemeier Hansson	71bc9bcf54	Merge pull request #222 from basecamp/deploy-groups Allow booting containers in groups for rolling restarts	2023-05-02 13:14:32 +02:00
David Heinemeier Hansson	c83b74dcb7	Simplify domain language to just "boot" and unscoped config keys	2023-05-02 13:11:31 +02:00
Donal McBreen	971a91da15	Retain a fixed number of containers when pruning Time based container and image retention can have variable space requirements depending on how often we deploy. - Only prune stopped containers, retaining the 5 newest - Then prune dangling images so we only keep images for the retained containers.	2023-05-02 10:15:08 +01:00
Donal McBreen	7fe24d5048	Check all hosts before rolling back Hosts could end up out of sync with each other if prune commands are run manually or when new hosts are added. Before rolling back confirm that the required container is available on all hosts and roles.	2023-05-02 10:14:50 +01:00
Kevin McConnell	a72f95f44d	Ensure Traefik service name is consistent If we don't specify any service properties when labelling containers, the generated service will be named according to the container. However, we change the container name on every deployment (as it is versioned), which means that the auto-generated service name will be different in each container. That is a problem for two reasons: - Multiple containers share a common router while a deployment is happening. At this point, the router configuration will be different between the containers; Traefik flags this as an error, and stops routing to the containers until it's resolved. - We allow custom labels to be set in an app's config. In order to define custom configuration on the service, we'll need to know what it will be called. Changed to force the service name by setting one of its properties.	2023-05-02 09:43:04 +01:00
David Heinemeier Hansson	19527b4f65	Merge branch 'main' into customizable-audit-broadcast	2023-05-02 10:25:25 +02:00
Jberczel	bfb70b2118	Add local dependencies check Add checks for: * Docker installed locally * Docker buildx plugin installed locally * Dockerfile exists If checks fail, it will halt deployment and provide more specific error messages. Also adds a cli subcommand: `mrsk build dependencies` Fixes: #109 and #237	2023-05-01 16:32:41 -04:00
Jeremy Daer	e85bd5ff63	Bootstrap: use multi-platform installer * Limit auto-install to root users; otherwise, give manual install guidance * Support non-Debian/Ubuntu with the multi-OS get.docker.com installer	2023-05-01 13:26:00 -07:00
Donal McBreen	650f9b1fbf	Include traefik access logs	2023-05-01 18:55:10 +01:00
Donal McBreen	1170e2311e	Check if we are still getting a 404	2023-05-01 18:32:07 +01:00
Donal McBreen	94f87edded	Also dump load balancer logs	2023-05-01 18:27:08 +01:00
Donal McBreen	548a1019c1	Dump traefik logs when app not booted	2023-05-01 18:21:22 +01:00
Donal McBreen	ca2e2bac2e	Fix missing for apt-get	2023-05-01 12:50:45 +01:00
Donal McBreen	494a1ae089	Report on container health after failure	2023-05-01 12:13:12 +01:00
Donal McBreen	a77428143f	Fix the integration test healthcheck The alpine nginx container doesn't contain curl, so let's override the healthcheck command to use wget.	2023-05-01 12:11:24 +01:00
David Heinemeier Hansson	4fa6a6c06d	Merge pull request #219 from basecamp/docker-health-checks	2023-04-28 11:43:33 +02:00
Donal McBreen	cd668066ff	Get lock status by executing directly Getting the lock status with invoke passes through any options from the original command which will raise an exception if they are not also valid for the lock status command. Fixes https://github.com/mrsked/mrsk/issues/239	2023-04-25 16:57:02 +01:00
Donal McBreen	52ca5b846a	Wait for healthy containers in integration test Rather than waiting 5 seconds and hoping for the best after we boot docker compose, add docker healthchecks and wait for all the containers to be healthy.	2023-04-25 15:41:25 +01:00
Kevin McConnell	99fe31d4b4	Rename MRSK_EVENT -> MRSK_MESSAGE It's a better name, and frees up `MRSK_EVENT` to be used later.	2023-04-14 16:11:42 +01:00
Donal McBreen	bcf8a927f5	Run a mrsk deploy integration test Adds a simple integration test to ensure that `mrsk deploy` works. Everything required is spun up with docker compose: - shared: a container that contains an ssh key and a self signed cert to be shared between the images - deployer: the image we will deploy from - registry: a docker registry - two vm images to deploy into - load_balancer: an nginx load balancer to use between our images The other images are in privileged mode so that we can run docker-in-docker. We need to run docker inside the images - mapping in the docker socket doesn't work because both VMs would share the host daemon. The docker registry requires a self signed cert as you cannot use basic auth over HTTP except on localhost. It runs on port 4443 rather than 443 because docker refused to accept that "registry" is a docker host and tries to push images to docker.io/registry. "registry:4443" works fine. The shared container contains the ssh keys for the deployer and vms, and the self signed cert for the registry. When the shared container boots, it copies them into a shared volume. The other deployer and vm images are built with soft links from the shared volume to the require locations. Their boot scripts wait for the files to be copied in before continuing. The root mrsk folder is mapped into the deployer container. On boot it builds the gem and installs it. Right now there's just a single test. We confirm that the load balancer is returning a 502, run `mrsk deploy` and then confirm it returns 200.	2023-04-14 15:49:43 +01:00
Kevin McConnell	f055766918	Allow percentage-based rolling deployments	2023-04-14 12:46:14 +01:00
Kevin McConnell	a8726be20e	Move `group_limit` & `group_wait` under `boot` Also make formatting the group strategy the responsibility of the commander.	2023-04-14 11:31:51 +01:00
Kevin McConnell	828e56912e	Allow customizing audit broadcast with env When invoking the audit broadcast command, provide a few environment variables so that people can customize the format of the message if they want. We currently provide `MRSK_PERFORMER`, `MRSK_ROLE`, `MRSK_DESTINATION` and `MRSK_EVENT`. Also adds the destination to the default message, which we continue to send as the first argument as before.	2023-04-13 17:54:25 +01:00
Kevin McConnell	df202d6ef4	Move health checks into Docker Replaces our current host-based HTTP healthchecks with Docker healthchecks, and adds a new `healthcheck.cmd` config option that can be used to define a custom health check command. Also removes Traefik's healthchecks, since they are no longer necessary. When deploying a container that has a healthcheck defined, we wait for it to report a healthy status before stopping the old container that it replaces. Containers that don't have a healthcheck defined continue to wait for `MRSK.config.readiness_delay`. There are some pros and cons to using Docker healthchecks rather than checking from the host. The main advantages are: - Supports non-HTTP checks, and app-specific check scripts provided by a container. - When booting a container, allows MRSK to wait for a container to be healthy before shutting down the old container it replaces. This should be safer than relying on a timeout. - Containers with healthchecks won't be active in Traefik until they reach a healthy state, which prevents any traffic from being routed to them before they are ready. The main _disadvantage_ is that containers are now required to provide some way to check their health. Our default check assumes that `curl` is available in the container which, while common, won't always be the case.	2023-04-13 16:08:43 +01:00
Kevin McConnell	f530009a6e	Allow performing boot & start operations in groups Adds top-level configuration options for `group_limit` and `group_wait`. When a `group_limit` is present, we'll perform app boot & start operations on no more than `group_limit` hosts at a time, optionally sleeping for `group_wait` seconds after each batch. We currently only do this batching on boot & start operations (including when they are part of a deployment). Other commands, like `app stop` or `app details` still work on all hosts in parallel.	2023-04-13 15:58:27 +01:00

... 2 3 4 5 6 ...

456 Commits