Commit Graph

902 Commits

Author SHA1 Message Date
Kevin McConnell
100b72e4b4 Limit rolling deployment to boot operation 2023-04-14 10:41:07 +01:00
Kevin McConnell
828e56912e Allow customizing audit broadcast with env
When invoking the audit broadcast command, provide a few environment
variables so that people can customize the format of the message if they
want.

We currently provide `MRSK_PERFORMER`, `MRSK_ROLE`, `MRSK_DESTINATION` and
`MRSK_EVENT`.

Also adds the destination to the default message, which we continue to
send as the first argument as before.
2023-04-13 17:54:25 +01:00
Kevin McConnell
df202d6ef4 Move health checks into Docker
Replaces our current host-based HTTP healthchecks with Docker
healthchecks, and adds a new `healthcheck.cmd` config option that can be
used to define a custom health check command. Also removes Traefik's
healthchecks, since they are no longer necessary.

When deploying a container that has a healthcheck defined, we wait for
it to report a healthy status before stopping the old container that it
replaces. Containers that don't have a healthcheck defined continue to
wait for `MRSK.config.readiness_delay`.

There are some pros and cons to using Docker healthchecks rather than
checking from the host. The main advantages are:

- Supports non-HTTP checks, and app-specific check scripts provided by a
  container.
- When booting a container, allows MRSK to wait for a container to be
  healthy before shutting down the old container it replaces. This
  should be safer than relying on a timeout.
- Containers with healthchecks won't be active in Traefik until they
  reach a healthy state, which prevents any traffic from being routed to
  them before they are ready.

The main _disadvantage_ is that containers are now required to provide
some way to check their health. Our default check assumes that `curl` is
available in the container which, while common, won't always be the
case.
2023-04-13 16:08:43 +01:00
Kevin McConnell
f530009a6e Allow performing boot & start operations in groups
Adds top-level configuration options for `group_limit` and `group_wait`.
When a `group_limit` is present, we'll perform app boot & start
operations on no more than `group_limit` hosts at a time, optionally
sleeping for `group_wait` seconds after each batch.

We currently only do this batching on boot & start operations (including
when they are part of a deployment). Other commands, like `app stop` or
`app details` still work on all hosts in parallel.
2023-04-13 15:58:27 +01:00
Matteo Landi
4b36df5dab Configure git to trust /workdir
Resolves: #220
2023-04-13 15:13:13 +02:00
Gilles Demarty
79d46ceb16 Add OpenSSH Client to the alpine server 2023-04-12 19:20:09 +02:00
David Heinemeier Hansson
bc8875e020 Merge pull request #183 from basecamp/cleanup-excessive-containers-running
Clear stale containers
2023-04-12 15:58:59 +02:00
David Heinemeier Hansson
d4a72da9d8 Merge pull request #213 from ncreuschling/fix-spelling-of-label
fix spelling of label
2023-04-12 15:58:46 +02:00
David Heinemeier Hansson
04a04c05e0 Merge branch 'main' into fix-spelling-of-label 2023-04-12 15:58:41 +02:00
David Heinemeier Hansson
cff8b058af Merge pull request #214 from tannakartikey/traefik_lables_readme_example_fix
Traefik label example typo fix
2023-04-12 15:58:08 +02:00
David Heinemeier Hansson
b6f7d94ac3 Merge pull request #144 from monorkin/shell-escape-dollar-signs
Shell escape dollar signs
2023-04-12 15:57:37 +02:00
Stanko K.R
3ab16c8994 Shell escape dollar signs
But allow for shell expansion using curly braces e.g. ${PWD}
2023-04-12 15:55:54 +02:00
Kartikey Tanna
b6743e5e1c Traefik label example typo fix 2023-04-12 19:21:20 +05:30
Jacopo
9ddb181f50 Merge branch 'main' into cleanup-excessive-containers-running
* main:
  Pull the primary host from the role
  Minimise holding the deploy lock
2023-04-12 15:19:19 +02:00
Nicolai Reuschling
fbe1458478 fix spelling of label 2023-04-12 14:56:39 +02:00
David Heinemeier Hansson
2f1393cd92 Merge pull request #212 from basecamp/role-primary-hosts
Pull the primary host from the role
2023-04-12 14:09:38 +02:00
David Heinemeier Hansson
76673c0c1b Merge pull request #211 from basecamp/minimise-lock-retention
Minimise holding the deploy lock
2023-04-12 14:08:05 +02:00
Donal McBreen
fb62f2e6e1 Pull the primary host from the role
So commands like this run on a host with the specified role:
```
mrsk app exec -r=console -i "/bin/bash`
mrsk app logs -f -r=workers
```
2023-04-12 13:03:02 +01:00
Donal McBreen
051556674f Minimise holding the deploy lock
If we get an error we'll only hold the deploy lock if it occurs while
trying to switch the running containers.

We'll also move tagging the latest image from when the image is pulled
to just before the container switch. This ensures that earlier errors
don't leave the hosts with an updated latest tag while still running the
older version.
2023-04-12 12:09:56 +01:00
Jacopo
3cbf4aea46 Make method private method and use :send 2023-04-12 11:53:49 +02:00
Jacopo
5ed431b807 Merge branch 'main' into cleanup-excessive-containers-running
* main: (24 commits)
  Bump version for 0.11.0
  Labels can be added to Traefik
  Make rollbacks role-aware
  fix typo role to roles
  Explained the latest modifications of Traefik container labels
  Remove .idea folder
  Updated README.md with new healthcheck.max_attempts option
  Fix test case: console output message was not updated to display the current/total attempts
  Require net-ssh ~> 7.0 for SHA-2 support
  Improved deploy lock acquisition
  Excess CR
  Style
  Simpler
  Make it explicit, focus on Ubuntu
  More explicit
  Not that --bundle is a Rails 7+ option
  Update README.md
  Update README.md
  Improved: configurable max_attempts for healthcheck
  Traefik service name to be derived from role and destination
  ...
2023-04-12 11:52:47 +02:00
David Heinemeier Hansson
60a19f0b30 Bump version for 0.11.0 v0.11.0 2023-04-12 11:45:33 +02:00
David Heinemeier Hansson
2d0a7e1b67 Merge pull request #208 from tannakartikey/add_labels_to_traefik
Labels can be added to Traefik
2023-04-12 11:35:28 +02:00
David Heinemeier Hansson
49df19fb0d Merge pull request #209 from ncreuschling/fix-roles-documentation
fix typo role to roles
2023-04-12 11:34:02 +02:00
David Heinemeier Hansson
cef8fddfb4 Merge pull request #210 from basecamp/role-aware-rollbacks
Make rollbacks role-aware
2023-04-12 11:33:45 +02:00
Kartikey Tanna
c59eb00dd0 Labels can be added to Traefik 2023-04-12 14:53:48 +05:30
Donal McBreen
43f7409de0 Make rollbacks role-aware
Rollbacks stopped working after https://github.com/mrsked/mrsk/pull/99.

We'll confirm that a container is available for the first role on the
primary host before attempting to rollback.
2023-04-12 09:59:39 +01:00
Nicolai Reuschling
448ea7719f fix typo role to roles 2023-04-12 10:53:10 +02:00
Jacopo
72b70e3e9e More compact 2023-04-11 16:22:47 +02:00
Jacopo
e8697327fa Use no_commands block 2023-04-11 16:20:16 +02:00
Jacopo
0bfd4ca780 Use cli = self approach 2023-04-11 16:04:46 +02:00
Jacopo
12e3a562c4 Extract helper 2023-04-11 15:26:55 +02:00
David Heinemeier Hansson
ab54dbdb8b Merge pull request #206 from tannakartikey/traefik_rule_docs
Explained the latest modifications of Traefik container labels
2023-04-11 14:18:31 +02:00
David Heinemeier Hansson
ac3771447a Merge pull request #203 from matharvard/main
Require net-ssh ~> 7.0 for SHA-2 support
2023-04-11 14:17:52 +02:00
David Heinemeier Hansson
daa0c9b5be Merge pull request #196 from handy-la/main
Configurable max_attempts for healthcheck
2023-04-11 14:17:17 +02:00
Jacopo
c3393c8213 Remove dot 2023-04-11 11:03:11 +02:00
Jacopo
03d933d10b Add Role to the message 2023-04-11 10:59:25 +02:00
Jacopo
579b4cd9aa Simplify
By using and ad-hoc command to detect and stop stale containers.
By default stale containers are only detected.
2023-04-11 10:22:03 +02:00
Jacopo
f9436d5673 Style 2023-04-11 08:53:33 +02:00
Jacopo
8ae5331d97 Boot stop all the old containers 2023-04-11 08:53:33 +02:00
Jacopo
4d47fbdf41 Merge stop and stop_stale_containers 2023-04-11 08:53:33 +02:00
Jacopo
e980f1164e Avoid using GNU-only Perl Regepx Grep 2023-04-11 08:53:33 +02:00
Jacopo
e2f6db5cae Clear stale containers
By stopping all the older containers with matching /#{service}-#{role}-#{dest}-.*/ running on the same host.
2023-04-11 08:53:33 +02:00
Kartikey Tanna
d3936363d0 Explained the latest modifications of Traefik container labels 2023-04-11 10:20:16 +05:30
Arturo Ojeda
cfc8fa0590 Remove .idea folder 2023-04-10 22:33:20 -06:00
Arturo Ojeda
161ebe4bc1 Updated README.md with new healthcheck.max_attempts option 2023-04-10 22:26:10 -06:00
Arturo Ojeda
514b2aa243 Fix test case: console output message was not updated to display the current/total attempts 2023-04-10 09:29:19 -06:00
David Heinemeier Hansson
18031bc552 Merge pull request #202 from basecamp/deploy-lock-acquisition
Improved deploy lock acquisition
2023-04-10 16:42:03 +02:00
Mat Harvard
d8c61004e4 Require net-ssh ~> 7.0 for SHA-2 support
Versions of net-ssh before 7.0 do not support the SHA-2 algorithm and result in mrsk not being able to connect to hosts using keys generated with it. net-ssh is also a dependency of sshkit, however, sshkit has a version requirement of >= 2.8.0 for net-ssh, so is not effective at ensuring mrsk has the version it needs to be the most compatible.
2023-04-10 07:29:07 -07:00
Donal McBreen
c4df440c79 Improved deploy lock acquisition
1. Don't raise lock error for non-lock issues during lock acquire
  (see https://github.com/mrsked/mrsk/pull/181)
2. If there is an error while the lock is held, don't release the lock
  and send a warning to stderr
2023-04-10 15:23:00 +01:00