Commit Graph

32 Commits

Author SHA1 Message Date
Nick Hammond
89994c8b20 Add grep's context option to show lines before and after a match 2024-05-24 08:59:33 -07:00
Donal McBreen
187861fa60 Space not tab 2024-05-21 12:20:19 +01:00
Donal McBreen
0e73f02743 Split lock and connection setup
Allow run the pre-connect hook before the first SSH command is executed,
but only run the locking in `with_lock` blocks.
2024-05-21 12:02:16 +01:00
Donal McBreen
64f5955444 Don't hold lock on error 2024-05-21 12:02:12 +01:00
Donal McBreen
6a7c90cf4d Only stopping containers locks 2024-05-21 12:01:07 +01:00
Donal McBreen
2c2d94c6d9 Merge pull request #740 from basecamp/remove-healthcheck-step
Remove the healthcheck step
2024-05-21 12:00:25 +01:00
Donal McBreen
9700e2b3c4 Merge pull request #646 from nickhammond/server/exec
Add in a server exec command for running ad-hoc commands directly on the server
2024-05-21 11:34:20 +01:00
Donal McBreen
ee758d951a Only use barrier when needed, more descriptive info 2024-05-20 12:18:30 +01:00
Donal McBreen
773ba3a5ab Show container logs and healthcheck status on failure 2024-05-20 12:18:30 +01:00
Donal McBreen
5be6fa3b4e Improve comments 2024-05-20 12:18:30 +01:00
Donal McBreen
07c5658396 Remove redundant method 2024-05-20 12:18:30 +01:00
Donal McBreen
0efb5ccfff Remove the healthcheck step
To speed up deployments, we'll remove the healthcheck step.

This adds some risk to deployments for non-web roles - if they don't
have a Docker healthcheck configured then the only check we do is if
the container is running.

If there is a bad image we might see the container running before it
exits and deploy it. Previously the healthcheck step would have avoided
this by ensuring a web container could boot and serve traffic first.

To mitigate this, we'll add a deployment barrier. Until one of the
primary role containers passes its healthcheck, we'll keep the barrier
up and avoid stopping the containers on the non-primary roles.

It the primary role container fails its healthcheck, we'll close the
barrier and shut down the new containers on the waiting roles.

We also have a new integration test to check we correctly handle a
a broken image. This highlighted that SSHKit's default runner will
stop at the first error it encounters. We'll now have a custom runner
that waits for all threads to finish allowing them to clean up.
2024-05-20 12:18:30 +01:00
Nick Hammond
fb58fc0ba6 Add in a server exec command for running ad-hoc commands directly on the server 2024-05-13 14:17:06 -07:00
Donal McBreen
6d062ce271 Host specific env with tags
Allow hosts to be tagged so we can have host specific env variables.

We might want host specific env variables for things like datacenter
specific tags or testing GC settings on a specific host.

Right now you either need to set up a separate role, or have the app
be host aware.

Now you can define tag env variables and assign those to hosts.

For example:
```
servers:
  - 1.1.1.1
  - 1.1.1.2: tag1
  - 1.1.1.2: tag2
  - 1.1.1.3: [ tag1, tag2 ]
env_tags:
  tag1:
    ENV1: value1
  tag2:
    ENV2: value2
```

The tag env supports the full env format, allowing you to set secret and
clear values.
2024-05-09 16:02:45 +01:00
Donal McBreen
5e492ecc4d Merge pull request #748 from basecamp/latest-by-tag
Latest by tag
2024-04-03 09:11:03 +01:00
Donal McBreen
8a6a51977f Set env variables when running kamal app exec
Allow additional env variable to be set when running `kamal app exec`.
Works for both new and existing containers.
2024-04-01 15:01:32 +01:00
Donal McBreen
ba7a13f895 Only tag after deploying to all hosts 2024-03-29 10:29:58 +00:00
Donal McBreen
05ac808f2a Use image tag to determine stale containers
Use current_running_version to determine the latest version when finding
stale containers.
2024-03-29 10:23:50 +00:00
Donal McBreen
fb7d9077ff Use latest tag for the current destination 2024-03-29 09:48:09 +00:00
Donal McBreen
55dd2f49c1 Tag image after booting and include destination
If you are deploying more than one destination to a host, the latest
tags will conflict, so we'll append the destination to the tag.

The latest tag is used when booting the app or exec-ing a new container.

If a deploy doesn't complete on a host for all roles then we should
probably not be using it, so move the tagging to the end of the boot
process.
2024-03-29 08:51:50 +00:00
Donal McBreen
e99e1955b8 Extract app boot steps
The Kamal::Cli::App#boot has a lot to do, so extract the steps to make
things clearer.
2024-03-22 09:21:52 +00:00
Donal McBreen
3ecfb3744f Add Rubocop
- Pull in the 37signals house style
- Autofix violations
- Add to CI
2024-03-20 10:23:02 +00:00
Donal McBreen
4966d52919 Pass around Roles instead of Strings
Avoid looking up roles by names everywhere. This avoids the awkward
role/role_config naming as well.
2024-03-08 08:44:35 +00:00
Ahmed Al Hafoudh
0d709a3fdb Allow lines option to be configured when following app logs 2024-01-08 09:34:38 +01:00
Donal McBreen
645f5ab72d App exec with env file
When calling `kamal app exec` for new non interactive containers, run
the command per role on each server and include the role config
including the environment.

Fixes: https://github.com/basecamp/kamal/issues/492
2023-09-25 15:07:05 +01:00
Donal McBreen
0861730e0e Run interactive commands with the correct host
Fixes https://github.com/basecamp/kamal/issues/430
2023-09-18 12:00:36 +01:00
dhh
59ac59d351 Healthcheck polling is a CLI concern
Also, it has no instance variables, so let's just have it be a module.
2023-09-16 11:19:38 -07:00
dhh
3ae855ef28 Explain method better 2023-09-16 09:53:03 -07:00
Donal McBreen
0b439362da Asset paths
During deployments both the old and new containers will be active for a
small period of time. There also may be lagging requests for older CSS
and JS after the deployment.

This can lead to 404s if a request for old assets hits a new container
or visa-versa.

This PR makes sure that both sets of assets are available throughout the
deployment from before the new version of the app is booted.

This can be configured by setting the asset path:

```yaml
asset_path: "/rails/public/assets"
```

The process is:
1. We extract the assets out of the container, with docker run, docker
cp, docker stop. Docker run sets the container command to "sleep" so
this needs to be available in the container.
2. We create an asset volume directory on the host for the new version
of the app on the host and copy the assets in there.
3. If there is a previous deployment we also copy the new assets into
its asset volume and copy the older assets into the new asset volume.
4. We start the new container mapping the asset volume over the top of
the container's asset path.

This means the both the old and new versions have replaced the asset
path with a volume containing both sets of assets and should be able
to serve any request during the deployment. The older assets will
continue to be available until the next deployment.
2023-09-11 12:18:18 +01:00
Donal McBreen
8a41d15b69 Zero downtime deployment with cord file
When replacing a container currently we:
1. Boot the new container
2. Wait for it to become healthy
3. Stop the old container

Traefik will send requests to the old container until it notices that it
is unhealthy. But it may have stopped serving requests before that point
which can result in errors.

To get round that the new boot process is:

1. Create a directory with a single file on the host
2. Boot the new container, mounting the cord file into /tmp and
including a check for the file in the docker healthcheck
3. Wait for it to become healthy
4. Delete the healthcheck file ("cut the cord") for the old container
5. Wait for it to become unhealthy and give Traefik a couple of seconds
to notice
6. Stop the old container

The extra steps ensure that Traefik stops sending requests before the
old container is shutdown.
2023-09-06 14:35:30 +01:00
Tobias Bühlmann
556f7f5a37 Fix picking the first available role on primary_host 2023-08-24 13:50:24 +02:00
David Heinemeier Hansson
c4a203e648 Rename to Kamal 2023-08-22 08:24:31 -07:00