Commit Graph

21 Commits

Author SHA1 Message Date
Donal McBreen
8b62e2694a Test non-ascii secret interpolation 2024-09-05 10:01:56 +01:00
Donal McBreen
3d502ab12d Add test adapter and interpolate secrets in integration tests 2024-09-04 12:40:27 +01:00
Donal McBreen
5f2384f123 Use docker info to get arch 2024-08-29 08:46:18 +01:00
Donal McBreen
d2d0223c37 Require an arch to be set, and default to amd64 in the template 2024-08-29 08:45:51 +01:00
Donal McBreen
56268d724d Simplify the builders configuration
1. Add driver as an option, defaulting to `docker-container`. For a
   "native" build you can set it to `docker`
2. Set arch as a array of architectures to build for, defaulting to
   `[ "amd64", "arm64" ]` unless you are using the docker driver in
   which case we default to not setting a platform
3. Remote is now just a connection string for the remote builder
4. If remote is set, we only use it for non-local arches, if we are
   only building for the local arch, we'll ignore it.

Examples:

On arm64, build for arm64 locally, amd64 remotely or
On amd64, build for amd64 locally, arm64 remotely:

```yaml
builder:
  remote: ssh://docker@docker-builder
```

On arm64, build amd64 on remote,
On amd64 build locally:

```yaml
builder:
  arch:
    - amd64
  remote:
    host: ssh://docker@docker-builder
```

Build amd64 on local:

```yaml
builder:
  arch:
    - amd64
```

Use docker driver, building for local arch:

```yaml
builder:
  driver: docker
```
2024-08-29 08:45:48 +01:00
Donal McBreen
cffb6c3d7e Allow the driver to be set 2024-08-29 08:44:11 +01:00
Donal McBreen
0efb5ccfff Remove the healthcheck step
To speed up deployments, we'll remove the healthcheck step.

This adds some risk to deployments for non-web roles - if they don't
have a Docker healthcheck configured then the only check we do is if
the container is running.

If there is a bad image we might see the container running before it
exits and deploy it. Previously the healthcheck step would have avoided
this by ensuring a web container could boot and serve traffic first.

To mitigate this, we'll add a deployment barrier. Until one of the
primary role containers passes its healthcheck, we'll keep the barrier
up and avoid stopping the containers on the non-primary roles.

It the primary role container fails its healthcheck, we'll close the
barrier and shut down the new containers on the waiting roles.

We also have a new integration test to check we correctly handle a
a broken image. This highlighted that SSHKit's default runner will
stop at the first error it encounters. We'll now have a custom runner
that waits for all threads to finish allowing them to clean up.
2024-05-20 12:18:30 +01:00
Donal McBreen
f48c227768 Move env_tags under env key
Instead of:

```
env:
  CLEAR_TAG: untagged
env_tags:
  tag1:
    CLEAR_TAG: tagged
```

We'll have:

```
env:
  clear:
    CLEAR_TAG: untagged
  tags:
    tag1:
      CLEAR_TAG: tagged
```
2024-05-15 10:19:22 +01:00
Donal McBreen
6d062ce271 Host specific env with tags
Allow hosts to be tagged so we can have host specific env variables.

We might want host specific env variables for things like datacenter
specific tags or testing GC settings on a specific host.

Right now you either need to set up a separate role, or have the app
be host aware.

Now you can define tag env variables and assign those to hosts.

For example:
```
servers:
  - 1.1.1.1
  - 1.1.1.2: tag1
  - 1.1.1.2: tag2
  - 1.1.1.3: [ tag1, tag2 ]
env_tags:
  tag1:
    ENV1: value1
  tag2:
    ENV2: value2
```

The tag env supports the full env format, allowing you to set secret and
clear values.
2024-05-09 16:02:45 +01:00
Donal McBreen
5481fbb973 Test that we pull in env host variables
Now that clear env variables specified on the command line we can check
that values specified as `${VAR}` are pulled in from the host.
2024-03-25 12:26:37 +00:00
Donal McBreen
cb030e8751 Merge pull request #680 from igor-alexandrov/traefik-2.10
Bump default Traefik image to 2.10
2024-03-04 11:58:37 +00:00
Krzysztof Adamski
b411356409 Allow for Custom Accessory Service Name 2024-02-15 11:12:18 +01:00
Igor Alexandrov
77e72e34ce Bumped default Traefik image to 2.10 2024-02-13 16:00:02 +04:00
Donal McBreen
0b439362da Asset paths
During deployments both the old and new containers will be active for a
small period of time. There also may be lagging requests for older CSS
and JS after the deployment.

This can lead to 404s if a request for old assets hits a new container
or visa-versa.

This PR makes sure that both sets of assets are available throughout the
deployment from before the new version of the app is booted.

This can be configured by setting the asset path:

```yaml
asset_path: "/rails/public/assets"
```

The process is:
1. We extract the assets out of the container, with docker run, docker
cp, docker stop. Docker run sets the container command to "sleep" so
this needs to be available in the container.
2. We create an asset volume directory on the host for the new version
of the app on the host and copy the assets in there.
3. If there is a previous deployment we also copy the new assets into
its asset volume and copy the older assets into the new asset volume.
4. We start the new container mapping the asset volume over the top of
the container's asset path.

This means the both the old and new versions have replaced the asset
path with a volume containing both sets of assets and should be able
to serve any request during the deployment. The older assets will
continue to be available until the next deployment.
2023-09-11 12:18:18 +01:00
Donal McBreen
8a41d15b69 Zero downtime deployment with cord file
When replacing a container currently we:
1. Boot the new container
2. Wait for it to become healthy
3. Stop the old container

Traefik will send requests to the old container until it notices that it
is unhealthy. But it may have stopped serving requests before that point
which can result in errors.

To get round that the new boot process is:

1. Create a directory with a single file on the host
2. Boot the new container, mounting the cord file into /tmp and
including a check for the file in the docker healthcheck
3. Wait for it to become healthy
4. Delete the healthcheck file ("cut the cord") for the old container
5. Wait for it to become unhealthy and give Traefik a couple of seconds
to notice
6. Stop the old container

The extra steps ensure that Traefik stops sending requests before the
old container is shutdown.
2023-09-06 14:35:30 +01:00
Donal McBreen
94bf090657 Copy env files to remote hosts
Setting env variables in the docker arguments requires having them on
the deploy host.

Instead we'll add two new commands `kamal env push` and
`kamal env delete` which will manage copying the environment as .env
files to the remote host.

Docker will pick up the file with `--env-file <path-to-file>`. Env files
will be stored under `<kamal run directory>/env`.

Running `kamal env push` will create env files for each role and
accessory, and traefik if required.

`kamal envify` has been updated to also push the env files.

By avoiding using `kamal envify` and creating the local and remote
secrets manually, you can now avoid accessing secrets needed
for the docker runtime environment locally. You will still need build
secrets.

One thing to note - the Docker doesn't parse the environment variables
in the env file, one result of this is that you can't specify multi-line
values - see https://github.com/moby/moby/issues/12997.

We maybe need to look docker config or docker secrets longer term to get
around this.

Hattip to @kevinmcconnell - this was all his idea.
2023-09-06 14:33:13 +01:00
Donal McBreen
7cd25fd163 Add more integration tests
Add tests for main, app, accessory, traefik and lock commands.
Other commands are generally covered by the main tests.

Also adds some changes to speed up the integration specs:
- Use a persistent volume for the registry so we can push images to to
reuse between runs (also gets around docker hub rate limits)
- Use persistent volume for mrsk gem install, to avoid re-installing
between tests
- Shorter stop wait time
- Shorter connection timeouts on the load balancer

Takes just over 2 minutes to run all tests locally on an M1 Mac
after docker caches are primed.
2023-05-16 10:35:35 +01:00
Donal McBreen
a5ef1f254f Highlight uncommitted changes in version
If there are uncommitted changes in the app repository when building,
then append `_uncommitted_<random>` to it to distinguish the image
from one built from a clean checkout.

Also change the version used when renaming a container on redeploy to
distinguish and explain the version suffixes.
2023-05-12 11:08:48 +01:00
Donal McBreen
650f9b1fbf Include traefik access logs 2023-05-01 18:55:10 +01:00
Donal McBreen
a77428143f Fix the integration test healthcheck
The alpine nginx container doesn't contain curl, so let's override the
healthcheck command to use wget.
2023-05-01 12:11:24 +01:00
Donal McBreen
bcf8a927f5 Run a mrsk deploy integration test
Adds a simple integration test to ensure that `mrsk deploy` works.

Everything required is spun up with docker compose:
- shared: a container that contains an ssh key and a self signed cert to
be shared between the images
- deployer: the image we will deploy from
- registry: a docker registry
- two vm images to deploy into
- load_balancer: an nginx load balancer to use between our images

The other images are in privileged mode so that we can run
docker-in-docker. We need to run docker inside the images - mapping in
the docker socket doesn't work because both VMs would share the host
daemon.

The docker registry requires a self signed cert as you cannot use basic
auth over HTTP except on localhost. It runs on port 4443 rather than 443
because docker refused to accept that "registry" is a docker host and
tries to push images to docker.io/registry. "registry:4443" works fine.

The shared container contains the ssh keys for the deployer and vms, and
the self signed cert for the registry. When the shared container boots,
it copies them into a shared volume.

The other deployer and vm images are built with soft links from the
shared volume to the require locations. Their boot scripts wait for the
files to be copied in before continuing.

The root mrsk folder is mapped into the deployer container. On boot it
builds the gem and installs it.

Right now there's just a single test. We confirm that the load balancer
is returning a 502, run `mrsk deploy` and then confirm it returns 200.
2023-04-14 15:49:43 +01:00