The previous 'localhost' value was causing connection problems on the
remote host, such that it would not create a connection through the
tunnel to the specified port.
With 1Password, there is a way to retrieve all fields
of a given item directly without having to enumerate them.
Allowing this when passing no arguments for secrets fetch
command.
It shouldn't be necessary to deploy the app on proxy reboot. When there
are multiple apps using the same proxy we'll only deploy the one we
run the reboot command from, so we don't always reboot anyway.
Instead of checking for the proxy key, we'll set the config to {} if it
is nil in the Kamal::Configuration::Proxy initializer.
This is a bit cleaner, and maybe it will help with
https://github.com/basecamp/kamal/issues/1555 if somehow
@raw_config.key?(:proxy) is false but @raw_config.proxy is not nil.
When booting non-primary role hosts we will always wait for a primary
role host to boor first.
So when booting in groups, if there are no primary role hosts in the
first batch, then booting will stall.
Sort primary role app_hosts first to avoid this.
Fixes: https://github.com/basecamp/kamal/issues/1553
Allow --metrics_port and --debug options to be set via the boot config.
--metrics_port support will come in kamal-proxy v0.8.8, so this option
doesn't work right now.
This will be updated before the next Kamal release though and we can add
integration tests for the metrics at that point.
This is for boot time configuration for the kamal proxy. Config in here
doesn't not belong in Kamal::Configuration::Proxy which is for deploy
time configuration for the app itself.
Kamal apps don't contain boot time config, because multiple apps can
share a proxy and the config could conflict.
We'll set the KAMAL_LOCK environment when calling run hooks. If set to
true we have the lock and the hook will not need to acquire it again if
it runs kamal commands.
Fixes: https://github.com/basecamp/kamal/issues/1517
Docker buildx build outputs the build logs to stderr by default.
SSHKit displays stderr logs in red, which can suggest that an error has
occurred.
Redirect the output to stdout, so it shows in green. If there is an
error, the output will be repeated in red anyway.
Fixes: https://github.com/basecamp/kamal/issues/1356
The `app exec` and `accessory exec` commands will run `docker run` if
they are not set to reuse existing containers. This might need to pull
an image so let's make sure we are logged in before running the command.
Fixes: https://github.com/basecamp/kamal/issues/1163
We hook into the SSHKit `on` method to run the pre-connect hook before
the first SSH command. This doesn't work for interactive exec commands
where ssh is called directly.
Fixes: https://github.com/basecamp/kamal/issues/1157
When booting an accessory, check for the container first and skip boot
if it exists. This allows us to rerun `kamal setup` on hosts with
accessories without raising an error.
Fixes: https://github.com/basecamp/kamal/issues/488
We only need to run the docker login commands for pushing and pulling
images.
So let's move the logins into those commands. This ensures we are logged
in when calling `kamal build` commands directly.
Fixes: https://github.com/basecamp/kamal/issues/919
If there are accessories defined in the configuration, we'll not require
servers to be defined as well.
This allows for accessory-only configurations which allows you to run
external images with kamal-proxy for zero-downtime deployments.
We don't manage image cleanup for accessories though so the user will
need to deal with that themselves.
Adds support for maintenance mode to Kamal.
There are two new commands:
- `kamal app maintenance` - puts the app in maintenance mode
- `kamal app live` - puts the app back in live mode
In maintenance mode, the kamal proxy will respond to requests with a
503 status code. It will use an error page built into kamal proxy.
You can use your own error page by setting `error_pages_path` in the
configuration. This will copy any 4xx.html or 5xx.html files from that
page to a volume mounted into the proxy container.
Maps in and external /home/kamal-proxy/.app-config volume that we can
use to map files to the proxy.
Can be used to store custom maintenance pages or SSL certificates.
Use the --registry, --repository and --image_version options of
`kamal proxy boot_config set` to change the kamal-proxy image used.
We'll still insist that the image version is at least as high as the
minimum.
v3 was recently released which broke the integration tests. Update them
to use the correct config file.
Set the major version to prevent this from happening when v4 is
released.
Adds the host the container is being deployed to as KAMAL_HOST.
My use case is to more easily tag the host for metrics tagging,
but there might be other uses as well.
We cat the options file, append the proxy image and then pass it
to xargs to ensure it handles spaces correctly.
Works better than using eval which can handle spaces but tries
to evaluate things like backticks.
Fixes: https://github.com/basecamp/kamal/issues/1448
See: https://github.com/sparklemotion/nokogiri/releases/tag/v1.18.0
- required for ruby 3.4 compatibility
- add more platforms to lockfile to support
docker build process, due to changes
in the nokogiri native gem setup where
-musl and -gnu linux platforms are no longer
interchangeable
- bundler >= 2.5.6 required according to Nokogiri release
notes, so updated to current latest version (2.6.5)
Somewhere along the way the docker build broke, it now needs yaml-dev
to be installed. Maybe the underlying image changed?
Update to 3.4-alpine while we are here.
- Reset KAMAL on alias command, rather than relying on checking
"invoked_via_subcommand"
- Validate the accessories roles when loading the configuration
not later on when trying to access them
Add two new hooks pre-app-boot and post-app-boot. They are analagous
to the pre/post proxy reboot hooks.
If the boot strategy deploys in groups, then the hooks are called once
per group of hosts and `KAMAL_HOSTS` contains a comma delimited list of
the hosts in that group.
If all hosts are deployed to at once, then they are called once with
`KAMAL_HOSTS` containing all the hosts.
It is possible to have pauses between groups of hosts in the boot config,
where this is the case the pause happens after the post-app-boot hook is
called.
which will build a "dirty" image using the working directory.
This command is different from `build push` in two important ways:
- the image tags will have a suffix of `-dirty`
- the export action is "docker", pushing to the local docker image store
The command also supports the `--output` option just added to `build
push` to override that default.
This command is intended to allow developers to quickly iterate on a
docker image built from their local working directory while avoiding
any confusion with a pristine image built from a git clone, and
keeping those images on the local dev system by default.
which controls where the build result is exported.
The default value is "registry" to reflect the current behavior of
`build push`.
Any value provided to this option will be passed to the `buildx build`
command as a `--output=type=<VALUE>` flag.
For example, the following command will push to the local docker image
store:
kamal build push --output=docker
squash
The ERB runs first so it does matter if it in a comment. If the file
doesn't exist (e.g. if not using Ruby, you'll get an error).
We'll change the example to match the Rails deploy.yml template won't
have than problem.
We were checking before `kamal build push`, but not `kamal registry login`.
Since `kamal registry login` is called first by a deploy we don't
get the nice error message.
We only loaded the configuration once, which meant that aliases always
used the initial configuration file and destination.
We don't want to load the configuration in subcommands as it is not
passed all the options we need. But just checking if we are in a
subcommand is enough - the alias reloads and the subcommand does not.
One thing to note is that anything passed on the command line overrides
what is in the alias, so if an alias says
`other_config: config -c config/deploy2.yml` and you run
`kamal other_config -c config/deploy.yml`, it won't switch.
There were typo-bug during `validate_servers!` invocation for role.
It wasn't discovered, because it never met condition. Because role_config wasn't correctly extracted for validation.
Also remove not used anymore `accessories_on`. Leftover from previous changes.
In Dotenv 3.1.5, `Dotenv.parse` no longer returns values that are
already in the environment.
See https://github.com/bkeepers/dotenv/issues/518
We can get the values though by setting overwrite: true, which works
with both 3.1.4 and 3.1.5.
this is useful for long running rake tasks or scripts
that can be run without having to keep open connection to the server.
Example:
```
kamal app exec 'bin/rails db:backfill_task' --detach
```
Previously when unknown role (or with typo) was placed in accessories.roles,
this error was thrown: `ERROR (NoMethodError): undefined method `hosts' for nil`.
The max-size log opt is not valid for all logging drivers, such as
syslog. Allow the option to be removed from the boot config with:
```
kamal proxy boot_config set --log-max-size=
or
kamal proxy boot_config set --log-max-size=""
```
Sometimes a projects has a lot of secrets (more than 10). And its
cumbersome to write $(kama secrets fetch ...) with a lot of field
names.
I want to be able to just fetch all the fields from a given item
and then just use these with $(kamal extract NAME)
* Update to be able to run on 3.4 with frozen strings
---------
Co-authored-by: Jeremy Daer <jeremydaer@gmail.com>
Co-authored-by: Sijawusz Pur Rahnama <sija@sija.pl>
Proxy changes:
- Add option to use custom TLS certificates (#17)
- Don't buffer SSE responses (#36)
- Allow routing to wildcard subdomains (#45)
Custom TLS certificates not supported in Kamal itself yet. Buffering
SSE responses and wildcard subdomains will work without any Kamal
changes.
Dotenv's variable substitution doesn't work the same way as commands run
in the shell. It needs values to be escaped.
```sh
$ cat /tmp/env
SECRETS=$(cat /tmp/json)
SECRETS2=$(echo $SECRETS | jq)
$ cat /tmp/json
\{\ \"foo\"\ :\ \"bar\" \}
$ SECRETS=$(cat /tmp/json)
$ SECRETS2=$(echo $SECRETS | jq)
jq: parse error: Invalid numeric literal at line 1, column 2
$ ruby -e 'require "dotenv"; puts Dotenv.parse("/tmp/env")["SECRETS2"]'
{
"foo": "bar"
}
```
Since you then can't use the shell to debug, `kamal secrets print` will
allow you to see what the secrets will be set to.
Upgrading kamal from `v1.8.3` to `v1.9.0` broke my [kamal playground](https://labs.iximiuz.com/playgrounds/kamal):
```
laborant@dev-machine:~/svc-a$ kamal setup
INFO [34d0def6] Running /usr/bin/env mkdir -p .kamal on 172.16.0.3
INFO [c34cf833] Running /usr/bin/env mkdir -p .kamal on 172.16.0.4
INFO [34d0def6] Finished in 0.147 seconds with exit status 0 (successful).
INFO [c34cf833] Finished in 0.204 seconds with exit status 0 (successful).
Acquiring the deploy lock...
Ensure Docker is installed...
INFO [413ee426] Running docker -v on 172.16.0.4
INFO [f1acacba] Running docker -v on 172.16.0.3
INFO [413ee426] Finished in 0.036 seconds with exit status 0 (successful).
INFO [f1acacba] Finished in 0.076 seconds with exit status 0 (successful).
Log into image registry...
INFO [94cff492] Running docker login registry.iximiuz.com -u [REDACTED] -p [REDACTED] on localhost
INFO [94cff492] Finished in 0.077 seconds with exit status 0 (successful).
INFO [605c535f] Running docker login registry.iximiuz.com -u [REDACTED] -p [REDACTED] on 172.16.0.4
INFO [6002b598] Running docker login registry.iximiuz.com -u [REDACTED] -p [REDACTED] on 172.16.0.3
INFO [605c535f] Finished in 0.083 seconds with exit status 0 (successful).
INFO [6002b598] Finished in 0.083 seconds with exit status 0 (successful).
Build and push app image...
INFO [9d172b1e] Running docker --version && docker buildx version on localhost
INFO [9d172b1e] Finished in 0.059 seconds with exit status 0 (successful).
INFO Cloning repo into build directory `/tmp/kamal-clones/svc-a-2f65914456263/workdir/`...
INFO [26fb1bd3] Running /usr/bin/env git -C /tmp/kamal-clones/svc-a-2f65914456263 clone /workdir --recurse-submodules on localhost
ERROR Error preparing clone: Failed to clone repo: git exit status: 32768
git stdout: Nothing written
git stderr: Cloning into 'workdir'...
fatal: detected dubious ownership in repository at '/workdir/.git'
To add an exception for this directory, call:
git config --global --add safe.directory /workdir/.git
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
, deleting and retrying...
INFO Cloning repo into build directory `/tmp/kamal-clones/svc-a-2f65914456263/workdir/`...
INFO [fd4aac0c] Running /usr/bin/env git -C /tmp/kamal-clones/svc-a-2f65914456263 clone /workdir --recurse-submodules on localhost
Finished all in 0.3 seconds
Releasing the deploy lock...
Finished all in 0.6 seconds
ERROR (SSHKit::Command::Failed): git exit status: 32768
git stdout: Nothing written
git stderr: Cloning into 'workdir'...
fatal: detected dubious ownership in repository at '/workdir/.git'
To add an exception for this directory, call:
git config --global --add safe.directory /workdir/.git
fatal: Could not read from remote repository.
Please make sure you have the correct access rights
and the repository exists.
laborant@dev-machine:~/svc-a$ kamal version
2.0.0
```
I checked the [v1.8.3...v1.9.0](https://github.com/basecamp/kamal/compare/v1.8.3...v1.9.0) diff, and couldn't find anything even remotely related to the above error.
Then I checked the `git` versions in kamal `v1.8.3` and `v1.9.0` images:
```
docker run -it --rm --entrypoint sh ghcr.io/basecamp/kamal:v1.8.3
/workdir # git --version
git version 2.38.5
```
vs.
```
docker run -it --rm --entrypoint sh ghcr.io/basecamp/kamal:v2.0.0
/workdir # git --version
git version 2.39.5
```
Apparently, something changed in between `2.38.5` and `2.39.5` git releases (likely yet another CVE fix), and the `git config --global --add safe.directory /workdir` stopped working.
Here is the mitigation I currently use, but it's a bit awkward to do it:
```
docker build -t ghcr.io/basecamp/kamal:v2.0.0 - <<EOF
FROM ghcr.io/basecamp/kamal:v2.0.0
RUN git config --global --add safe.directory /workdir/.git
EOF
```
Hence, this PR.
To repro, you can start a [kamal playground](https://labs.iximiuz.com/playgrounds/kamal), then `docker pull ghcr.io/basecamp/kamal:v2.0.0` to override my patched image, and `cd svc-a && kamal setup`.
When fetched item is not a login, Bitwarden adapter raises NoMethodError
because the returned JSON does not have the login.password value.
Add a nicer error message for that case.
Add commands for managing proxy boot config. Since the proxy can be
shared by multiple applications, the configuration doesn't belong in
`config/deploy.yml`.
Instead you can set the config with:
```
Usage:
kamal proxy boot_config <set|get|clear>
Options:
[--publish], [--no-publish], [--skip-publish] # Publish the proxy ports on the host
# Default: true
[--http-port=N] # HTTP port to publish on the host
# Default: 80
[--https-port=N] # HTTPS port to publish on the host
# Default: 443
[--docker-options=option=value option2=value2] # Docker options to pass to the proxy container
```
By default we boot the proxy with `--publish 80:80 --publish 443:443`.
You can stop it from publishing ports, specify different ports and pass
other docker options.
The config is stored in `.kamal/proxy/options` as arguments to be passed
verbatim to docker run.
Where someone wants to set the options in their application they can do
that by calling `kamal proxy boot_config set` in a pre-deploy hook.
There's an example in the integration tests showing how to use this to
front kamal-proxy with Traefik, using an accessory.
Use localhost for app_with_roles and 127.0.0.1 for app. Confirm we can
deploy both and the respond to requests. Ensure the proxy is removed
once both have been removed.
1. Update to kamal-proxy 0.4.0 which creates and chowns
/home/kamal-proxy/.config/kamal-proxy to kamal-proxy
2. Use a docker volume rather than mapping in a directory, so docker
keeps it owned by the correct user
By default only the primary role runs the proxy. To disable the proxy
for that role, you can set `proxy: false` under it.
For other roles they default to not running the proxy, but you can
enable it by setting `proxy: true` for the role, or alternatively
setting a proxy configuration.
The proxy configuration will be merged into the root proxy configuration.
Remove `stop_wait_time` and `readiness_timeout` from the root config
and remove `deploy_timeout` and `drain_timeout` from the proxy config.
Instead we'll just have `deploy_timeout` and `drain_timeout` in the
root config.
For roles that run the proxy, they are passed to the kamal-proxy deploy
command. Once that returns we can assume the container is ready to
shut down.
For other roles, we'll use the `deploy_timeout` when polling the
container to see if it is ready and the `drain_timeout` when stopping
the container.
Adds:
- `kamal upgrade` to upgrade all app hosts and accessory hosts
- `kamal proxy upgrade` to upgrade the proxy on all hosts
- `kamal accessory upgrade [name]` to upgrade accessories on all hosts
Upgrade takes rolling and confirmed options and calls `proxy upgrade`
and `accessory upgrade` in turn.
To just upgrade a single host add -h [host] to the command. But the
upgrade should run on all hosts, not just those running the proxy.
Calling upgrade on a host that has already been upgraded should work ok.
Upgrading hosts causes downtime but you can avoid if you run multiple
hosts by:
1. Implementing the pre-proxy-reboot and post-proxy-reboot hooks to
remove the host from external load balancers
2. Running the upgrade with the --rolling option
**kamal proxy upgrade**
1. Creates a `kamal` network if required
2. Stops and removes the old proxy (whether Traefik or kamal-proxy)
3. Starts a kamal-proxy container in the `kamal` network
4. Reboots the app containers in the `kamal` network
**kamal accessory upgrade [name]**
1. Creates a `kamal` network if required
2. Reboots the accessory containers in the `kamal` network
A matching `downgrade` command will be added to Kamal 1.9.
The proxy can be enabled via the config:
```
proxy:
enabled: true
hosts:
- 10.0.0.1
- 10.0.0.2
```
This will enable the proxy and cause it to be run on the hosts listed
under `hosts`, after running `kamal proxy reboot`.
Enabling the proxy disables `kamal traefik` commands and replaces them
with `kamal proxy` ones. However only the marked hosts will run the
kamal-proxy container, the rest will run Traefik as before.
When overriding the command, docker will still run the entrypoint. We
want to avoid that here - we just want to get the assets out as quickly
as possible. Otherwise maybe something important is going on when we
stop the container.
Add a shared secrets file used across all destinations. Useful for
things Github tokens or registry passwords.
The secrets are added to a new file called `secrets-common` to highlight
they are shared, and to avoid acciedentally inheriting a secret from the
`secrets` file to `secrets.destination`.
Secrets should be interpolated at runtime so we do want the file in git.
But add a warning at the top to avoid adding secrets or git ignore the
file if you do.
Also provide examples of the three options for interpolating secrets.
Avoid outputting this login error message, it wasn't an error and you
don't need to follow those instructions.
```
[ERROR] 2024/09/11 11:57:08 You are not currently signed in. Please run `op signin --help` for instructions
```
Rather than redirecting the global $stdout, which is not never clever in
a threaded program, we'll make the secrets commands aware they are
being inlined, so they return the value instead of printing it.
Additionally we no longer need to interrupt the parent process on error
as we've inlined the command - exit 1 is enough.
Add env files back in for secrets - hides them from process lists and
allows you to pick up the latest env file when running
`kamal app exec` without reusing.
By default look for the env file in .kamal/env to avoid clashes with
other tools using .env.
For now we'll still load .env and issue a deprecation warning, but in
future we'll stop reading those.
1. Add driver as an option, defaulting to `docker-container`. For a
"native" build you can set it to `docker`
2. Set arch as a array of architectures to build for, defaulting to
`[ "amd64", "arm64" ]` unless you are using the docker driver in
which case we default to not setting a platform
3. Remote is now just a connection string for the remote builder
4. If remote is set, we only use it for non-local arches, if we are
only building for the local arch, we'll ignore it.
Examples:
On arm64, build for arm64 locally, amd64 remotely or
On amd64, build for amd64 locally, arm64 remotely:
```yaml
builder:
remote: ssh://docker@docker-builder
```
On arm64, build amd64 on remote,
On amd64 build locally:
```yaml
builder:
arch:
- amd64
remote:
host: ssh://docker@docker-builder
```
Build amd64 on local:
```yaml
builder:
arch:
- amd64
```
Use docker driver, building for local arch:
```yaml
builder:
driver: docker
```
Include the host name in the builder name, so we can have one builder
per host/arch across all kamal projects.
Inherit from the remote builder. The difference in the hybrid builder
is that we create a local buildx instance and append the remote context
to it.
It's just a remote builder, that will build whichever platform is asked
for, so let's remove the "native" part.
We'll also remove the service name from the builder name, so multiple
services can share the same builder.
Combine the two builders, as they are almost identical. The only
difference was whether the platforms were set.
The native cached builder wasn't using the context it created, so now
we do.
We'll set the driver to `docker-container` - it seems to be the default
but the Docker docs claim it is `docker`.
If you can have an alias like:
```
aliases:
rails: app exec -p rails
```
Then `kamal rails db:migrate:status` will execute
`kamal app exec -p rails db:migrate:status`.
So this works, we'll allow multiple arguments `app exec` and
`server exec` to accept multiple arguments.
The arguments are combined by simply joining them with a space. This
means that these are equivalent:
```
kamal app exec -p rails db:migrate:status
kamal app exec -p "rails db:migrate:status"
```
If you want to pass an argument with spaces, you'll need to quote it:
```
kamal app exec -p "git commit -am \"My comment\""
kamal app exec -p git commit -am "\"My comment\""
```
Aliases are defined in the configuration file under the `aliases` key.
The configuration is a map of alias name to command. When we run the
command the we just do a literal replacement of the alias with the
string.
So if we have:
```yaml
aliases:
console: app exec -r console -i --reuse "rails console"
```
Then running `kamal console -r workers` will run the command
```sh
$ kamal app exec -r console -i --reuse "rails console" -r workers
```
Because of the order Thor parses the arguments, this allows us to
override the role from the alias command.
There might be cases where we need to munge the command a bit more but
that would involve getting into Thor command parsing internals,
which are complicated and possibly subject to change.
There's a chance that your aliases could conflict with future built-in
commands, but there's not likely to be many of those and if it happens
you'll get a validation error when you upgrade.
Thanks to @dhnaranjo for the idea!
The integrations tests use their own registry so avoid hitting docker
hub rate limits.
This was using a self signed certificate but instead use
`--insecure-registry` to let the docker daemon use HTTP.
Allow blocks prefixed with `x-` in the configuration as a place to
declare reusable blocks with YAML anchors and aliases.
Borrowed from the Docker Compose configuration file format -
https://github.com/compose-spec/compose-spec/blob/main/spec.md#extension
Thanks to @ruyrocha for the suggestion.
Find the first registry mirror on each host. If we find any, pull the
images on one host per mirror, then do the remainder concurrently.
The initial pulls will seed the mirrors ensuring that we pull the image
from Docker Hub once each.
This works best if there is only one mirror on each host.
Setting `GITHUB_TOKEN` as in the docs results in reusing the existing
`GITHUB_TOKEN` since `gh` returns that env var if it's set:
```bash
GITHUB_TOKEN=junk gh config get -h github.com oauth_token
junk
```
Using the original env ensures that the templates will be evaluated the
same way regardless of whether envify had been previously invoked.
We didn't log boot errors if there was one role because there was no
barrier and the logging is done by the first host to close the barrier.
Let's always create the barrier to fix this.
Fixes:
```
ERROR (Net::SSH::Exception): Exception while executing on host example.com: could not settle on kex algorithm
Server kex preferences: curve25519-sha256@libssh.org,ext-info-s,kex-strict-s-v00@openssh.com
Client kex preferences: ecdh-sha2-nistp521,ecdh-sha2-nistp384,ecdh-sha2-nistp256,diffie-hellman-group-exchange-sha256,diffie-hellman-group14-sha256,diffie-hellman-group14-sha1```
add x25519 in Gemfile.lock
- Add local logout to `kamal registry logout`
- Add `skip_local` and `skip_remote` options to `kamal registry` commands
- Skip local login in `kamal deploy` when `--skip-push` is used
Load the hosts from the contexts before trying to build.
If there is no context, we'll create one. If there is one but the hosts
don't match we'll re-create.
Where we just have a local context, there won't be any hosts but we
still inspect the builder to check that it exists.
Validate the Kamal configuration giving useful warning on errors.
Each section of the configuration has its own config class and a YAML
file containing documented example configuration.
You can run `kamal docs` to see the example configuration, and
`kamal docs <section>` to see the example configuration for a specific
section.
The validation matches the configuration to the example configuration
checking that there are no unknown keys and that the values are of
matching types.
Where there is more complex validation - e.g for envs and servers, we
have custom validators that implement those rules.
Additonally the configuration examples are used to generate the
configuration documentation in the kamal-site repo.
You generate them by running:
```
bundle exec bin/docs <kamal-site-checkout>
```
When cloning the git repo:
1. Try to clone
2. If there's already a build directory reset it
3. Check the clone is valid
If anything goes wrong during that process:
1. Delete the clone directory
2. Clone it again
3. Check the clone is valid
Raise any errors after that
Using a different SSHKit runner doesn't work well, because the group
runner uses the Parallel runner internally. So instead we'll patch its
behaviour to fail slow.
We'll also get it to return all the errors so we can report on all the
hosts that failed.
If a primary role container is unhealthy, we might take a while to
timeout the health check poller. In the meantime if we have started the
other roles, they'll be running tow containers.
This could be a problem, especially if they read run jobs as that
doubles the worker capacity which could cause exessive load.
We'll wait for the first primary role container to boot successfully
before starting the other containers from other roles.
To speed up deployments, we'll remove the healthcheck step.
This adds some risk to deployments for non-web roles - if they don't
have a Docker healthcheck configured then the only check we do is if
the container is running.
If there is a bad image we might see the container running before it
exits and deploy it. Previously the healthcheck step would have avoided
this by ensuring a web container could boot and serve traffic first.
To mitigate this, we'll add a deployment barrier. Until one of the
primary role containers passes its healthcheck, we'll keep the barrier
up and avoid stopping the containers on the non-primary roles.
It the primary role container fails its healthcheck, we'll close the
barrier and shut down the new containers on the waiting roles.
We also have a new integration test to check we correctly handle a
a broken image. This highlighted that SSHKit's default runner will
stop at the first error it encounters. We'll now have a custom runner
that waits for all threads to finish allowing them to clean up.
This includes a fix for a bug in the eviction thread that could cause
this error:
```
[ERROR (IOError): Exception while executing on host foo: closed stream]
```
See https://github.com/capistrano/sshkit/pull/534
Docker does not respect the .dockerignore file when building from a tar.
Instead by default we'll make a local clone into a tmp directory and
build from there. Subsequent builds will reset the clone to match the
checkout.
Compared to building directly in the repo, we'll have reproducible
builds.
Compared to using a git archive:
1. .dockerignore is respected
2. We'll have faster builds - docker can be smarter about caching the
build context on subsequent builds from a directory
To build from the repo directly, set the build context to "." in the
config.
If there are uncommitted changes, we'll warn about them either being
included or ignored depending on whether we build from the clone.
Allow hosts to be tagged so we can have host specific env variables.
We might want host specific env variables for things like datacenter
specific tags or testing GC settings on a specific host.
Right now you either need to set up a separate role, or have the app
be host aware.
Now you can define tag env variables and assign those to hosts.
For example:
```
servers:
- 1.1.1.1
- 1.1.1.2: tag1
- 1.1.1.2: tag2
- 1.1.1.3: [ tag1, tag2 ]
env_tags:
tag1:
ENV1: value1
tag2:
ENV2: value2
```
The tag env supports the full env format, allowing you to set secret and
clear values.
When ssh options are set, they overwrite username and password passed as ssh builder uri. Passing part of uri for ssh-kit is fine, as it then properly extracts username and password and forwards it as host.ssh_options (in which case it's no longer empty)
Occasionally in CI things run slowly and it takes more that 1 second
for a cli test to run, so let's allow any value for the runtime in the
hook checks.
Extract Kamal::Commander::Specifics to deal with host and role setup and
ensure that primary hosts and roles always come first. This means that
in a rolling deploy we deploy to the primary ones first.
This will be important when we remove the healthcheck step as we want
to confirm the primary host can be deployed to before completing a
deployment for other roles.
By setting the hosts and roles all together in one place we can sort
the primary ones to the front without creating infinite loops.
Currently the latest container is the one that was created last. But if
we have had a failed deployment that left two containers running that
would not be the one we want. The second container could be in a
restart loop for example.
Instead we want the container that is running the image tagged as
latest. As we now tag as latest after a successful deployment we can
trust that that is a healthy container.
In the case that there is no container running the latest image tag,
we'll fall back to the latest container.
This could happen if the deploy was halted in between the old container
being stopped and the image being tagged as latest.
If you are deploying more than one destination to a host, the latest
tags will conflict, so we'll append the destination to the tag.
The latest tag is used when booting the app or exec-ing a new container.
If a deploy doesn't complete on a host for all roles then we should
probably not be using it, so move the tagging to the end of the boot
process.
This will allow us to filter for containers that have no destination in
cases where we deploy an empty + a non empty destination to the same
host.
To note:
```
\# Containers with a destination label
$ docker ps --filter label=destination
\# Containers with an empty destination label
$ docker ps --filter label=destination=
```
If no context is specified and we are in a git repo, then we'll build
from a git archive by default. This means we don't need a separate
setting and gives us a safer default build.
Building directly from a checkout will pull in uncommitted files to or
more sneakily files that are git ignored, but not docker ignored.
To avoid this, we'll add an option to build from a git archive of HEAD
instead. Docker doesn't provide a way to build directly from a git
repo, so instead we create a tarball of the current HEAD with git
archive and pipe it into the build command.
When building from a git archive, we'll still display the warning about
uncommitted changes, but we won't add the `_uncommitted_...` suffix to
the container name as they won't be included in the build.
Perhaps this should be the default, but we'll leave that decision for
now.
Secret and clear env variables have different lifecycles. The clear ones
are part of the repo, so it makes sense to always deploy them with the
rest of the repo.
The secret ones are external so we can't be sure that they are up to
date, therefore they require an explicit push via `envify` or `env push`.
We'll keep the env file, but now it just contains secrets. The clear
values are passed directly to `docker run`.
Add an app with roles to the integration tests. We'll deploy two web
containers and one worker. The worker just sleeps, so we are testing
that the container has booted.
If curl is not available to download the docker install script, try
with wget instead.
If neither is available or both fail, return a simple failing script
so that we don't carry on regardless.
Fixes: https://github.com/basecamp/kamal/issues/395
This allows the user to make any necessary configuration changes to
Docker before setting up any containers, allowing those configuration
changes to take effect from the outset.
By default we keep 5 containers around for rollback. The containers
don't take much space, but the images for them can.
Make the number of containers to retain configurable, either in the
config with the `retain_containers` setting on the command line
with the `--retain` option.
1. Add Ruby 2.7 specific Gemfile that uses an older version of nokogiri
2. Rails edge doesn't support Ruby 2.7.0, so exclude it.
3. Add Ruby 3.3
4. Update Gemfile.lock to test against Rails 7.1.2 as it's the latest
version.
5. Remove continue-on-error from the matrix and always set to true
This is better then adding them together which confusingly results in
both ENV vars in the same file, though based on the load order, they
worked anyway.
This allows for more meaningful naming in roles.
The only caution here is that we don't support the renaming of roles, so
any migration is left to the user.
Provide pre and post reboot hooks for Traefik, that can be used to
remove/add to an external load balancer to prevent requests from being
sent during the reboot.
Works best with the --rolling setting, where each hook is called once
per host.
If the app container is down or not responding then traefik will return
a 404 response code. This is not ideal as it suggests a client rather
than a server problem.
To fix this, we'll define a catch all route that always returns a 502.
This is not ideal as this route would take priority over a shorter route
with priorty 1.
TODO: up the priority of the app route.
Calling `load_envs` again does not load updated env variables, because
Dotenv does not overwrite existing values.
To fix this we'll store the original ENV and reset to it before
reloading.
https://github.com/basecamp/kamal/issues/512
The "No git remote set" error message was appropriate for the previous
block (where it was presumably copy-pasted from), but in this line we
have failed the check that determines if we have a git branch checked
out, so we should output a corresponding error.
The env check is not needded anymore as all the commands rely on the
env files having already been created remotely.
The only place the env is needed is when running `kamal env push` and
that will still raise an apropriate error.
When calling `kamal app exec` for new non interactive containers, run
the command per role on each server and include the role config
including the environment.
Fixes: https://github.com/basecamp/kamal/issues/492
In lieu of a general purpose mechanism to pass dynamically-evaluated env-vars at container execution time, we can pass the `config.version` as KAMAL_VERSION to avoid having to take apart the container name just to determine the SHA of the deployed version in the entrypoint.
Adding -T to the copy command ensures that the files are copied at the
same level into the target directory whether it exists or not.
That allows us to drop the `/*` which was not picking up hidden files.
Fixes: https://github.com/basecamp/kamal/issues/465
Kamal needs images to have the service label so it can track them for
pruning. Images built by Kamal will have the label, but externally built
ones may not.
Without it images will build up over time. The worst case is an outage
if all the hosts disks fill up at the same time.
We'll add a check for the label and halt if it is not there.
If you always want to use a destination, and have a base deploy.yml file
that doesn't specify any hosts, then if you forget to specific the
destination you will get a cryptic error.
Add a "require_destination" setting you can use to avoid this.
An interrupted deployment can leave older containers lying around. To
ensure they are cleaned up subsequently, stop stale containers during
deployments instead of just reporting them.
During deployments both the old and new containers will be active for a
small period of time. There also may be lagging requests for older CSS
and JS after the deployment.
This can lead to 404s if a request for old assets hits a new container
or visa-versa.
This PR makes sure that both sets of assets are available throughout the
deployment from before the new version of the app is booted.
This can be configured by setting the asset path:
```yaml
asset_path: "/rails/public/assets"
```
The process is:
1. We extract the assets out of the container, with docker run, docker
cp, docker stop. Docker run sets the container command to "sleep" so
this needs to be available in the container.
2. We create an asset volume directory on the host for the new version
of the app on the host and copy the assets in there.
3. If there is a previous deployment we also copy the new assets into
its asset volume and copy the older assets into the new asset volume.
4. We start the new container mapping the asset volume over the top of
the container's asset path.
This means the both the old and new versions have replaced the asset
path with a volume containing both sets of assets and should be able
to serve any request during the deployment. The older assets will
continue to be available until the next deployment.
The go template was concatenating all the mounts into one line. It
happened to work because the mount we are interested was always first.
Fix it to output one mount per line instead.
When replacing a container currently we:
1. Boot the new container
2. Wait for it to become healthy
3. Stop the old container
Traefik will send requests to the old container until it notices that it
is unhealthy. But it may have stopped serving requests before that point
which can result in errors.
To get round that the new boot process is:
1. Create a directory with a single file on the host
2. Boot the new container, mounting the cord file into /tmp and
including a check for the file in the docker healthcheck
3. Wait for it to become healthy
4. Delete the healthcheck file ("cut the cord") for the old container
5. Wait for it to become unhealthy and give Traefik a couple of seconds
to notice
6. Stop the old container
The extra steps ensure that Traefik stops sending requests before the
old container is shutdown.
Setting env variables in the docker arguments requires having them on
the deploy host.
Instead we'll add two new commands `kamal env push` and
`kamal env delete` which will manage copying the environment as .env
files to the remote host.
Docker will pick up the file with `--env-file <path-to-file>`. Env files
will be stored under `<kamal run directory>/env`.
Running `kamal env push` will create env files for each role and
accessory, and traefik if required.
`kamal envify` has been updated to also push the env files.
By avoiding using `kamal envify` and creating the local and remote
secrets manually, you can now avoid accessing secrets needed
for the docker runtime environment locally. You will still need build
secrets.
One thing to note - the Docker doesn't parse the environment variables
in the env file, one result of this is that you can't specify multi-line
values - see https://github.com/moby/moby/issues/12997.
We maybe need to look docker config or docker secrets longer term to get
around this.
Hattip to @kevinmcconnell - this was all his idea.
To avoid polluting the default SSH directory with lots of Kamal config,
we'll default to putting them in a `kamal` sub directory.
But also make the directory configurable with the `run_directory` key,
so for example you can set it as `/var/run/kamal/`
The directory is created during bootstrap or before any command that
will need to access a file.
Adds the `publish` option which, if set to false, does not pass `--publish` to
`docker run` when starting Traefik. This is useful when running Traefik
behind a reverse proxy, for example.
ifoptions[:confirmed]||ask("This will remove all containers, images and data directories for #{name}. Are you sure?",limited_to:%w( y N ),default:"N")=="y"
with_accessory(name)do
stop(name)
remove_container(name)
remove_image(name)
remove_service_directory(name)
end
confirming"This will remove all containers, images and data directories for #{name}. Are you sure?"do
desc"deliver","Build app and push app image to registry then pull image on servers"
defdeliver
mutatingdo
push
pull
end
invoke:push
invoke:pull
end
desc"push","Build and push app image to registry"
option:output,type::string,default:"registry",banner:"export_type",desc:"Exported type for the build result, and may be any exported type supported by 'buildx --output'."
defpush
mutatingdo
cli=self
cli=self
verify_local_dependencies
run_hook"pre-build"
# Ensure pre-connect hooks run before the build, they may needed for a remote builder
desc"dev","Build using the working directory, tag it as dirty, and push to local image store."
option:output,type::string,default:"docker",banner:"export_type",desc:"Exported type for the build result, and may be any exported type supported by 'buildx --output'."
raiseKamal::Cli::Build::BuildError,"Clone in #{KAMAL.config.builder.build_directory} is not on the correct revision, expected `#{Kamal::Git.revision}` but got `#{revision}`"
end
rescueSSHKit::Command::Failed=>e
raiseKamal::Cli::Build::BuildError,"Failed to validate clone: #{e.message}"
raise"kamal-proxy version #{version} is too old, run `kamal proxy reboot` in order to update to at least #{Kamal::Configuration::Proxy::Boot::MINIMUM_VERSION}"
say"Running '#{cmd}' on #{hosts.join(', ')}...",:magenta
on(hosts)do|host|
execute*KAMAL.auditor.record("Executed cmd '#{cmd}' on #{host}"),verbosity::debug
puts_by_hosthost,capture_with_info(cmd)
end
end
end
ifmissing.any?
raise"Docker is not installed on #{missing.join(", ")} and can't be automatically installed without having root access and the `curl` command available. Install Docker manually: https://docs.docker.com/engine/install/"
raise"Docker is not installed on #{missing.join(", ")} and can't be automatically installed without having root access and either `wget` or `curl`. Install Docker manually: https://docs.docker.com/engine/install/"
raiseArgumentError,"No servers specified for the #{role.name} role"
ifprimary_role.hosts.empty?
raiseKamal::ConfigurationError,"No servers specified for the #{primary_role.name} primary_role"
end
unlessallow_empty_roles?
roles.eachdo|role|
ifrole.hosts.empty?
raiseKamal::ConfigurationError,"No servers specified for the #{role.name} role. You can ignore this with allow_empty_roles: true"
end
end
end
end
true
end
defensure_valid_service_name
raiseKamal::ConfigurationError,"Service name can only include alphanumeric characters, hyphens, and underscores"unlessraw_config[:service]=~/^[a-z0-9_-]+$/i
# The default registry is Docker Hub, but you can change it using `registry/server`.
#
# By default, Docker Hub creates public repositories. To avoid making your images public,
# set up a private repository before deploying, or change the default repository privacy
# settings to private in your [Docker Hub settings](https://hub.docker.com/repository-settings/default-privacy).
#
# A reference to a secret (in this case, `DOCKER_REGISTRY_TOKEN`) will look up the secret
# in the local environment:
registry:
server:registry.digitalocean.com
username:
- DOCKER_REGISTRY_TOKEN
password:
- DOCKER_REGISTRY_TOKEN
# Using AWS ECR as the container registry
#
# You will need to have the AWS CLI installed locally for this to work.
# AWS ECR’s access token is only valid for 12 hours. In order to avoid having to manually regenerate the token every time, you can use ERB in the `deploy.yml` file to shell out to the AWS CLI command and obtain the token:
registry:
server:<your aws account id>.dkr.ecr.<your aws region id>.amazonaws.com
username:AWS
password:<%= %x(aws ecr get-login-password) %>
# Using GCP Artifact Registry as the container registry
#
# To sign into Artifact Registry, you need to
# [create a service account](https://cloud.google.com/iam/docs/service-accounts-create#creating)
# and [set up roles and permissions](https://cloud.google.com/artifact-registry/docs/access-control#permissions).
# Normally, assigning the `roles/artifactregistry.writer` role should be sufficient.
#
# Once the service account is ready, you need to generate and download a JSON key and base64 encode it:
#
# ```shell
# base64 -i /path/to/key.json | tr -d "\\n"
# ```
#
# You'll then need to set the `KAMAL_REGISTRY_PASSWORD` secret to that value.
#
# Use the environment variable as the password along with `_json_key_base64` as the username.
# [SSHKit](https://github.com/capistrano/sshkit) is the SSH toolkit used by Kamal.
#
# The default, settings should be sufficient for most use cases, but
# when connecting to a large number of hosts, you may need to adjust.
# SSHKit options
#
# The options are specified under the sshkit key in the configuration file.
sshkit:
# Max concurrent starts
#
# Creating SSH connections concurrently can be an issue when deploying to many servers.
# By default, Kamal will limit concurrent connection starts to 30 at a time.
max_concurrent_starts:10
# Pool idle timeout
#
# Kamal sets a long idle timeout of 900 seconds on connections to try to avoid
# re-connection storms after an idle period, such as building an image or waiting for CI.
pool_idle_timeout:300
Some files were not shown because too many files have changed in this diff
Show More
Reference in New Issue
Block a user
Blocking a user prevents them from interacting with repositories, such as opening or commenting on pull requests or issues. Learn more about blocking a user.