167 Commits

Author SHA1 Message Date
Slavi Pantaleev
ca786cc343 Revert "Upgrade Synapse (1.31 -> 1.32)"
This reverts commit f825c7c263782638d352ada58dff1d6b64033f64.

Related to https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/1010
2021-04-20 23:40:55 +03:00
Slavi Pantaleev
f825c7c263 Upgrade Synapse (1.31 -> 1.32) 2021-04-20 17:47:34 +03:00
Slavi Pantaleev
3f426de599 Upgrade Synapse (1.30.1 -> 1.31.0) 2021-04-06 16:00:10 +03:00
Slavi Pantaleev
d09609daa8 Fix Jinja2 syntax error
Fixes a regression introduced in ffe649a2405a25
2021-03-22 17:13:10 +02:00
Slavi Pantaleev
ffe649a240 Update homeserver.yaml to keep up with Synapse v1.30.0
Related to https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/958
2021-03-22 16:43:10 +02:00
Slavi Pantaleev
f99dcd611f Pass proper UID/GID to Synapse
Fixes a regression caused by a5ee39266c29c6.

If the user id and group id were different than 991:991
(which used to be a hardcoded default for us long ago),
there was a mismatch between what Synapse was trying to use (991:991)
and what it was actually started with (in `--user=..`). It was then
trying to change ownership, which was failing.

This was mostly affecting newer installations which were not using the
991:991 defaults we had long ago (since a1c5a197a93d410).
2021-03-19 16:44:10 +02:00
Slavi Pantaleev
a5ee39266c Go through start.py when launching Synapse
This allows us to benefit from helpful things it does for us,
like enabling jemalloc: https://github.com/matrix-org/synapse/pull/8553

We weren't going through `start.py` before, because it was causing some
conflict with our `docker run --user=...` stuff, but it doesn't seem
to be a problem anymore.

Having done this, we won't need to do things like
https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/941
anymore.
2021-03-19 08:16:59 +02:00
Pablo Montepagano
52fe8a05b0 Adding vars to synapse for private servers. 2021-03-14 00:39:44 -03:00
Slavi Pantaleev
9b72384df7 Upgrade Synapse (1.28.0 -> 1.29.0) 2021-03-08 17:24:09 +02:00
Slavi Pantaleev
ae091d7b2d Upgrade Synapse (v1.27.0 -> v1.28.0) 2021-02-25 13:40:35 +02:00
Slavi Pantaleev
2ef1d9c537 Make healthchecks work for Synapse worker containers
Related to https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/456
2021-02-24 07:59:14 +02:00
Slavi Pantaleev
daae74b074 Merge branch 'master' into synapse-workers 2021-02-16 17:31:40 +02:00
Slavi Pantaleev
521160c12f Upgrade Synapse (v1.26.0 -> v1.27.0) 2021-02-16 17:30:48 +02:00
Slavi Pantaleev
43059bb040 Fix metrics listeners for Synapse workers
`::` leads to errors like:

> socket.gaierror: [Errno -9] Address family for hostname not supported
2021-02-15 11:19:07 +02:00
Slavi Pantaleev
5cfeae806b Merge branch 'master' into synapse-workers 2021-02-14 13:00:57 +02:00
Slavi Pantaleev
7e8e95a09a Make S3-mounting path configurable
This will make data migration easier.
2021-02-09 22:05:07 +02:00
Slavi Pantaleev
1cd2a218de Merge branch 'master' into synapse-workers 2021-01-27 21:41:54 +02:00
Slavi Pantaleev
c6feb0b99e Upgrade Synapse (v1.25.0 -> v1.26.0) 2021-01-27 21:41:47 +02:00
Slavi Pantaleev
d98a1ceadd Merge branch 'master' into synapse-workers 2021-01-27 10:27:17 +02:00
Slavi Pantaleev
512f42aa76 Do not report docker kill/rm attempts as errors
These are just defensive cleanup tasks that we run.
In the good case, there's nothing to kill or remove, so they trigger an
error like this:

> Error response from daemon: Cannot kill container: something: No such container: something

and:

> Error: No such container: something

People often ask us if this is a problem, so instead of always having to
answer with "no, this is to be expected", we'd rather eliminate it now
and make logs cleaner.

In the event that:
- a container is really stuck and needs cleanup using kill/rm
- and cleanup fails, and we fail to report it because of error
suppression (`2>/dev/null`)

.. we'd still get an error when launching ("container name already in use .."),
so it shouldn't be too hard to investigate.
2021-01-27 10:22:46 +02:00
Slavi Pantaleev
66cdc7bf5a Clean up worker.yaml generation a bit and make it more flexible 2021-01-25 13:02:01 +02:00
Slavi Pantaleev
1462409b34 Fix worker listening addresses
Not specifying bind addresses for the worker resulted in this warning:

> synapse.app - 47 - WARNING - None - Failed to listen on 0.0.0.0, continuing because listening on [::]

Additionally, metrics listening only on 127.0.0.1 seems like a no-op.
Only having it accessible from within the container is likely not what
we intend. Changed that to all interfaces as well.

Whether it actually gets exposed or not depends on the systemd service
and `matrix_synapse_workers_container_host_bind_address`.
2021-01-25 12:29:47 +02:00
Slavi Pantaleev
01747c8cc4 Prevent Synapse warning about enabling metric listeners with enable_metrics: false
> synapse.app.generic_worker - 606 - WARNING - None - Metrics listener configured, but enable_metrics is not True!
2021-01-25 12:24:12 +02:00
Slavi Pantaleev
70796703d3 Run Synapse workers in their own containers
This switches the `docker exec` method of spawning
Synapse workers inside the `matrix-synapse` container with
dedicated containers for each worker.

We also have dedicated systemd services for each worker,
so this are now:
- more consistent with everything else (we don't use systemd
instantiated services anywhere)
- we don't need the "parse systemd instance name into worker name +
port" part
- we don't need to keep track of PIDs manually
- we don't need jq (less depenendencies)
- workers dying would be restarted by systemd correctly, like any other
service
- `docker ps` shows each worker separately and we can observe resource
usage
2021-01-25 12:14:46 +02:00
Slavi Pantaleev
63301b0ef1 Improvements around Synapse worker/metrics ports exposure
There was a `matrix_nginx_proxy_enabled|default(False)` check, but:
- it didn't seem to work reliably for some reason (hmm)
- referring to a `matrix_nginx_proxy_*` variable from within the
  `matrix-synapse` role is not ideal
- exposing always happened on `127.0.0.1`, which may not be good enough
  for some rarer setups (where the own webserver is external to the host)
2021-01-25 08:25:43 +02:00
Marcel Partap
edc21f15e5 Restrict publishing worker (metrics) ports to localhost 2021-01-24 08:53:09 +01:00
Marcel Partap
183adec3d8 Merge remote-tracking branch 'origin/master' into synapse-workers 2021-01-23 15:04:11 +01:00
Marcel Partap
f2c7d79238 Drop probably incorrect comment from synapse homeserver.yaml.j2 2021-01-23 14:44:36 +01:00
Slavi Pantaleev
1692a28fe4 Work around annoying Docker warning about undefined $HOME
> WARNING: Error loading config file: .dockercfg: $HOME is not defined

.. which appeared in Docker 20.10.
2021-01-15 00:23:01 +02:00
Slavi Pantaleev
d5945c6e78 Upgrade Synapse (v1.24.0 -> v1.25.0) for amd64 2021-01-13 13:02:49 +02:00
Marcel Partap
cd8100544b Merge remote-tracking branch 'origin/master' into synapse-workers
Sync with upstream
2021-01-08 20:58:50 +01:00
Slavi Pantaleev
d08b27784f Fix systemd services autostart problem with Docker 20.10
The Docker 19.04 -> 20.10 upgrade contains the following change
in `/usr/lib/systemd/system/docker.service`:

```
-BindsTo=containerd.service
-After=network-online.target firewalld.service containerd.service
+After=network-online.target firewalld.service containerd.service multi-user.target
-Requires=docker.socket
+Requires=docker.socket containerd.service
Wants=network-online.target
```

The `multi-user.target` requirement in `After` seems to be in conflict
with our `WantedBy=multi-user.target` and `After=docker.service` /
`Requires=docker.service` definitions, causing the following error on
startup for all of our systemd services:

> Job matrix-synapse.service/start deleted to break ordering cycle starting with multi-user.target/start

A workaround which appears to work is to add `DefaultDependencies=no`
to all of our services.
2020-12-10 11:43:20 +02:00
Slavi Pantaleev
aa86e0dac6 Upgrade Synapse (v1.23.0 -> v1.24.0)
Because the ARM images are not pushed yet, we hold back to v1.23.0
for now.
2020-12-09 13:31:10 +02:00
Slavi Pantaleev
c07c927d9f Automatically enable openid listeners when ma1sd enabled
ma1sd requires the openid endpoints for certain functionality.
Example: 90b2b5301c/src/main/java/io/kamax/mxisd/auth/AccountManager.java (L67-L99)

If federation is disabled, we still need to expose these openid APIs on the
federation port.

Previously, we were doing similar magic for Dimension.
As per its documentation, when running unfederated, one is to enable
the openid listener as well. As per their recommendation, people
are advised to do enable it on the Client-Server API port
and use the `federationUrl` variable to override where the federation
port is (making federation requests go to the Client-Server API).

Because ma1sd always uses the federation port (unless you do some
DNS overwriting magic using its configuration -- which we'd rather not
do), it's better if we just default to putting the `openid` listener
where it belongs - on the federation port.

With this commit, we retain the "automatically enable openid APIs" thing
we've been doing for Dimension, but move it to the federation port instead.
We also now do the same thing when ma1sd is enabled.
2020-12-08 16:59:20 +02:00
Marcel Partap
e892ac464f synapse workers: untangle config template and specify bind address
.. to mitigate log noise - WARNING:
Failed to listen on 0.0.0.0, continuing because listening on [::]
2020-12-01 23:49:23 +01:00
Marcel Partap
f201bca519 synapse workers: define and expose METRICS port for each worker
As seen on TV:
https://github.com/matrix-org/synapse/blob/master/docs/metrics-howto.md#monitoring-workers
2020-12-01 22:49:15 +01:00
Marcel Partap
b73ac965ac Merge remote-tracking branch 'origin/master' into synapse-workers 2020-12-01 21:24:26 +01:00
Slavi Pantaleev
1fca917ad1 Replace some -v instances with --mount
`-v` magically creates the source destination as a directory,
if it doesn't exist already. We'd like to avoid this magic
and the potential breakage that it might cause.

We'd rather fail while Docker tries to find things to `--mount`
than have it automatically create directories and fail anyway,
while having contaminated the filesystem.

There's a lot more `-v` instances remaining to be fixed later on.
This is just some start.

Things like `matrix_synapse_container_additional_volumes` and
`matrix_nginx_proxy_container_additional_volumes` were not changed to
use `--mount`, as options for each one are passed differently
(`ro` is `ro`, but `rw` doesn't exist and `slave` is `bind-propagation=slave`).
To avoid breaking people's custom volume mounts, we keep it as it is for now.

A deficiency with `--mount` is that it lacks the `z` option (SELinux
ownership changes), and some of our `-v` instances use that. I'm not
sure how supported SELinux is for us right now, but it might be,
and breaking that would not be a good idea.
2020-11-24 10:26:05 +02:00
Slavi Pantaleev
b627d93cdc Update homeserver.yaml to keep up with Synapse v1.23.0
Related to #724 (Github Pull Request)
2020-11-18 16:57:50 +02:00
Marcel Partap
2d1b9f2dbf synapse workers: reworkings + get endpoints from upstream docs via awk
(yes, a bit awkward and brittle… xD)
2020-10-28 07:13:19 +01:00
Marcel Partap
87bd64ce9e Merge remote-tracking branch 'origin/master' into synapse-workers 2020-10-23 23:45:07 +02:00
Marcel Partap
a4125d5446 synapse workers: polishing, cleansing and installation of jq dependency 2020-10-23 20:49:53 +02:00
Marcel Partap
501efee07e synapse workers: supply systemd with actual worker PIDs (requires jq)
also, worker.yaml.j2:
  - hone worker_name
  - remove worker_pid_file entry (would only be used if worker_daemonize
    set to true; also, synapse only knows about the container namespace
    and thus can not provide the required host-view PID)
2020-10-22 20:53:41 +02:00
Aaron Raimist
78529cbd47
Upgrade Synapse (v1.20.1 -> v1.21.0) 2020-10-12 23:59:34 -05:00
Marcel Partap
d2e61af224 Add worker_name to synapse worker config template
& restrict federation listener; frontend_proxy / user_dir don't need it
2020-10-11 21:52:08 +02:00
Marcel Partap
e9241f5fb9 Improve synapse-workers systemd service template
Is the PID magic gonna work? or will it need an ExecStartPost hack..
2020-10-11 21:09:19 +02:00
Marcel Partap
40024e9b81 Prevent workers failing if their config doesn't exist
- cherry-pick "Ensure worker config exists in systemd service (#7528)"
  from synapse d74cdc1a42e8b487d74c214b1d0ca575429d546a:
  "check that the worker config file exists instead of silently failing."
2020-10-11 21:09:19 +02:00
Marcel Partap
93a8ea7e4a Merge remote-tracking branch 'master' into feature/add-worker-support 2020-10-11 20:59:05 +02:00
Slavi Pantaleev
dd217137b6 Upgrade Synapse (v1.19.3 -> v1.20.0) 2020-09-22 19:28:07 +03:00
Max Klenk
4fdfc0a34f
add missing ratelimiting options required for load testing 2020-09-11 09:46:20 +02:00