This is based on the PR (https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/3241)
by Tobias Diez (https://github.com/tobiasdiez).
I've refactored some parts, made it more configurable, polished it up,
and it's integrated into the playbook now.
Both the WeChat bridge and WeChat agent appear to be working.
The WeChat bridge joins rooms and responds as expected.
That said, end-to-end testing (actually bridging to a WeChat account) has not been done yet.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/701
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/3092
This is sponsored https://etke.cc/ work related to https://gitlab.com/etke.cc/ansible/-/issues/2
Squashed commit of the following:
commit fdd37f02472a0b83d61b4fac80650442f90e7629
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 21:05:53 2024 +0300
Add documentation for WeChat bridge
commit 8426fc8b95bb160ea7f9659bd45bc59cf1326614
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 20:59:42 2024 +0300
Rename directory for matrix_wechat_agent_container_src_files_path
commit da200df82bbc9153d307095dd90e4769c400ea1e
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 20:58:26 2024 +0300
Make WeChat listen_secret configurable and auto-configured via matrix_homeserver_generic_secret_key
commit 4022cb1355828ac16af7d9228cb1066962bb35f5
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 20:54:56 2024 +0300
Refactor install.yml for WeChat a bit (using blocks, etc.)
commit d07a39b4c4f6b93d04204e13e384086d5a242d52
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 20:52:35 2024 +0300
Rename WeChat Agent configuration file
This makes it more clear that it belongs to the agent.
Otherwise, `config.yaml` and `configure.yaml` make you wonder.
commit ccca72f8d1e602f7c42f4bd552193afa153c9b9d
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 20:49:06 2024 +0300
Move WeChat agent configuration to a template
commit a4047d94d8877b4095712dfc76ac3082a1edca28
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 20:47:17 2024 +0300
Mount WeChat config as readonly and instruct bridge to not update it
commit bc0e89f345bf14bbdbfd574bb60d93918c2ac053
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 20:46:33 2024 +0300
Sync WeChat config with upstream
Brings up-to-date with:
https://github.com/duo/matrix-wechat/commits/0.2.4/example-config.yaml
commit a46f5b9cbc8bf16042685a18c77d25a606bc8232
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 19:48:17 2024 +0300
Rename some files
commit 3877679040cffc4ca6cccfa21a7335f8f796f06e
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 19:47:10 2024 +0300
Update WeChat logging config
This brings it up-to-date with what mautrix-go uses.
Otherwise, on startup we see:
> Migrating legacy log config
.. and it gets migrated to what we've done here.
commit e3e95ab234651867c7a975a08455549b31db4172
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 19:43:37 2024 +0300
Make sure matrix-wechat-agent runs as 1000:1000
It needs to write stuff to `/home/user/.vnc`.
`/home/user` is owned by `user:group` (`1000:1000`), so it cannot run
any other way.
Previously, if the `matrix` user was uid=1000 by chance, it would work,
but that's pure luck.
commit 4d5748ae9b84c81d6b48b0a41b790339d9ac4724
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 18:57:09 2024 +0300
Pin wechat and wechat-agent versions
commit 40d40009f19ebceed4126146cbb510a2c95af671
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 18:53:58 2024 +0300
docker_image -> container_image for WeChat bridge
commit cc33aff592541913070d13288d17b04ed6243176
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 18:00:25 2024 +0300
docker_src -> container_src in WeChat bridge
commit 42e6ae9a6483c8ca6d53b8052058d41d90d93797
Author: Slavi Pantaleev <slavi@devture.com>
Date: Mon Jun 3 17:54:24 2024 +0300
matrix_go_wechat_ -> matrix_wechat_
The bridge is written in Go, but does not include Go anywhere in its
name. As such, it's mostly useless to use `matrix_go_wechat` as the
prefix.
commit d6662a69d1916d215d5184320c36d2ef73afd3e9
Author: Tobias Diez <code@tobiasdiez.de>
Date: Mon Mar 25 10:55:16 2024 +0800
Add wechat bridge
In the process of writing the Draupnir for all role documentation it was forgotten that Draupnir needs to have the ability to write to the main management room policy list that controls who can access the bot. This flaw was overlooked during development as naturally without thinking the bot had these powers.
Upstream Docs had this exact bug also and the author of this commit will have to go and fix upstream docs also to resolve this bug.
* Draupnir for all Role
* Draupnir for all Documentation
* Pin D4A to Develop until D4A patches are in a release.
* Update D4A Docs to mention pros and cons of D4A mode compared to normal
* Change Documentation to mention a fixed simpler provisioning flow.
Use of /plain allows us to bypass the bugs encountered during the development of this role with clients attempting to escape our wildcards causing the grief that led to using curl.
This reworded commit does still explain you can automatically inject stuff into the room if you wanted to.
* Emphasise the State of D4A mode
* Link to Draupnir-for-all docs and tweak the docs some
* Link to Draupnir-for-all from Draupnir documentation page
* Announce Draupnir-for-all
---------
Co-authored-by: Slavi Pantaleev <slavi@devture.com>
* feat: auto-accept-invite module and docs
* fix: name typos and some forgot to adjust variables
* fix: accept only direct messages should work now and better wording
* changed: only_direct_messages variable naming
* feat: add logger, add synapse workers config
* Fix typo and add details about synapse-auto-acccept-invite
* Add newline at end of file
* Fix alignment
* Fix logger name for synapse_auto_accept_invite
The name of the logger needs to match the name of the Python module.
Ref: d673c67678/synapse_auto_accept_invite/__init__.py (L20)
* Add missing document start YAML annotation
* Remove trailing spaces
---------
Co-authored-by: Slavi Pantaleev <slavi@devture.com>
I've just tested Rocky Linux v9 and it seems to work.
I suppose the Docker situation
(https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/300)
on RHEL v8 has improved, so it probably works too.
I see no reason AlmaLinux and other RHEL derivatives wouldn't work,
but I have neither tested them, nor have confirmation from others about
it.
It's mostly a matter of us being able to install:
- Docker, via https://github.com/geerlingguy/ansible-role-docker which
seems to support various distros
- a few other packages (systemd-timesyncd, etc).
The list of supported distros has been reordered alphabetically.
I've heard reports of SUSE Linux working well too, so it may also be added
if confirmed again.
Closes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/300
Fixup for https://github.com/spantaleev/matrix-docker-ansible-deploy/pull/3017
This reverts 1cd82cf068 and also multiplies results by `1024`
so as to pass bytes to Synapse, not KB (as done before).
1cd82cf068 was correctly documenting what we were doing (passing KB values),
but that's incorrect.
Synapse's Config Conventions
(https://matrix-org.github.io/synapse/latest/usage/configuration/config_documentation.html#config-conventions)
are supposed to clear it up, but they don't currently state what happens when you pass a plain number (without a unit suffix).
Thankfully, the source code tells us:
bc1db16086/synapse/config/_base.py (L181-L206)
> If an integer is provided it is treated as bytes and is unchanged.
>
> String byte sizes can have a suffix of ...
> No suffix is understood as a plain byte count.
We were previously passing strings, but that has been improved in 3d73ec887a.
Regardless, non-suffixed values seem to be treated as bytes by Synapse,
so this patch changes the variables to use bytes.
Moreover, we're moving from `matrix_synapse_memtotal_kb` to
`matrix_synapse_cache_size_calculations_memtotal_bytes` as working with
the base unit everywhere is preferrable.
Here, we also introduce 2 new variables to allow for the caps to be
tweaked:
- `matrix_synapse_cache_size_calculations_max_cache_memory_usage_cap_bytes`
- `matrix_synapse_cache_size_calculations_target_cache_memory_usage_cap_bytes`
* Modify Synapse Cache Factor to use Auto Tune
Synapse has the ability to as it calls in its config auto tune caches.
This ability lets us set very high cache factors and then instead limit our resource use.
Defaults for this commit are 1/10th of what Element apparently runs for EMS stuff and matrix.org on Cache Factor and upstream documentation defaults for auto tune.
* Add vars to Synapse main.yml to control cache related config
This commit adds various cache related vars to main.yml for Synapse.
Some are auto tune and some are just adding explicit ways to control upstream vars.
* Updated Auto Tune figures
Autotuned figures have been bumped in consultation with other community members as to a reasonable level. Please note these defaults are more on the one of each workers side than they are on the monolith Side.
* Fix YML Error
The playbook is not happy with the previous state of this patch so this commit hopefully fixes it
* Add to_json to various Synapse tuning related configs
* Fix incorrect indication in homeserver.yaml.j2
* Minor cleanups
* Synapse Cache Autotuning Documentation
* Upgrade Synapse Cache Autotune to auto configure memory use
* Update Synapse Tuning docs to reflect automatic memory use configuration
* Fix Linting errors in synapses main.yml
* Rename variables for consistency (matrix_synapse_caches_autotuning_* -> matrix_synapse_cache_autotuning_*)
* Remove FIX ME comment about Synapse's `cache_autotuning`
`docs/maintenance-synapse.md` and `roles/custom/matrix-synapse/defaults/main.yml`
already contains documentation about these variables and the default values we set.
* Improve "Tuning caches and cache autotuning" documentation for Synapse
* Announce larger Synapse caches and cache auto-tuning
---------
Co-authored-by: Slavi Pantaleev <slavi@devture.com>
Adds a Draupnir mention to the list and as for why we pull from Gnuxie its because that is the official source of docker images as Draupnir used to be Gnuxie/Draupnir before it moved to The Draupnir Project.
The path rule was not working because for federation fo work it needs several endpoints.
Two of them are not under /_matrix/federation :
- /_matrix/key
- /_matrix/media
* Update configuring-playbook-traefik.md
Added docu on how to host another server behind traefik.
* Added MASH and docker options
Added the link to mash and the compatibility adjustments.
Mentioned the prefered method with docker containers.
Some rephrasing to make clear, the intended guide ios for reverse proxying non-docker services.
* Improve wording in configuring-playbook-traefik.md
---------
Co-authored-by: Slavi Pantaleev <slavi@devture.com>
After some checking, it seems like there's `/_synapse/client/oidc`,
but no such thing as `/_synapse/oidc`.
I'm not sure why we've been reverse-proxying these paths for so long
(even in as far back as the `matrix-nginx-proxy` days), but it's time we
put a stop to it.
The OIDC docs have been simplified. There's no need to ask people to
expose the useless `/_synapse/oidc` endpoint. OIDC requires
`/_synapse/client/oidc` and `/_synapse/client` is exposed by default
already.
Issues and Pull Requests were not migrated to the new
organization/repository, so `matrix-org/synapse/pull` and
`matrix-org/synapse/issues` references were kept as-is.
`matrix-org/synapse-s3-storage-provider` references were also kept,
as that module still continues living under the `matrix-org` organization.
This patch mainly aims to change documentation-related things, not actual
usage in full yet. For polish that, another more comprehensive patch is coming later.
The old variables still work. The global lets us avoid
auto-detection logic like we're currently doing for
`matrix_nginx_proxy_proxy_matrix_federation_api_enabled`.
In the future, we'd just be able to reference
`matrix_homeserver_federation_enabled` and know the up-to-date value
regardless of homeserver.
This was meant to serve as an intermediary for services needing to reach
the homeserver. It was used like that for a while in this
`bye-bye-nginx-proxy` branch, but was never actually public.
It has recently been superseded by homeserver-like services injecting
themselves into a new internal Traefik entrypoint
(see `matrix_playbook_internal_matrix_client_api_traefik_entrypoint_*`),
so `matrix-homeserver-proxy` is no longer necessary.
---
This is probably a good moment to share some benchmarks and reasons
for going with the internal Traefik entrypoint as opposed to this nginx
service.
1. (1400 rps) Directly to Synapse (`ab -n 1000 -c 100 http://matrix-synapse:8008/_matrix/client/versions`
2. (~900 rps) Via `matrix-homeserver-proxy` (nginx) proxying to Synapse (`ab -n 1000 -c 100 http://matrix-homeserver-proxy:8008/_matrix/client/versions`)
3. (~1200 rps) Via the new internal entrypoint of Traefik (`matrix-internal-matrix-client-api`) proxying to Synapse (`ab -n 1000 -c 100 http://matrix-traefik:8008/_matrix/client/versions`)
Besides Traefik being quicker for some reason, there are also other
benefits to not having this `matrix-homeserver-proxy` component:
- we can reuse what we have in terms of labels. Services can register a few extra labels on the new Traefik entrypoint
- we don't need services (like `matrix-media-repo`) to inject custom nginx configs into `matrix-homeserver-proxy`. They just need to register labels, like they do already.
- Traefik seems faster than nginx on this benchmark for some reason, which is a nice bonus
- no need to run one extra container (`matrix-homeserver-proxy`) and execute one extra Ansible role
- no need to maintain a setup where some people run the `matrix-homeserver-proxy` component (because they have route-stealing services like `matrix-media-repo` enabled) and others run an optimized setup without this component and everything needs to be rewired to talk to the homeserver directly. Now, everyone can go through Traefik and we can all run an identical setup
Downsides of the new Traefik entrypoint setup are that:
- all addon services that need to talk to the homeserver now depend on Traefik
- people running their own Traefik setup will be inconvenienced - they
need to manage one additional entrypoint
We'd be adding integration with an internal Traefik entrypoint
(`matrix_playbook_internal_matrix_client_api_traefik_entrypoint`),
so renaming helps disambiguate things.
There's no need for deperecation tasks, because the old names
have only been part of this `bye-bye-nginx-proxy` branch and not used by
anyone publicly.
This reverts commit bf95ad2235.
This was a bad idea.
It's better to have people manually define the password.
Otherwise, `matrix_homeserver_generic_secret_key` changing some day in
the future would break the bot and one would have to figure out how to
reset its password manually.
Using an explicit password is more stable.
This also updates validation tasks and documentation, pointing to
variables in the matrix-synapse role which don't currently exist yet
(e.g. `matrix_synapse_container_labels_client_synapse_admin_api_enabled`).
These variables will be added soon, as Traefik labels are added to the
`matrix-synapse` role. At that point, the `matrix-synapse-reverse-proxy-companion` role
will be updated to also use them.
matrix-nginx-proxy is going away and this is one of the features it
offered.
This feature will have no equivalent in our new Traefik-only
setup, although it's possible to implement it manually by using
`matrix_client_element_container_labels_additional_labels`
In nginx reverse-proxy, when the upstream server relies on SNI, the reverser-proxy may return 502 by follow error:
```
*10 SSL_do_handshake() failed (SSL: error:0A000410:SSL routines::sslv3 alert handshake failure:SSL alert number 40) while SSL handshaking to upstream, client: 172.19.0.1, server: example.host, request: "GET /.well-known/matrix/client HTTP/2.0", upstream: "https://<ip>/.well-known/matrix/client", host: "<domain>"
```
This problem often arises when the upstream server is behind the CDN, setting `proxy_ssl_server_name` to `on` will solve it.