When the Postgres role updates database passwords (e.g., due to a
change in the secret derivation method), the Matrix Authentication
Service container may still be running with old configuration that
references the previous password. This causes mas-cli to fail with
"password authentication failed" when the matrix-user-creator role
tries to register users.
Rather than adding config-change detection or eager restarts to the
MAS role, this adds targeted retry logic: if the initial registration
attempt fails with a database authentication error, restart the MAS
service (which picks up the new config with the updated password),
wait for it to start, and retry. The restart usually only triggers
once per run since subsequent user registrations succeed after the restart.
Related to c21a80d232
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
Replace password_hash('sha512', rounds=655555) with hash('sha512')
for all 114 secret derivations in group_vars/matrix_servers.
The old method (655k rounds of SHA-512) was designed for protecting
low-entropy human passwords in /etc/shadow. For deriving secrets
from a high-entropy secret key, a single hash round is equally
secure - the security comes from the key's entropy, not the
computational cost. SHA-512 remains preimage-resistant regardless
of rounds.
This yields a major performance improvement: evaluating
postgres_managed_databases (which references multiple derived
database passwords) dropped from ~10.7s to ~0.6s on a fast mini
PC. The Postgres role evaluates this variable multiple times, and
other roles reference derived passwords too, so the cumulative
savings across a full playbook run are substantial.
All derived service passwords (database passwords, appservice
tokens, etc.) will change on the next run. The main/superuser
database password is not affected (it's hardcoded in inventory
variables). All services receive their new passwords in the same
run, so this should be seamless.
Co-Authored-By: Claude Opus 4.6 <noreply@anthropic.com>
* Fetch ansible-role-ddclient from MASH project
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Replace `matrix_dynamic_dns` with `ddclient`
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Set `matrix-dynamic-dns` to `ddclient_identifier`
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Remove `ddclient_container_network` in favor of the role's configuration
On the role the value of `ddclient_container_network` is set to `ddclient_identifier`, which is set to `matrix-dynamic-dns` on the playbook.
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Replace `matrix-dynamic-dns` with `ddclient` on matrix_servers
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Replace `ddclient_docker_image_*` with `ddclient_container_image_*`
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Update `ddclient_container_image_*`
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Move `ddclient_base_path` to matrix_servers
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Move `ddclient_web_*` to matrix_servers
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Remove `matrix-dynamic-dns` directory
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Update configuring-playbook-dynamic-dns.md
Reuse 75e264f538/docs/services/ddclient.md
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
* Fix a typo
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
---------
Signed-off-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
Co-authored-by: Suguru Hirahara <did:key:z6MkvVZk1A3KBApWJXv2Ju4H14ErDfRGxh8zxdXSZ4vACDg5>
The cli-non-interactive script passes arguments directly to psql, which
interprets positional arguments as database names, not SQL commands.
Without the -c flag, commands like:
/matrix/postgres/bin/cli-non-interactive 'DROP DATABASE foo;'
fail with: FATAL: database "DROP DATABASE foo;" does not exist
The correct syntax requires -c to pass a command:
/matrix/postgres/bin/cli-non-interactive -c 'DROP DATABASE foo;'
This mistake was originally introduced in c399992542
when the matrix-bridge-mautrix-hangouts role was removed. That commit's
uninstallation docs were then used as a template and the error propagated
to subsequent removal documentation for other bridges and components.
The whoami-based approach is now the only implementation for sync worker routing.
It works with all token types (native Synapse, MAS, etc.) and is automatically
enabled when sync workers exist.
The old map-based approach only worked with native Synapse tokens (syt_<b64>_...)
and would give poor results with MAS or other auth systems.
This adds a new routing mechanism for sync workers that resolves access tokens
to usernames via Synapse's whoami endpoint, enabling true user-level sticky
routing regardless of which device or token is used.
Previously, sticky routing relied on parsing the username from native Synapse
tokens (`syt_<base64 username>_...`), which only works with native Synapse auth
and provides device-level stickiness at best. This new approach works with any
auth system (native Synapse, MAS, etc.) because Synapse handles token validation
internally.
Implementation uses nginx's auth_request module with an njs script because:
- The whoami lookup requires an async HTTP subrequest (ngx.fetch)
- js_set handlers must return synchronously and don't support async operations
- auth_request allows the async lookup to complete, then captures the result
via response headers into nginx variables
The njs script:
- Extracts access tokens from Authorization header or query parameter
- Calls Synapse's whoami endpoint to resolve token -> username
- Caches results in a shared memory zone to minimize latency
- Returns the username via a `X-User-Identifier` header
The username is then used by nginx's upstream hash directive for consistent
worker selection. This leverages nginx's built-in health checking and failover.
- Upgrade from 1.0.1 to 4.0.0
- Add ircService.mediaProxy configuration for authenticated Matrix media
- Add Traefik integration for media proxy endpoint
- Generate signing key for authenticated media
Closes#3512
Co-authored-by: Jade Ellis <jade@ellis.link>
Co-authored-by: Slavi Pantaleev <slavi@devture.com>