People often report and ask about these "failures".
More-so previously, when the `docker kill/rm` output was collected,
but it still happens now when people do `systemctl status
matrix-something` and notice that it says "FAILURE".
Suppressing to avoid further time being wasted on saying "this is
expected".
Reverts b1b4ba501fdfaa, 90c9801c560b6, a3c84f78ca9c65a, ..
I haven't really traced it (yet), but on some servers, I'm observing
`ansible-playbook ... --tags=start` completing very slowly, waiting
to stop services. I can't reproduce this on all Matrix servers I manage.
I suspect that either the systemd version is to blame or that some
specific service is not responding well to some `docker kill/rm` command.
`ExecStop` seems to work great in all cases and it's what we've been
using for a very long time, so I'm reverting to that.
Until now, we were leaving services "enabled"
(symlinks in /etc/systemd/system/multi-user.target.wants/).
We clean these up now. Broken symlinks may still exist in older
installations that enabled/disabled services. We're not taking care
to fix these up. It's just a cosmetic defect anyway.
In short, the problem is that older Postgres versions store passwords
hashed as md5. When you dump such a database, the dump naturally also
contains md5-hashed passwords.
Restoring from that dump used to create users and updates their passwords
with these md5 hashes.
However, Postgres v14 prefers does not like md5-hashed passwords now (by default),
which breaks connectivity. Postgres v14 prefers `scram-sha-256` for
authentication.
Our solution is to just ignore setting passwords (`ALTER ROLE ..`
statements) when restoring dumps. We don't need to set passwords as
defined in the dump anyway, because the playbook creates users
and manages their passwords by itself.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/1340
It's mildly annoying when trying to execute these scripts while logged
in as a regular user, as the missing execute permissions will hinder
autocompletion even when trying to use with sudo.
These shell scripts don't contain secrets, but may fail when ran by a
regular user. The failure is due to the lack of access to the /matrix
directory, and does not result in any damage.
This passes any arguments given to 'matrix-postgres-cli' to the 'psql' command.
Examples:
$ # start an interactive shell connected to a given db
$ sudo matrix-postgres-cli -d synapse
$ # run a query, non-interactively
$ sudo matrix-postgres-cli -d synapse -c 'SELECT group_id FROM groups;'
These are just defensive cleanup tasks that we run.
In the good case, there's nothing to kill or remove, so they trigger an
error like this:
> Error response from daemon: Cannot kill container: something: No such container: something
and:
> Error: No such container: something
People often ask us if this is a problem, so instead of always having to
answer with "no, this is to be expected", we'd rather eliminate it now
and make logs cleaner.
In the event that:
- a container is really stuck and needs cleanup using kill/rm
- and cleanup fails, and we fail to report it because of error
suppression (`2>/dev/null`)
.. we'd still get an error when launching ("container name already in use .."),
so it shouldn't be too hard to investigate.
Fixes a regression introduced in 95346f3117f2a3a67a52.
Fixes https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/814
Using `matrix_synapse_` variables in the `matrix-postgres` role is not
ideal, but.. this script belongs neither here, nor there.
We'll have it be like that for now.
In short, this makes Synapse a 2nd class citizen,
preparing for a future where it's just one-of-many homeserver software
options.
We also no longer have a default Postgres superuser password,
which improves security.
The changelog explains more as to why this was done
and how to proceed from here.
We need to suppress systemd service-stopping requests in certain rare
cases like https://github.com/spantaleev/matrix-docker-ansible-deploy/issues/771
That issue seems to describe a case, where a migration from mxisd to
ma1sd was happening (DB files had just been moved), and then we were
attemping to stop `matrix-ma1sd.service` so we could import that database into
Postgres. However, there's neither `matrix-mxisd.service`, nor
`matrix-ma1sd.service` after `migrate_mxisd.yml` had just run, so
stopping `matrix-ma1sd.service` was failing.