-
-
Notifications
You must be signed in to change notification settings - Fork 1.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
postgres: create databases for all services #740
postgres: create databases for all services #740
Conversation
31d2f2c
to
fc66ab3
Compare
@@ -162,3 +162,33 @@ | |||
- matrix-change-user-admin-status | |||
- matrix-postgres-update-user-password-hash | |||
when: "not matrix_postgres_enabled|bool" | |||
|
|||
# Create additional databases | |||
- name: Retrieve IP of postgres container |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why not use the database server's host name? The Docker DNS server should resolve it for you - {{ matrix_synapse_database_host }}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you reach a docker container by it's hostname on the host? Last time i checked, a similar feature was only available for docker-on-mac - and ansible (i.e.: this module) runs on the host. The IP however should be routable from the host, regardless of docker version / OS distribution
Bridges compatible with postgres DB store:
(incomplete) |
16145a1
to
4d3ed47
Compare
If a service is enabled, a database for it is created in postgres with a uniqque password. The service can then use this database for data storage instead of relying on sqlite.
4d3ed47
to
d9f4914
Compare
Moving it above the "uninstalling" set of tasks is better. Extracting it out to another file at the same time, for readability, especially given that it will probably have to become more complex in the future (potentially installing `jq`, etc.)
We'll most likely use one that matches the database name, but it's better to have it configurable.
Quotes are necessary around dictionary field names. There was a missing `}` as well.
To avoid needing to have `jq` installed on the machine, we could: - try to run jq in a Docker container using some small image providing that - better yet, avoid `jq` altogether
> Invalid data passed to 'loop', it requires a list, got this instead: matrix_postgres_additional_databases. Hint: If you passed a list/dict of just one element, try adding wantlist=True to your lookup invocation or use q/query instead of lookup. Well, or working around it, as I've done in this commit (which seems more sane than `wantlist=True` stuff).
The tasks in `create_additional_databases.yml` will likely ensure `matrix-postgres.service` is started, etc. If no additional databases are defined, we'd rather not execute that file and all these tasks that it may do in the future.
While these modules are really nice and helpful, we can't use them for at least 2 reasons: - for us, Postgres runs in a container on a private Docker network (`--network=matrix`) without usually being exposed to the host. These modules execute on the host so they won't be able to reach it. - these modules require `psycopg2`, so we need to install it before using it. This might or might not be its own can of worms.
Being more explicit sounds better.
People can toggle between them now. The playbook also defaults to using SQLite if an external Postgres server is used. Ideally, we'd be able to create databases/users in external Postgres servers as well, but our initialization logic (and `docker run` command, etc.) hardcode too many things right now.
Using the result of `password_hash` works for creating them, but authentication seems to be failing with some tools like pgloader. It's possible that we're not escaping things properly somewhere. Ideally, it'd be nice to solve that. But the easier (and still relatively safe/good) solution is to just turn that password hash into a UUID that's safe for passing around without worrying about escaping.
This can be used by various bridges, etc., to import an SQLite (or some other supported) database into Postgres.
Thank you for the great start on this! 👍 It's been a good inspiration and I've spent a lot of time progressing tonight based on it. I believe to have made some significant progress. As an overview of all the changes:
As mentioned in the summary above, most of these things are only implemented for the Appservice Discord bridge. I'm yet to spend time on doing the same for the other bridges. Well, if someone else wants to help with that, that'd be great. Now that the complicated part is done, it should be relatively easy to fix these things up. Talking about the Appservice Discord bridge, I've already used the playbook's instructions to migrate my data from SQLite to Postgres. It seems like it worked, but I'm yet to see any Discord activity and confirm whether it all works. I'll likely spend time migrating the Facebook bridge tomorrow. At least that's a bridge I can test easily. |
We were running into conflicts, because having initialized the roles (users) and databases, trying to import leads to errors (role XXX already exists, etc.). We were previously ignoring the Synapse database (`homeserver`) when upgrading/importing, because that one gets created by default whenever the container starts. For our additional databases, it's a similar situation now. It's not created by default as soon as Postgres starts with an empty database, but rather we create it as part of running the playbook. So we either need to skip those role/database creation statements while upgrading/importing, or to avoid creating the additional database and rely on the import for that. I've gone for the former, because it's already similar to what we were doing and it's simpler (it lets `setup_postgres.yml` be the same in all scenarios).
I've added (SQLite + Postgres) support to the remaining services. In dd797ba, I've also fixed I think that what remains to be done before we can merge this, is:
It might also be nice to get pgloader running on ARM, but we can also leave this for another time. |
In cases where pgloader is not enough and we need to do some additional migration work after it, we can now use `additional_psql_statements_list` and `additional_psql_statements_db_name`. This is to be used when migrating `matrix-registration`'s data at the very least.
Now that 0.7.2 is out, the Docker image supports Postgres and we can do the (SQLite -> Postgres) migration. I've also found out that we needed to fix up the `tokens.ex_date` column data type a bit to prevent matrix-registration from raising exceptions when comparing `datetime.now()` with `ex_date` coming from the database. Example: > File "/usr/local/lib/python3.8/site-packages/matrix_registration/tokens.py", line 58, in valid > expired = self.ex_date < datetime.now() > TypeError: can't compare offset-naive and offset-aware datetimes
Should have been part of 149872e
…gration Our old (base-path -> data-path) SQLite migration can't work otherwise. It's probably not necessary to keep it anymore, but since we still do, at least we should take care to ensure it works.
Lately, I've:
I've tested as much as I can, especially given that I don't use most of these bridges. It's somewhat unfortunate that we didn't get much outside testing for this, so I guess we'll have to test it upon "release". I'll put up a warning about it in the changelog. |
I tested this again today and I messed up a few things so it was not a standard test - it worked for a moment fine, then I added the relaybot for whatsapp by changing the template and changed the template of telegram as well to allow all to use it, got kicked out everywhere and the bridges are now regenerating rooms but it might be my fault for using the wrong template at the beginning which made them use sqlite because I used my old templates that had sqlite databases in them. Also facing a problem still with the WhatsApp bridge and haven't tested the slack bridge yet... |
If a service is enabled, a database for it is created in postgres with a uniqque password. The service can then use this database for data storage instead of relying on sqlite.
matrix_postgres
is enabled)sqlite
, so should the playbook as a whole ifpostgres
is running externally--tags=import-generic-sqlite-db
(from thematrix-postgres
role) to import sqlite DBs (we still have this tag, even though we now do automatic migration for everything and don't advertise it at all)Implementation:
defaulting to Postgres blocked by Postgres not supported when using the container image (cannot import psycopg2) zeratax/matrix-registration#44; automatic migration supported)Feel free to pick a bridge and implement the migration :)