Skip to content
This repository has been archived by the owner on Apr 26, 2024. It is now read-only.

1.52.0 fails to start with prometheus_client version error #11942

Closed
olmari opened this issue Feb 8, 2022 · 20 comments
Closed

1.52.0 fails to start with prometheus_client version error #11942

olmari opened this issue Feb 8, 2022 · 20 comments
Labels
X-Needs-Info This issue is blocked awaiting information from the reporter

Comments

@olmari
Copy link
Contributor

olmari commented Feb 8, 2022

1.52.0 Debian Bullseye matrix.org repo, HS hacklab.fi

Upgrading workerized synapse to 1.52.0 and synapse won't start, nothing on logs either so very early fail at least without debug logs enabled. I emergency downgraded our HS to 1.51.0 and it started right up again.

While I promised to provide strace too on #synapse:matrix.org, I did promise it later when I can afford to halt our ~100 user homeserver. However there is few others who experienced similar isue, with venv pip installation too, and culplrit is likely needing to have also updated treq package alongside new twist.

Our issue seems having originated from an incompatibility between twisted 22.1.0 and treq which was too old but did not get pulled in as a dependency during pip --upgrade.

Manually upgrading treq and twisted to 22.1.0 has fixed the issue for us.
Related: RedHat Bug 1953535 - ImportError: cannot import name '_PY3' from 'twisted.python.compat'

So there is very likely rootcause there.

I did make issue now as early as possible for all to know, will fill strace later if not confirmed before.

@richvdh
Copy link
Member

richvdh commented Feb 8, 2022

yeah, treq 20.4.0 and earlier are incompatible with recent versions of twisted (twisted/treq#288). You'll need to upgrade that as well.

@richvdh
Copy link
Member

richvdh commented Feb 8, 2022

I just realised that @olmari is reporting this as a problem with the matrix.org debian packages, which should include up-to-date versions of both Twisted and treq, so this is very much unexpected.

@richvdh
Copy link
Member

richvdh commented Feb 8, 2022

Strongly suspect @olmari's issue is unrelated to treq. We'll need logs to investigate further.

If nothing is written to the log file, it is almost certain something will have been written to stderr. Assuming you're using systemd for process management, you should find something from journalctl.

@richvdh richvdh added the X-Needs-Info This issue is blocked awaiting information from the reporter label Feb 8, 2022
@olmari
Copy link
Contributor Author

olmari commented Feb 8, 2022

Indeed, might not be treq related after all, but prometheus_client.. We do have metrix enabled, should that be cornerstone.

journalctl -eu matrix-synapse gives:

Feb 08 16:06:58 morpheus systemd[1]: Starting Synapse master...
Feb 08 16:06:59 morpheus matrix-synapse[3385]: ERROR:root:Needed prometheus_client>=0.4.0,<0.13.0, got prometheus-client==0.13.1
Feb 08 16:06:59 morpheus matrix-synapse[3385]: Missing Requirements: "prometheus_client>=0.4.0,<0.13.0"
Feb 08 16:06:59 morpheus matrix-synapse[3385]: To install run:
Feb 08 16:06:59 morpheus matrix-synapse[3385]:     pip install --upgrade --force "prometheus_client>=0.4.0,<0.13.0"
Feb 08 16:06:59 morpheus systemd[1]: matrix-synapse.service: Control process exited, code=exited, status=1/FAILURE
Feb 08 16:06:59 morpheus systemd[1]: matrix-synapse.service: Failed with result 'exit-code'.
Feb 08 16:06:59 morpheus systemd[1]: Failed to start Synapse master.
Feb 08 16:07:02 morpheus systemd[1]: matrix-synapse.service: Scheduled restart job, restart counter is at 128.
Feb 08 16:07:02 morpheus systemd[1]: Stopped Synapse master.

@DMRobertson
Copy link
Contributor

Sounds like #11832?

@DMRobertson
Copy link
Contributor

Ah, I misspoke. #11834 introduced the change.

@olmari
Copy link
Contributor Author

olmari commented Feb 8, 2022

I do wonder what ought to make deb-packaged setup go this way.. Especially as I don't think I'm only one using metrics :)

I know I could go run that pip install command inside the deb provided venv, but I feel restless running anything that messes with things that would/should/could normally be deb-package to provide...

@richvdh
Copy link
Member

richvdh commented Feb 8, 2022

matrix-synapse-py3_1.52.0+bullseye1_amd64.deb contains prometheus_client 0.12.0, so should not cause this error. It's hard to see how you could have ended up with a different version other than by running pip inside the deb-provided venv - which, as you note, is something that should be done with caution.

I'm a bit mystified. Could you share the result of /opt/venvs/matrix-synapse/bin/pip freeze ?

@richvdh richvdh changed the title 1.52.0, w/ workers, fails to start, likely needs new treq with new twisted 1.52.0 fails to start with prometheus_client version error Feb 8, 2022
@MacLemon
Copy link

MacLemon commented Feb 8, 2022

We've seen this issue on FreeBSD with pip.

synapse crashes upon launch with twisted 22.1.0 *if treq is too old. The trace looks like this for synapse and the worker processes.

/usr/local/lib/python3.7/site-packages/twisted/conch/ssh/common.py:14: CryptographyDeprecationWarning: int_from_bytes is deprecated, use int.from_bytes instead
  from cryptography.utils import int_from_bytes, int_to_bytes
Traceback (most recent call last):
  File "/usr/local/lib/python3.7/runpy.py", line 193, in _run_module_as_main
    "__main__", mod_spec)
  File "/usr/local/lib/python3.7/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/usr/local/lib/python3.7/site-packages/synapse/app/homeserver.py", line 71, in <module>
    from synapse.server import HomeServer
  File "/usr/local/lib/python3.7/site-packages/synapse/server.py", line 44, in <module>
    from synapse.appservice.api import ApplicationServiceApi
  File "/usr/local/lib/python3.7/site-packages/synapse/appservice/api.py", line 24, in <module>
    from synapse.http.client import SimpleHttpClient
  File "/usr/local/lib/python3.7/site-packages/synapse/http/client.py", line 33, in <module>
    import treq
  File "/usr/local/lib/python3.7/site-packages/treq/__init__.py", line 5, in <module>
    from treq.api import head, get, post, put, patch, delete, request
  File "/usr/local/lib/python3.7/site-packages/treq/api.py", line 5, in <module>
    from treq.client import HTTPClient
  File "/usr/local/lib/python3.7/site-packages/treq/client.py", line 11, in <module>
    from twisted.python.compat import _PY3, unicode
ImportError: cannot import name '_PY3' from 'twisted.python.compat' (unknown location)

This is also mentioned in this older Bug from RedHat, which even mentions synapse.
Bug 1953535 - ImportError: cannot import name '_PY3' from 'twisted.python.compat'

The Update Instructions suggest to update twisted to 21.1.0, but don't mention that this requires to also update treq which is not automtically pulled in as a dependency. I'd like to suggest adding that information so people don't run into easily avoidable issues.

After explicitly updating treq to the latest version the problem was fixed.

@richvdh
Copy link
Member

richvdh commented Feb 8, 2022

synapse crashes upon launch with twisted 22.1.0 *if treq is too old. The trace looks like this for synapse and the worker processes.

This is known, and unrelated to @olmari's issue. See #11943.

@olmari
Copy link
Contributor Author

olmari commented Feb 9, 2022

@richvdh @ version 1.51.0:

olmari@morpheus:~$ /opt/venvs/matrix-synapse/bin/pip freeze
attrs==21.4.0
Authlib==0.15.5
Automat==20.2.0
bcrypt==3.2.0
bleach==4.1.0
canonicaljson==1.5.0
certifi==2021.10.8
cffi==1.15.0
charset-normalizer==2.0.10
constantly==15.1.0
cryptography==36.0.1
defusedxml==0.7.1
elementpath==2.4.0
frozendict==2.1.1
hiredis==2.0.0
hyperlink==21.0.0
idna==3.3
ijson==3.1.4
importlib-metadata==4.10.1
incremental==21.3.0
jaeger-client==4.8.0
Jinja2==3.0.3
jsonschema==4.4.0
ldap3==2.9.1
lxml==4.7.1
MarkupSafe==2.0.1
matrix-common==1.0.0
matrix-synapse @ file:///synapse/build
matrix-synapse-ldap3==0.1.5
mock==4.0.3
msgpack==1.0.3
netaddr==0.8.0
opentracing==2.4.0
packaging==21.3
parameterized==0.8.1
phonenumbers==8.12.41
Pillow==9.0.0
pkg_resources==0.0.0
prometheus-client==0.13.1
psycopg2==2.9.3
pyasn1==0.4.8
pyasn1-modules==0.2.8
pycparser==2.21
PyJWT==2.3.0
pymacaroons==0.13.0
Pympler==1.0.1
PyNaCl==1.5.0
pyOpenSSL==21.0.0
pyparsing==3.0.7
pyrsistent==0.18.1
pysaml2==7.1.0
python-dateutil==2.8.2
pytz==2021.3
PyYAML==6.0
requests==2.27.1
sentry-sdk==1.5.3
service-identity==21.1.0
shared-secret-authenticator @ git+https://github.com/devture/matrix-synapse-shared-secret-auth@865c95650792b1613ad50edfc69b616bea8116d7
signedjson==1.1.1
simplejson==3.17.6
six==1.16.0
sortedcontainers==2.4.0
systemd-python==234
threadloop==1.0.2
thrift==0.15.0
tornado==6.1
treq==21.5.0
Twisted==21.7.0
txredisapi==1.4.7
typing_extensions==4.0.1
unpaddedbase64==2.1.0
urllib3==1.26.8
webencodings==0.5.1
xmlschema==1.9.2
zipp==3.7.0
zope.interface==5.4.0

@richvdh
Copy link
Member

richvdh commented Feb 10, 2022

@olmari and to confirm, this is matrix-synapse-py3_1.52.0+bullseye1_amd64.deb ? It looks like you've installed at least shared-secret-authenticator on top of the basic image. Please could you share the logs from when you ran pip to install the extras.

@olmari
Copy link
Contributor Author

olmari commented Feb 10, 2022

Like mentioned, 1.51.0, as 1.52.0 does not run on production server until resolution is reached.. shared-secret-authenticator indeed was installed inside venv with pip, for bridge use... I try dig to see if I find logs about that.

Edit: no specific logs seem to be generated by pip, such console output is weeks away gone, as nothing suspicious happened in immediate result..

Edit2: Instructions followed from: https://github.com/devture/matrix-synapse-shared-secret-auth and command itself was pip install git+https://github.com/devture/matrix-synapse-shared-secret-auth

I suppose I could update the synapse to 1.52.0 again and run the pip install --upgrade --force "prometheus_client>=0.4.0,<0.13.0" inside the venv what error suggests, and hope for the best (as in future too). At least now I know way more wher to dig things if such happends again for any reason.

@richvdh
Copy link
Member

richvdh commented Feb 10, 2022

Currently I'm inclined to consider this a bug in matrix-synapse-shared-secret-auth, since it doesn't happen in an unmodified system.

If you'd like to pursue it here, could you install 1.52.0 and matrix-synapse-shared-secret-auth as you normally would, and share the output from the installation process. Suggest using a test VM or container rather than your production server.

@olmari
Copy link
Contributor Author

olmari commented Feb 10, 2022

Mm, I do have to admit I didn't even though age old pip install of tool would have this effect after long time (and thus that specifically wanting new prometheus client if it even did). There's also theoretical possibility me running some other command that makes such effect, while I can't remember doing so... Foremostly I'd love to resolve this such that stuff would work in future too, be the rootcause in any software definitions or my own actions. I'll poke "devture" to see this report and maybe figure if that was even it, and so on...

I'll test now that fix command for immediate test, and worry the additional test later, will report will that work ASAP.

@richvdh
Copy link
Member

richvdh commented Feb 10, 2022

There's also theoretical possibility me running some other command that makes such effect, while I can't remember doing so

since prometheus-client is contained within the matrix-synapse-py3 package (and is hence overwritten each time you upgrade), note that such a command would have to have been done since the most recent upgrade (or downgrade).

@olmari
Copy link
Contributor Author

olmari commented Feb 10, 2022

At least that pip install --upgrade --force "prometheus_client>=0.4.0,<0.13.0" made 1.52.0 work, so in that kind of way things again work.

since prometheus-client is contained within the matrix-synapse-py3 package (and is hence overwritten each time you upgrade), note that such a command would have to have been done since the most recent upgrade (or downgrade).

Well here I know definately that only command ran was apt update and apt upgrade, no pip shenanigans in between 1.51.0 running happily and apt upgrade whereafter kaboom.

@richvdh
Copy link
Member

richvdh commented Feb 10, 2022

well, that's simply not the way dpkg works, so there must be something very odd going on here.

For example, on a clean bullseye container, I install matrix-synapse-py3 1.45.1:

root@7c6a11b42458:/# apt update
root@7c6a11b42458:/# apt install -y python3 wget
...
root@7c6a11b42458:/# wget -q https://packages.matrix.org/debian/pool/main/m/matrix-synapse-py3/matrix-synapse-py3_1.45.1+bullseye1_amd64.deb
root@7c6a11b42458:/# dpkg -i matrix-synapse-py3_1.45.1+bullseye1_amd64.deb
...
root@24ff53e37819:/#  ls -1d /opt/venvs/matrix-synapse/lib/python3.9/site-packages/prometheus_client*
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/prometheus_client
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/prometheus_client-0.11.0.dist-info

Note the old version of prometheus_client.

If I now upgrade to 1.52.0:

root@24ff53e37819:/# wget -O /usr/share/keyrings/matrix-org-archive-keyring.gpg https://packages.matrix.org/debian/matrix-org-archive-keyring.gpg
root@24ff53e37819:/# echo "deb [signed-by=/usr/share/keyrings/matrix-org-archive-keyring.gpg] https://packages.matrix.org/debian/ bullseye main" > /etc/apt/sources.list.d/matrix-org.list
root@24ff53e37819:/# apt update
...
root@24ff53e37819:/# apt install matrix-synapse-py3
...
root@24ff53e37819:/#  ls -1d /opt/venvs/matrix-synapse/lib/python3.9/site-packages/prometheus_client*
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/prometheus_client
/opt/venvs/matrix-synapse/lib/python3.9/site-packages/prometheus_client-0.12.0.dist-info

Given this is unreproducible and you've solved your problem, I'm going to close this. If you can give us reproduction steps, we can reopen.

@richvdh richvdh closed this as completed Feb 10, 2022
@olmari
Copy link
Contributor Author

olmari commented Feb 10, 2022

All I can think if at this stage is that will dpkg / apt / deb downgrade the prometheus_client if it already is newer than is included in .deb? Reasoning here that whatever reason I've managed to land newer prometheus_client, and then 1.52.0 apparently included the earlier mentioned prometheus_client version nailing that 1.51.0. didn't have, rel #11834

@richvdh
Copy link
Member

richvdh commented Feb 10, 2022

yes. dpkg doesn't know anything about python package versions. It just removes the files in the old debian package and installs the ones in the new one.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
X-Needs-Info This issue is blocked awaiting information from the reporter
Projects
None yet
Development

No branches or pull requests

4 participants