Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Exception in mmap_dict.py with multiprocess in 0.5.0 #357

Closed
nonspecialist opened this issue Dec 10, 2018 · 6 comments
Closed

Exception in mmap_dict.py with multiprocess in 0.5.0 #357

nonspecialist opened this issue Dec 10, 2018 · 6 comments

Comments

@nonspecialist
Copy link

nonspecialist commented Dec 10, 2018

What's the issue?

JSON unicode decode error using multiprocess with moderate throughput

UnicodeDecodeError: 'utf-8' codec can't decode byte 0x9f in position 166: invalid start byte
  File "django/core/handlers/exception.py", line 34, in inner
    response = get_response(request)
  File "django/core/handlers/base.py", line 126, in _get_response
    response = self.process_exception_by_middleware(e, request)
  File "django/core/handlers/base.py", line 124, in _get_response
    response = wrapped_callback(request, *callback_args, **callback_kwargs)
  File "newrelic/hooks/framework_django.py", line 544, in wrapper
    return wrapped(*args, **kwargs)
  File "django_prometheus/exports.py", line 115, in ExportToDjangoView
    metrics_page = prometheus_client.generate_latest(registry)
  File "prometheus_client/exposition.py", line 89, in generate_latest
    for metric in registry.collect():
  File "prometheus_client/registry.py", line 75, in collect
    for metric in collector.collect():
  File "prometheus_client/multiprocess.py", line 30, in collect
    return self.merge(files, accumulate=True)
  File "prometheus_client/multiprocess.py", line 44, in merge
    for key, value in d.read_all_values():
  File "prometheus_client/mmap_dict.py", line 100, in read_all_values
    for k, v, _ in self._read_all_values():
  File "prometheus_client/mmap_dict.py", line 95, in _read_all_values
    yield encoded.decode('utf-8'), value, pos

What configuration?

  • prometheus-client==0.5.0
  • django-prometheus==1.0.15
  • Apache 2.4 with mod_wsgi
  • Kubernetes 1.11.3

Metrics are exported via HTTP.

What happens?

At minimal throughput (single requests per minute up to tens of requests per minute) there's no problem. Once throughput increases to around 200 requests/minute (quite moderate), we start seeing the above exceptions.

These are containerised workloads, so there are no metrics files that need to be cleared away when the main process starts.

I'm rebuilding against 0.4.2 to validate that the issue was introduced in 0.5.0.

@nonspecialist nonspecialist changed the title Crash in mmap_dict.py with multiprocess in 0.5.0 Exception in mmap_dict.py with multiprocess in 0.5.0 Dec 10, 2018
@nonspecialist
Copy link
Author

Confirmed -- although, of course, 0.4.2 still has #328 and #329

@brian-brazil
Copy link
Contributor

What exactly did you confirm?

@nonspecialist
Copy link
Author

Ah sorry, that 0.5.0 has the issue. 0.4.2 does not have the same problem, but of course there are the process startup problems with metrics db corruption.

@brian-brazil
Copy link
Contributor

#328 is the only change touching that code, though the big refactor might also have broke something. Can you test a build at 38e9f48 ?

Also, what Python version are you using?

@nonspecialist
Copy link
Author

Python 3.6.7 on Debian 9.6 (in a Docker container)

I was setting up to test against the commit that you referenced above, but while doing so accidentally installed the most recent commit first (with daemon_threads = True enabled). With that setting, I'm unable to reproduce the problem, even under load. Reverting to 0.5.0 allows me to blow it all up very rapidly.

So, it looks like 5aa256d fixes this

@brian-brazil
Copy link
Contributor

Huh, that's weird but at least it's fixed.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants