Add extra statsd metrics for normal and/or abnormal worker exits #2126

dnlserrano · 2019-10-07T09:51:01Z

Hey and thanks for all the hard work on gunicorn! ❤️

Would it be possible to have the metric for gunicorn.workers be brought back to INFO level instead of DEBUG? I'm very much interested in understanding the variations in number of workers my web server goes through while serving requests. Particularly, I'm interested in understanding when a given worker has been killed.

Furthermore, I think this is a good monitoring practice that shouldn't be kept around just during debugging, hence my suggestion of increasing the log level for it.

Is there a better way to achieve this?

If the proposed change is considered interesting, I may take a stab at it. It should be just a matter of changing info to debug. 🤔

Thanks in advance! 🙇

The text was updated successfully, but these errors were encountered:

tilgovi · 2019-10-11T21:55:15Z

My understanding of that change is that the metrics should be sent regardless of enabled log level.

Are you asking to log a message when the number of workers changes, or are you finding that StatsD metrics for gunicorn.workers are not sent?

tilgovi · 2019-10-11T21:58:03Z

Particularly, I'm interested in understanding when a given worker has been killed.

There is an INFO level log when a worker exits.

dnlserrano · 2019-10-13T23:18:56Z

My understanding of that change is that the metrics should be sent regardless of enabled log level.

Weird... you seem to be correct, but...

(...) or are you finding that StatsD metrics for gunicorn.workers are not sent?

Yes. 😐 I think I misinterpreted the code. It seems like you only log (and emit the metric) after you've spawned workers as needed, which is a bit puzzling to me (I haven't had time to look into commit history, which might explain this decision - sorry).

Would it make more sense to log the number of workers before spawning needed workers? In this way, if some worker segfaulted and was re-spawned by the main master loop, I could get metrics before the process of restoring them kicks in.

There is an INFO level log when a worker exits.

Yes, indeed. These are useful when sporadically looking at logs. But it'd be ideal if this was mirrored as a metric that I can track as part of my monitoring dashboard (e.g., in Datadog) without having to parse the logs using some external process (in the administrative sense of the word).

tilgovi · 2019-10-14T00:39:44Z

It seems like you only log (and emit the metric) after you've spawned workers as needed, which is a bit puzzling to me (I haven't had time to look into commit history, which might explain this decision - sorry).

Ahh, yes. I don't recall the history, but it could be that the intention was to just log the number of configured workers. This number can change during a configuration reload or by using the TTIN and TTOU signals.

Would it make more sense to log the number of workers before spawning needed workers?

Maybe it would make sense to have extra guages for normal and/or abnormal worker exits.

dnlserrano · 2019-10-14T21:41:52Z

Maybe it would make sense to have extra guages for normal and/or abnormal worker exits.

Indeed. I've changed the issue title to reflect that, and I'll try to take a look at a possible PR for it soon. Thanks for the feedback @tilgovi! 🙇

dnlserrano · 2021-09-02T13:15:22Z

In hindsight, I think what I need is #2407, so closing this in detriment of that PR.

dnlserrano changed the title ~~Bring back INFO level metric collection for changes in number of workers~~ Add extra statsd metrics for normal and/or abnormal worker exits Oct 14, 2019

dnlserrano closed this as completed Sep 2, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Add extra statsd metrics for normal and/or abnormal worker exits #2126

Add extra statsd metrics for normal and/or abnormal worker exits #2126

dnlserrano commented Oct 7, 2019 •

edited

Loading

tilgovi commented Oct 11, 2019 •

edited

Loading

tilgovi commented Oct 11, 2019

dnlserrano commented Oct 13, 2019

tilgovi commented Oct 14, 2019

dnlserrano commented Oct 14, 2019

dnlserrano commented Sep 2, 2021

Add extra statsd metrics for normal and/or abnormal worker exits #2126

Add extra statsd metrics for normal and/or abnormal worker exits #2126

Comments

dnlserrano commented Oct 7, 2019 • edited Loading

tilgovi commented Oct 11, 2019 • edited Loading

tilgovi commented Oct 11, 2019

dnlserrano commented Oct 13, 2019

tilgovi commented Oct 14, 2019

dnlserrano commented Oct 14, 2019

dnlserrano commented Sep 2, 2021

dnlserrano commented Oct 7, 2019 •

edited

Loading

tilgovi commented Oct 11, 2019 •

edited

Loading