-
Notifications
You must be signed in to change notification settings - Fork 1.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add GAE Shutdown Handler #2976
Add GAE Shutdown Handler #2976
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm hoping someone who owns the repo also reviews this for style/testing.
|
||
|
||
def _get_gae_instance(): | ||
return _fetch_metadata('instance/attributes/gae_backend_instance') |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
|
||
def _get_gae_backend(): | ||
return _fetch_metadata('instance/attributes/gae_backend_name') |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
|
||
def _get_gae_version(): | ||
return _fetch_metadata('instance/attributes/gae_backend_version') |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
entries = [entry] | ||
|
||
client.logging_api.write_entries( |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, if I understand this correctly, the idea is that you can call client.enable_shutdown_logging()
, and if you are on AppEngine, logs get sent on SIGTERM?
If that is correct, then the overall direction seems fine to me. I have a couple nitpicks with particulars of your implementation (see comments) but no major issues with what you are trying to do.
:param log_level: (Optional) When true, on SIGTERM the application | ||
will log the stacktrace of all running threads. | ||
""" | ||
if thread_dump: |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
:param client: Stackdriver logging client. | ||
""" | ||
if not _is_on_appengine(): | ||
raise Exception('Shutdown reporting is only supported on App Engine ' |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
576cfd4
to
4985c5b
Compare
It looks fine to me, but hoping @dhermes will take a pass also before we merge it. |
Addresses #2819 |
Officially assigning @dhermes for review; I forgot to the first time. |
@dhermes ping |
@@ -0,0 +1,26 @@ | |||
# Copyright 2017 Google Inc. |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
|
||
# Maximum size in bytes to send to Stackdriver Logging in one entry | ||
MAX_PAYLOAD_SIZE = 1024 * 100 |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
def _get_gae_instance(): | ||
"""Returns the App Engine Flexible instance.""" | ||
return os.getenv('GAE_INSTANCE') |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
def _get_gae_service(): | ||
"""Returns the App Engine Flexible service.""" | ||
return os.getenv('GAE_SERVICE') |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
def _get_gae_version(): | ||
"""Returns the App Engine Flexible version.""" | ||
return os.getenv('GAE_VERSION') |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
|
||
def _report_stacktraces( | ||
client, signal_, frame): # pylint: disable=unused-argument |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
:param frame: The current stack frame. | ||
""" | ||
traces = '' | ||
for thread_id, stack in sys._current_frames().items(): |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
'appengine.googleapis.com/module_id': gae_service, | ||
} | ||
|
||
split_payloads = _split_entry(text_payload.encode('utf-8')) |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
:type frame: frame object | ||
:param frame: The current stack frame. | ||
""" | ||
traces = '' |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
raise RuntimeError('Shutdown reporting is only supported on App ' | ||
'Engine flexible environment.') | ||
signal.signal( | ||
signal.SIGTERM, functools.partial(_report_stacktraces, client)) |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
wow good to know difference between += and append there. Others comments were just me forgetting to run lint all done. |
Agreed I can add a test for that but have to head to GDG at the moment. |
@dhermes sorry for delay got pulled into that conference rest of yesterday. So it turns out that my tests weren't that bad because the function behavior was still correct, since the last line of the function joins the entire list into one big string at the end, it's irrelevant whether the strings get broken up in the interim. Obviously sloppy to mix += and append like that but just noting end result was the same. Since I took a look at the test anyway I did try to make it check a bit more with a regex rather than just looking for a function name. |
@dhermes ping |
"""Environment variable set in App Engine when env:flex is set.""" | ||
|
||
|
||
_CONTAINER_ENGINE_ENV = 'KUBERNETES_SERVICE' |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
|
||
def _report_stacktraces( | ||
client, signal_, frame): # pylint: disable=unused-argument |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
raise RuntimeError('Shutdown reporting is only supported on App ' | ||
'Engine flexible environment.') | ||
signal.signal( | ||
signal.SIGTERM, functools.partial(_report_stacktraces, client)) |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
logging/unit_tests/test__shutdown.py
Outdated
import mock | ||
|
||
|
||
class Test_setup_shutdown_stracktrace_reporting(unittest.TestCase): |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
logging/unit_tests/test__shutdown.py
Outdated
setup_stacktrace_crash_report) | ||
return setup_stacktrace_crash_report(request) | ||
|
||
def test_setup_shutdown_stacktrace_reporting_no_gae(self): |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
logging/unit_tests/test__shutdown.py
Outdated
|
||
expected_payload = 'myversion\nThread traces\n{}'.format( | ||
trace).encode('utf-8') | ||
self.assertEqual(called[0][0], [{'text_payload': expected_payload}]) |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
logging/unit_tests/test__shutdown.py
Outdated
self.assertEqual(called[0][0], [{'text_payload': expected_payload}]) | ||
|
||
|
||
class Test_report_stacktraces(unittest.TestCase): |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
logging/unit_tests/test__shutdown.py
Outdated
patch = mock.patch( | ||
'google.cloud.logging._shutdown._write_stacktrace_log') | ||
with patch as write_log_mock: | ||
self._call_fut(mock.Mock(), mock.Mock(), mock.Mock()) |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
logging/unit_tests/test__shutdown.py
Outdated
) | ||
return _report_stacktraces(client, signal, frame) | ||
|
||
def test_report_stacktraces(self): |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
@@ -632,6 +632,20 @@ def test_setup_logging(self): | |||
|
|||
setup_logging.assert_called() | |||
|
|||
def test_enable_shutdown_logging(self): |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
After talking with @dhermes, I actually do have a few concerns here. My understanding (correct me if I am wrong on this) is that this only works in GAE, and throws an error elsewhere? That seems to violate principle of least surprise for the majority of users, who will be using this outside GAE. At minimum, can we set this up in a way to hold non-GAE users harmless? |
from google.cloud.logging._environment_vars import GAE_VERSION | ||
from google.cloud import _helpers | ||
|
||
import six |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
|
||
:type thread_dump: bool | ||
:param thread_dump: (Optional) When true, on SIGTERM the application | ||
will log the stacktrace of all running threads. |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
.. note:: This method is only supported on App Engine Flexible. | ||
|
||
This method installs a SIGTERM handler that will report various | ||
application metrics to Stackdriver Logging. |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
handler. | ||
""" | ||
traces = [] | ||
for thread_id, stack in sys._current_frames().items(): |
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
This comment was marked as spam.
This comment was marked as spam.
Sorry, something went wrong.
Based on offline convo going to close this PR and relocate it to https://github.com/googlecloudplatform/stackdriver-python (OOO this week but starting that migration when I return). |
Very rough draft (no tests/docs, it's messy) etc, to get feedback on overall direction from @karan and whoever will review for this repo.
This is a feature for App Engine Flex customers to get more post-mortem logs on VM shutdown. In theory it could be generalized but it's been spec-ed out to be very GAE specific so currently code just checks for that.
Due to lack of MonitoredResource support (see #2673) this currently uses the
client.logging_api
directly. This could be changed if the ability to specific monitoredresource gets added to the higher level API.