Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add GAE Shutdown Handler #2976

Closed
wants to merge 12 commits into from
Closed

Conversation

waprin
Copy link
Contributor

@waprin waprin commented Jan 31, 2017

Very rough draft (no tests/docs, it's messy) etc, to get feedback on overall direction from @karan and whoever will review for this repo.

This is a feature for App Engine Flex customers to get more post-mortem logs on VM shutdown. In theory it could be generalized but it's been spec-ed out to be very GAE specific so currently code just checks for that.

Due to lack of MonitoredResource support (see #2673) this currently uses the client.logging_api directly. This could be changed if the ability to specific monitoredresource gets added to the higher level API.

@googlebot googlebot added the cla: yes This human has signed the Contributor License Agreement. label Jan 31, 2017
Copy link

@karan karan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm hoping someone who owns the repo also reviews this for style/testing.



def _get_gae_instance():
return _fetch_metadata('instance/attributes/gae_backend_instance')

This comment was marked as spam.



def _get_gae_backend():
return _fetch_metadata('instance/attributes/gae_backend_name')

This comment was marked as spam.



def _get_gae_version():
return _fetch_metadata('instance/attributes/gae_backend_version')

This comment was marked as spam.


entries = [entry]

client.logging_api.write_entries(

This comment was marked as spam.

Copy link
Contributor

@lukesneeringer lukesneeringer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, if I understand this correctly, the idea is that you can call client.enable_shutdown_logging(), and if you are on AppEngine, logs get sent on SIGTERM?

If that is correct, then the overall direction seems fine to me. I have a couple nitpicks with particulars of your implementation (see comments) but no major issues with what you are trying to do.

:param log_level: (Optional) When true, on SIGTERM the application
will log the stacktrace of all running threads.
"""
if thread_dump:

This comment was marked as spam.

:param client: Stackdriver logging client.
"""
if not _is_on_appengine():
raise Exception('Shutdown reporting is only supported on App Engine '

This comment was marked as spam.

@waprin waprin force-pushed the shutdown_logs branch 2 times, most recently from 576cfd4 to 4985c5b Compare February 23, 2017 23:26
@waprin waprin changed the title [ROUGH DRAFT] Add GAE Shutdown Handler Add GAE Shutdown Handler Feb 24, 2017
@waprin
Copy link
Contributor Author

waprin commented Feb 24, 2017

@dhermes @karan ready for review. I didn't put anything in the usage docs yet, felt a little too niche.

@lukesneeringer
Copy link
Contributor

It looks fine to me, but hoping @dhermes will take a pass also before we merge it.

@waprin
Copy link
Contributor Author

waprin commented Feb 28, 2017

Addresses #2819

@lukesneeringer
Copy link
Contributor

Officially assigning @dhermes for review; I forgot to the first time.

@waprin
Copy link
Contributor Author

waprin commented Mar 1, 2017

@dhermes ping

@@ -0,0 +1,26 @@
# Copyright 2017 Google Inc.

This comment was marked as spam.



# Maximum size in bytes to send to Stackdriver Logging in one entry
MAX_PAYLOAD_SIZE = 1024 * 100

This comment was marked as spam.

This comment was marked as spam.


def _get_gae_instance():
"""Returns the App Engine Flexible instance."""
return os.getenv('GAE_INSTANCE')

This comment was marked as spam.

This comment was marked as spam.


def _get_gae_service():
"""Returns the App Engine Flexible service."""
return os.getenv('GAE_SERVICE')

This comment was marked as spam.

This comment was marked as spam.


def _get_gae_version():
"""Returns the App Engine Flexible version."""
return os.getenv('GAE_VERSION')

This comment was marked as spam.

This comment was marked as spam.



def _report_stacktraces(
client, signal_, frame): # pylint: disable=unused-argument

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

:param frame: The current stack frame.
"""
traces = ''
for thread_id, stack in sys._current_frames().items():

This comment was marked as spam.

This comment was marked as spam.

'appengine.googleapis.com/module_id': gae_service,
}

split_payloads = _split_entry(text_payload.encode('utf-8'))

This comment was marked as spam.

This comment was marked as spam.

:type frame: frame object
:param frame: The current stack frame.
"""
traces = ''

This comment was marked as spam.

This comment was marked as spam.

raise RuntimeError('Shutdown reporting is only supported on App '
'Engine flexible environment.')
signal.signal(
signal.SIGTERM, functools.partial(_report_stacktraces, client))

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@waprin
Copy link
Contributor Author

waprin commented Mar 2, 2017

wow good to know difference between += and append there. Others comments were just me forgetting to run lint all done.

@waprin
Copy link
Contributor Author

waprin commented Mar 2, 2017

Agreed I can add a test for that but have to head to GDG at the moment.

@waprin
Copy link
Contributor Author

waprin commented Mar 3, 2017

@dhermes sorry for delay got pulled into that conference rest of yesterday.

So it turns out that my tests weren't that bad because the function behavior was still correct, since the last line of the function joins the entire list into one big string at the end, it's irrelevant whether the strings get broken up in the interim. Obviously sloppy to mix += and append like that but just noting end result was the same.

Since I took a look at the test anyway I did try to make it check a bit more with a regex rather than just looking for a function name.

@waprin
Copy link
Contributor Author

waprin commented Mar 6, 2017

@dhermes ping

"""Environment variable set in App Engine when env:flex is set."""


_CONTAINER_ENGINE_ENV = 'KUBERNETES_SERVICE'

This comment was marked as spam.

This comment was marked as spam.



def _report_stacktraces(
client, signal_, frame): # pylint: disable=unused-argument

This comment was marked as spam.

raise RuntimeError('Shutdown reporting is only supported on App '
'Engine flexible environment.')
signal.signal(
signal.SIGTERM, functools.partial(_report_stacktraces, client))

This comment was marked as spam.

import mock


class Test_setup_shutdown_stracktrace_reporting(unittest.TestCase):

This comment was marked as spam.

This comment was marked as spam.

setup_stacktrace_crash_report)
return setup_stacktrace_crash_report(request)

def test_setup_shutdown_stacktrace_reporting_no_gae(self):

This comment was marked as spam.

This comment was marked as spam.


expected_payload = 'myversion\nThread traces\n{}'.format(
trace).encode('utf-8')
self.assertEqual(called[0][0], [{'text_payload': expected_payload}])

This comment was marked as spam.

This comment was marked as spam.

self.assertEqual(called[0][0], [{'text_payload': expected_payload}])


class Test_report_stacktraces(unittest.TestCase):

This comment was marked as spam.

This comment was marked as spam.

patch = mock.patch(
'google.cloud.logging._shutdown._write_stacktrace_log')
with patch as write_log_mock:
self._call_fut(mock.Mock(), mock.Mock(), mock.Mock())

This comment was marked as spam.

This comment was marked as spam.

)
return _report_stacktraces(client, signal, frame)

def test_report_stacktraces(self):

This comment was marked as spam.

This comment was marked as spam.

@@ -632,6 +632,20 @@ def test_setup_logging(self):

setup_logging.assert_called()

def test_enable_shutdown_logging(self):

This comment was marked as spam.

@waprin
Copy link
Contributor Author

waprin commented Mar 7, 2017

@dhermes comments addressed and rebased

re: it belonging in the library, I could see it going either way, hopefully in future we can figure those things out in issue tracker ( #2819 ) before PR.

@lukesneeringer
Copy link
Contributor

lukesneeringer commented Mar 8, 2017

I'm on the fence on how to feel about a "very specific use case" as it pertains to this library. Would be interested in what @jonparrott and @lukesneeringer think.

My feeling on monitoring is basically "defer to @waprin when it comes to what he thinks he needs".

After talking with @dhermes, I actually do have a few concerns here. My understanding (correct me if I am wrong on this) is that this only works in GAE, and throws an error elsewhere? That seems to violate principle of least surprise for the majority of users, who will be using this outside GAE.

At minimum, can we set this up in a way to hold non-GAE users harmless?

from google.cloud.logging._environment_vars import GAE_VERSION
from google.cloud import _helpers

import six

This comment was marked as spam.


:type thread_dump: bool
:param thread_dump: (Optional) When true, on SIGTERM the application
will log the stacktrace of all running threads.

This comment was marked as spam.

.. note:: This method is only supported on App Engine Flexible.

This method installs a SIGTERM handler that will report various
application metrics to Stackdriver Logging.

This comment was marked as spam.

handler.
"""
traces = []
for thread_id, stack in sys._current_frames().items():

This comment was marked as spam.

This comment was marked as spam.

This comment was marked as spam.

@tseaver tseaver added the api: logging Issues related to the Cloud Logging API. label Mar 15, 2017
@waprin
Copy link
Contributor Author

waprin commented Mar 20, 2017

Based on offline convo going to close this PR and relocate it to https://github.com/googlecloudplatform/stackdriver-python (OOO this week but starting that migration when I return).

@waprin waprin closed this Mar 20, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: logging Issues related to the Cloud Logging API. cla: yes This human has signed the Contributor License Agreement.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants