Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Python Logging API always defaults to Custom logs #2673

Closed
laturner42 opened this issue Nov 3, 2016 · 24 comments
Closed

Python Logging API always defaults to Custom logs #2673

laturner42 opened this issue Nov 3, 2016 · 24 comments
Assignees
Labels
api: logging Issues related to the Cloud Logging API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: question Request for information or clarification. Not an issue.

Comments

@laturner42
Copy link

When using the Python Stackdriver Logging API, there is not a well-defined way to specify where you want your logs to land. All of them will go into Custom Logs. The way around this is to use a Logging Handler, but this is not well written anywhere except for this one particular blog post:

https://medium.com/google-cloud/cloud-logging-though-a-python-log-handler-a3fbeaf14704#.39a97qpib

The reason it cannot be done through the standard API is because this line

https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/logging/google/cloud/logging/logger.py#L127

overwrites whatever resource you would rather write to. No parameter I could find (or the Google engineer I asked) would set the resource in order to change the destination of the logs. The closest we could find is here:

https://github.com/GoogleCloudPlatform/google-cloud-python/blob/master/logging/google/cloud/logging/connection.py#L131

but this is set by the previous line stated above. Changing this resource manually is a pretty important feature to us that we are surprised is not already there.

@daspecster daspecster added the api: logging Issues related to the Cloud Logging API. label Nov 3, 2016
@tseaver
Copy link
Contributor

tseaver commented Nov 3, 2016

We would need to add a MonitoredResource type (including mapping the list API method for them), and then allow passing an instance of it through to the logger, either as an optional constructor paramater or as a keyword parameter for log_text and log_struct.

@waprin Any thoughts?

@waprin
Copy link
Contributor

waprin commented Nov 3, 2016

yes @HiroLord, you are definitely right, you should be able to specify a monitored resource.

As a stopgap to at least get the resources types right on GAE/GKE I have #2668 which uses the fluentd handlers. However you are right that in general the client library should allow you to specify a monitored resource of your choice.

I played around with it a bit and found that besides the resource type, I also needed to set the right labels in order for the logs to show up in places where I expect. But there are different labels for different monitored resource types. So I'm not totally sure about the scope of the change necessary.

Out of curiosity, what types of resources are you trying to write to?

@waprin
Copy link
Contributor

waprin commented Nov 3, 2016

Also @HiroLord just a small correction - the logging handler is totally orthogonal to this issue. Handlers are just a way of mapping Python logging calls to API calls, but we have logging handlers in this repo as well, though at the moment they only write to custom logs.

The real reason you can accomplish what you want with that blog post is it's using https://github.com/google/google-api-python-client, which is a lower level auto-generated library, and exposes all the possible ways to write to the API. This library tries to be a more convenient and idiomatic, but in this case it missed the use case of writing to specific resources, which we can fix. Once it's fixed, you can use the handlers or just write to the API directly.

@laturner42
Copy link
Author

We are trying to write logs from our compute engine instances using this API. Our end goal is to pipe std_out from different programs into this Python logger to get all of our logs in the same location on stack driver for each instance.

@laturner42
Copy link
Author

@waprin Ahh, that makes sense. I was curious how we were able to work around it with handlers, but the lower level API makes sense. Thanks!

@waprin
Copy link
Contributor

waprin commented Nov 3, 2016

@HiroLord you might want to consider installing the fluentd agent and just writing your logs to a file it can pickup from...admittedly the documentation on doing that could also use some work:

https://cloud.google.com/logging/docs/agent/installation

@tseaver the short version of what I'm saying is I agree with you suggested as a way to enable people to accomplish this, though I'm not totally sure we need the MonitoredResource objects and the list method, since I think it will be very uncommon for people to programmatically pick the resource type, so just letting people pass in the resource type as a string might be sufficient.

We might want to consider ways to make getting the right labels/log names together a bit more usable.

@waprin
Copy link
Contributor

waprin commented Jan 31, 2017

@tseaver @dhermes circling back on this, I think this does need to be added the logger surface because it's pretty basic functionality to the api .

I think @tseaver's proposed plan to add the MonitoredResource type does make sense.

However, the code already exists in google.cloud.monitoring.resource.py and google.cloud.monitoring.label.py. And if you look at the Logging API and the Monitoring API protos, you will see that they do share the MonitoredResource as common object.

So I think (but I'm not sure) that google.cloud.monitoring.resource.py should be in some sort of common package. But it doesn't feel like it belongs in core, and a new package feels like overkill. But I think a new package like common makes the most sense.

So @dhermes @tseaver WDYT about moving resource.py to a common package?

@karan since we want to write to the correct monitored resource this is blocking the shutdown handler (although we could temporarily use the HTTP or Gax client direclty)

@lukesneeringer
Copy link
Contributor

But it doesn't feel like it belongs in core, and a new package feels like overkill. But I think a new package like common makes the most sense.

I am confused. A new package feels like overkill, and is most sensible? (Apologies if I am not understanding, those two sentences just seem contradictory to me.)

I would be fine with just putting it in core. I am nervous about the idea of having core and common as I cannot think of any defining rule of what belongs in which of them.

@waprin
Copy link
Contributor

waprin commented Jan 31, 2017

Thanks for chiming in. Yes I was thinking out loud and was a bit unclear. In my head core involves code relating to the client but common would be shared API data objects. But I agree it's a blurry distinction so moving to core sounds good to me.

@daiyueweng
Copy link

Hi, I see that in _make_entry_resource of logger.py, there is a resource dictionary that has a key -
resource which specifies resource type, so maybe I can pass in/define whatever resource I like to write to in there?

Or, instead, I have to use write_entries in _gax.py and __http.py?

@waprin
Copy link
Contributor

waprin commented Mar 28, 2017

@daiyueweng I might be missing something, but even if you use _make_entry_resource there's no method that accepts the object that returns. All of the log_* methods call _make_entry_resource themselves. Plus it's not ideal to use a private method.

I think I explained it slightly poorly in the previous issue. You should be able to use the public method client.logging_api, which will pick the underlying transports between _gax.py and _httpy.py the same way it does for the other methods, so you shouldn't have to choose between them. You can look at those files to get an idea of their methods signatures though, write_entries being the one you probably want.

@daiyueweng
Copy link

daiyueweng commented Mar 29, 2017

@waprin So I cloned the google.cloud.logging library into a directory/package(called stackdriver_logging) of my project and changed all the references to google.cloud.logging to stackdriver_logging.google.cloud.logging. Slightly modified the client.py and logger.py in order to use Monitored Resource as follows (moved to a gist by @dhermes, with diffs provided as well).

When I unit tested the code, I got the following error when I tried to use project resource type,

Traceback (most recent call last):
  File "/home/user_name/.local/lib/python3.5/site-packages/google/protobuf/json_format.py", line 441, in _ConvertFieldValuePair
    message_descriptor.full_name, name))
google.protobuf.json_format.ParseError: Message type "google.api.MonitoredResource" has no field named "p".

@daiyueweng
Copy link

A simple unit test code is as follows,

from stackdriver_logging.google.cloud import logging

class Severity(Enum):
    """
    log severity levels
    """
    DEFAULT = 0
    DEBUG = 1
    INFO = 2
    NOTICE = 3
    WARNING = 4
    ERROR = 5
    CRITICAL = 6
    ALERT = 7
    EMERGENCY = 8
   
d = dict()
d["file"] = "file_name"
d["function"] = "function_name"
d["line"] = "line_number"
d["msg"] = "critical error msg"
d["detail"] = "the critical error message detail"

client = logging.Client()


logger = client.logger("log_test", resource="project")
            
logger.log_struct(d, severity=Severity.CRITICAL)

@daiyueweng
Copy link

I have fixed the above error, this was due to the way I define resource, it should be

resource = {
            'logName': self.full_name,
            'resource': {'type': resource},
        }

instead of

resource = {
            'logName': self.full_name,
            'resource': resource,
        }

in method _make_entry_resource of logger.py.

@daiyueweng
Copy link

after fixing the error, re-run the unit test, got the following errors,

Traceback (most recent call last):
  File "/home/user_name/PycharmProjects/gc_logging/stackdriver_logging.py", line 90, in log
    logger.log_struct(msg, severity=sev.name)
  File "/home/user_name/PycharmProjects/gc_logging/stackdriver_logging/google/cloud/logging/logger.py", line 251, in log_struct
    client.logging_api.write_entries([entry_resource], resource=resource)
  File "/home/user_name/PycharmProjects/gc_logging/stackdriver_logging/google/cloud/logging/_gax.py", line 130, in write_entries
    partial_success=partial_success, options=options)
  File "/home/user_name/.local/lib/python3.5/site-packages/google/cloud/gapic/logging/v2/logging_service_v2_client.py", line 352, in write_log_entries
    partial_success=partial_success)
TypeError: Parameter to MergeFrom() must be instance of same class: expected google.api.MonitoredResource got str.

@daiyueweng
Copy link

So, I have tried to use Monitored Resource code work, no luck. Tried to pass in an argument for resource parameter in method write_entries of _gax.py, caused

Traceback (most recent call last):
  File "/home/user_name/PycharmProjects/gc_logging/stackdriver_logging.py", line 90, in log
    logger.log_struct(msg, severity=sev.name)
  File "/home/user_name/PycharmProjects/gc_logging/stackdriver_logging/google/cloud/logging/logger.py", line 251, in log_struct
    client.logging_api.write_entries([entry_resource], resource=resource)
  File "/home/user_name/PycharmProjects/gc_logging/stackdriver_logging/google/cloud/logging/_gax.py", line 130, in write_entries
    partial_success=partial_success, options=options)
  File "/home/user_name/.local/lib/python3.5/site-packages/google/cloud/gapic/logging/v2/logging_service_v2_client.py", line 352, in write_log_entries
    partial_success=partial_success)
TypeError: Parameter to MergeFrom() must be instance of same class: expected google.api.MonitoredResource got str.

this is probably because in write_entries,

self._gax_api.write_log_entries(
            entry_pbs, log_name=logger_name, resource=resource, labels=labels,
            partial_success=partial_success, options=options)

requires resource to be None, so that in method write_log_entries, resource can be initliaze to a monitored_resource_pb2.MonitoredResource() in logging_service_v2_client.py. So if I pasded in a string project to resource, it will generate the above errors. And it seems there is no way to specify resource type in monitored_resource_pb2.MonitoredResource(), otherwise I could have instantiated it and pass to resource in write_entries.

@daiyueweng
Copy link

If I don't pass in any argument for resource in write_entries, and just specify a resource type other than global (e.g. project in this case) in method _make_entry_resource of logger.py, I will get the following errors,

Traceback (most recent call last):
  File "/home/user_name/.local/lib/python3.5/site-packages/google/gax/retry.py", line 120, in inner
    return to_call(*args)
  File "/home/user_name/.local/lib/python3.5/site-packages/google/gax/retry.py", line 68, in inner
    return a_func(*updated_args, **kwargs)
  File "/usr/local/lib/python3.5/dist-packages/grpc/_channel.py", line 507, in __call__
    return _end_unary_response_blocking(state, call, False, deadline)
  File "/usr/local/lib/python3.5/dist-packages/grpc/_channel.py", line 455, in _end_unary_response_blocking
    raise _Rendezvous(state, None, None, deadline)
grpc._channel._Rendezvous: <_Rendezvous of RPC that terminated with (StatusCode.DEADLINE_EXCEEDED, Deadline Exceeded)>

@lukesneeringer lukesneeringer added priority: p2 Moderately-important priority. Fix may not be included in next release. type: question Request for information or clarification. Not an issue. labels Mar 31, 2017
@theacodes
Copy link
Contributor

Closed by #3386 and should be in the next release.

@dhermes dhermes closed this as completed May 11, 2017
@daiyueweng
Copy link

I am just wondering when the next release will be available and how can I get a notification about it.

cheers

@theacodes
Copy link
Contributor

@daiyueweng you can subscribe to notifications on this repo, but it'll probably be noisy.

Unless @dhermes wants to go ahead and release now, we'll probably wait until #3359 is complete.

@dhermes
Copy link
Contributor

dhermes commented May 11, 2017

I'm fine doing a release if @lukesneeringer is

@daiyueweng
Copy link

has the fix been released yet and is it also reflected in the documentation now? is there any example for putting logs to different monitored resources?

@waprin
Copy link
Contributor

waprin commented Jun 1, 2017

Probably a good idea to add example to docstring but in meantime look at the tests. Pretty self explanatory, build a MonitoredResource object with type and labels, then pass that into the write calls.

Doesn't look like it's been released yet though you could always depend on the latest in this repo if you really wanted to.

@daiyueweng
Copy link

daiyueweng commented Jun 2, 2017

I downloaded the google-cloud-python/logging/ and put it as a package in my project root in pycharm, I am wondering what is the best way to updating the whole logging library without downloading it again from the repo.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
api: logging Issues related to the Cloud Logging API. priority: p2 Moderately-important priority. Fix may not be included in next release. type: question Request for information or clarification. Not an issue.
Projects
None yet
Development

No branches or pull requests

8 participants