Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

path in batch/secondary service logs #320

Closed
soxofaan opened this issue Jul 15, 2020 · 5 comments
Closed

path in batch/secondary service logs #320

soxofaan opened this issue Jul 15, 2020 · 5 comments
Milestone

Comments

@soxofaan
Copy link
Member

soxofaan commented Jul 15, 2020

GET /services/{service_id}/logs:

the log records have a path field, which is required and should reference process graph node_ids (and optionally process_ids). But what about logging from places that are not directly tied to a process graph node? For example: general secondary service aspects (start up, request handling, ...), the logic that transforms service parameters to process graph parameters, ...

@m-mohr
Copy link
Member

m-mohr commented Jul 15, 2020

An empty path would work, right? But indeed, it could also just be left out.

@soxofaan
Copy link
Member Author

Indeed an empty path would work as workaround, but I think it should also be possible to provide an alternative way to describe the "location" of the log, e.g. just with a simple string (instead of a list of structured node_id objects).

In general (so also in the case of batch job logs), from the standpoint of the VITO backend implementation there is quite a mismatch between this structured node_id listing and what is feasible/possible for us. I guess it is also quite challenging for other backends to achieve that level of location granularity at the logging layer. Especially because the log handling can be very implementation specific and might involve attempts to parse out-of-band log files. So I am wondering if it wouldn't be better to keep the "location" property of a log (currently the path field) simple, like a single string

@soxofaan soxofaan changed the title path in secondary service logs path in batch/secondary service logs Jul 15, 2020
@m-mohr
Copy link
Member

m-mohr commented Jul 15, 2020

Indeed an empty path would work as workaround

I'll make the path optional.

but I think it should also be possible to provide an alternative way to describe the "location" of the log, e.g. just with a simple string (instead of a list of structured node_id objects).

The path is aiming to allow a precise mapping so that clients could actually point to the code/block where the error happend. It's a bit like a stack trace. The JS parser is able to track the (parent) node IDs, I'm curious why that's a problem in Python? All others can probably just put it in the message.

@soxofaan
Copy link
Member Author

well, I'm not talking about errors, but logging in general. For example with the VITO geopyspark backend, we (try to) fetch the logs of Spark/YARN, which are loosely structured log files. So there is no real guarantee that we can extract some kind of process graph path from them; let alone the logging messages are related to a process graph in some way.

So making the path field optional would be indeed useful in this context.

Another possible improvement would be adding a field to describe some kind of location/origin/source, so we can flag for example that a log entry is coming from Spark, YARN logs or from our own Python/Scala/Java logic. Again, this is probably highly implementation specific, and thus hard to standardize

m-mohr added a commit that referenced this issue Jul 15, 2020
@m-mohr
Copy link
Member

m-mohr commented Jul 15, 2020

As of ddfde37 the path is no longer required.

There's probably room for improvement, but that feels like a API 1.1 addition once we have more implementations. I'm not sure yet how to best expose other information commonly across back-ends.

@m-mohr m-mohr closed this as completed Jul 15, 2020
@m-mohr m-mohr added this to the v1.0.0-final milestone Jul 15, 2020
soxofaan added a commit to Open-EO/openeo-python-driver that referenced this issue Jul 15, 2020
also: `path` is not required anymore in logs (Open-EO/openeo-api#320)
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants