Resource usage and time metrics #370

m-mohr · 2021-04-02T11:10:40Z

Could be done via:

.../logs? (works for all processing modes) - note: for sync. processing via Header
via the resource details (GET /jobs/:id and GET /services/:id)? (doesn't work for sync. processing)
via a new endpoint?

Could include:

total batch job duration (wall time)
total batch job cpu hours
total batch job memory hours

These metrics can be visualized in a web interface, which should give the user an idea of scalability.

m-mohr · 2021-04-09T15:21:15Z

Use Case:

The API shall provide capabilities for visual process monitoring in the distributed compute environment. Note: This shall provide relevant information through graphs/dashboards on e.g. orchestration, data throughput, CPU workloads, etc. so that users can optimise the efficiency of their code and thus reduce the costs required when switching to bulk processing. (req. 60)

soxofaan · 2021-04-12T07:58:47Z

.../logs? (works for all processing modes)

.../logs is not available for sync processing, only batch and services, right?

Another solution could be custom response headers. This is probably the only option for sync processing (unless you make sync processing stateful).

For batch and services, something like /logs is probably a lot cleaner than custom response headers.

m-mohr · 2021-04-12T09:06:03Z

That was not clearly described: For synchronous processing, it would just be added to the log file that can be returned via the header, which is the equivalent to /logs in batch/services.

Sync. processing is stateful anyway if you implement either billing (which most services will do) or return log "files" via the header.

The advantage of log "files" would be that it can also do intermediate time metrics, e.g. after each process. That would certainly help to find out bottlenecks. That would be a reason against custom headers, especially as you'll often use sync processing for debugging and trying things out on smaller chunks, where you'd probably would like to get the metrics as detailed as possible.

So overall, it seems like the best option would indeed be ".../logs" (batch/services) / log-header (sync).

m-mohr · 2021-04-15T13:51:14Z

This is the last issue for API v1.1.0 right now that doesn't have a corresponding PR.
Before starting a PR, I'd need feedback on two details:

Where to specify the metrics (see above)? It seems that there's a slight tendency towards the logs, which I think I'm also in favor of.
What metrics to specify and which units do they have. Some examples are also mentioned above. This may need some investigation what software actually supports...

m-mohr · 2021-04-16T09:28:50Z

Instead of specifying a fixed unit for each metric, we could also make it self-descriptive:

{
	"metrics": {
		"cpu": {
			"value": 123,
			"unit": "cpu-seconds" // default
		},
		"memory": {
			"value": 123,
			"unit": "mb-seconds" / "gb-hours" / ... // default: ?
		},
		"time": {
			"value": 123,
			"unit": "seconds" / "minutes" / "hours" // default: minutes?
		},
		"network": {
			"value": 123,
			"unit": "b" / "kb" / ... / "tb" // default: mb?
		},
		"storage": {
			"value": 123,
			"unit": "b" / "kb" / ... / "tb" // default: mb?
		}
	}
}

see PR: #383

jdries · 2021-04-16T11:27:01Z

I'll try to figure out if accessing to /logs can be made user friendly enough. Maybe we can even do both, using /logs for all kinds of advanced use cases, and the same 'usage' properties inside the job metadata, where users can immediately see them?
Keep in mind that this information will also be the basis for accounting, so if we log usage for different steps, we also need a way to find the total usage, in an unambiguous way.

m-mohr · 2021-04-16T11:32:51Z

@jdries

I'll try to figure out if accessing to /logs can be made user friendly enough.

In the Web Editor it's implemented already with a UI. That could also be integrated into Jupyter.

Maybe we can even do both, using /logs for all kinds of advanced use cases, and the same 'usage' properties inside the job metadata, where users can immediately see them?

That is actually what PR #383 proposes.

Keep in mind that this information will also be the basis for accounting, so if we log usage for different steps, we also need a way to find the total usage, in an unambiguous way.

But that seems to be a back-end / implementation issue?! Nevertheless, should be solved in the mentioned PR.

jdries · 2021-04-19T05:52:57Z

Apologies, I see it now indeed in the diff, so that seems reasonable.
Total usage needs to be unambiguous for the users as well, as it is the basis for their bill.

m-mohr · 2021-04-19T08:14:57Z

Total usage needs to be unambiguous for the users as well, as it is the basis for their bill.

@jdries I understand that this is important, but I'm not sure whether this is just a general remark or you'd want something specific to be changed/added/... in the current proposal? Is the proposal unambiguous?

jdries · 2021-04-19T10:48:50Z

Was just a general remark, no changes needed!

kempenep · 2021-04-21T14:46:34Z

Shouldn't the units of memory and network be exchanged ?

m-mohr · 2021-04-22T10:51:35Z

@kempenep I don't think so, I've usually only seen those units in use. If you have more details on where this is handled differently, please let me know.

Resource usage and time metrics #370

m-mohr · 2021-05-05T09:09:22Z

Merged.

m-mohr added the platform label Apr 2, 2021

m-mohr changed the title ~~User visible overview of processing time metrics~~ Overview of processing time metrics Apr 9, 2021

m-mohr added this to the 1.1.0 milestone Apr 9, 2021

m-mohr added feedback required help wanted data processing labels Apr 15, 2021

m-mohr self-assigned this Apr 15, 2021

m-mohr added a commit that referenced this issue Apr 16, 2021

Overview of processing time metrics #370

f17aa7c

m-mohr added a commit that referenced this issue Apr 16, 2021

Overview of processing time metrics #370

0a60b94

m-mohr mentioned this issue Apr 16, 2021

Resource usage and time metrics #370 #383

Merged

m-mohr linked a pull request Apr 16, 2021 that will close this issue

Resource usage and time metrics #370 #383

Merged

m-mohr changed the title ~~Overview of processing time metrics~~ Resource usage and time metrics Apr 16, 2021

m-mohr mentioned this issue May 5, 2021

Support for usage and time metrics (+ time) Open-EO/openeo-vue-components#42

Closed

m-mohr added a commit that referenced this issue May 5, 2021

Merge pull request #383 from Open-EO/usage-metrics

d3ab40e

Resource usage and time metrics #370

m-mohr closed this as completed May 5, 2021

This was referenced May 11, 2021

Release openEO API v1.1.0 #391

Merged

Release openEO API v1.1.0 Open-EO/PSC#11

Closed

zcernigoj mentioned this issue Nov 27, 2023

openeo api version hardcoded to 1.0.0 Open-EO/openeo-sentinelhub-python-driver#56

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Resource usage and time metrics #370

Resource usage and time metrics #370

m-mohr commented Apr 2, 2021 •

edited

Loading

m-mohr commented Apr 9, 2021

soxofaan commented Apr 12, 2021

m-mohr commented Apr 12, 2021 •

edited

Loading

m-mohr commented Apr 15, 2021 •

edited

Loading

m-mohr commented Apr 16, 2021 •

edited

Loading

jdries commented Apr 16, 2021

m-mohr commented Apr 16, 2021

jdries commented Apr 19, 2021

m-mohr commented Apr 19, 2021

jdries commented Apr 19, 2021

kempenep commented Apr 21, 2021

m-mohr commented Apr 22, 2021

m-mohr commented May 5, 2021

Resource usage and time metrics #370

Resource usage and time metrics #370

Comments

m-mohr commented Apr 2, 2021 • edited Loading

m-mohr commented Apr 9, 2021

soxofaan commented Apr 12, 2021

m-mohr commented Apr 12, 2021 • edited Loading

m-mohr commented Apr 15, 2021 • edited Loading

m-mohr commented Apr 16, 2021 • edited Loading

jdries commented Apr 16, 2021

m-mohr commented Apr 16, 2021

jdries commented Apr 19, 2021

m-mohr commented Apr 19, 2021

jdries commented Apr 19, 2021

kempenep commented Apr 21, 2021

m-mohr commented Apr 22, 2021

m-mohr commented May 5, 2021

m-mohr commented Apr 2, 2021 •

edited

Loading

m-mohr commented Apr 12, 2021 •

edited

Loading

m-mohr commented Apr 15, 2021 •

edited

Loading

m-mohr commented Apr 16, 2021 •

edited

Loading