-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add unique identifier and additional information to tasks #204
Comments
@uniqueg what are your thoughts on this? |
I fully agree that there should be a way to address individual tasks by identifier, and I am definitely in favor of any solution that increases consistency/compatibility with TES. So yes, I would model this according to how this info is represented in TES and would therefore also prefer option 1. However, perhaps better to use Of course, in TES there is more nesting of logs vs the flat architecture of WES But on top of adding What do you think, @kellrott and @vsmalladi? |
I agree. I like option 1. I would want to think through the So would a user that has WES and TES endpoints not go to the TES endpoint to understand individual task logs and could find everything at WES? |
@vsmalladi i think Wes should not act as a barrier to getting to the raw tes logs if they are available. I also think in general tes is a lower level abstraction that an end user will likely need to be exposed to. It enables engines to interface with infrastructure. So I think there will always be a need for an independent definition of a WES task which is a lighter weight version of a TES task. If we end up defining the task body in a way that is not compatible with a TES log or pointing to a tes log in some way, I would be dissappointed Another question would be if adding a required field is a breaking change or not... |
@uniqueg good point, it has been merged and is slotted for release, but is not yet part of one. We had originally though it may be better to hold off on adding the identifier now, but I think it's looking like it may be easier to do it now |
For sure it's easier to make it required now and lift that requirement later, if it turns out later that there are good reasons against it. |
This is a great idea @patmagee thanks for the clarification. |
I think what may be a good approach is not changing any of the schema of the We could then make the return type of {
"tasks": [
{
"id": "5216DEE8-4A79-438F-BD91-93FFAD052B7A",
"state": "RUNNING",
"name": "hello_world",
"cmd": [
"echo hello"
],
"start_time": "2023-03-28T12:11:05.123Z",
"end_time": "2023-03-28T12:11:05.123Z",
"stderr": "/executions/5216DEE8-4A79-438F-BD91-93FFAD052B7A/logs/stderr.txt",
"stdout": "/executions/5216DEE8-4A79-438F-BD91-93FFAD052B7A/logs/stderr.txt",
"system_logs": [
"/executions/5216DEE8-4A79-438F-BD91-93FFAD052B7A/logs/backend.txt"
],
"exit_code": 0
}
]
} You may noticed there are a few new fields I added here based on general feedback and existing issues.
|
I think this looks good, but I would include That is, just for the record, in spite of my general preference for a TES-based solution, because I think that everything that is relevant for TES is also of relevance for workflow tasks, even if engines don't make use of TES :) ...and acknowledging that there is room for improvement on the TES side in terms of schema simplification and better support for provenance (just like in WES). |
@uniqueg if we added It has the benefit of not pegging wes to a specific version of tes as well |
Yeah, a coordination of WES and TES would require a lot more planning (and probably some externalization of schemas that both WES and TES depend on). Maybe something for an eventual, coordinated WES/TES 2.0.0 release way down the road. Until then, I'll be more than happy with an optional |
Why
When polling workflow state it is often helpful to monitor the progress of a given task over time. Unfortunately there currently is no guarantee that any identifier present in the
task_logs
or list returned by the/tasks
endpoint and you can never be sure that the task you are looking at is the same one from the preview retrievalAdditionally, Since merging: #177 into the spec the goal is to make individual tasks addressable in the future.
Proposal
task_id
attribute within theLog
object. I prefer this approach since it can be an easy transition to presentingTES
information. If we do not want to return TES objects directly, we could also include a linktes_url
to the task definition if we ever wanted to go that way.. but that is all for future consideration, adding an ID is minor bridge therename
of the task should be Unique across all runs. IMO this is the inferior option, since the name often has some sort of meaning to the end user. Requiring this attribute to be unique now may require the engine to obscure the original nameThe text was updated successfully, but these errors were encountered: