-
Notifications
You must be signed in to change notification settings - Fork 38
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Separate out "job" concept from "workflow" concept #26
Conversation
@@ -5,25 +5,25 @@ | |||
DEFAULT_PAGE_SIZE = 100 | |||
|
|||
|
|||
def GetWorkflowLog(**kwargs): | |||
def GetJobLog(**kwargs): | |||
return {} | |||
|
|||
|
|||
def CancelJob(**kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The Job
concept has actually been used.
@@ -41,20 +41,20 @@ paths: | |||
$ref: '#/definitions/ErrorResponse' | |||
tags: | |||
- WorkflowExecutionService | |||
/workflows: | |||
/workflow-jobs: | |||
get: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It could simply be just /jobs
instead of /workflow-jobs
. I personally don't see any problem either way.
@@ -100,22 +100,22 @@ paths: | |||
- WorkflowExecutionService | |||
post: | |||
summary: |- | |||
Run a workflow, this endpoint will allow you to create a new workflow request and | |||
Submit a workflow job, this endpoint will allow you to create a new job request and |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
As there is likely a queued period before the job gets run, so submit
seems better reflect the situation.
I don't know if Unfortunately I haven't found many alternatives that aren't already used in the field (such as In any case |
IMO this is even more confusing. As already said, job is commonly used as a synonym for a workflow task. Better alternatives could be:
Another option could be to keep |
A couple of years ago the then Containers & Workflows group hashed out how we wanted to refer to these things and the results are here. The claim was that APIs from this working group would use those terms, so now we don't need to argue about the terminology here :) |
@geoffjentry Re-reading that link, it says that an instance of a Task is a TaskInstance but it seems to punt on what an instance of a workflow is. Out of the proposals so far, I vote for @pditommaso 's "workflow-run" as an instance of a "workflow" mostly for old times sake. "Job" is ok (especially now that Google JES seems to have been retired), but it isn't obvious to me from the naming that a "job" is an instance of a "workflow". |
@denis-yuen I noticed that too, i'm 99% sure the decision was to just continue with the If we're going to deviate from this, and since you brought up Google (it's influenced by them), the Biosphere job manager is referring to a job generically as anything which a user might launch. So from our parlance either a workflow or a task. The physical incarnation becomes a job. As that's likely to be a key way which users are interacting with these APIs it is worth keeping that in mind. @denis-yuen - Google JES wasn't retired, it was just long ago renamed to the Pipelines API |
@geoffjentry I don't remember it (either way), but it sounds fine. Thanks, I meant that as short hand for "the name was retired, freeing the term job for us to use" In other words ... |
Thanks for sharing your thoughts, @psafont, @pditommaso, @geoffjentry and @denis-yuen. My understanding is that standardizing terminology is an integral part of the process defining any standards. As @geoffjentry pointed out this had happened for this group: here The current situation is that we identified a potential issue in the current WES specification, ie, It seems the main sticking point is that: what is the best term to be used for an execution of a Given that in the field of scientific workflow, the usage of typical terms, such as, No matter what's the final choice, I hope it's commonly agreeable that the terms must be well defined and unambiguous within the context of GA4GH workflow domain, although the usage may differ from other systems. Briefly, here is how I see it with relevant terms put together: A |
Hi @junjun-zhang - as mentioned above I'm pretty sure that the If people want to break from that pattern, I don't think that this PR is the place to have that convo. Instead it should be taken to the workstream calls |
@geoffjentry, you mentioned never liked the But if we go with Don't know how others think about this. It's probably good to discuss it on the workstream calls. |
@junjun-zhang My main point is that I don't think this PR is the right venue for the conversation at large. Ultimately there's never going to be a good answer for this. Nearly every group I interact with uses different (and often contradictory) terminology and many of them feel strongly that their terminology is the correct one. The one thing I did like about the And ultimately I think that unwiedly endpoint/object names are fine as presumably the very extreme majority of interaction with these APIs will be programatic. |
@geoffjentry your point is well taken. I think we still need to address the issue: |
One other perspective from this presentation about CWL, provenance, and Research Objects They use "instance" to refer to a parameterized workflow in contrast to the workflow "definition" as a generic recipe. Workflow "run" is then the execution of a workflow instance. As the Definitely agree this is worth discussing with the Cloud Workstream. |
re "tool", yeah - that was one where we decided the world was already in such a state of disarray that frankly there was no helping it, so punted by saying "we're not going to use it officially for the GA4GH APIs, but it can mean what one needs it to mean in the wild" sort of affair. @briandoconnor - Do you think it's worth revisiting the nomenclature debate w/ the larger group? |
I am not sure we could get away from the need to well define the term In fact, there is already a definition for As expressed earlier, there is no need for us to harmonize all of the common terms in the broader scientific workflow area, it's already quite messy. However, IMO, all key terms used in the GA4GH Workflow Standards need to be well defined and unambiguous within GA4GH standards. It's totally OK to have different opinions, but within a standard, key terms need to be crystal clear. |
@junjun-zhang re TRS note that Dockstore handles both workflows and individual tools (which IMO are just degenerate 1 step workflows, but YMMV) |
Yes, @geoffjentry I noticed that in TRS The way I find useful to think about it is that both Just an FYI, I had a feature request in Dockstore about unifying |
With respect, I think that the original point of this issue was slightly off-the-mark. The WES, as it stands today, is not intended to store definitions of workflows, only to accept and track run-time requests.
Am I missing something? |
Thanks, @delagoya. Looking at @junjun-zhang's commit I think the PR could be better titled "replace Others can correct me, if that wasn't the intention. |
@jaeddy This was my larger point. We'd already hashed out most of this terminology 2 years ago, at least in a "these are the words we use in the GA4GH APIs". My concern is that the terminology battle is always a painful one and rekindling it on the same words is just going to be a giant PITA |
@jaeddy I do feel terminology in all four GA4GH Workflow API standards should be harmonized. Within WES, |
@jaeddy it is indeed possible to replace It's even possible to change |
@junjun-zhang will this shift to "WorkflowRun"? I like the idea of not including another term (Job for Workflow that's running). |
Riffing on some thoughts while listening to panel discussions in Toronto... I feel like the
Following that idea, we have:
Basically, a task (for the sake of TRS) is a unit of work that has a command (or Thus, TRS is still a tool registry service, but we can be more explicit about tool types. This is analogous to how bio.tools classifies entries — although they have some classes that are obviously incompatible with description in a standardized format (CWL/WDL/NFL). In terms of the executed state of a workflow or task, I'm happy with run for both cases — e.g..:
The |
So equal split between:
There were no objection for choosing workflow + WorkflowRun at the Cloud Workstream meeting on Jun 25, 2018. So closing this PR, move forward with #32 |
The current
workflow
refers to bothworkflow
andjob
(an instance of a workflow run), it's somewhat confusing. This pull request introduces thejob
concept,workflow
remains and only refers to workflow definition.The changes were pretty much string replacements, ie,
workflow
tojob
(in some casesworkflow job
) when appropriate.