Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Async #13

Merged
merged 14 commits into from
Nov 29, 2018
Merged

Async #13

merged 14 commits into from
Nov 29, 2018

Conversation

felipeZ
Copy link
Contributor

@felipeZ felipeZ commented Nov 13, 2018

Create an asynchronous interface to call an lieMD simulation, see #11

API

There are 3 new functions that support the async call:

  • run_async_ligand_solvent_md
  • run_async_ligand_protein_md
  • query_liemd_results

How does it work?

Both the run_async_ligand_solvent_md and run_async_ligand_protein_md take the same input that they blocking counterparts: run_ligand_solvent_md and run_ligand_protein_md, respectively. These async functions submit a job using cerise and return inmediately.

These functions return the a dictionary specified here. The dict returned by these functions can be used together with the query_liemd_results function to retrieve the output. When this last function is invoked, it checks for the status of the job using the job_name, username and other meta-data information provided by either run_async_ligand_solvent_md and run_async_ligand_protein_md .
If the job is still running an empty dictionary is return together with the status. Otherwise if the job has finished successfully a results dictionary is return. In case of failure the status and an empty dictionary are returned.

See the example

@LourensVeen
Copy link
Member

Hey, great to see this being implemented!

Looks good on the whole, but I have a few remarks about the schema:

  • The async response should never return 'completed', because it won't return any results. I'm not sure if it makes sense to forbid this in the schema. Also, is the schema for the result not defined centrally? This reads to me like this is a locally defined result type, but maybe I'm misreading, or it's not a problem.

  • Could the service return a single identifier only (job_name or task_id, not sure which is more appropriate), for passing back to query_liemd_results? I think it would be neater if things like the container name and port are internal to lie_md, because all sorts of crazy stuff could happen if these become corrupted on the client side (including security issues).

@felipeZ felipeZ mentioned this pull request Nov 13, 2018
@felipeZ
Copy link
Contributor Author

felipeZ commented Nov 13, 2018

@LourensVeen you are right about the async function should never return completed. Also, the schemas are using a local type but I have also defined the future type as in common_resources. I am just not sure if we need to defined this future type globally.

Also, job_name = "job_" + task_id, so in principle we need only the task_id. I can also make job_name = task_id.
The problem with having just job_name is that the lie_md module will need to keep track of what containers/username are in used. but I think that is the client who should be in charged.

@LourensVeen
Copy link
Member

The reason I was thinking it would be defined globally is that it carries a particular meaning, namely "this is an asynchronous service, you should call the specified callback again to get the result". The workflow engine needs to be able to distinguish between synchronous and asynchronous calls, and the idea is that it can do so by the return type. If it's a Future, it's asynchronous, otherwise it's synchronous and whatever comes out is the result.

If the workflow engine can distinguish this by itself, then the user doesn't have to think about it. They just specify the step, and the engine takes care of the rest. It's much more user-friendly.

For the same reason, I don't think that the user should have to worry about docker container ids and port names. The user is busy thinking about proteins and ligands and binding sites! They shouldn't have to know that there's a Cerise running in a Docker that connects to a cluster. All they should have to know is that they put a protein-with-ligand-md step in their workflow which will produce the energies as its output. Everything else is overhead that we should try to minimise.

Also, what happens if the user specifies a container name and port number that's in use by another user. Will they be able to steal the other user's core hours?

@marcvdijk
Copy link
Member

@felipeZ @LourensVeen Great to see this materialize so quickly.

As Lourens mentioned, I would try to keep the Future response as generic as possible and indeed have it be part of common_resources. It makes developing the workflow manager easier and more consistent and in the end it makes the life of the user easier.
It is very likely that more services are going to want to implement async endpoints and it would be consistent if these all make use of the same Future schema as much as possible.

That makes me wonder how many of the current parameters in the schema are lie_md specific and if they could be derived from the task_id. A task_id is generic to the Future object regardless the async endpoint implementation but a job_type or a port is not. Why not associate the lie_md specific parameters to the task_id and store them in the lie_md or Cerise database?

@felipeZ
Copy link
Contributor Author

felipeZ commented Nov 14, 2018

I agree with both of you I would return a Future object with only the task_id.

@marcvdijk
Copy link
Member

And probably the URI of the endpoint needed to check progress and retrieve results.

@felipeZ
Copy link
Contributor Author

felipeZ commented Nov 14, 2018

Following the recommendation of both @LourensVeen and @marcvdijk the functions run_async_ligand_solvent_md and run_async_ligand_protein_md returned a future object. This object basically contains the status, task_id and query_url. This last string is the URL to the endpoint to query the status and results of the future.

the function query_liemd_results returns also a future object. But when the status is completed, it also attaches the results to the future object, as specified in the schema.

We only need to store the task_id because we can always retrieve the whole service/job pair from the mongodb and the cerise interface.

The synchronous functions liemd_ligand and liemd_protein also return a future object](https://github.com/MD-Studio/common_resources/blob/master/common_resources/schemas/resources/future.v1.json), but the results are available and the status is completed or failed.

@marcvdijk marcvdijk merged commit 8350a5c into master Nov 29, 2018
@marcvdijk marcvdijk deleted the async branch November 29, 2018 12:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants