Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Issues with the grid_phonon_flow and custom decorators when using Parsl #1882

Merged

Conversation

tomdemeyere
Copy link
Contributor

Summary of Changes

It seems that the "job within a job" concept is not well appreciated by Parsl when using custom executors (redecorated jobs). Leading to problems as mentioned in #1852

Changing this job (_ph_recover_job) to a subflow and removing strip_decorators does make things work. We have now to understand how this can possibly affect other workflow engines.

Checklist

  • I have read the "Guidelines" section of the contributing guide. Don't lie! 😉
  • My PR is on a custom branch and is not named main.
  • I have added relevant, comprehensive unit tests.

Notes

  • Your PR will likely not be merged without proper and thorough tests.
  • If you are an external contributor, you will see a comment from @buildbot-princeton. This is solely for the maintainers.
  • When your code is ready for review, ping one of the active maintainers.

@buildbot-princeton
Copy link
Collaborator

Can one of the admins verify this patch?

Copy link

codecov bot commented Mar 15, 2024

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 99.29%. Comparing base (3abe881) to head (47196f9).

Additional details and impacted files
@@           Coverage Diff           @@
##             main    #1882   +/-   ##
=======================================
  Coverage   99.29%   99.29%           
=======================================
  Files          81       81           
  Lines        3273     3273           
=======================================
  Hits         3250     3250           
  Misses         23       23           

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Mar 15, 2024

I understand the bug now. You are updating the decorator to the quacc.recipes.espresso.phonons.phonon_job.ph_recover_job, but that function does not have a decorator associated with it when used in the @flow (it is stripped and thereby a pure function). What really needs to happen is that the decorator to _ph_recover_job needs to be updated.

This is not related to Parsl. It is a general bug.

@tomdemeyere
Copy link
Contributor Author

What really needs to happen is that the decorator to _ph_recover_job needs to be updated.

I tried to do this, but somehow it was still throwing the same error?

    pw_job, ph_init_job, ph_job, ph_recover_job = customize_funcs(
        ["relax_job", "ph_init_job", "ph_job", "ph_recover_job"],
        [relax_job, phonon_job, phonon_job, phonon_job],
        parameters=job_params,
        decorators=job_decorators,
    )

    _ph_recover_custom = redecorate(_ph_recover_job, job_decorators.get("ph_recover_job", job()))

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Mar 15, 2024

It is not immediately clear to me what is wrong with your snippet, but the original code is definitely not correct in that regard.

I can potentially see the argument for a @subflow even though it is returning a RunSchema instead of a list. It is a bit different than most @subflows though, where it is essentially launching multiple @jobs whose results are returned in an iterable (like _grid_phonon_subflow). Here, it's launching one @job and its result directly.

If we go with the @subflow, we'll just have to make sure it works for Covalent. I think it will, but I can confirm. I can also give it some thought to see if there is an obvious route we are missing. I won't be able to do so until next week, however. Note that there are also issues with #1787, but that's for another time.

@tomdemeyere
Copy link
Contributor Author

It seems like the last job (recover job) is actually re-doing everything instead of reading files. This is probably due to file not being copied

I modified one test to make sure that Quacc catch this. I will fix it when I got the time

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Mar 21, 2024

It seems like the last job (recover job) is actually re-doing everything instead of reading files. This is probably due to file not being copied

I modified one test to make sure that Quacc catch this. I will fix it when I got the time

Good idea for the test. Makes sense as a potential culprit.

The @subflow vs. @job issue is still on my radar. I'm hoping to do a deeper dive on your open issues around this weekend or so. 👍 Currently drowning in a sea of grant proposals, some of which would hopefully secure funds to support quacc if successful!

@Andrew-S-Rosen
Copy link
Member

Andrew-S-Rosen commented Mar 28, 2024

@tomdemeyere: If you keep the original @job definition (i.e. don't switch to @subflow) and simply remove the strip_decorator() call, does that work? My understanding of the code suggests this would resolve your problems because you would no longer be stripping the decorator that you pre-customized. If that works, let's go with that approach.

@Andrew-S-Rosen
Copy link
Member

Actually, scratch my comment. That would require the use of the @subflow decorator as you implemented.

I'll go ahead and merge this then, although I may return to it in the future.

@Andrew-S-Rosen Andrew-S-Rosen merged commit d3f30d8 into Quantum-Accelerators:main Mar 28, 2024
20 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Development

Successfully merging this pull request may close these issues.

3 participants