-
Notifications
You must be signed in to change notification settings - Fork 189
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CalcJob
: allow nested target paths for local_copy_list
#4373
CalcJob
: allow nested target paths for local_copy_list
#4373
Conversation
Codecov Report
@@ Coverage Diff @@
## develop #4373 +/- ##
===========================================
+ Coverage 79.32% 79.33% +0.01%
===========================================
Files 468 468
Lines 34752 34755 +3
===========================================
+ Hits 27564 27569 +5
+ Misses 7188 7186 -2
Flags with carried forward coverage won't be shown. Click here to find out more.
Continue to review full report at Codecov.
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, apparently it all worked basically with the modifications we discusses as you had anticipated! I only have 2 comments: one I already did in the code, the second is if it would be possible to add a test that checks this now allows to generate directories by providing full paths in the local_copy_file
.
aiida/engine/daemon/execmanager.py
Outdated
if not dry_run: | ||
for filename in folder.get_content_list(): | ||
logger.debug('[submission of calculation {}] copying file/folder {}...'.format(node.pk, filename)) | ||
transport.put(folder.get_abs_path(filename), filename) | ||
|
||
if dry_run: | ||
if remote_copy_list: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
So, you moved this here and now you have a dry_run
check right next to another; would it perhaps be better to unify these two conditional branches? (the first is just if not dry_run
, but the second has an else
clause in line 225).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done and also added a test for the nested paths
So, I've also noticed that in the if section I mentioned in the review there is a part where if you are doing a dry run, it would copy the files requested from the remote machine locally to your sandbox folder. It would seem to me that this part would be easy to modify if we wanted to allow a list for copying files from different machines (it would just do this code even if it was not a dry run and the machine uuid was different, and then it would add the files to the provenance exlude list as we are doing in this PR). aiida-core/aiida/engine/daemon/execmanager.py Lines 226 to 248 in b002a9a
|
81213ec
to
a70ede0
Compare
If a `CalcJob` would specify a `local_copy_list` containing an entry where the target remote path contains nested subdirectories, the `upload_calculation` would except unless all subdirectories would already exist. To solve this, one could have added a transport call that would create the directories if the target path is nested. However, this would risk being very inefficient if there are many local copy list instructions with relative path, where each would incurr a command over the transport. Instead, we change the design and simply apply the local copy list instructions to the sandbox folder on the local file system. This also at the same time allows us to get rid of the inefficient workaround of writing the file to a temporary file, because the transport interface doesn't accept filelike objects and the file repository does not expose filepaths on the local file system. The only additional thing to take care of is to make sure the files from the local copy list do not end up in the repository of the node, which was the whole point of the `local_copy_list`'s existence in the first place. But this is solved by simply adding each file, that is added to the sandbox, also to the `provenance_exclude_list`.
a70ede0
to
3641fcd
Compare
@ramirezfranciscof thanks for the review. I addressed your comments and this should be ready for review |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Cool stuff; good to go for me! I'll see if I can rebase transport after you merge and start finishing that up.
Fixes #4350
If a
CalcJob
would specify alocal_copy_list
containing an entrywhere the target remote path contains nested subdirectories, the
upload_calculation
would except unless all subdirectories wouldalready exist. To solve this, one could have added a transport call that
would create the directories if the target path is nested. However, this
would risk being very inefficient if there are many local copy list
instructions with relative path, where each would incurr a command over
the transport.
Instead, we change the design and simply apply the local copy list
instructions to the sandbox folder on the local file system. This also
at the same time allows us to get rid of the inefficient workaround of
writing the file to a temporary file, because the transport interface
doesn't accept filelike objects and the file repository does not expose
filepaths on the local file system.
The only additional thing to take care of is to make sure the files from
the local copy list do not end up in the repository of the node, which
was the whole point of the
local_copy_list
's existence in the firstplace. But this is solved by simply adding each file, that is added to
the sandbox, also to the
provenance_exclude_list
.