Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

upload+download: RF to reuse logic/options/UI #48

Open
yarikoptic opened this issue Mar 13, 2020 · 5 comments
Open

upload+download: RF to reuse logic/options/UI #48

yarikoptic opened this issue Mar 13, 2020 · 5 comments

Comments

@yarikoptic
Copy link
Member

ATM implementations are separate and analysis for either to download/upload a file is happening right before actual download/upload. That forbids implementing proper "sync" functionality where analysis first to be done on either any file needs to be removed (or may be moved! ;-) ) on local/server end. So RF should be done to minimize difference between API and implementation of the two:

  • given the path specification both should first obtain list of local/remote "assets"
  • given the mode of operation in case of "existing" do the analysis and provide user with the summary on upcoming transfer
    • possibly even present user with a list of files which would not be download/uploaded because they are already newer on the destination
  • pass into actual download/upload functions only the files which decided to be acted upon
  • if it was full dandiset (or a folder) to be downloaded/uploaded -- perform necessary deletions (locally or remotely) (might want to be a dedicated option, e.g. --sync)
  • exit with non-0 if any file which could have been download/uploaded was not (e.g. in case of --sync and --file-mode not being "overwrite" or "force", if some file was already newer and we didn't perform transfer)

For download would be needed:

  • to not use girder's downloadFile, which relies on a context manager to report progress, but which gets only filename as an ID.
  • Also it downloads into temporary directory which might be on a different partition. Should be near the target file. we could then detect/continue interrupted downloads etc
@yarikoptic
Copy link
Member Author

A few additional notes:

  • as briefly discussed in organize: add optional --disappeared=error|remove or --mode=complete|incremental #47, organize could also be considered a form of upload/download - we are processing from one listing to another
  • as discussed in slack and echoing user prompts in previous discussions we might follow suggested by @satra analog to kubernetis --yes option to actually perform any destructive operation (replacing or deleting a file etc). not yet sure if a single flag would be sufficient though. E.g. for upload --existing we already have error|skip|force|overwrite|refresh which cover various use cases. We could of cause double it with requiring additional --yes if actual force or overwrite or refresh is to annihilate previous file, but sounds like a not so useful duplication...

@satra
Copy link
Member

satra commented Apr 9, 2020

so storing our object id --> filepath somewhere on dandi should help with updates.

(i still think we should add some checksum to file metadata)

@yarikoptic
Copy link
Member Author

I don't think it would be needed - we will just get full list of assets in remote end first, they will include their remote path and object is. Given there is no duplicates (:-)), we can always figure out renames to update. And in general it would be the same path if file wasn't renamed locally.

@yarikoptic
Copy link
Member Author

Or are you aiming to meld organize into upload?

@satra
Copy link
Member

satra commented Apr 9, 2020

i'm still looking out for verification of bits.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants