Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

add support for renaming sources on download #2223

Merged
merged 5 commits into from
Jun 22, 2017

Conversation

boegel
Copy link
Member

@boegel boegel commented May 15, 2017

This adds support for:

sources = ["toy.tar.gz -> toy-0.0.tar.gz"]

With this construct, the source will be downloading using toy.tar.gz, but saved to disk using toy-0.0.tar.gz. If the file is already available as toy-0.0.tar.gz it will be picked up and toy.tar.gz will not be downloaded again.

I'm open to suggestions for alternate notations other than ->, this is just one of the possibilities I went with that make it quite explicit. Another alternative would be something like toy.tar.gz:toy-0.0.tar.gz.

fix for #455

cc @wpoely86, @pneerincx, @fgeorgatos

@boegel boegel added this to the 3.3.0 milestone May 15, 2017
@boegel
Copy link
Member Author

boegel commented May 18, 2017

@wpoely86 thoughts on this?

@akesandgren
Copy link
Contributor

akesandgren commented May 18, 2017

looks interesting and useful. "->" is a good separator, rarely found in actual file names, ":" is far more common.

But wouldn't a list be even better, something like

sources = [
['toy.tgz', 'toy-1.0.tgz'],
'something-else.tgz',
]

Or some other construct instead of putting all of it into a string

@boegel
Copy link
Member Author

boegel commented May 18, 2017

@akesandgren Well, the problem is that we already support specifying a source together with an unpack command, see for example https://github.com/hpcugent/easybuild-easyconfigs/blob/master/easybuild/easyconfigs/a/ABINIT/ABINIT-7.2.1-x86_64_linux_gnu4.5.eb#L29.

We could probably detect whether the 2nd provided string is an extract command or an alternate name for the tarball (e.g. if it includes ' %s'), and even allow specifying all 3 (original name, rename-name and extract command), although that doesn't make much sense since if you specify a new name you might as well do it properly so EB can correctly derive the unpack command...

@akesandgren
Copy link
Contributor

yes, but isn't ['a', 'b'] != ('a', 'b') in python typing? And thus distinguishable?

You might need a awkward unpack command, think lapack in openblas or similar, together with renaming.

@boegel
Copy link
Member Author

boegel commented May 18, 2017

@akesandgren Sure, a list [..., ...] and a tuple (..., ...) are easy to distinguish, but it would make things more confusing imho, I would then prefer either using a single string with -> or auto-detecting whether it's an unpack command or an alternate name.

@wpoely86
Copy link
Member

I'm not sure I like the syntax but it's a great feature to have

@boegel
Copy link
Member Author

boegel commented May 19, 2017

@wpoely86 I'm not sure I'm very fond of it myself, to be honest... I started with this to dance around the issue of already supporting ("<source>", "<extract command>"), but we should find a better solution...

Thoughts on auto-detecting whether the 2nd element specified is an extract command (based on the presence of %s in it) or an alternate filename?

Another option would be allowing to specify sources with a dict value, for example something like:

sources = [{'name': SOURCE_TAR_GZ, 'download_as': 'archive.tar.gz'}]

That's a bit more verbose, but it's explicit...

We can then even consider deprecating ("<source>", "<extract command>") and move to using {'name': 'foo.bz2', 'extract_cmd': "tar xfzj %s"} instead...

@akesandgren
Copy link
Contributor

akesandgren commented May 19, 2017

I like that. But it would be even smoother if one could specify

sources = [
 'oldstyle-simple-source',
('old-style-with-extractcmd', 'cmd-for-extract'),
{'name': 'new-style', 'download_as': 'save-name', 'extract_cmd': "some cmd"},
]

At least under a transition period.

@boegel
Copy link
Member Author

boegel commented May 19, 2017

@akesandgren That would be the idea, but I'm inclined to deprecate the ('old-style-with-extractcmd', 'cmd-for-extract') then, and in the long-term only support 'oldstyle-simple-source' and the dict format (which adds flexibility to support other things at a later stage, maybe even per-source URL/checksums rather than having separate lists...)

@akesandgren
Copy link
Contributor

+1 on deprecating. As long as there is a transition period to rewrite stuff :-)

+1 on chksum in that same dict. It's currently difficult to be sure which one goes where for a newbie.
The URL could of course be per source, but having a common one (or two) to use is often good enough, forcing it into the dict would be awkward probably.

… dict; deprecate 2-tuple list/element source spec to specify custom extract command
@boegel
Copy link
Member Author

boegel commented Jun 22, 2017

Documentation update @ easybuilders/easybuild#345

@boegel
Copy link
Member Author

boegel commented Jun 22, 2017

Thanks for the review @akesandgren and @wpoely86!

I'll go ahead and merge this in for EasyBuild v3.3.0.

It does not support including checksums inside of sources yet, but that can be added later with breaking backward compatibility.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants