Skip to content

Commit

Permalink
Merge pull request #2247 from MatthiasZepper/DownloadForTower
Browse files Browse the repository at this point in the history
After approval by Julia and Matthias, the new Download functionality for Tower is hopefully ready for use!
  • Loading branch information
MatthiasZepper authored Jun 1, 2023
2 parents ef62904 + 259afa6 commit 138339a
Show file tree
Hide file tree
Showing 12 changed files with 1,340 additions and 579 deletions.
2 changes: 1 addition & 1 deletion .github/workflows/pytest-frozen-ubuntu-20.04.yml
Original file line number Diff line number Diff line change
Expand Up @@ -15,7 +15,7 @@ concurrency:
cancel-in-progress: true

jobs:
pytest:
pytest-frozen:
runs-on: ubuntu-20.04
steps:
- uses: actions/checkout@v3
Expand Down
5 changes: 5 additions & 0 deletions CHANGELOG.md
Original file line number Diff line number Diff line change
Expand Up @@ -12,6 +12,11 @@
- Remove `aws_tower` profile ([#2287])(https://github.com/nf-core/tools/pull/2287)
- Fixed the Slack report to include the pipeline name ([#2291](https://github.com/nf-core/tools/pull/2291))

### Download

- Introduce a `--tower` flag for `nf-core download` to obtain pipelines in an offline format suited for [seqeralabs® Nextflow Tower](https://cloud.tower.nf/) ([#2247](https://github.com/nf-core/tools/pull/2247)).
- Refactored the CLI for `--singularity-cache` in `nf-core download` from a flag to an argument. The prior options were renamed to `amend` (container images are only saved in the `$NXF_SINGULARITY_CACHEDIR`) and `copy` (a copy of the image is saved with the download). `remote` was newly introduced and allows to provide a table of contents of a remote cache via an additional argument `--singularity-cache-index` ([#2247](https://github.com/nf-core/tools/pull/2247)).

### Linting

- Warn if container access is denied ([#2270](https://github.com/nf-core/tools/pull/2270))
Expand Down
25 changes: 15 additions & 10 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,7 +20,7 @@ A python package with helper tools for the nf-core community.
- [`nf-core` tools update](#update-tools)
- [`nf-core list` - List available pipelines](#listing-pipelines)
- [`nf-core launch` - Run a pipeline with interactive parameter prompts](#launch-a-pipeline)
- [`nf-core download` - Download pipeline for offline use](#downloading-pipelines-for-offline-use)
- [`nf-core download` - Download a pipeline for offline use](#downloading-pipelines-for-offline-use)
- [`nf-core licences` - List software licences in a pipeline](#pipeline-software-licences)
- [`nf-core create` - Create a new pipeline with the nf-core template](#creating-a-new-pipeline)
- [`nf-core lint` - Check pipeline code against nf-core guidelines](#linting-a-workflow)
Expand Down Expand Up @@ -348,13 +348,13 @@ nextflow run /path/to/download/nf-core-rnaseq-dev/workflow/ --input mydata.csv -
### Downloaded nf-core configs

The pipeline files are automatically updated (`params.custom_config_base` is set to `../configs`), so that the local copy of institutional configs are available when running the pipeline.
So using `-profile <NAME>` should work if available within [nf-core/configs](https://github.com/nf-core/configs).
So using `-profile <NAME>` should work if available within [nf-core/configs](https://github.com/nf-core/configs). This option is not available when downloading a pipeline for use with [Nextflow Tower](#adapting-downloads-to-nextflow-tower) because the application manages all configurations separately.

### Downloading singularity containers

If you're using Singularity, the `nf-core download` command can also fetch the required Singularity container images for you.
To do this, select `singularity` in the prompt or specify `--container singularity` in the command.
Your archive / target output directory will then include three folders: `workflow`, `configs` and also `singularity-containers`.
Your archive / target output directory will then also include a separate folder `singularity-containers`.

The downloaded workflow files are again edited to add the following line to the end of the pipeline's `nextflow.config` file:

Expand All @@ -372,10 +372,9 @@ We highly recommend setting the `$NXF_SINGULARITY_CACHEDIR` environment variable
If found, the tool will fetch the Singularity images to this directory first before copying to the target output archive / directory.
Any images previously fetched will be found there and copied directly - this includes images that may be shared with other pipelines or previous pipeline version downloads or download attempts.

If you are running the download on the same system where you will be running the pipeline (eg. a shared filesystem where Nextflow won't have an internet connection at a later date), you can choose to _only_ use the cache via a prompt or cli options `--singularity-cache-only` / `--singularity-cache-copy`.
If you are running the download on the same system where you will be running the pipeline (eg. a shared filesystem where Nextflow won't have an internet connection at a later date), you can choose to _only_ use the cache via a prompt or cli options `--singularity-cache amend`. This instructs `nf-core download` to fetch all Singularity images to the `$NXF_SINGULARITY_CACHEDIR` directory but does _not_ copy them to the workflow archive / directory. The workflow config file is _not_ edited. This means that when you later run the workflow, Nextflow will just use the cache folder directly.

This instructs `nf-core download` to fetch all Singularity images to the `$NXF_SINGULARITY_CACHEDIR` directory but does _not_ copy them to the workflow archive / directory.
The workflow config file is _not_ edited. This means that when you later run the workflow, Nextflow will just use the cache folder directly.
If you are downloading a workflow for a different system, you can provide information about its image cache to `nf-core download`. To avoid unnecessary container image downloads, choose `--singularity-cache remote` and provide a list of already available images as plain text file to `--singularity-cache-index my_list_of_remotely_available_images.txt`. To generate this list on the remote system, run `find $NXF_SINGULARITY_CACHEDIR -name "*.img" > my_list_of_remotely_available_images.txt`. The tool will then only download and copy images into your output directory, which are missing on the remote system.

#### How the Singularity image downloads work

Expand All @@ -391,16 +390,22 @@ Where both are found, the download URL is preferred.

Once a full list of containers is found, they are processed in the following order:

1. If the target image already exists, nothing is done (eg. with `$NXF_SINGULARITY_CACHEDIR` and `--singularity-cache-only` specified)
2. If found in `$NXF_SINGULARITY_CACHEDIR` and `--singularity-cache-only` is _not_ specified, they are copied to the output directory
1. If the target image already exists, nothing is done (eg. with `$NXF_SINGULARITY_CACHEDIR` and `--singularity-cache amend` specified)
2. If found in `$NXF_SINGULARITY_CACHEDIR` and `--singularity-cache copy` is specified, they are copied to the output directory
3. If they start with `http` they are downloaded directly within Python (default 4 at a time, you can customise this with `--parallel-downloads`)
4. If they look like a Docker image name, they are fetched using a `singularity pull` command
- This requires Singularity to be installed on the system and is substantially slower
- This requires Singularity/Apptainer to be installed on the system and is substantially slower

Note that compressing many GBs of binary files can be slow, so specifying `--compress none` is recommended when downloading Singularity images.
Note that compressing many GBs of binary files can be slow, so specifying `--compress none` is recommended when downloading Singularity images that are copied to the output directory.

If the download speeds are much slower than your internet connection is capable of, you can set `--parallel-downloads` to a large number to download loads of images at once.

### Adapting downloads to Nextflow Tower

[seqeralabs® Nextflow Tower](https://cloud.tower.nf/) provides a graphical user interface to oversee pipeline runs, gather statistics and configure compute resources. While pipelines added to _Tower_ are preferably hosted at a Git service, providing them as disconnected, self-reliant repositories is also possible for premises with restricted network access. Choosing the `--tower` flag will download the pipeline in an appropriate form.

Subsequently, the `*.git` folder can be moved to it's final destination and linked with a pipeline in _Tower_ using the `file:/` prefix.

## Pipeline software licences

Sometimes it's useful to see the software licences of the tools used in a pipeline.
Expand Down
44 changes: 39 additions & 5 deletions nf_core/__main__.py
Original file line number Diff line number Diff line change
Expand Up @@ -209,21 +209,46 @@ def launch(pipeline, id, revision, command_only, params_in, params_out, save_all
# nf-core download
@nf_core_cli.command()
@click.argument("pipeline", required=False, metavar="<pipeline name>")
@click.option("-r", "--revision", type=str, help="Pipeline release")
@click.option(
"-r",
"--revision",
multiple=True,
help="Pipeline release to download. Multiple invocations are possible, e.g. `-r 1.1 -r 1.2`",
)
@click.option("-o", "--outdir", type=str, help="Output directory")
@click.option(
"-x", "--compress", type=click.Choice(["tar.gz", "tar.bz2", "zip", "none"]), help="Archive compression type"
)
@click.option("-f", "--force", is_flag=True, default=False, help="Overwrite existing files")
@click.option("-t", "--tower", is_flag=True, default=False, help="Download for seqeralabs® Nextflow Tower")
@click.option(
"-c", "--container", type=click.Choice(["none", "singularity"]), help="Download software container images"
)
@click.option(
"--singularity-cache-only/--singularity-cache-copy",
help="Don't / do copy images to the output directory and set 'singularity.cacheDir' in workflow",
"-s",
"--singularity-cache",
type=click.Choice(["amend", "copy", "remote"]),
help="Utilize the 'singularity.cacheDir' in the download process, if applicable.",
)
@click.option(
"-i",
"--singularity-cache-index",
type=str,
help="List of images already available in a remote 'singularity.cacheDir', imposes --singularity-cache=remote",
)
@click.option("-p", "--parallel-downloads", type=int, default=4, help="Number of parallel image downloads")
def download(pipeline, revision, outdir, compress, force, container, singularity_cache_only, parallel_downloads):
def download(
pipeline,
revision,
outdir,
compress,
force,
tower,
container,
singularity_cache,
singularity_cache_index,
parallel_downloads,
):
"""
Download a pipeline, nf-core/configs and pipeline singularity images.
Expand All @@ -233,7 +258,16 @@ def download(pipeline, revision, outdir, compress, force, container, singularity
from nf_core.download import DownloadWorkflow

dl = DownloadWorkflow(
pipeline, revision, outdir, compress, force, container, singularity_cache_only, parallel_downloads
pipeline,
revision,
outdir,
compress,
force,
tower,
container,
singularity_cache,
singularity_cache_index,
parallel_downloads,
)
dl.download_workflow()

Expand Down
Loading

0 comments on commit 138339a

Please sign in to comment.