Manage your conda environment(s) with Ansible. Create new conda environments, install, update, and remove packages.
Similar to the pip module, this supports passing a list into the name
field. This results in fast and efficient use of running conda commands.
The role is designed to make the conda
module available for use in subsequent tasks.
- conda (tested on
4.5.0
and higher)
---
- name: Test evandam.conda
hosts: all
roles:
- role: evandam.conda
tasks:
- name: Update conda
conda:
name: conda
state: latest
executable: /opt/conda/bin/conda
- name: Create a conda environment
conda:
name: python
version: 3.7
environment: python3
state: present
- name: Install some packages in the environment
conda:
name:
- pandas
- numpy
- tensorflow
environment: python3
- name: Install R, using a versioned name
conda:
name: r-base=3.5.0
Let's assume we have a yaml-dict/list in a file (./software/conda.yaml), defining all our environments and their packages staged for installation like this:
Quality_control:
- afterqc
- multiqc
- trimmomatic
Assembly:
- spades
- megahit
This can be beneficial for several reasons. Firstly, you can separate different deployments and software setups in discrete git branches referencing only this file. Secondly, if you have collaborators that are not git savvy, they only have to worry about editing this specific file in order to update any software deployments they need, and it can be done easily f.ex by editing this raw file in the browser. Thirdly, using a vars-loop, it simplifies our code a lot. As seen in the next example.
The following playbook consists of 3 discrete parts (and assumes anaconda is already installed)
- It parses the previously mentioned conda.yaml so our target environments and packages are readily defined. (optimally, channels should also be parsed like this, but in this example we are lazy and hard code them in)
- Installs Mamba in the base environment (see bottom for explanation).
- Loops through our "envs" var creating every defined environment and installs its packages using Mamba.
---
- name: Install Conda envs and packages
hosts: my_remote_machine
roles:
- role: evandam.conda
remote_user: admin
become: yes
tasks:
- include_vars:
file: ./software/conda.yaml
name: envs
- name: Install Mamba in base env using standard conda
become_user: user
conda:
environment: base
name: mamba
state: latest
channels:
- conda-forge
executable: /opt/miniconda3/bin/conda
- name: Create Conda environments and install packages using Mamba instead of default Conda
become_user: user
conda:
environment: "{{item.key}}"
name: "{{item.value}}"
state: latest
channels:
- bioconda
- conda-forge
- defaults
executable: /opt/miniconda3/bin/mamba
loop: "{{ envs | dict2items }}"
One of the downsides of using Anaconda as a package manager is the bloated channels. For certain packages, especially from conda-forge, the environment solver can use hours of computation to figure out dependencies before any installation even starts. The solution: Mamba/Micromamba. Mamba wraps the executable conda, implementing the solver in C++, making it blazingly fast (notice the executable: directive change from conda to mamba in the last task). As a real world example, with some 30+ environments for a student course deployment, this reduced deployment time from 2,5 hours to 15 minutes!
BSD