A more flexible Hydra Submitit Launcher plugin
This launcher is a extension of Submiti Launcher Plugin.
- Iterative scheduling of jobs:
- Max Number for Active and Pending Jobs
- Automatically Schedule Jobs and Wait depending on Limits
- Single Jobs are submitted instead of Job arrays
- Hydra must be active to schedule jobs
- No Naming Scheme for Slurm jobs. This makes manual canceling difficult
pip install hydra-submitit-extension
python main.py hydra/launcher=submitit_slurm_extended -m
All Parameter:
# @package hydra.launcher
# No changes to Submitit Launcher plugin
submitit_folder: ${hydra.sweep.dir}/.submitit/%j
timeout_min: 30
cpus_per_task: 1
gpus_per_node: null
tasks_per_node: 1
mem_gb: 1
nodes: 1
name: ${hydra.job.name}
partition: "dev_single"
qos: null
comment: null
constraint: null
exclude: null
#gres: "gpu:1"
cpus_per_gpu: null
gpus_per_task: null
mem_per_gpu: null
mem_per_cpu: null
account: null
signal_delay_s: 120
max_num_timeout: 0
additional_parameters: {}
array_parallelism: 256
setup: null
# Hydra Submitit Extension
_target_: hydra_plugins.hydra_submitit_extension.submitit_launcher.ExtendedSlurmLauncher
# Time between reschedule tries in s.
# Min is 60s
reschedule_interval: 60
# Maximum number of total active jobs in slurm account.
max_jobs_in_total: 5
# Maximum number of active jobs in current partition (e.g dev_single).
max_jobs_in_partition: 4
# Maximum number of active jobs in current sweep.
max_jobs_in_sweep: 3
- Iterative Scheduling
- Greedy Partition Selection