Spot instances with Snakemake and Nextflow
Spot instances are up to 80% cheaper than standard on-demand instances, but they can be terminated – or "preempted" – at any time. The engineering effort to make pipelines tolerant to those terminations often counteracts the cost savings.
FlowDeploy fixes this by adding a layer of error handling for spot instances, but the syntax is a bit different.
Usage
Both Nextflow and Snakemake support spot instances through FlowDeploy, but you'll have to enable spot instances first. Once you've enabled spot instances, you can request them for the entire workflow or specific steps.
- Snakemake
- Nextflow
Basic usage
Using spot instances by default
You can enable spot instances by default on the command line:
snakemake [...] --default-resources "preemptible=True"
Configuring spot instances for a specific rule
Resources set on an individual rule take precedence over command-line settings. That means you can set
preemptible=False
for a specific rule to override default usage of spot instances.
You can enable or disable spot instances for a specific rule:
rule RULE_NAME:
resources:
preemptible=False
...
Advanced usage
Retrying after termination
Snakemake retries are supported for preempted runs.
rule RULE_NAME:
resources:
preemptible=True
retries: 2
...
Requesting an on-demand instance after spot instance termination
Spot instance termination rates are highly variable, so sometimes it makes sense to switch from spot instances to on-demand instances after a couple spot instance terminations.
To do this, set a preemptible
function. You can use the "attempt" argument as passed by Snakemake to switch from
spot instances to on-demand instances after two attempts:
def set_preemptible(wildcards, attempt):
return attempt < 3
rule RULE_NAME:
resources:
preemptible=set_preemptible
retries: 2
...
Basic usage
Using spot instances by default
Add the preemptible
directive to the process scope in your Nextflow config:
process {
preemptible = true
...
}
Configuring spot instances for a specific process
Resources set on an individual process take precedence over config files. That means you can set
preemptible = false
for a specific rule to override default usage of spot instances.
Add the preemptible
directive to the process definition:
process {
preemptible = false
...
}
Advanced usage
Retrying after termination
You can set the Nextflow errorStrategy
directive to "retry" to respawn preempted processes.
process {
preemptible = true
errorStrategy 'retry'
maxRetries 3
...
}