Skip to main content

Command-line interface

How to use the FlowDeploy command-line interface.

Installation

Get started by installing the flowdeploy package via pip:

pip install flowdeploy

pip will automatically add the command-line executable to your environment's path. For a global install, use pipx.

(Want another install method like brew? Let us know!)

Basic usage

flowdeploy run nextflow nf-core/fetchngs --release 1.11.0 --outdir fd://shared/outdir --profile docker --profile test --input-file ./input.json

Where input.json contains

[
{
"source": "s3://flowdeploy-public-demo/ids.csv",
"destination": "fd://shared/ids.csv",
"arg": "--input"
}
]

Once a FlowDeploy instance is spawning, the command-line interface displays the status of the run until it's finished. If you terminate the command, the FlowDeploy instance will continue running in the cloud – no need to keep your computer running.

info

Terminating the command after spawning a run with FlowDeploy does not cancel the run; it keeps running until it finishes. You must cancel runs through the FlowDeploy app.

Use the FlowDeploy app to monitor, terminate, retry, and debug the run.

Background

Running a pipeline with FlowDeploy creates an automatically scaling cluster. Each time you use the FlowDeploy CLI to run a pipeline, it creates a new cluster. Once the pipeline is finished – or cancelled – the cluster scales down.

FlowDeploy clusters use a shared file system. To reference a file in the shared file system from the Python package, you need to a fd:// prefix – e.g. fd://shared/file.txt.

You can access the shared file system, but sometimes it's easier to import and export files from somewhere else, like S3. The inputs parameter is used to import files, and the export_location parameter is used to export files.

FlowDeploy automatically checkpoints your run with the workflow manager, and resumes from the last checkpoint if a run fails. The workflow manager (e.g. Nextflow) determines what can be resumed – usually by checking input files and parameters. You can set the run_location parameter to demarcate runs.

The run_location is also used as the starting point for any relative file paths used in the pipeline.

Raw arguments for the workflow manager are set with cli_args.

Commands

flowdeploy set-key

Authorizes your environment by setting your FlowDeploy key in an environment configuration file.

flowdeploy_key (required)

Define your FlowDeploy key here.

Example usage

flowdeploy set-key YOUR_FLOWDEPLOY_KEY_HERE

--config, -c (optional)

Specify the path to the configuration file where the key should be set. It defaults to ~/.zshenv on macOS and ~/.bashrc on Linux.

Example usage
flowdeploy set-key YOUR_FLOWDEPLOY_KEY_HERE --config /path/to/config

flowdeploy run

Use flowdeploy run [parameters] to spawn a new pipeline.

workflow_manager (required)

The workflow manager to run the pipeline. Supports snakemake and nextflow.

Example usage
flowdeploy run snakemake [..]

pipeline (required)

The name of the pipeline.

Example usage
flowdeploy run nextflow nf-core/fetchngs [..]

--release, -r (required if --branch is not set)

The git release to use for execution of the pipeline. Either --release or --branch must be set.

Example usage
flowdeploy run [..] --release 1.10.0

--branch, -r (required if --release is not set)

The git branch to use for execution of the pipeline. Either --release or --branch must be set.

Example usage
flowdeploy run [..] --branch main

--outdir, -o (required, nextflow only)

Set the output directory to store the results from the pipeline run. Note that the path must be a FlowDeploy file path.

Not applicable for Snakemake.

Example usage
flowdeploy run nextflow [..] --outdir fd://shared/outdir

--input-file, -i (optional)

The path to a JSON file that contains all inputs.

FlowDeploy accepts a list of input objects, which are transferred to the cluster file system and passed to the workflow manager.

If a file already exists on the FlowDeploy file system, you don't need to include it in inputs – unless it's passed directly to the workflow manager with an argument. In other words, always include files that would normally be passed to the workflow manager on the command line. For example, you would always include "samplesheet.csv" as used in:

nextflow [...] --inputs samplesheet.csv

in the FlowDeploy inputs.

The input object has three attributes: source, destination, and arg.

source is transferred to destination. If arg is set, the file path is passed to the workflow manager with this argument (e.g. --input in the example above).

In a bit more detail:

  • source: where the input file is located (S3 or FlowDeploy)
  • destination: if the input file is remote, the absolute path to download to on the FlowDeploy file system
  • arg: optionally, set a workflow manager command line argument associated with the file
Example usage
flowdeploy run [..] --input-file ./input.json
Example input files
An S3 file that's passed as an argument
[{
"source": "s3://broad-references/hg19/v0/Homo_sapiens_assembly19.fasta",
"destination": "fd://shared/Homo_sapiens_assembly19.fasta",
"arg": "--fasta"
}]

s3://broad-references/hg19/v0/Homo_sapiens_assembly19.fasta is transferred to /shared/Homo_sapiens_assembly19.fasta in the shared file system.

A FlowDeploy file that's passed as an argument
[{
"source": "fd://shared/samplesheet.csv",
"arg": "--input"
}]
Both of the above in one object
[{
"source": "s3://broad-references/hg19/v0/Homo_sapiens_assembly19.fasta",
"destination": "fd://shared/Homo_sapiens_assembly19.fasta",
"arg": "--fasta",
}, {
"source": "fd://shared/samplesheet.csv",
"arg": "--input",
}]

--cli-args (optional)

Use cli-args to pass raw command line arguments to the workflow manager.

Avoid setting these attributes, which produce undefined behavior:

  • pipeline names (use pipeline)
  • profiles (use profiles)
  • input files (use inputs)
  • output locations (use outdir)
  • export locations (use export-location)
  • working directory (use run-location)
  • settings for logging, resumption, configuration files, or run naming (FlowDeploy handles these)

Anything passed to cli-args are passed as raw arguments to the workflow manager.

Example usage
flowdeploy run [..] --cli-args "--pseudo_aligner salmon"

--export-location (optional)

An S3 location to export results.

Example usage
flowdeploy run [..] --export-location s3://example/project_one

--export-location-source (optional, snakemake only)

Exports the contents of this path to the --export-location destination. Must be a shared file system path (e.g. fd://...).

The --export-location-source is only used for Snakemake runs with the --export-location destination set.

Example usage
flowdeploy run snakemake [..] --export-location s3://example/project_one --export-location-source fd://shared/snakemake/outputs

--profile (optional)

Workflow manager configuration profiles. To add multiple profiles, add --profile multiple times.

The FlowDeploy configuration profile is automatically added.

Example usage
flowdeploy run [..] --profile "docker" --profile "test"

--run-location (optional)

A FlowDeploy path to use as the working directory for the pipeline.

The run location is used for workflow manager caching. Re-running the same command with the same run-location allows the workflow manager to resume from cached results.

Defaults to fd://shared/work/${TASK_ID} for Nextflow, but is required for Snakemake.

Example usage
flowdeploy run [..] --run-location fd://shared/project_one

--snakemake-folder (optional)

Snakemake's working directory, relative to the run location. Defaults to the pipeline name.

Example usage
flowdeploy run snakemake [..] --run-location fd://shared/project_one --snakemake-folder snakmake_pipeline

The command in the example runs from /shared/project_one/snakemake_pipeline/.

--snakefile-location (optional)

Snakefile location, relative to the Snakemake folder name. Defaults to 'workflow/Snakefile'.

Example usage
flowdeploy run snakemake [..] --run-location fd://shared/project_one --snakemake-folder snakmake_pipeline --snakefile-location workflow/Snakefile

The command in the example uses the Snakefile at /shared/project_one/snakemake_pipeline/workflow/Snakefile.

--target (optional, snakemake only)

Snakemake targets. To add multiple targets, add --target multiple times.

Example usage
flowdeploy run snakemake [..] --target results/mapped/A.bam --target results/mapped/B.bam

--is-async (optional)

If set, exits immediately after spawning a FlowDeploy instance.

Default: False.

Example usage
flowdeploy run [..] --is-async

--flowdeploy-key (optional)

Provide a FlowDeploy API key to authenticate the run.

Alternatively, you can run flowdeploy set-key [..] to authenticate your environment.

Example usage
flowdeploy run [..] --flowdeploy-key YOUR_API_KEY_HERE

flowdeploy transfer

Use flowdeploy transfer [..] to spawn a new transfer run.

--input-file, -i (required)

The path to a JSON file that contains all transfers.

FlowDeploy accepts a list of transfer objects, which are transferred to or from the cluster file system and external sources.

The transfer object has three attributes: source, destination, and destination_name.

source is transferred to destination. If destination_name is set, that name is used as the file name at the destination. destination_name is required when transferring individual files.

In a bit more detail:

  • source: where the source file or folder is located (S3, HTTP(S), or FlowDeploy)
  • destination: the absolute path to the folder where the transfer should end up (S3 or FlowDeploy)
  • destination_name: the name of the destination file, if applicable
Example usage
flowdeploy transfer [..] --input-file ./transfers.json
Example input files
An individual S3 file
[{
"source": "s3://broad-references/hg19/v0/Homo_sapiens_assembly19.fasta",
"destination": "fd://shared/",
"destination_name": "Homo_sapiens_assembly19.fasta"
}]

s3://broad-references/hg19/v0/Homo_sapiens_assembly19.fasta is transferred to /shared/Homo_sapiens_assembly19.fasta in the shared file system.

A FlowDeploy directory that's transferred to S3
[{
"source": "fd://shared/outputs/",
"destination": "s3://my-bucket/outputs/"
}]

The files inside /shared/outputs/ are transferred to s3://my-bucket/outputs/.

Both of the above in one object
[{
"source": "s3://broad-references/hg19/v0/Homo_sapiens_assembly19.fasta",
"destination": "fd://shared/",
"destination_name": "Homo_sapiens_assembly19.fasta"
}, {
"source": "fd://shared/outputs/",
"destination": "s3://my-bucket/outputs/"
}]

--is-async (optional)

If set, exits immediately after spawning a FlowDeploy instance.

Default: False.

Example usage
flowdeploy transfer [..] --is-async

--flowdeploy-key (optional)

Provide a FlowDeploy API key to authenticate the run.

Alternatively, you can run flowdeploy set-key [..] to authenticate your environment.

Example usage
flowdeploy transfer [..] --flowdeploy-key YOUR_API_KEY_HERE

flowdeploy status

Checks the state of a FlowDeploy pipeline or transfer run.

run_id (required)

The FlowDeploy ID for the run.

Example usage

flowdeploy status YOUR_RUN_ID_HERE

--flowdeploy-key (optional)

Provide a FlowDeploy API key to authenticate the run.

Alternatively, you can run flowdeploy set-key [..] to authenticate your environment.

Example usage

flowdeploy status YOUR_RUN_ID_HERE --flowdeploy-key YOUR_FLOWDEPLOY_KEY_HERE