Python package

How to use the Python package.

Installation

The flowdeploy Python package is hosted on PyPI. Install it with your favorite package manager, like pip or Poetry.

pip install flowdeploy

Basic usage

Nextflow
Snakemake
Transfer

flowdeploy.nextflow(
    pipeline="nf-core/fetchngs",
    release="1.10.0",
    outdir="fd://shared/outdir",
    profiles=["docker", "test"],
)

flowdeploy.snakemake(
    pipeline='trytoolchest/snakemake-testing',
    release='v0.1.0',
    targets=["results/mapped/A.bam"],
    run_location='fd://shared/',
    cli_args='--use-conda',
)

flowdeploy.transfer(
    transfers=[{
        "source": "s3://flowdeploy-public-demo/ids.csv",
        "destination": "fd://shared/project_one/",
        "destination_name": "ids.csv",
    }, {
        "source": "https://osf.io/f92qd/download",
        "destination": "fd://shared/project_one/",
        "destination_name": "test.txt",
    }],
)

Once a FlowDeploy instance is spawning, the Python package shows the updated status of the run until it's finished. If you terminate the Python process, the FlowDeploy instance will continue running in the cloud – no need to keep your computer running.

info

Terminating the Python process after spawning a run with FlowDeploy does not cancel the run; it keeps running until it finishes. You must cancel runs through the FlowDeploy app.

Use the FlowDeploy app to monitor, terminate, retry, and debug the run.

Background

Running a pipeline with FlowDeploy creates an automatically scaling cluster. Each time the FlowDeploy Python package runs a pipeline, it creates a new cluster. Once the pipeline is finished – or cancelled – the cluster scales down.

FlowDeploy clusters use a shared file system. To reference a file in the shared file system from the Python package, you need to a fd:// prefix – e.g. fd://shared/file.txt.

You can access the shared file system, but sometimes it's easier to import and export files from somewhere else, like S3. The inputs parameter is used to import files, and the export_location parameter is used to export files.

FlowDeploy automatically checkpoints your run with the workflow manager, and resumes from the last checkpoint if a run fails. The workflow manager (e.g. Nextflow) determines what can be resumed – usually by checking input files and parameters. You can set the run_location parameter to demarcate runs.

The run_location is also used as the starting point for any relative file paths used in the pipeline.

Raw arguments for the workflow manager are set with cli_args.

Authentication

Create a key in the FlowDeploy app developer settings.

Use the set_key function, or set the FLOWDEPLOY_KEY environment variable.

Key string
Key file
Environment variable

flowdeploy.set_key("YOUR_KEY")

flowdeploy.set_key("~/your_key.txt")

export FLOWDEPLOY_KEY="YOUR_KEY"

Nextflow and Snakemake

Nextflow
Snakemake

flowdeploy.nextflow(
    pipeline="nf-core/fetchngs",
    release="1.10.0",
    outdir="fd://shared/outdir",
    profiles=["docker", "test"],
)

flowdeploy.snakemake(
    pipeline='trytoolchest/snakemake-testing',
    release='v0.1.0',
    targets=["results/mapped/A.bam"],
    run_location='fd://shared/',
    cli_args='--use-conda',
)

`cli_args`

Use cli_args to pass raw command line arguments to the workflow manager.

Avoid setting these attributes, which produce undefined behavior:

pipeline names (use pipeline)
profiles (use profiles)
input files (use inputs)
output locations (use outdir)
export locations (use export_location)
working directory (use run_location)
settings for logging, resumption, configuration files, or run naming (FlowDeploy handles these)

Anything passed to cli_args are passed as raw arguments to the workflow manager.

Example usage

flowdeploy.nextflow(
    pipeline="nf-core/rnaseq",
    cli_args="--pseudo_aligner salmon',
    ...
)

translates to:

nextflow run nf-core/rnaseq --pseudo_aligner salmon [..]

`export_location` (recommended)

An S3 location to export results.

Example usage

flowdeploy.nextflow(
    export_location="s3://example/project_one",
    ...
)

`export_location_source` (Snakemake only, recommended)

Exports the contents of this path to the export_location destination. Must be a shared file system path (e.g. fd://...).

The export_location_source is only used for Snakemake runs with the export_location destination set.

Example usage

flowdeploy.snakemake(
    export_location="s3://example/project_one",
    export_location_source="fd://shared/project_one/outputs",
    ...
)

`inputs`

FlowDeploy accepts a list of input objects, which are transferred to the cluster file system and passed to the workflow manager.

If a file already exists on the FlowDeploy file system, you don't need to include it in inputs – unless it's passed directly to the workflow manager with an argument. In other words, always include files that would normally be passed to the workflow manager on the command line. For example, you would always include "samplesheet.csv" as used in:

nextflow [...] --inputs samplesheet.csv

in the FlowDeploy inputs.

The input object has three attributes: source, destination, and arg.

source is transferred to destination. If arg is set, the file path is passed to the workflow manager with this argument (e.g. --input in the example above).

In a bit more detail:

source: where the input file is located (S3 or FlowDeploy)
destination: if the input file is remote, the absolute path to download to on the FlowDeploy file system
arg: optionally, set a workflow manager command line argument associated with the file

Example usage

flowdeploy.nextflow(
    inputs=inputs,
    ...
)

Example input objects

An S3 file that's passed as an argument

inputs = [{
  "source": "s3://broad-references/hg19/v0/Homo_sapiens_assembly19.fasta",
  "destination": "fd://flowdeploy-shared/Homo_sapiens_assembly19.fasta",
  "arg": "--fasta"
}]

s3://broad-references/hg19/v0/Homo_sapiens_assembly19.fasta is transferred to /flowdeploy-shared/Homo_sapiens_assembly19.fasta in the shared file system.

A FlowDeploy file that's passed as an argument

inputs = [{
    "source": "fd://flowdeploy-shared/samplesheet.csv",
    "arg": "--input"
}]

Both of the above in one object

inputs = [{
  "source": "s3://broad-references/hg19/v0/Homo_sapiens_assembly19.fasta",
  "destination": "fd://flowdeploy-shared/Homo_sapiens_assembly19.fasta",
  "arg": "--fasta",
}, {
  "source": "fd://flowdeploy-shared/samplesheet.csv",
  "arg": "--input",
}]

`is_async`

If true, exits immediately after spawning a FlowDeploy instance. The transfer will continue, and can be monitored in the FlowDeploy app.

Default: False.

Example usage

flowdeploy.nextflow(
    is_async=True,
    ...
)

`pipeline` (required)

The name of the pipeline.

Example usage

flowdeploy.nextflow(
    pipeline="nf-core/fetchngs",
    ...
)

`release` (required if `branch` is not set)

The git release to use for execution of the pipeline. Either release or branch must be set.

Example usage

flowdeploy.nextflow(
    release="1.1.0",
    ...
)

`branch` (required if `release` is not set)

The git branch to use for execution of the pipeline. Either release or branch must be set.

Example usage

flowdeploy.snakemake(
    branch="main",
    ...
)

`profiles`

Workflow manager configuration profiles.

The FlowDeploy configuration profile is automatically added.

Example usage

flowdeploy.nextflow(
    profiles=["docker"],
    ...
)

`run_location` (recommended)

A FlowDeploy path to use as the working directory for the pipeline.

The run location is used for workflow manager caching. Re-running the same command with the same run_location allows the workflow manager to resume from cached results.

Defaults to /work/${TASK_ID}.

Example usage

flowdeploy.nextflow(
    run_location="fd://shared/project_one",
    ...
)

translates to

cd /shared/project_one && nextflow run [..] -resume

`outdir` (Nextflow only, required)

Where to place pipeline outputs. Must be a FlowDeploy file path.

Example usage

flowdeploy.nextflow(
    outdir="fd://shared/project_one",
    ...
)

`snakemake_folder` (Snakemake only, recommended)

The name of the folder in which Snakemake will run.

FlowDeploy creates this folder if it does not exist, and clones the pipeline into this directory. The run location for Snakemake is computed as run_location joined with snakemake_folder.

Example usage

flowdeploy.snakemake(
    run_location="fd://shared/",
    snakemake_folder="project_one",
    ...
)

translates to:

<Create "/shared/project_one" and set up pipeline> && snakemake -d /shared/project_one [...]

`snakefile_location` (Snakemake only, recommended)

The path to the Snakefile, relative to the snakemake_folder. Defaults to workflow/Snakefile.

If your Snakefile is at the base of your project, set as snakefile_location="Snakefile".

Example usage

flowdeploy.snakemake(
    run_location="fd://shared/",
    snakemake_folder="project_one",
    snakefile_location="pipeline_a.snakefile",
    ...
)

translates to:

[...] cd project_one && snakemake -s "pipeline_a.snakefile" [...]

`targets`

Snakemake targets, as a list.

Example usage

flowdeploy.snakemake(
    targets=["results/mapped/A.bam", "results/mapped/B.bam"],
    ...
)

Transfer

`transfers` (required)

A list of the files or folders to transfer, with each entry as a dictionary containing:

source (required): where the file is currently located (s3://, fd://, or https://)
destination (required): where to transfer the file (s3:// or fd://)
destination_name (required for files): the name of the destination file, if applicable

Example usage

flowdeploy.transfer(
    transfers=[{
        "source": "s3://flowdeploy-public-demo/ids.csv",
        "destination": "fd://shared/project_one/",
        "destination_name": "ids.csv",
    }, {
        "source": "https://osf.io/f92qd/download",
        "destination": "fd://shared/project_one/",
        "destination_name": "test.txt",
    }],
)

`is_async`

If true, exits immediately after spawning. The transfer will continue, and can be monitored in the FlowDeploy app.

Default: False.

Example usage

flowdeploy.transfer(
    is_async=True,
    ...
)

Restrictions

Running more than one run concurrently with the same run_location has undefined behavior.
Each pipeline is limited to 20 concurrent subtasks (can be increased on request).

Python package

Installation​

Basic usage​

Background​

Authentication​

Nextflow and Snakemake​

cli_args​

Example usage​

export_location (recommended)​

Example usage​

export_location_source (Snakemake only, recommended)​

Example usage​

inputs​

Example usage​

Example input objects​

An S3 file that's passed as an argument​

A FlowDeploy file that's passed as an argument​

Both of the above in one object​

is_async​

Example usage​

pipeline (required)​

Example usage​

release (required if branch is not set)​

Example usage​

branch (required if release is not set)​

Example usage​

profiles​

Example usage​

run_location (recommended)​

Example usage​

outdir (Nextflow only, required)​

Example usage​

snakemake_folder (Snakemake only, recommended)​

Example usage​

snakefile_location (Snakemake only, recommended)​

Example usage​

targets​

Example usage​

Transfer​

transfers (required)​

Example usage​

is_async​

Example usage​

Restrictions​

Installation

Basic usage

Background

Authentication

Nextflow and Snakemake

`cli_args`

Example usage

`export_location` (recommended)

Example usage

`export_location_source` (Snakemake only, recommended)

Example usage

`inputs`

Example usage

Example input objects

An S3 file that's passed as an argument

A FlowDeploy file that's passed as an argument

Both of the above in one object

`is_async`

Example usage

`pipeline` (required)

Example usage

`release` (required if `branch` is not set)

Example usage

`branch` (required if `release` is not set)

Example usage

`profiles`

Example usage

`run_location` (recommended)

Example usage

`outdir` (Nextflow only, required)

Example usage

`snakemake_folder` (Snakemake only, recommended)

Example usage

`snakefile_location` (Snakemake only, recommended)

Example usage

`targets`

Example usage

Transfer

`transfers` (required)

Example usage

`is_async`

Example usage

Restrictions