Skip to content

Importing a Nextflow Pipeline into Quark

This guide provides step-by-step instructions for onboarding a Nextflow pipeline.

It is designed to help Bioinformaticians configure the technical architecture to set up a user-friendly interface on Quark for the rest of the team.


Overview

Quark transforms command-line Nextflow pipelines into accessible applications by mapping your Git repository to Quark.

This walkthrough uses nf-core/bactmap as a reference, though the flow applies to any Nextflow-based repository.

Before You Start

Ensure you have the following components ready:

  • Repository Access: HTTPS URL for public repos; SSH URL for private repos.
  • Revision: The specific branch, tag, commit SHA, or HEAD you want to deploy.
  • Entry Point: The primary execution file (usually main.nf).
  • Reference Datasets: Any static data (e.g., GRCh38, adapter lists) must already exist in Quark.
  • User Inputs: A list of variables from your nextflow.config that should be exposed as UI fields.

For private repos, Quark must have clone access. If an import fails at the clone stage, verify that your Git credentials or SSH keys are correctly configured within Quark.


Step-by-Step Instructions

Start the Import

  1. Navigate to the My Pipelines tab on your dashboard.
  2. Click the Import Pipeline button in the top-right corner.
  3. In the Select Pipeline Type window, choose Nextflow and click Continue.

Pipelines

Click on `Import Pipeline`

Pipelines

Select `Nextflow` and click `Continue`


Step 1: General Pipeline Details

Define how the pipeline is identified and categorized for your team.

  • Name: Provide a short, unique name for the pipeline.
  • Summary: Provide a one-sentence description of the biological objective.

Pipelines

  • Badge: Defaults to "Nextflow."
  • About (Optional): Upload a Markdown file for a longer overview.
    • Bioinformatician Tip: Use this to list specific container versions, tool citations, and expected output structures to help wet-lab users interpret their data. Upload a Markdown file that gives a detailed overview of your nf-core pipeline.

Pipelines

  • Category: Choose the appropriate match (e.g., Genomics, Transcriptomics).

Pipelines

  • Tags: Provide metadata filtering Tags for your pipeline. Add at least one key/value pair (e.g., pipeline: nf-core/bactmap, organism: bacteria).

Pipelines

  • Review and click Next.

Step 2: Pipeline Source

Configure where Quark clones the code from.

  • Repository: A short label for this specific import (e.g., bactmap-v1).
  • Source: Defaults to Git.
  • Repository URL:
  • Public: https://github.com/my-org/my-repo.git
  • Private: git@github.com:my-org/my-repo.git

Pipelines

  • Revision: Set to HEAD, a branch name, a tag (e.g., 1.0.0), or a specific commit SHA.
  • Entry Point: The file Nextflow runs first (commonly main.nf).

Pipelines

  • Nextflow Version: Select the version required by your pipeline. Matching your local development version is recommended for consistency. Ensure the selected version is compatible with your pipeline's syntax (e.g., ensuring DSL2 support if using recent nf-core templates).

Pipelines

  • Review and click Next.

Step 3: Mount Required Datasets

Mounting is used for stable, pre-existing datasets that the pipeline requires for every run. You can connect either objectstores (S3/Azure/GCP) or filesystems (NFS/Lustre) datasets to your pipeline.

  • Click Add New Dataset Mount.

Pipelines

  • Search and select the required dataset (e.g., refdata-grch38). In the example below, refdata-1 (an objectstore dataset) is selected.

Pipelines

Pipelines

  • Repeat for all necessary files by clicking Add New Dataset Mount.
  • If you select a filesystem dataset, you will be prompted to add one or more directories of the dataset as mount paths. Ensure these paths match the hard-coded paths in your Nextflow scripts if you aren't using parameters for reference files.

Pipelines

  • Once all datasets required for the pipeline are mounted, click Next.

Step 4: Define Pipeline Parameters

Map your Nextflow parameters to UI elements i.e, for every params.name in your code, create a corresponding field. This creates the form users will fill out at runtime.

Pipelines

For each parameter, define:

  • Name: Must match the variable name in your Nextflow script.
  • Type: String, Integer, Float, Boolean, or File.

Pipelines

Type Technical Behavior Use Case
Boolean Maps to a binary flag. In Nextflow, "On" passes --param true and "Off" passes --param false. --skip_trimming, --save_intermediates.
String Passes a text string. Specific genome IDs or sample names.
Integer/Float Passes numeric values. Quark validates that the input is a number before launching. --max_cpus, --min_depth, --threshold.
File Handles the staging of data into the Nextflow work directory. FASTQ files, Sample Sheets (CSV).
  • Help Text: Write a clear prompt to guide the user (e.g., "Enter the minimum depth for variant calling").
  • Check Optional Field, or Hide Field as required for the parameter.

File Parameters: Upload vs. Mounted Data

If Type \= File, choose how the user provides data:

  • Browse: The user uploads a file from their local machine. Set Supported File Types (e.g., .csv or .fastq.gz) for validation.

Pipelines

Pipelines

  • Directory Only: The user selects from a dataset already mounted to the pipeline in Quark.

Pipelines

Scalar Parameters: Input vs. Dropdown

For parameters with String, Float or Integer Type, you will be prompted to specify a Field Type. Choose based on whether the pipeline's user can Input their value or select them from a list of Dropdown values that you specify.

  • Input: The user types a value manually.
  • Dropdown: The user chooses from a restricted list of allowed values (prevents typos). For example, if a pipeline only supports BWA or Bowtie2, a dropdown prevents the user from entering an unsupported aligner.

Pipelines

  • Example for updating an Integer parameter type:

Pipelines

  • Choose Boolean to create a simple toggle (e.g., --skip_trimming).

Example parameter inputs for nf-core/bactmap

  • For the nf-core/bactmap example, your first parameter may be:

    • Name: Input CSV
    • Type: File

Pipelines

  • Select Add New Parameter for adding a reference map.

    • Name: Reference
    • Type: File
    • Browse or Directory Only: Directory Only
    • Dataset: Name of your attached dataset (e.g. refdata-1)

    Pipelines

Conditional Parameters (Optional)

Configure fields to appear only when a specific condition is met. (Use one parameter to enable/disable another parameter.)

Pipelines

  • Example: Only show the "Phred Score" input if the "Quality Trimming" (Boolean) toggle is enabled. Or, for instance, if a user selects Type: Paired-end, you can trigger a second File input field for Read 2.

Pipelines Pipelines Pipelines Pipelines

  • Review all parameters and click Next.

Step 5: Advanced Validation (Env + Args)

Use this section to define how Quark validates the inputs before the run starts.

Pipelines

  • Env: Define environment variables required for validation. Set keys like NXF_DEBUG or specific API tokens required during the initialization phase.
  • Args: Map UI parameters to the argument names your validation logic expects.
  • Example: Map Input CSV (UI) → input (Nextflow).

Pipelines

  • Nextflow Config: Optionally paste a nextflow.config fragment that acts as a -coverride. For example, set resource profiles (e.g., process.executor = 'awsbatch') or hard-code parameters you don't want the wet-lab team to change.
  • Review and click Next.

Step 6: Visualization App

Attach a viewer so that wet-lab scientists can interpret results directly in Quark.

  • Click Add New Visualization App.

Pipelines

  • Choose the App Name (e.g., IGV for genomic alignment browsing).

Pipelines

  • Set a Display Name which will become a tab in the results page or the "Vizapp" dashboard on Quark.

Pipelines


Step 7: Review and Submit

Check your settings for accuracy. Once satisfied, click Submit.

Pipelines

Your pipeline will now appear under My Pipelines.

Pipelines

Next Step: Version and Publish your pipeline.

Troubleshooting

Problem Cause Resolution
Import fails at "Clone" Authentication Check URL format. Ensure Quark's SSH key is added to your Git provider for private repos.
Pipeline fails immediately Entry Point/Version Confirm main.nf is in the root and that the Nextflow version is compatible with your code.
Input data not found Pathing/Mounts Ensure "Directory Only" parameters are pointed to the correct mounted dataset.
Parameters missing in UI UI Configuration Check if "Hide Field" is toggled or if a "Conditional Parameter" rule is hiding the field.
Validation Failure Arg Mapping Ensure the Args names match the params names in your nextflow.config exactly.