Skip to content

Importing a WDL Pipeline into Quark

This guide provides step-by-step instructions for onboarding a WDL (Workflow Description Language) pipeline. By mapping your WDL source code to Quark, you transform complex command-line scripts into reproducible, accessible applications.

Overview

Quark supports WDL workflows by wrapping the underlying execution engine (typically Cromwell). This allows you to manage tasks across different cloud or on-prem environments while providing a simplified front-end for researchers.

Before You Start

Ensure the following technical components are ready:

  • Repository Access:
  • Public: Use HTTPS URLs.
  • Private: Use SSH URLs. Ensure Quark’s public SSH key is added to your Git provider to authorize cloning.
  • Revision Management: For production-grade reproducibility, use a Commit SHA or a Tag (e.g., v2.1.0). Avoid using HEAD or moving branches like main if you require strict version control.
  • Workflow Entry Point: The relative path to the primary .wdl file (e.g., workflows/germline/main.wdl).
  • Input Schema: A list of the input variables defined in your workflow block. Having an example inputs.json file on hand is highly recommended for reference.

Note on Naming: WDL is strictly namespaced. Your input names in Quark must match the exact string expected by the runner (e.g., WorkflowName.TaskName.Variable).


Step-by-Step Instructions

Start the Import

  1. Navigate to the My Pipelines tab.
  2. Click Import Pipeline in the top-right corner.
  3. In the Select Pipeline Type window, choose WDL and click Continue.

Step 1: General Pipeline Details

This metadata helps organize your team's workspace and documents the underlying science.

  • Name & Summary: These are searchable within Quark. Use clear, descriptive titles (e.g., "GATK HaplotypeCaller - Germline").
  • About (Optional): Supports Markdown.
  • Bioinformatician Tip: Use this to document the WDL version (e.g., 1.0 or development), tool versions, and citations for algorithms used in the tasks.
  • Tags: Add key/value pairs to facilitate pipeline discovery (e.g., sequencing: Illumina, species: Human).

Step 2: Pipeline Source (Git)

Configure the connection between Quark and your WDL source code.

  • Repository URL: Ensure the URL format matches your access level (HTTPS vs. SSH).
  • Workflow Entry Point:
  • Must not start with a /
  • Must end with .wdl
  • If your workflow imports other WDL files using relative paths, ensure the repository structure is maintained during the clone.
  • Execution Engine: Select the runner (e.g., Cromwell) and the specific version.
  • Note: Different Cromwell versions may have varying support for WDL 1.1 features or specific runtime attributes.

Step 3: Parameters (WDL Inputs)

This is the most critical step for bioinformaticians. You are defining the JSON input mapping that Quark will pass to the WDL engine.

For each parameter, define:

  • Name: This must be the fully qualified name of the input as defined in your WDL script. If your WDL script has input { File input_bam }, the name here must be input_bam (or WorkflowName.input_bam depending on your engine's requirements).
  • Type: * String/Integer/Float: Standard scalar types.
  • Boolean: Creates a UI toggle. In WDL, this maps to the Boolean type, allowing you to control conditional execution blocks (e.g., if (run_stats) { call samtools_stats }).
  • File: Handles data staging.

File Inputs: Upload vs. Browse

Choose how the user provides data to the workflow:

  • Upload: Best for small metadata files or individual samples. Quark will handle the localization of these files to the execution environment.
  • Browse: Use this for high-throughput data.
  • Directory Only: Select this if your WDL task expects a Directory type (WDL 1.1) or a String path to a folder.

Scalar Inputs: Input vs. Dropdown

  • Dropdown: Highly recommended for "guardrailing" parameters like genome_build (e.g., GRCh37, GRCh38). This prevents users from entering strings that your WDL tasks aren't configured to handle.

Boolean Inputs

Configure the display labels for your toggle. In the backend, Quark will pass true or false to the WDL input JSON.

Conditional Inputs (Optional)

Hide or show technical parameters based on user selection.

  • Example: Only show "Advanced Trimming Options" if the "Enable Quality Control" toggle is set to True.

Step 4: Review and Submit

Review the input mapping. Errors here usually result in "Workflow Submission Failures" due to JSON schema mismatches.

  1. Click Submit.
  2. The pipeline appears in My Pipelines.
  3. Next Step: You must Version and Publish before other users can execute the workflow.

Troubleshooting

Problem Technical Context Resolution
Clone Failure Authentication/Path Verify SSH keys. Check that the Entry Point path is relative to the repo root and spelled correctly.
Submission Failure Namespace Mismatch Ensure parameter Names in Step 3 match the WDL file's input block exactly, including workflow name prefixes.
Staging Failure File Localization If your WDL expects a specific directory structure, ensure you are using Directory Only or providing the correct base path.
Invalid Engine Version Syntax Support If using WDL 1.1 features (like structs), ensure the selected Cromwell version supports them.