Key Concepts

This page defines the core terms and concepts used throughout the Quark V3 platform and documentation. Familiarising yourself with these will help you navigate the platform and follow the guides more easily.

Platform Concepts

Pipeline

A pipeline is an automated, multi-step bioinformatics workflow that processes scientific data and produces analytical outputs. In Quark V3, pipelines can be:

Launched directly from the Launchpad (ready-to-run, no setup required)
Imported from external workflow systems, such as Nextflow or WDL

Pipelines in Quark support multiple workflow formats including nf-core, AWS HealthOmics, Nextflow, WDL, Snakemake, and custom scripts.

Run

A run is a single execution of a pipeline with a specific set of inputs and parameters. Every time a pipeline is launched, a new run is created and tracked independently. Each run has:

A unique Run Name (assigned by the user at submission)
A Status (e.g. Running, Completed, Failed)
Recorded Start time, Finish time, and Total Runtime
Associated Results accessible from My Files

Runs are listed and searchable under Pipelines → Runs.

Launchpad

The Launchpad is Quark's curated catalogue of ready-to-run pipelines. Pipelines on the Launchpad are pre-configured and can be launched without any coding expertise. They are organised into categories including Transcriptomics, Genomics, Proteomics, Immunology, and others, and can be searched by name or filtered by type and category.

My Pipelines

My Pipelines is the section within the Pipelines module where bioinformaticians can import, build, configure, version, and publish their own pipelines. Pipelines created here can be made available to the wider team once published.

Workspace

A Workspace is a configurable, cloud-based compute environment within Quark V3. It provides bioinformaticians with an IDE for writing and running analysis code directly in the platform, without managing local infrastructure.

Each workspace has a defined: - Environment preset - Compute specification (CPU cores, memory in GB, optional GPU and Spot instance settings) - Package dependencies - Environment variables

A workspace goes through the following states: Connect → Run → Stop.

Project

A Project is the organisational unit in Quark V3 for grouping related pipelines, workspaces, datasets, and team members.

Project types:

Developer — for engineering and pipeline development use cases
Data Science — for analytical and research use cases
All — unrestricted access across both types

App (Visualisation App)

An App in Quark V3 is a user-configured visualisation tool that allows researchers to explore and interpret pipeline outputs. Apps are created by specifying a container image, compute resources, dataset mounts, and access permissions. Each app can be connected, re-run, started, edited, or deleted from the Apps module.

My Files

My Files is the central file management area in Quark V3. It is organised into three tabs:

Data — input files and datasets available for use in pipeline runs
Results — output files and artefacts generated by completed runs
Activity — a log of file-level actions and changes across the platform

Pipeline & Workflow Concepts

nf-core

nf-core is a community-curated collection of peer-reviewed bioinformatics pipelines built using the Nextflow workflow management system. Quark V3 provides direct access to nf-core pipelines via the Launchpad.

Nextflow

Nextflow is a workflow management system that enables scalable and reproducible scientific pipelines using software containers. It is one of the pipeline formats supported for import into Quark V3 via My Pipelines.

WDL (Workflow Description Language)

WDL is a workflow description language designed for defining data processing pipelines. WDL pipelines can be imported into Quark V3 via My Pipelines.

Snakemake

Snakemake is a Python-based workflow management system. Snakemake pipelines can be imported into Quark V3 via My Pipelines.

AWS HealthOmics

AWS HealthOmics is an Amazon Web Services platform for storing, querying, and analysing genomics and biological data at scale. Quark V3 integrates with HealthOmics through two pipeline types:

Private Workflow — your organisation's own private workflows
Ready2Run Workflow — pre-validated, production-ready AWS workflows

Compute Concepts

CPU and Memory

When configuring a Workspace or App, users specify the compute requirements in terms of CPU (number of cores) and Memory (in GB). These values determine how much compute resource is allocated to the job.

GPU

A GPU (Graphics Processing Unit) can be enabled for compute-intensive tasks such as deep learning inference or protein structure prediction. GPU support is available in specific workspace environment presets (e.g. Basic JupyterLab with GPU, Boltzgen-JupyterLab).

Spot Instance

A Spot instance uses spare cloud compute capacity at a reduced cost. Enabling Spot in a workspace configuration can significantly lower compute costs, but Spot instances may be interrupted if capacity is reclaimed. Spot is best suited for fault-tolerant, non-time-critical jobs.

CUDA

CUDA is NVIDIA's parallel computing platform, required to run GPU- accelerated workloads.

Workspace Environment Presets

Environment	Ubuntu	Python	JupyterLab	Extras
Boltzgen-JupyterLab	22.04	3.12.3	4.5.5	Boltzgen 0.3.0, CUDA 13.1.1
Basic JupyterLab	20.04	3.11	4.0.3	—
Basic JupyterLab with GPU	20.04	3.11	4.0.3	CUDA 11.6
Admin Terminal	22.04	—	—	Admin terminal access only
Ubuntu Terminal	22.04	—	—	Ubuntu terminal access only