Key Concepts
This page defines the core terms and concepts used throughout the Quark V3 platform and documentation. Familiarising yourself with these will help you navigate the platform and follow the guides more easily.
Platform Concepts
Pipeline
A pipeline is an automated, multi-step bioinformatics workflow that processes scientific data and produces analytical outputs. In Quark V3, pipelines can be:
- Launched directly from the Launchpad (ready-to-run, no setup required)
- Imported from external workflow systems, such as Nextflow or WDL
Pipelines in Quark support multiple workflow formats including nf-core, AWS HealthOmics, Nextflow, WDL, Snakemake, and custom scripts.
Run
A run is a single execution of a pipeline with a specific set of inputs and parameters. Every time a pipeline is launched, a new run is created and tracked independently. Each run has:
- A unique Run Name (assigned by the user at submission)
- A Status (e.g. Running, Completed, Failed)
- Recorded Start time, Finish time, and Total Runtime
- Associated Results accessible from My Files
Runs are listed and searchable under Pipelines → Runs.
Launchpad
The Launchpad is Quark's curated catalogue of ready-to-run pipelines. Pipelines on the Launchpad are pre-configured and can be launched without any coding expertise. They are organised into categories including Transcriptomics, Genomics, Proteomics, Immunology, and others, and can be searched by name or filtered by type and category.
My Pipelines
My Pipelines is the section within the Pipelines module where bioinformaticians can import, build, configure, version, and publish their own pipelines. Pipelines created here can be made available to the wider team once published.
Workspace
A Workspace is a configurable, cloud-based compute environment within Quark V3. It provides bioinformaticians with an IDE for writing and running analysis code directly in the platform, without managing local infrastructure.
Each workspace has a defined: - Environment preset - Compute specification (CPU cores, memory in GB, optional GPU and Spot instance settings) - Package dependencies - Environment variables
A workspace goes through the following states: Connect → Run → Stop.
Project
A Project is the organisational unit in Quark V3 for grouping related pipelines, workspaces, datasets, and team members.
Project types:
- Developer — for engineering and pipeline development use cases
- Data Science — for analytical and research use cases
- All — unrestricted access across both types
App (Visualisation App)
An App in Quark V3 is a user-configured visualisation tool that allows researchers to explore and interpret pipeline outputs. Apps are created by specifying a container image, compute resources, dataset mounts, and access permissions. Each app can be connected, re-run, started, edited, or deleted from the Apps module.
My Files
My Files is the central file management area in Quark V3. It is organised into three tabs:
- Data — input files and datasets available for use in pipeline runs
- Results — output files and artefacts generated by completed runs
- Activity — a log of file-level actions and changes across the platform
Pipeline & Workflow Concepts
nf-core
nf-core is a community-curated collection of peer-reviewed bioinformatics pipelines built using the Nextflow workflow management system. Quark V3 provides direct access to nf-core pipelines via the Launchpad.
Nextflow
Nextflow is a workflow management system that enables scalable and reproducible scientific pipelines using software containers. It is one of the pipeline formats supported for import into Quark V3 via My Pipelines.
WDL (Workflow Description Language)
WDL is a workflow description language designed for defining data processing pipelines. WDL pipelines can be imported into Quark V3 via My Pipelines.
Snakemake
Snakemake is a Python-based workflow management system. Snakemake pipelines can be imported into Quark V3 via My Pipelines.
AWS HealthOmics
AWS HealthOmics is an Amazon Web Services platform for storing, querying, and analysing genomics and biological data at scale. Quark V3 integrates with HealthOmics through two pipeline types:
- Private Workflow — your organisation's own private workflows
- Ready2Run Workflow — pre-validated, production-ready AWS workflows
Compute Concepts
CPU and Memory
When configuring a Workspace or App, users specify the compute requirements in terms of CPU (number of cores) and Memory (in GB). These values determine how much compute resource is allocated to the job.
GPU
A GPU (Graphics Processing Unit) can be enabled for compute-intensive tasks such as deep learning inference or protein structure prediction. GPU support is available in specific workspace environment presets (e.g. Basic JupyterLab with GPU, Boltzgen-JupyterLab).
Spot Instance
A Spot instance uses spare cloud compute capacity at a reduced cost. Enabling Spot in a workspace configuration can significantly lower compute costs, but Spot instances may be interrupted if capacity is reclaimed. Spot is best suited for fault-tolerant, non-time-critical jobs.
CUDA
CUDA is NVIDIA's parallel computing platform, required to run GPU- accelerated workloads.
Workspace Environment Presets
| Environment | Ubuntu | Python | JupyterLab | Extras |
|---|---|---|---|---|
| Boltzgen-JupyterLab | 22.04 | 3.12.3 | 4.5.5 | Boltzgen 0.3.0, CUDA 13.1.1 |
| Basic JupyterLab | 20.04 | 3.11 | 4.0.3 | — |
| Basic JupyterLab with GPU | 20.04 | 3.11 | 4.0.3 | CUDA 11.6 |
| Admin Terminal | 22.04 | — | — | Admin terminal access only |
| Ubuntu Terminal | 22.04 | — | — | Ubuntu terminal access only |