DNA Sarek Run

Step 1: Launch Quark

Open quark.invisibl.io and click on the Quark button, which opens a new tab.

Quark Page

The landing page will display a Dashboard that gives statistics about jobs run on the platform and their respective costs.

Landing Page

Click on the Launchpad tab displayed next to Runs.

Launchpad

Step 2: Access the Sarek Pipeline

The Launchpad page will show all templates available on the Quark platform.

Launchpad

Navigate to the Search bar and type the name of the required pipeline e.g. sarek

Sarek Run

Step 3: Select Pipeline Version

The UI will show the Sarek pipelines.

Template

Select the required pipeline version by clicking the small yellow box at the top right corner of the search result.

Quark Page

Select version 3.5.1. and click Run.

Step 4: Fill Run Parameters

A blank template loads on the right side of the page.

Template

Fill the following fields specified in the template. Certain parameters are mandatory (specified below) and are required to start the job run.

Example of a Blank Template:

Template

Name (Mandatory): <Job run Name>

Parameters:

Input CSV (Mandatory): Takes csv as input. The samplesheet format is as per nfcore specification. Select Files -> My files -> Directory where the samplesheet is present -> Samplesheet.csv

More details may be found here: Sarek: Usage
Tools (Mandatory) (fill as required): This should be comma separated. Some of the tools currently supported on Quark include vep, strelka, haplotypecaller
split_fastq (Mandatory): Specify how many reads each split of a FastQ file contains. Set 0 to turn off splitting (integer). We use the following value as default - 10000000
intervals (optional): Tick when using WES. It takes target bed file in case of whole exome or targeted sequencing or intervals file.
IGenome Reference (Mandatory): Select the igenome-reference-refdata
Genome (Mandatory): Select the reference genome
Caches for annotation (optional)
Snpeff: Select the path from the UI
vep: Select the top level directory from vep_cache2. Do NOT select the specific path inside the top level directory. The pipeline will automatically detect the organism and build, based on the inputs.

Template Vep

enable analytics (optional): Select if analytics needs to be triggered post job run. If this is not selected, the data will not be indexed and cannot be used for cohort analysis.

An example of a filled template is shown below:

Template

Step 5: Review and Run the Sarek Pipeline

Click Review and review the parameters.

Review Page

To submit the Job, click Submit. If any changes are required, click Edit.

Creating a Samplesheet

The samplesheet can be made in any of the text editors (notepad, excel, notepad++, vim, etc.) but the saved file should have the extension .csv

(For this documentation we will use MS Excel to easily visualize different columns.)

As per nfcore standards, some columns are mandatory in the CSV file.

In our current example, we are using a Normal vs Tumour sample.

Patient	Sex	Status	Sample	Lane	fastq_1	fastq_2
p1	NA	0	HG008_N	lane_1	s3://quark-demo-platform-data/artifacts/mpsdimamay-stlukes-com-ph/slmc/uploads/sarek/HG008-N_Illumina_R1.fastq.gz	s3://quark-demo-platform-data/artifacts/mpsdimamay-stlukes-com-ph/slmc/uploads/sarek/HG008-N_Illumina_R2.fastq.gz
p1	NA	1	HG008_T	lane_1	s3://quark-demo-platform-data/artifacts/mpsdimamay-stlukes-com-ph/slmc/uploads/sarek/HG008-T_Illumina_R1.fastq.gz	s3://quark-demo-platform-data/artifacts/mpsdimamay-stlukes-com-ph/slmc/uploads/sarek/HG008-T_Illumina_R2.fastq.gz

Patient: This column should have patient IDs. Since we are comparing Normal vs Tumour samples, the Patient ID should be same. (This is a requirement for Strelka)
Sex: If the gender of the patient is known it can be filled as male or female.
Status: This can be either 0 or 1. Normal = 0, Tumour = 1
Lane: This column should include the lane number from the sequencing experiment.
fastq_1: location of the forward read.
fastq_2: location of the reverse read.

Getting the location/S3 Path for the fastq files

Step 1: Click on “My Files” from the side bar.

My Files

Step 2: Navigate to the required directory/folder.

My Files

Step 3: Locate the file within the directory/folder, and click on the 3-dot icon. Click Copy Path

My Files

Step 4: Paste the copied path in the relevant column of the samplesheet file.

My Files

Once all the required columns have been filled in the samplesheet, save it as a .csv file.