Dataset Analysis using Workstations

Overview

Quark's Trusted Research Environment provides researchers with secure, cloud-based Workstations for performing data analysis. A workstation is a fully provisioned virtual machine equipped with the operating system and analytical tools required for your research — including JupyterLab, RStudio, and other software.

Workstations operate within the TRE's governance framework: every workstation must be approved by the Project Administrator before it can be launched, and all data flowing in and out of the workstation is subject to controlled access and audit logging.

The platform supports both Windows and Linux (Ubuntu DCV) workstation images, allowing researchers to choose the operating system that best suits their analytical workflows.

Workstation Capabilities

Quark workstations provide the following capabilities:

File Management — Upload data files and organise them into folders. Access output files generated by your analysis.
Request Approval Workflows — All workstation provisioning, file uploads, and file downloads are governed by an approval workflow.
Resource Monitoring — Monitor CPU usage, memory consumption, disk space, and other performance metrics in real time.
Lifecycle Management — Define automated schedules for starting and stopping workstations to optimise resource usage and cost.
Cost Tracking — View daily, weekly, and monthly cost breakdowns for your workstation usage.
Event Logging — All workstation-related activities (creation, launch, start, stop, termination) are automatically logged with timestamps and user information.

Requesting a Workstation

Before you can use a workstation, you must submit a request for provisioning, which will be reviewed by the Project Administrator.

Submitting a Workstation Request

Select Workstations from the Navigation Menu on the left.
Click the Add New button in the top-right corner of the screen.

Workstations page with Add New button

Select the appropriate workstation image from the available options (e.g., Ubuntu DCV for a Linux workstation).
In the right-hand pop-up pane, fill in the following details:
Workstation Name — Enter a descriptive name for your workstation (e.g., ws-ubuntu-cohortanalysis).
Description — Provide a brief description of the workstation's intended purpose.
Storage — Specify the disk storage size in GB (e.g., 30).
Capacity — Select the appropriate compute capacity (CPU and memory). Your Project Administrator will advise on the available capacity options.

Workstation configuration form

After verifying all parameters, click the Request icon.
Click Submit to send the provisioning request.

The workstation request will now appear on the Workstations screen with a status of Pending Approval.

Workstation request showing Pending Approval status

Note: Notify your Project Administrator if your request requires expedited review.

Launching a Workstation

Once the Project Administrator approves your workstation request, you can launch and connect to it.

Starting the Workstation

Navigate to Workstations from the Navigation Menu.
Confirm that your workstation status has been updated to Approved.
Click the Launch icon next to your workstation.

Launching an approved workstation

A confirmation prompt will appear. Copy the prompt text, paste it into the verification box, and click Start.

Workstation launch confirmation prompt

The workstation status will update to Progressing. Wait approximately 5–7 minutes for the workstation to be provisioned.
Once the status updates to Running, the workstation is ready for use.

Workstation status showing Running

Connecting to a Workstation

Once the workstation is in a Running state, you can connect to it remotely.

Retrieving the Password

Click on your workstation name in the Workstations list.
A pop-up pane will open on the right, displaying all the details of your workstation.
Scroll to the last row of the details pane to locate the Password. Copy this password.

Workstation details pane showing password

Connecting via the Browser

Click the Connect icon next to your workstation.

Workstation Connect icon

Enter the username as ubuntu (for Linux workstations) and paste the copied password.
Click Sign in.
If prompted again, re-enter the password to complete the connection.

You will now see the workstation's desktop environment in your browser.

Workstation desktop environment

Using JupyterLab

Workstations come pre-configured with JupyterLab for running analysis notebooks. To open JupyterLab:

Once inside the workstation, click Activities in the top-left corner of the desktop screen.
In the Activities overview, click the JupyterLab icon.

JupyterLab icon in Activities

JupyterLab will open in a browser window within the workstation. Navigate to the notebook or script you wish to run.

JupyterLab session open in workstation

Running Analysis Scripts

To execute code within a JupyterLab notebook:

Scroll to the first code block in the notebook.
Click within the code block to select it.
Click the Run (Play) icon at the top of the screen, or press Shift + Enter to execute the selected block.
Continue executing each subsequent code block step by step until the analysis is complete.

Running a code block in JupyterLab

Tip: Review each block's output before proceeding to the next, to identify any errors or unexpected results early in the workflow.

Managing Workstation Files

Uploading Data Files

Researchers can upload data files to their workstation for analysis:

Click on your workstation name to open the details pane.
Navigate to the Data tab.
Create folders to organise your data, and upload files as needed.

Uploaded files undergo a vulnerability scan and integrity check before they are transferred to the workstation. The upload status can be tracked under My Requests in the Navigation Menu.

Accessing Output Files

Output files generated by your analysis within the workstation are available under the Results tab:

Click on your workstation name to open the details pane.
Navigate to the Results tab.
Your output files will be listed with options for downloading. See Downloading Data Results for the download workflow.

Changing the Instance Type

If your workstation requires more (or fewer) compute resources after it has been created, you can change the instance type:

Navigate to the Workstations tab and select your workstation.
In the right-hand details pane, navigate to the Summary tab.
Locate the pencil (edit) icon next to the instance type.
Click the pencil icon and select a new instance type from the available options.
The workstation will reflect the updated instance type.

Starting and Stopping a Workstation

Researchers can manually start or stop their workstations at any time to manage resource usage and costs:

Stop a workstation when it is not actively in use to reduce costs.
Start a stopped workstation when you are ready to resume work.

These controls are available from the Workstations interface.

Lifecycle Management

Workstations support automated lifecycle management, enabling researchers to define schedules for automatic starting and stopping:

Open your workstation's details pane.
Navigate to the lifecycle management section.
Define a schedule or set of conditions (e.g., start at 9:00 AM, stop at 6:00 PM on weekdays).
The workstation will automatically start and stop according to the defined schedule.

This feature helps optimise resource consumption and manage costs by ensuring workstations are not left running outside of active research hours.

Monitoring Resources and Costs

Resource Monitoring

The Monitoring tab within the workstation details pane provides real-time metrics on:

CPU usage
Memory consumption
Disk space utilisation
Other relevant performance statistics

Use these metrics to identify resource bottlenecks and optimise your analysis workflows.

Cost Tracking

The Cost Chart within the workstation details pane provides detailed cost breakdowns:

Daily, weekly, and monthly cost views based on resource consumption.
Cost estimations for different configuration options, helping researchers manage budgets.

Event Logging

The Event Log automatically records all workstation activities:

Workstation creation, launch, start, stop, and termination events.
Each event includes a timestamp, user information, and status updates.
The event log is available for auditing and troubleshooting purposes.

What's Next

Download your analysis results — See Downloading Data Results.
Run a bioinformatics pipeline — See Dataset Analysis using Pipelines.