Dataset Analysis using Workstations
Overview
Quark's Trusted Research Environment provides researchers with secure, cloud-based Workstations for performing data analysis. A workstation is a fully provisioned virtual machine equipped with the operating system and analytical tools required for your research — including JupyterLab, RStudio, and other software.
Workstations operate within the TRE's governance framework: every workstation must be approved by the Project Administrator before it can be launched, and all data flowing in and out of the workstation is subject to controlled access and audit logging.
The platform supports both Windows and Linux (Ubuntu DCV) workstation images, allowing researchers to choose the operating system that best suits their analytical workflows.
Workstation Capabilities
Quark workstations provide the following capabilities:
- File Management — Upload data files and organise them into folders. Access output files generated by your analysis.
- Request Approval Workflows — All workstation provisioning, file uploads, and file downloads are governed by an approval workflow.
- Resource Monitoring — Monitor CPU usage, memory consumption, disk space, and other performance metrics in real time.
- Lifecycle Management — Define automated schedules for starting and stopping workstations to optimise resource usage and cost.
- Cost Tracking — View daily, weekly, and monthly cost breakdowns for your workstation usage.
- Event Logging — All workstation-related activities (creation, launch, start, stop, termination) are automatically logged with timestamps and user information.
Requesting a Workstation
Before you can use a workstation, you must submit a request for provisioning, which will be reviewed by the Project Administrator.
Submitting a Workstation Request
- Select Workstations from the Navigation Menu on the left.
- Click the Add New button in the top-right corner of the screen.

- Select the appropriate workstation image from the available options (e.g.,
Ubuntu DCVfor a Linux workstation). - In the right-hand pop-up pane, fill in the following details:
- Workstation Name — Enter a descriptive name for your workstation (e.g.,
ws-ubuntu-cohortanalysis). - Description — Provide a brief description of the workstation's intended purpose.
- Storage — Specify the disk storage size in GB (e.g.,
30). - Capacity — Select the appropriate compute capacity (CPU and memory). Your Project Administrator will advise on the available capacity options.

- After verifying all parameters, click the Request icon.
- Click Submit to send the provisioning request.
The workstation request will now appear on the Workstations screen with a status of Pending Approval.

Note: Notify your Project Administrator if your request requires expedited review.
Launching a Workstation
Once the Project Administrator approves your workstation request, you can launch and connect to it.
Starting the Workstation
- Navigate to Workstations from the Navigation Menu.
- Confirm that your workstation status has been updated to Approved.
- Click the Launch icon next to your workstation.
![]()
- A confirmation prompt will appear. Copy the prompt text, paste it into the verification box, and click Start.

- The workstation status will update to Progressing. Wait approximately 5–7 minutes for the workstation to be provisioned.
- Once the status updates to Running, the workstation is ready for use.

Connecting to a Workstation
Once the workstation is in a Running state, you can connect to it remotely.
Retrieving the Password
- Click on your workstation name in the Workstations list.
- A pop-up pane will open on the right, displaying all the details of your workstation.
- Scroll to the last row of the details pane to locate the Password. Copy this password.

Connecting via the Browser
- Click the Connect icon next to your workstation.
![]()
- Enter the username as
ubuntu(for Linux workstations) and paste the copied password. - Click Sign in.
- If prompted again, re-enter the password to complete the connection.
You will now see the workstation's desktop environment in your browser.

Using JupyterLab
Workstations come pre-configured with JupyterLab for running analysis notebooks. To open JupyterLab:
- Once inside the workstation, click Activities in the top-left corner of the desktop screen.
- In the Activities overview, click the JupyterLab icon.
![]()
- JupyterLab will open in a browser window within the workstation. Navigate to the notebook or script you wish to run.

Running Analysis Scripts
To execute code within a JupyterLab notebook:
- Scroll to the first code block in the notebook.
- Click within the code block to select it.
- Click the Run (Play) icon at the top of the screen, or press Shift + Enter to execute the selected block.
- Continue executing each subsequent code block step by step until the analysis is complete.

Tip: Review each block's output before proceeding to the next, to identify any errors or unexpected results early in the workflow.
Managing Workstation Files
Uploading Data Files
Researchers can upload data files to their workstation for analysis:
- Click on your workstation name to open the details pane.
- Navigate to the Data tab.
- Create folders to organise your data, and upload files as needed.
Uploaded files undergo a vulnerability scan and integrity check before they are transferred to the workstation. The upload status can be tracked under My Requests in the Navigation Menu.
Accessing Output Files
Output files generated by your analysis within the workstation are available under the Results tab:
- Click on your workstation name to open the details pane.
- Navigate to the Results tab.
- Your output files will be listed with options for downloading. See Downloading Data Results for the download workflow.
Changing the Instance Type
If your workstation requires more (or fewer) compute resources after it has been created, you can change the instance type:
- Navigate to the Workstations tab and select your workstation.
- In the right-hand details pane, navigate to the Summary tab.
- Locate the pencil (edit) icon next to the instance type.
- Click the pencil icon and select a new instance type from the available options.
- The workstation will reflect the updated instance type.
Starting and Stopping a Workstation
Researchers can manually start or stop their workstations at any time to manage resource usage and costs:
- Stop a workstation when it is not actively in use to reduce costs.
- Start a stopped workstation when you are ready to resume work.
These controls are available from the Workstations interface.
Lifecycle Management
Workstations support automated lifecycle management, enabling researchers to define schedules for automatic starting and stopping:
- Open your workstation's details pane.
- Navigate to the lifecycle management section.
- Define a schedule or set of conditions (e.g., start at 9:00 AM, stop at 6:00 PM on weekdays).
- The workstation will automatically start and stop according to the defined schedule.
This feature helps optimise resource consumption and manage costs by ensuring workstations are not left running outside of active research hours.
Monitoring Resources and Costs
Resource Monitoring
The Monitoring tab within the workstation details pane provides real-time metrics on:
- CPU usage
- Memory consumption
- Disk space utilisation
- Other relevant performance statistics
Use these metrics to identify resource bottlenecks and optimise your analysis workflows.
Cost Tracking
The Cost Chart within the workstation details pane provides detailed cost breakdowns:
- Daily, weekly, and monthly cost views based on resource consumption.
- Cost estimations for different configuration options, helping researchers manage budgets.
Event Logging
The Event Log automatically records all workstation activities:
- Workstation creation, launch, start, stop, and termination events.
- Each event includes a timestamp, user information, and status updates.
- The event log is available for auditing and troubleshooting purposes.
What's Next
- Download your analysis results — See Downloading Data Results.
- Run a bioinformatics pipeline — See Dataset Analysis using Pipelines.