Datasets
Overview
The Datasets section gives the DS Administrator visibility into all cohorts that researchers have requested for their projects, and the full catalog of datasets registered on the platform.
This section is central to data governance. The DS Administrator can monitor what sensitive data is being surfaced, review the status and history of cohort access, revoke access where appropriate, and browse the underlying dataset catalog that cohorts are built from.
Datasets is organised into two tabs: Cohorts and Catalog. The section opens on the Cohorts tab by default.
Navigation: Select Datasets from the left-hand navigation pane.
Catalog Tab
Select the Catalog tab to view the full list of datasets registered on the platform.

Each card shows:
| Field | Description |
|---|---|
| Name | The dataset's identifier, with a copy icon, useful for exporting the identifier for pipeline inputs. |
| Description | A plain-language summary of the dataset's contents and scope. |
| Tags | Key-value metadata describing the dataset (e.g., type: genomics, condition: lung cancer, nsclc, gene: EGFR). Tags vary by dataset and reflect the dataset's catalog metadata. |
| Last Updated | The date the dataset entry was last updated, shown with a clock icon. |
Use the search bar to find a dataset by name or description. The footer confirms when All catalog items loaded.
Note: The DS Administrator can view dataset catalog entries and their summaries. Publishing new datasets to the platform is the responsibility of the Infrastructure Administrator.
Dataset Summary Dashboard
Click on any dataset card to open its Cohort Summary dashboard. For a catalog-level dataset, this displays:
- The dataset name, with a copy icon, and its description.
- Number of Persons and Total Records aggregate tiles.
- Distribution charts for Gender, Race, Ethnicity, and Top Conditions across the entire dataset.

This view gives the DS Administrator a high-level picture of a dataset's scope and population — useful for assessing its relevance to active projects before users build cohorts from it.
Cohorts Tab
The Cohorts tab lists every cohort that has been requested across your projects, displayed as cards.

Searching and Filtering
Use the search bar to find a cohort by name. Use the status dropdown (defaults to All) to filter the list to a specific cohort status.
Reading a Cohort Card
Each card shows:
| Field | Description |
|---|---|
| Name | The cohort's name, with a copy icon to copy it to the clipboard. |
| Status badge | The current status of the cohort access request — see Cohort Statuses below. |
| Description | A summary of the cohort's composition, where provided. |
| Requesting User | The user who submitted the cohort request. |
| Created Date | The date the cohort request was created. |
| Expiry Tag | Indicates how much longer the cohort's access grant remains valid — Expires in X day(s), Expires today, or Expired. |
Tip: The icon next to the cohort name indicates the type of access request — a single-user icon typically represents a request for an entire dataset, while a group icon represents a derived cohort built from a subset of a dataset.
Cohort Statuses
| Status | Description |
|---|---|
| Pending Approval | The request has been submitted and is awaiting review. |
| Approved | The request has been approved and the requesting user has access to the cohort data. |
| Revoked | Access to the cohort has been manually revoked by an administrator. |
| Expired | The cohort's access grant has passed its expiry date. |
Card Actions
Each cohort card has one or two action icons in the bottom-right corner:
| Icon | Action | Available When |
|---|---|---|
| Timeline | Open the Access Timeline panel for this cohort. | Always |
| Revoke (red, crossed-out file) | Revoke the user's access to this cohort. | Approved or Expired |
Access Timeline
Click the Timeline icon on any cohort card to open the Access Timeline panel on the right side of the screen.


The panel shows a reverse-chronological history of status changes for the cohort, including:
- Approved — when the request was approved, and by whom.
- Approval Requested — when the request was originally submitted, and by whom. Each entry shows the user responsible for the action and the exact timestamp. Use the search field at the top of the panel to filter the timeline if it contains a long history.
Revoking Cohort Access
If a user no longer requires access to a cohort — for example, their project has concluded, or a governance concern has been identified — the DS Administrator can revoke access directly.
To revoke access:
- Locate the cohort card with Approved or Expired status.
- Click the red Revoke icon in the bottom-right corner of the card.
- Confirm the action when prompted.

Once revoked:
- The cohort's status badge updates to Revoked.
- The revocation is recorded in the Access Timeline for the cohort.
- The requesting user loses access to the cohort's underlying data.

Note: Revoking access does not delete the cohort definition itself — it only removes the user's access to the underlying data. The cohort remains visible in the Cohorts tab for audit purposes.
Approved Cohort Summary
Click on any approved cohort card (other than its action icons) to open the Cohort Summary dashboard for that cohort.

The dashboard displays:
- The cohort name, with a copy icon.
- Status badges, including the cohort's Status (e.g., Approved, Revoked), Auto Updates (On/Off — indicates whether the cohort's results refresh automatically as the underlying dataset is updated), and its expiry tag.
- The Search Query used to define the cohort (e.g.,
Dataset = impact). - The requesting user.
- A Timeline icon to open the Access Timeline directly from this view.
Aggregate Statistics
Below the header, summary tiles show the Number of Persons, and Total Records in the requested. The dashboard includes a set of distribution charts that visualise the composition of the cohort, such as: Year of Birth, Gender Distribution, Race Distribution, Ethnicity Distribution, Top Conditions, Top Drugs (most prevalent drug exposures in the cohort).
Age Plot and KM Survival
For approved cohorts with active data access, two additional interactive charts are available:
- Age Plot — age distribution filterable by Type (e.g., Drug, Condition) and Value (e.g., a specific drug name).
- KM Survival — a Kaplan-Meier survival curve, filterable by the same Type and Value selectors, with Upper CI, Lower CI, and All series. Use the download icon to export the chart.
Cohort Table
For approved cohorts, a Person Table lists individual person-level records that were not previously available to the user.
Each column has a search field to filter the table. Use the Person / Specimen dropdown to switch the table between person-level and specimen-level views. Use the filter and download icons to refine and export the table data.
Note: For cohorts with Revoked status, the Cohort Summary dashboard may display its charts and tables as empty placeholders, reflecting that the underlying data is no longer accessible.
What's Next
- Metadata — The quality and consistency of metadata directly affects how discoverable and queryable datasets and cohorts are. Configure validation rules to improve data findability.
- Requests — New cohort and dataset access requests are reviewed and actioned here. The Cohorts tab in Datasets reflects the resulting status and provides ongoing oversight, including revocation.
- Audit Logs — Review access events for datasets and cohorts, including approvals and revocations, to support compliance and data governance reporting.