Merge pull request #1766 from alan-turing-institute/processes

Restructure processes section of docs
alan-turing-institute · Mar 28, 2024 · dc06f48 · dc06f48
2 parents 0a9d037 + fbab6b6
commit dc06f48
Show file tree

Hide file tree

Showing 26 changed files with 151 additions and 407 deletions.
diff --git a/docs/source/design/security/index.md b/docs/source/design/security/index.md
@@ -6,6 +6,7 @@
 
 objectives.md
 technical_controls.md
+software_package_approval.md
 reference_configuration.md
 ```
 
@@ -15,5 +16,8 @@ reference_configuration.md
 [Built-in technical controls](technical_controls.md)
 : Default technical controls built-in to the Data Safe Haven
 
+[Software package approval](software_package_approval.md)
+: Policy and process for adding packages to the _core_ allowlists
+
 [Turing security configuration](reference_configuration.md)
 : Security configuration used at The Alan Turing Institute to try to meet these objectives
diff --git a/...ce/processes/software_package_approval.md → ...ign/security/software_package_approval.md b/...ce/processes/software_package_approval.md → ...ign/security/software_package_approval.md
diff --git a/docs/source/design/security/technical_controls.md b/docs/source/design/security/technical_controls.md
@@ -89,22 +89,22 @@ Note that this means that eg. password managers cannot be used to autofill a {re
 
 ### Sign-off on bringing data into the environment:
 
-- **{ref}`policy_tier_2` and {ref}`policy_tier_3`:** At the Alan Turing Institute all three of {ref}`role_investigator`, {ref}`role_data_provider_representative` and {ref}`role_referee` must agree the data is suitable for the environment Tier.
+- **{ref}`policy_tier_2` and {ref}`policy_tier_3`:** At the Alan Turing Institute all three of {ref}`role_investigator`, {ref}`role_data_provider_representative` and referee must agree the data is suitable for the environment Tier.
 - **{ref}`policy_tier_0` and {ref}`policy_tier_1`:** At the Alan Turing Institute both {ref}`role_investigator` and {ref}`role_data_provider_representative` must agree the data is suitable for the environment Tier.
 
 ### Sign-off on bringing data out of the environment:
 
-- **{ref}`policy_tier_2` and {ref}`policy_tier_3`:** At the Alan Turing Institute all three of {ref}`role_investigator`, {ref}`role_data_provider_representative` and {ref}`role_referee` must agree the data is suitable for its destination before it is egressed from an SRE.
+- **{ref}`policy_tier_2` and {ref}`policy_tier_3`:** At the Alan Turing Institute all three of {ref}`role_investigator`, {ref}`role_data_provider_representative` and referee must agree the data is suitable for its destination before it is egressed from an SRE.
 - **{ref}`policy_tier_0` and {ref}`policy_tier_1`:** At the Alan Turing Institute both {ref}`role_investigator` and {ref}`role_data_provider_representative` must agree the data is suitable for its destination before it is egressed from an SRE.
 
 ### Sign-off on adding new users:
 
-- **{ref}`policy_tier_3`:** At the Alan Turing Institute the {ref}`role_investigator` and {ref}`role_referee` must both authorise access to an SRE at {ref}`policy_tier_3`.
+- **{ref}`policy_tier_3`:** At the Alan Turing Institute the {ref}`role_investigator` and referee must both authorise access to an SRE at {ref}`policy_tier_3`.
 - **{ref}`policy_tier_0` to {ref}`policy_tier_2`:** At the Alan Turing Institute the {ref}`role_investigator` can authorise access to an SRE at {ref}`policy_tier_0` to {ref}`policy_tier_2`.
 
 ### Sign-off on bringing external code/software into the environment:
 
-- **{ref}`policy_tier_3`:** At the Alan Turing Institute both the {ref}`role_investigator` and {ref}`role_referee` must authorise the ingress of code or software to an SRE at {ref}`policy_tier_3`.
+- **{ref}`policy_tier_3`:** At the Alan Turing Institute both the {ref}`role_investigator` and referee must authorise the ingress of code or software to an SRE at {ref}`policy_tier_3`.
 - **{ref}`policy_tier_0` to {ref}`policy_tier_2`:** At the Alan Turing Institute the {ref}`role_investigator` can authorise ingress of code or software to an SRE at {ref}`policy_tier_0` to {ref}`policy_tier_2`.
 
 ### Python/R package availability:

diff --git a/docs/source/index.md b/docs/source/index.md
@@ -7,7 +7,6 @@
 overview/index.md
 design/index.md
 deployment/index.md
-processes/index.md
 roles/index.md
 contributing/index.md
 ```
@@ -35,20 +34,15 @@ The documentation for this project covers several different topics.
 You can read them through in order or simply jump to the section that you are most interested in.
 
 - [**Overview**](overview/index.md)
-    - if you want an overview of what the Data Safe Haven project is about.
-
+    - If you want an overview of what the Data Safe Haven project is about.
 - [**Design**](design/index.md)
-    - if you want details about the technical design of the Data Safe Haven.
-
+    - If you want details about the technical design of the Data Safe Haven.
 - [**Deployment**](deployment/index.md)
-    - if you want to deploy your own Data Safe Haven.
-
-- [**Processes**](processes/index.md)
-    - processes necessary to use the Data Safe Haven.
-
+    - If you want to deploy your own Data Safe Haven.
 - [**Roles**](roles/index.md)
-    - information about the different user roles in the Data Safe Haven.
-    - if you're using a Data Safe Haven that someone else has deployed then start here.
+    - Information about the different user roles in the Data Safe Haven.
+    - Instructions and advice for the actions of different user roles.
+    - If you're using a Data Safe Haven that someone else has deployed then start here.
 
 ## Legal
 

diff --git a/docs/source/overview/index.md b/docs/source/overview/index.md
@@ -7,6 +7,7 @@
 what_is_dsh.md
 why_use_dsh.md
 sensitivity_tiers.md
+using_dsh.md
 ```
 
 ## Background and concepts
@@ -20,6 +21,9 @@ sensitivity_tiers.md
 [Sensitivity tiers](sensitivity_tiers.md)
 : Details of the five sensitivity tiers that projects are classified into.
 
+[Before using the Data Safe Haven](using_dsh.md)
+: Important considerations for using the Data Safe Haven to enable secure research.
+
 ## Further resources
 
 [One-page poster](https://doi.org/10.6084/m9.figshare.11815224)

diff --git a/docs/source/overview/using_dsh.md b/docs/source/overview/using_dsh.md
@@ -0,0 +1,86 @@
+(using_dsh)=
+
+# Before using the Data Safe Haven
+
+This page contains some important considerations you must take before using the Data Safe Haven.
+Where appropriate, there are links to external resources, including policies and processes used at The Turing.
+
+```{warning}
+Use of a Data Safe Haven is not by itself sufficient to guarantee the security of your data! It must be paired with appropriate information governance requirements and user agreements.
+```
+
+```{warning}
+Each organisation deploying their own instance of the Data Safe Haven is responsible for verifying their Data Safe Haven instance is deployed as expected and that the deployed configuration effectively supports their own information governance policies and processes.
+
+Each organisation deploying their own instance of the Data Safe Haven is responsible for verifying that the instance is configured as expected. The organisation is also reponsible for confirming that the deployed configuration is appropriate for their purposes and effectively supports their own information governance policies and processes. We provide the Data Safe Haven code and material on an ‘as is’ basis without warranties of any kind and you use the code and supporting materials at your own cost and risk.
+```
+
+```{tip}
+In terms of the [Five Safes framework](https://ukdataservice.ac.uk/help/secure-lab/what-is-the-five-safes-framework/) the Data Safe Haven is aiming to be a Safe Setting.
+```
+
+## What is needed to run a Data Safe Haven?
+
+The code of this project is not on its own sufficient to operate a secure environment for research on sensitive data.
+In fact, _any_ functional TRE is not just code and infrastructure, but also people, policies, and processes.
+This project provides code to deploy a TRE with a particular architecture and the documentation gives instructions and advice for operating an instance.
+Every group deploying the Data Safe Haven will need to provide the rest including,
+
+- Information governance processes
+    - How to approve data ingress and egress
+    - Classifying work into Data Safe Haven tiers
+- Mapping internal roles or people to Data Safe Haven roles
+- Staffing essential roles such as system and programme managers
+- Data security incident handling procedures
+- Financial planning
+- Supporting infrastructure such as
+    - Communication channels
+    - Domains and DNS
+    - Secure methods to share SAS tokens
+    - Programme management tools
+
+The [Standard Architecture for Trusted Research Environments (SATRE)](https://satre-specification.readthedocs.io) project is a useful reference for TRE design.
+It features a comprehensive set of requirements, technical and non-technical, that a TRE operator should meet.
+An evaluation of the Data Safe Haven production instances at the Turing against SATRE can be found [here](https://satre-specification.readthedocs.io/evaluations/alan_turing_institute).
+
+## Tiering
+
+[Tiering](sensitivity_tiers.md) is a fundamental part of DSH.
+The code deploys Secure Research Environments with four levels of technical control to meet five tiers of sensitivity classification.
+These tiers are explained in the section [](design_security_objectives).
+
+Each organisation will need to decide how to use the available tiers and a process to decide what tier is appropriate for each project.
+This will require a careful consideration of the organisations risk appetite, balancing the value of enabling work against the risks of data disclosure.
+
+The project classification process used at the Turing is described [here](https://alan-turing-institute.github.io/trusted-research/tasks/setting_up_tre/project_initialisation/project_classification.html).
+This process considers work packages, which cover the combination of all input data and the planned work when making a classification.
+That approach better captures the risks associated with merging data sets and also considers the sensitivity of intended outputs.
+
+## Role mapping
+
+The Data Safe Haven is designed with a number of [roles](roles) required for secure operation.
+Importantly, some of these roles are mutually exclusive.
+That is because one person holding multiple roles may circumvent security controls.
+For example, a Researcher should not also be a System Manager as they would be able to conduct data ingress and egress, mock other users or create new user accounts.
+
+These roles are specific to the Data Safe Haven.
+Organisations will need to decide how to map their existing roles to Data Safe Haven roles or how to otherwise popular them.
+
+## Bad actors
+
+The technical controls of the Data Safe Haven cannot protect you completely against bad actors.
+There are sensible restrictions to limit what users able to do.
+For example, outbound network connection is strictly controlled to prevent users uploading sensitive data to the public internet.
+However, the design assumes that users are generally trustworthy and good actors.
+It is therefore necessary to have confidence in the identity of users and make a decision on whether to trust them.
+
+TRE operators should consider how they balance different types of risk.
+The [Five Safes framework](https://ukdataservice.ac.uk/help/secure-lab/what-is-the-five-safes-framework/) is useful for addressing this.
+If you have low confidence in the safety of people, for example untrusted users, you will need to compensate in other areas.
+
+## At the Turing
+
+Our production instances of the Data Safe Haven are managed by a dedicated team at the Turing.
+There processes and policies are open and can be read [here](https://alan-turing-institute.github.io/trusted-research).
+The Turing provides no guarantee for anyone following its processes and assumes no responsibility for others running a Data Safe Haven instance.
+Organisations must carefully consider the risks themselves and decide what is acceptable to them.
diff --git a/docs/source/overview/why_use_dsh.md b/docs/source/overview/why_use_dsh.md
@@ -4,20 +4,6 @@ The Data Safe Haven is our implementation of a TRE following the principles we l
 We provide a set of instructions that will allow you to set up your own secure environment with some default security controls.
 Our aim throughout has been to make the environments [reproducible](why_reproducible), [usable](why_usable), [secure](why_secure), [cloud-native](why_cloud_native) and [open source](why_open_source).
 
-```{warning}
-Use of a Data Safe Haven is not by itself sufficient to guarantee the security of your data! It must be paired with appropriate information governance requirements and user agreements.
-```
-
-```{warning}
-Each organisation deploying their own instance of the Data Safe Haven is responsible for verifying their Data Safe Haven instance is deployed as expected and that the deployed configuration effectively supports their own information governance policies and processes.
-
-Each organisation deploying their own instance of the Data Safe Haven is responsible for verifying that the instance is configured as expected. The organisation is also reponsible for confirming that the deployed configuration is appropriate for their purposes and effectively supports their own information governance policies and processes. We provide the Data Safe Haven code and material on an ‘as is’ basis without warranties of any kind and you use the code and supporting materials at your own cost and risk.
-```
-
-```{tip}
-In terms of the [Five Safes framework](https://ukdataservice.ac.uk/help/secure-lab/what-is-the-five-safes-framework/) the Data Safe Haven is aiming to be a Safe Setting.
-```
-
 (why_reproducible)=
 
 ## Reproducible
@@ -65,5 +51,5 @@ We also hope that you will contribute any improvements back to the main project.
 You are responsible for verifying the Data Safe Haven is appropriate for your purposes and effectively supports your own information governance policies and processes.
 
 ```{warning}
-The Data Safe Haven is not a managed service offered by the Alan Turing Institute. It is a set of instructions enabling you to set up your own secure environment
+The Data Safe Haven is not a managed service offered by the Alan Turing Institute. It is a set of instructions enabling you to set up your own secure environment.
 ```
diff --git a/docs/source/processes/data_access_controls.md b/docs/source/processes/data_access_controls.md