diff --git a/docs/dbgap_info.md b/docs/dbgap_info.md new file mode 100644 index 000000000..8cc92e763 --- /dev/null +++ b/docs/dbgap_info.md @@ -0,0 +1,69 @@ +# dbGaP Information (as understood by Gen3) + +The [Database for Genotypes and Phenotypes (dbGaP)](https://www.ncbi.nlm.nih.gov/gap/) is used to "archive and distribute the data and results from studies that have investigated the interaction of genotype and phenotype in Humans". + +> NOTE: For official details about dbGaP please visit the official site/documentation. + +The largest unit of data that can be submitted to dbGaP is a *Study*. Studies can have sub-studies. Each study is identified by a unique study number (AKA phsid AKA study accession) and additional information (like version), which may look something like `phs001826.v1.p1.c1`. The `.` delimites various pieces of information. + +* `phs001826`: unique study identifier +* `v1`: data version +* `p1`: participant set version +* `c1`: consent group version + +The combination of these fields is known as a *dbGaP Accession Number*. + +More information about this can be found in [this NCBI article](https://www.ncbi.nlm.nih.gov/pmc/articles/PMC2031016/). + +## Authorization + +Fence is capable of syncing user access information from dbGaP's *Telemetry Files* (AKA study whitelists). These are typically provided in an SFTP server as CSV's or TXT's. You can see an example of the format in the Fence unit tests (`/tests/dbgap_sync`). + +A single *Telemetry File* represents the access allowed for a given dbGaP study accession. The file will contain rows of eRA Commons user IDs and other information. Fence is able to parse all *Telemetry Files* in an SFTP server (given credentials for the SFTP server). + +## Consent Groups + +Fence contains a configuration for whether or not to parse consent codes (at the time of writing, it is `parse_consent_code` in the `dbGaP` block). + +> NOTE: Reference the `config-default.yaml` for current configuration options and further details. + +When parsing consent codes, the authorization resources a user is given access to will be in the form `study_id.consent_group` (ex: `phs001826.c2`). + +### Consent Group `c999` Handling + +The consent group `c999` is interpretted as meaning the user should implicitly have access +to **all** available consent groups within the study. It can additionally be interpretted +as providing access to that study's "exchange area" (in addition to the parent study's +"common exchange area"). + +Fence will consolidate all known consents for a given study and then provide any user +with `c999` access to all those consents (including `.c999` explicitly, to represent +that study's exchange area). + +Fence allows configuring whether or not you want to handle the "common exchange area" logic +mentioned above (at the time of writing, it is `enable_common_exchange_area_access` in the `dbGaP` block). + +When turned on, you can provide a list of study identifiers (ex: `phs000123`, `phs000456`) and the resource you want to represent their parent study's common exchange area (ex: `123_and_456_common_exchange_area`) in Fence's configuration file (at the time of writing, it is `study_common_exchange_areas` in the `dbGaP` block). + +> NOTE: Again, please see the `config-default.yaml` for more information about available configurations. + +For example, `c999` would be handled slightly differently based on configuration. Below, assume a user has access to `c999` consent group: + +|| **Consent Cfg == True** | **Consent Codes Cfg == False** | +|---| ------------- | ------------- | +| **Common Exchange Cfg == True** | access to: common exchange area (if phsid in cfg mapping) + study-specifc exchange area + all consent codes | c999 ignored, access to phsid w/o consent | +| **Common Exchange Cfg == False** | access to: study-specifc exchange area + all consent codes | c999 ignored, access to phsid w/o consent | + +So the user access granted in a situation with `phs000123.c999` (assuming there exists a +`phs000123.c1` and `phs000123.c2`): + +|| **Consent Cfg == True** | **Consent Codes Cfg == False** | +|---| ------------- | ------------- | +| **Common Exchange Cfg == True** | `test_common_exchange_area` + `phs000123.c999` + `phs000123.c1`, `phs000123.c2` | `phs000123` +| **Common Exchange Cfg == False** | `phs000123.c999` + `phs000123.c1`, `phs000123.c2` | `phs000123` | + +> NOTE: On the resource level, `phs000123.c999` should refer to resources that exist in that study's specific exchange area. Resources in the parent's common exchange area should be controlled via `test_common_exchange_area`. + +## Version Updates + +A study can be updated and at that time the patients and consent groups may change and the version number `v1` would get bumped up. At the moment, Fence does not handle these versions, so authorization is effectively either study level, or study+consent level. \ No newline at end of file diff --git a/fence/config-default.yaml b/fence/config-default.yaml index 66f1715b9..7e94ea3b5 100644 --- a/fence/config-default.yaml +++ b/fence/config-default.yaml @@ -383,15 +383,51 @@ dbGaP: proxy_user: '' protocol: 'sftp' decrypt_key: '' + # parse out the consent from the dbgap accession number such that something + # like "phs000123.v1.p1.c2" becomes "phs000123.c2". + # + # NOTE: when this is "false" the above would become "phs000123" parse_consent_code: true - # A mapping from the dbgap study to which authorization namespaces the actual data - # lives in. For example, `studyX` data may exist in multiple organizations, so - # we need to know to map authorization to all orgs resources + # A consent of "c999" can indicate access to that study's "exchange area data" + # and when a user has access to one study's exchange area data, they + # have access to the parent study's "common exchange area data" that is not study + # specific. The following config is whether or not to parse/handle "c999" codes + # for access to the common exchange area data + # + # NOTE: When enabled you MUST also provide a mapping to the + # `study_common_exchange_areas` from study -> parent common exchange area resource + enable_common_exchange_area_access: false + # The below configuration is a mapping from studies to their "common exchange area data" + # Fence project name a user gets access to when parsing c999 exchange area codes (and + # subsequently gives access to an Arborist resource representing this common area + # as well) + study_common_exchange_areas: + 'example': 'test_common_exchange_area' + # 'studyX': 'test_common_exchange_area' + # 'studyY': 'test_common_exchange_area' + # 'studyZ': 'test_common_exchange_area' + # A mapping from the dbgap study / Fence project to which authorization namespaces the + # actual data lives in. For example, `studyX` data may exist in multiple organizations, so + # we need to know how to map authorization to all orgs resources study_to_resource_namespaces: '_default': ['/'] - 'studyX': ['/orgA/', '/orgB/'] - 'studyY': ['/orgB/', '/orgC/'] - 'studyZ': ['/orgD/'] + 'test_common_exchange_area': ['/dbgap/'] + # above are for default support and exchange area support + # below are further examples + # + # 'studyX': ['/orgA/', '/orgB/'] + # 'studyX.c2': ['/orgB/', '/orgC/'] + # 'studyZ': ['/orgD/'] + +# Regex to match an assession number that has consent information in forms like: +# phs00301123.c999 +# phs000123.v3.p1.c3 +# phs000123.c3 +# phs00301123.v3.p4.c999 +# Will NOT MATCH forms like: phs000123 +# +# WARNING: Do not change this without consulting the code that uses it +DBGAP_ACCESSION_WITH_CONSENT_REGEX: '(?Pphs[0-9]+)(.(?Pv[0-9]+)){0,1}(.(?Pp[0-9]+)){0,1}.(?Pc[0-9]+)' # ////////////////////////////////////////////////////////////////////////////////////// # STORAGE BACKENDS AND CREDENTIALS @@ -667,4 +703,4 @@ DREAM_CHALLENGE_TEAM: 'DREAM' DREAM_CHALLENGE_GROUP: 'DREAM' SYNAPSE_URI: 'https://repo-prod.prod.sagebase.org/auth/v1/' SYNAPSE_DISCOVERY_URL: -SYNAPSE_AUTHZ_TTL: 86400 \ No newline at end of file +SYNAPSE_AUTHZ_TTL: 86400 diff --git a/fence/sync/sync_users.py b/fence/sync/sync_users.py index cbd0ad1a1..07aaf6f00 100644 --- a/fence/sync/sync_users.py +++ b/fence/sync/sync_users.py @@ -6,6 +6,7 @@ import subprocess as sp import tempfile import yaml +import copy from contextlib import contextmanager from csv import DictReader from io import StringIO @@ -299,6 +300,10 @@ def __init__( self.protocol = dbGaP["protocol"] self.dbgap_key = dbGaP["decrypt_key"] self.parse_consent_code = dbGaP.get("parse_consent_code", True) + self.enable_common_exchange_area_access = dbGaP.get( + "enable_common_exchange_area_access", False + ) + self.study_common_exchange_areas = dbGaP.get("study_common_exchange_areas", {}) self.session = db_session self.driver = SQLAlchemyDriver(DB) self.project_mapping = project_mapping or {} @@ -457,43 +462,62 @@ def _parse_csv(self, file_dict, sess, encrypted=True): dbgap_project = phsid[0] if len(phsid) > 1 and self.parse_consent_code: consent_code = phsid[-1] - if consent_code != "c999": - dbgap_project += "." + consent_code + + # c999 indicates full access to all consents and access + # to a study-specific exchange area + # access to at least one study-specific exchange area implies access + # to the parent study's common exchange area + # + # NOTE: Handling giving access to all consents is done at + # a later time, when we have full information about possible + # consents + self.logger.debug( + f"got consent code {consent_code} from dbGaP project " + f"{dbgap_project}" + ) + if ( + consent_code == "c999" + and self.enable_common_exchange_area_access + and dbgap_project in self.study_common_exchange_areas + ): + self.logger.info( + "found study with consent c999 and Fence " + "is configured to parse exchange area data. Giving user " + f"{username} {privileges} privileges in project: " + f"{self.study_common_exchange_areas[dbgap_project]}." + ) + self._add_dbgap_project_for_user( + self.study_common_exchange_areas[dbgap_project], + privileges, + username, + sess, + user_projects, + ) + + dbgap_project += "." + consent_code display_name = row.get("user name", "") + tags = {"dbgap_role": row.get("role", "")} + + # some dbgap telemetry files have information about a researchers PI + if "downloader for" in row: + tags["pi"] = row["downloader for"] + + # prefer name over previous "downloader for" if it exists + if "downloader for names" in row: + tags["pi"] = row["downloader for names"] + user_info[username] = { "email": row.get("email", ""), "display_name": display_name, "phone_number": row.get("phone", ""), - "tags": {"dbgap_role": row.get("role", "")}, + "tags": tags, } if dbgap_project not in self.project_mapping: - if dbgap_project not in self._projects: - - self.logger.debug( - "creating Project in fence from dbGaP study: {}".format( - dbgap_project - ) - ) - - project = self._get_or_create( - sess, Project, auth_id=dbgap_project - ) - - # need to add dbgap project to arborist - if self.arborist_client: - self._add_dbgap_study_to_arborist(dbgap_project) - - if project.name is None: - project.name = dbgap_project - self._projects[dbgap_project] = project - phsid_privileges = {dbgap_project: set(privileges)} - if username in user_projects: - user_projects[username].update(phsid_privileges) - else: - user_projects[username] = phsid_privileges - + self._add_dbgap_project_for_user( + dbgap_project, privileges, username, sess, user_projects + ) for element_dict in self.project_mapping.get(dbgap_project, []): try: phsid_privileges = { @@ -513,6 +537,33 @@ def _parse_csv(self, file_dict, sess, encrypted=True): self.logger.info(e) return user_projects, user_info + def _add_dbgap_project_for_user( + self, dbgap_project, privileges, username, sess, user_projects + ): + """ + Helper function for csv parsing that adds a given dbgap project to Fence/Arborist + and then updates the dictionary containing all user's project access + """ + if dbgap_project not in self._projects: + self.logger.debug( + "creating Project in fence for dbGaP study: {}".format(dbgap_project) + ) + + project = self._get_or_create(sess, Project, auth_id=dbgap_project) + + # need to add dbgap project to arborist + if self.arborist_client: + self._add_dbgap_study_to_arborist(dbgap_project) + + if project.name is None: + project.name = dbgap_project + self._projects[dbgap_project] = project + phsid_privileges = {dbgap_project: set(privileges)} + if username in user_projects: + user_projects[username].update(phsid_privileges) + else: + user_projects[username] = phsid_privileges + @staticmethod def sync_two_user_info_dict(user_info1, user_info2): """ @@ -934,6 +985,10 @@ def _sync(self, sess): self.logger.info("dbgap files: {}".format(dbgap_file_list)) permissions = [{"read-storage"} for _ in dbgap_file_list] + if self.parse_consent_code and self.enable_common_exchange_area_access: + self.logger.info( + f"using study to common exchange area mapping: {self.study_common_exchange_areas}" + ) user_projects, user_info = self._parse_csv( dict(list(zip(dbgap_file_list, permissions))), encrypted=True, sess=sess ) @@ -977,10 +1032,15 @@ def _sync(self, sess): self.sync_two_phsids_dict(user_projects_csv, user_projects) self.sync_two_user_info_dict(user_info_csv, user_info) - # privilleges in yaml files overide ones in csv files + # privileges in yaml files overide ones in csv files self.sync_two_phsids_dict(user_yaml.projects, user_projects) self.sync_two_user_info_dict(user_yaml.user_info, user_info) + if self.parse_consent_code: + self._grant_all_consents_to_c999_users( + user_projects, user_yaml.project_to_resource + ) + if user_projects: self.logger.info("Sync to db and storage backend") self.sync_to_db_and_storage_backend(user_projects, user_info, sess) @@ -1017,6 +1077,53 @@ def _sync(self, sess): ) exit(1) + def _grant_all_consents_to_c999_users( + self, user_projects, user_yaml_project_to_resources + ): + access_number_matcher = re.compile(config["DBGAP_ACCESSION_WITH_CONSENT_REGEX"]) + # combine dbgap/user.yaml projects into one big list (in case not all consents + # are in either) + all_projects = set( + list(self._projects.keys()) + list(user_yaml_project_to_resources.keys()) + ) + + self.logger.debug(f"all projects: {all_projects}") + + # construct a mapping from phsid (without consent) to all accessions with consent + consent_mapping = {} + for project in all_projects: + phs_match = access_number_matcher.match(project) + if phs_match: + accession_number = phs_match.groupdict() + + # TODO: This is not handling the .v1.p1 at all + consent_mapping.setdefault(accession_number["phsid"], set()).add( + ".".join([accession_number["phsid"], accession_number["consent"]]) + ) + + self.logger.debug(f"consent mapping: {consent_mapping}") + + # go through existing access and find any c999's and make sure to give access to + # all accessions with consent for that phsid + for username, user_project_info in copy.deepcopy(user_projects).items(): + for project, _ in user_project_info.items(): + phs_match = access_number_matcher.match(project) + if phs_match and phs_match.groupdict()["consent"] == "c999": + # give access to all consents + all_phsids_with_consent = consent_mapping.get( + phs_match.groupdict()["phsid"], [] + ) + self.logger.info( + f"user {username} has c999 consent group for: {project}. " + f"Granting access to all consents: {all_phsids_with_consent}" + ) + # NOTE: Only giving read-storage at the moment (this is same + # permission we give for other dbgap projects) + for phsid_with_consent in all_phsids_with_consent: + user_projects[username].update( + {phsid_with_consent: {"read-storage"}} + ) + def _update_arborist(self, session, user_yaml): """ Create roles, resources, policies, groups in arborist from the information in @@ -1294,6 +1401,8 @@ def _add_dbgap_study_to_arborist(self, dbgap_study): .get(dbgap_study, default_namespaces) ) + self.logger.debug(f"dbgap study namespaces: {namespaces}") + arborist_resource_namespaces = [ namespace.rstrip("/") + "/programs/" for namespace in namespaces ] diff --git a/tests/dbgap_sync/conftest.py b/tests/dbgap_sync/conftest.py index 43ed2a423..b5d3bf389 100644 --- a/tests/dbgap_sync/conftest.py +++ b/tests/dbgap_sync/conftest.py @@ -120,7 +120,14 @@ def syncer(db_session, request): "phstest": [{"name": "Test", "auth_id": "Test"}], } - dbGap = {} + dbGap = yaml_load( + open( + os.path.join( + os.path.dirname(os.path.dirname(os.path.abspath(__file__))), + "test-fence-config.yaml", + ) + ) + ).get("dbGaP") test_db = yaml_load( open( os.path.join( diff --git a/tests/dbgap_sync/test_user_sync.py b/tests/dbgap_sync/test_user_sync.py index 85c4692fe..fc93ffdc3 100644 --- a/tests/dbgap_sync/test_user_sync.py +++ b/tests/dbgap_sync/test_user_sync.py @@ -4,7 +4,7 @@ from fence import models from fence.sync.sync_users import _format_policy_id - +from fence.config import config from tests.dbgap_sync.conftest import LOCAL_YAML_DIR @@ -40,37 +40,58 @@ def test_sync_incorrect_user_yaml_file(syncer, monkeypatch, db_session): @pytest.mark.parametrize("syncer", ["google", "cleversafe"], indirect=True) -def test_sync(syncer, db_session, storage_client): +@pytest.mark.parametrize("parse_consent_code_config", [False, True]) +def test_sync( + syncer, db_session, storage_client, parse_consent_code_config, monkeypatch +): + # patch the sync to use the parameterized config value + monkeypatch.setattr(syncer, "parse_consent_code", parse_consent_code_config) syncer.sync() users = db_session.query(models.User).all() assert len(users) == 11 - tags = db_session.query(models.Tag).all() - assert len(tags) == 7 + if parse_consent_code_config: + user = models.query_for_user(session=db_session, username="USERC") + assert user.project_access == { + "phs000178.c1": ["read-storage"], + "phs000178.c2": ["read-storage"], + "phs000178.c999": ["read-storage"], + "phs000179.c1": ["read-storage"], + } - proj = db_session.query(models.Project).all() - assert len(proj) == 9 + user = models.query_for_user(session=db_session, username="USERF") + assert user.project_access == { + "phs000178.c1": ["read-storage"], + "phs000178.c2": ["read-storage"], + } - user = models.query_for_user(session=db_session, username="USERC") - assert user.project_access == { - "phs000178": ["read-storage"], - "TCGA-PCAWG": ["read-storage"], - "phs000179.c1": ["read-storage"], - } + user = models.query_for_user(session=db_session, username="TESTUSERB") + assert user.project_access == { + "phs000179.c1": ["read-storage"], + "phs000178.c1": ["read-storage"], + } + else: + user = models.query_for_user(session=db_session, username="USERC") + assert user.project_access == { + "phs000178": ["read-storage"], + "TCGA-PCAWG": ["read-storage"], + "phs000179": ["read-storage"], + } - user = models.query_for_user(session=db_session, username="USERF") - assert user.project_access == { - "phs000178.c1": ["read-storage"], - "phs000178.c2": ["read-storage"], - } + user = models.query_for_user(session=db_session, username="USERF") + assert user.project_access == { + "phs000178": ["read-storage"], + "TCGA-PCAWG": ["read-storage"], + } - user = models.query_for_user(session=db_session, username="TESTUSERB") - assert user.project_access == { - "phs000179.c1": ["read-storage"], - "phs000178.c1": ["read-storage"], - } + user = models.query_for_user(session=db_session, username="TESTUSERB") + assert user.project_access == { + "phs000178": ["read-storage"], + "TCGA-PCAWG": ["read-storage"], + "phs000179": ["read-storage"], + } user = models.query_for_user(session=db_session, username="TESTUSERD") assert user.display_name == "USER D" @@ -95,6 +116,133 @@ def test_sync(syncer, db_session, storage_client): assert not user_access +@pytest.mark.parametrize("syncer", ["google"], indirect=True) +@pytest.mark.parametrize("enable_common_exchange_area", [False, True]) +@pytest.mark.parametrize("parse_consent_code_config", [False, True]) +def test_dbgap_consent_codes( + syncer, + db_session, + storage_client, + enable_common_exchange_area, + parse_consent_code_config, + monkeypatch, +): + # patch the sync to use the parameterized value for whether or not to parse exchange + # area data + monkeypatch.setattr( + syncer, "enable_common_exchange_area_access", enable_common_exchange_area + ) + monkeypatch.setattr(syncer, "parse_consent_code", parse_consent_code_config) + monkeypatch.setattr(syncer, "project_mapping", {}) + + syncer.sync() + + user = models.query_for_user(session=db_session, username="USERC") + if parse_consent_code_config: + if enable_common_exchange_area: + # b/c user has c999, ensure they have access to all consents, study-specific + # exchange area (via .c999) and the common exchange area configured + assert user.project_access == { + "phs000179.c1": ["read-storage"], + "phs000178.c1": ["read-storage"], + "phs000178.c2": ["read-storage"], + "phs000178.c999": ["read-storage"], + # should additionally include the study-specific exchange area access and + # access to the common exchange area + "test_common_exchange_area": ["read-storage"], + } + else: + # b/c user has c999 but common exchange area is disabled, ensure they have + # access to all consents, study-specific exchange area (via .c999) + assert user.project_access == { + "phs000179.c1": ["read-storage"], + # c999 gives access to all consents + "phs000178.c1": ["read-storage"], + "phs000178.c2": ["read-storage"], + "phs000178.c999": ["read-storage"], + } + else: + # with consent code parsing off, ensure users have access to just phsids + assert user.project_access == { + "phs000178": ["read-storage"], + "phs000179": ["read-storage"], + } + + user = models.query_for_user(session=db_session, username="USERF") + if parse_consent_code_config: + assert user.project_access == { + "phs000178.c1": ["read-storage"], + "phs000178.c2": ["read-storage"], + } + else: + assert user.project_access == {"phs000178": ["read-storage"]} + + user = models.query_for_user(session=db_session, username="TESTUSERB") + if parse_consent_code_config: + assert user.project_access == { + "phs000178.c1": ["read-storage"], + "phs000179.c1": ["read-storage"], + } + else: + assert user.project_access == { + "phs000178": ["read-storage"], + "phs000179": ["read-storage"], + } + + user = models.query_for_user(session=db_session, username="TESTUSERD") + if parse_consent_code_config: + assert user.project_access == {"phs000179.c1": ["read-storage"]} + else: + assert user.project_access == {"phs000179": ["read-storage"]} + + resource_to_parent_paths = {} + for call in syncer.arborist_client.update_resource.call_args_list: + args, kwargs = call + parent_path = args[0] + resource = args[1].get("name") + resource_to_parent_paths.setdefault(resource, []).append(parent_path) + + if parse_consent_code_config: + if enable_common_exchange_area: + # b/c user has c999, ensure they have access to all consents, study-specific + # exchange area (via .c999) and the common exchange area configured + assert "phs000178.c999" in resource_to_parent_paths + assert resource_to_parent_paths["phs000178.c999"] == ["/orgA/programs/"] + + assert "test_common_exchange_area" in resource_to_parent_paths + assert resource_to_parent_paths["test_common_exchange_area"] == [ + "/dbgap/programs/" + ] + + assert "phs000178.c1" in resource_to_parent_paths + assert resource_to_parent_paths["phs000178.c1"] == ["/orgA/programs/"] + + # NOTE: this study+consent is configured to have multiple names in the dbgap config + assert "phs000178.c2" in resource_to_parent_paths + assert resource_to_parent_paths["phs000178.c2"] == [ + "/orgA/programs/", + "/orgB/programs/", + "/programs/", + ] + + assert "phs000178.c999" in resource_to_parent_paths + assert resource_to_parent_paths["phs000178.c999"] == ["/orgA/programs/"] + + assert "phs000179.c1" in resource_to_parent_paths + assert resource_to_parent_paths["phs000179.c1"] == ["/orgA/programs/"] + else: + assert "phs000178" in resource_to_parent_paths + # NOTE: this study is configured to have multiple names in the dbgap config + assert resource_to_parent_paths["phs000178"] == [ + "/orgA/programs/", + "/orgB/programs/", + "/programs/", + ] + + assert "phs000179" in resource_to_parent_paths + assert resource_to_parent_paths["phs000179"] == ["/orgA/programs/"] + + @pytest.mark.parametrize("syncer", ["google", "cleversafe"], indirect=True) def test_sync_from_files(syncer, db_session, storage_client): sess = db_session @@ -301,14 +449,13 @@ def test_update_arborist(syncer, db_session): # one project is configured to point to two different arborist resource # parent paths (/orgA/ and /orgB/ and /) - project_with_mult_namespaces = "phs000178" + projects_with_mult_namespaces = ["phs000178.c2"] expect_resources = [ "phs000179.c1", "phs000178.c1", "phs000178.c2", - "TCGA-PCAWG", + "phs000178.c999", "data_file", # comes from user.yaml file - project_with_mult_namespaces, ] resource_to_parent_paths = {} @@ -322,7 +469,7 @@ def test_update_arborist(syncer, db_session): assert resource in list(resource_to_parent_paths.keys()) if resource == "data_file": assert resource_to_parent_paths[resource] == ["/"] - elif resource == project_with_mult_namespaces: + elif resource in projects_with_mult_namespaces: assert resource_to_parent_paths[resource] == [ "/orgA/programs/", "/orgB/programs/", diff --git a/tests/test-fence-config.yaml b/tests/test-fence-config.yaml index a5eb5d5a1..2b27be184 100644 --- a/tests/test-fence-config.yaml +++ b/tests/test-fence-config.yaml @@ -283,13 +283,43 @@ dbGaP: proxy_user: '' protocol: 'sftp' decrypt_key: '' + # parse out the consent from the dbgap accession number such that something + # like "phs000123.v1.p1.c2" becomes "phs000123.c2". + # + # NOTE: when this is "false" the above would become "phs000123" parse_consent_code: true - # A mapping from the dbgap study to which authorization namespaces the actual data - # lives in. For example, `studyX` data may exist in multiple organizations, so + # A consent of "c999" can indicate access to that study's "exchange area data" + # and when a user has access to one study's exchange area data, they + # have access to the parent study's "common exchange area data" that is not study + # specific. The following config is whether or not to parse/handle "c999" codes + # for exchange area data + enable_common_exchange_area_access: false + # The below configuration is a mapping from studies to their "common exchange area data" + # Fence project name a user gets access to when parsing c999 exchange area codes (and + # subsequently gives access to an arborist resource representing this common area + # as well) + study_common_exchange_areas: + 'phs000178': 'test_common_exchange_area' + # A mapping from the dbgap study / Fence project to which authorization namespaces the + # actual data lives in. For example, `studyX` data may exist in multiple organizations, so # we need to know to map authorization to all orgs resources study_to_resource_namespaces: '_default': ['/orgA/'] + 'test_common_exchange_area': ['/dbgap/'] + # study when not parsing consent codes 'phs000178': ['/orgA/', '/orgB/', '/'] + # study when parsing consent codes + 'phs000178.c2': ['/orgA/', '/orgB/', '/'] + +# Regex to match an assession number that has consent information in forms like: +# phs00301123.c999 +# phs000123.v3.p1.c3 +# phs000123.c3 +# phs00301123.v3.p4.c999 +# Will NOT MATCH forms like: phs000123 +# +# WARNING: Do not change this without consulting the code that uses it +DBGAP_ACCESSION_WITH_CONSENT_REGEX: '(?Pphs[0-9]+)(.(?Pv[0-9]+)){0,1}(.(?Pp[0-9]+)){0,1}.(?Pc[0-9]+)' # ////////////////////////////////////////////////////////////////////////////////////// # STORAGE BACKENDS AND CREDENTIALS