Allow configuration to support data staging #8

dosumis · 2020-10-06T11:13:37Z

We originally discussed supporting a parallel staging pipeline

This doesn't seem to have happened, but looks like it would be relatively straightforward to lightly edit the SPARQL used for filtering our embargoed data: https://github.com/VirtualFlyBrain/vfb-pipeline-collectdata/tree/master/sparql

We would need a config for the parallel, staging pipeline that would allow through DataSets where production: False, staging: True

CC @Robbie1977 - please check my spec here.

matentzn · 2020-10-06T11:24:48Z

I can start working on this by the end of the week, but the more work intensive business will be to set up the parallel pipeline physically (setting up a parallel triple store, pdb, owlery etc). Maybe the first step would be to actually mirror the existing pipeline physically on Jenkins (vfb-pipeline2-devstage) and then start working with config to allow unpublished data to seep through?

Robbie1977 · 2020-10-06T11:43:59Z

I will setup a dual pipeline - is the whole thing needed or simply the dump stage and beyond?

matentzn · 2020-10-06T11:50:00Z

Everything except for KB is needed unfortunately, because the embargoeing happens pre-triplestore..

This pull implements the blocking and staging logics, which are totally independent. - Blocking is implemented through cypher queries, see process.sh lines 56-57. - Staging is the rest. We only care here about the embargo logic of staging. So if we embargo, then, depending on whether we are in the prod or dev stage mode (see Dockerfile), we will embargo different things. The logic corresponds to what was discussed [here](VirtualFlyBrain/neo4j2owl#52). The implementation is realised through two different sets of sparql queries (one for prod, one for dev), which apply differently rigorous embargo rules. The prod queries are unchanged, and the embargo rules in dev are tighter (i.e. less stuff gets embargoed). see #8 see VirtualFlyBrain/neo4j2owl#52

matentzn · 2020-10-23T10:48:56Z

Fixed in #9

dosumis assigned matentzn Oct 6, 2020

matentzn mentioned this issue Oct 15, 2020

Blocking and Staging logic #9

Merged

matentzn closed this as completed Oct 23, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Allow configuration to support data staging #8

Allow configuration to support data staging #8

dosumis commented Oct 6, 2020 •

edited

Loading

matentzn commented Oct 6, 2020

Robbie1977 commented Oct 6, 2020

matentzn commented Oct 6, 2020

matentzn commented Oct 23, 2020

Allow configuration to support data staging #8

Allow configuration to support data staging #8

Comments

dosumis commented Oct 6, 2020 • edited Loading

matentzn commented Oct 6, 2020

Robbie1977 commented Oct 6, 2020

matentzn commented Oct 6, 2020

matentzn commented Oct 23, 2020

dosumis commented Oct 6, 2020 •

edited

Loading