Update of previously imported observed data from another data source #785

Yuri05 · 2020-05-26T16:18:34Z

When importing observed data sets, one import process results in (generally) N observed data sets imported into a project. N>=1 (based on the grouping information).

It should be possible to update observed data sets imported from one data source by selecting another data source.

Proposed workflow:

User selects ONE observed data set in a project and then "Update from new data source"
Software detects if there are other observed data sets in the project which were imported from the same data source. If YES: user is informed that further data sets will be updated.
User defines a new data source
Software checks if the new data source has the same structure (e.g. data columns used by the import configuration) as the original one. If NOT: ERROR (Update from the new data source not possible)
Software checks if the new data source has the same combinations of metadata relevant for grouping. If NOT:

Software checks, if there are data sets which were imported previously but not available in the new data source. If YES: Software checks, if any of these data sets is used in the project (e.g. used in a simulation / Parameter identification/ ...).
- If this is the case: ERROR, Import cannot be performed. User is informed that those data sets must be manually deleted from all simulations/PIs/... first.
- Otherwise: all those data sets will be deleted from the project when the new import is finished.

Preview is shown (s. Preview of data in the import configuration editor #625)
- In the preview in this use case it will not be possible to change the import configuration

Example:

observed data is imported from

Time [min]	Organ	Compartment	Species	Concentration [mg/ml]
1	Brain	Plasma	Human	0,1
2	Brain	Plasma	Human	12
3	Brain	Plasma	Human	2
0	Liver	Plasma	Human	0,2
1	Liver	Plasma	Human	8
2	Liver	Plasma	Human	2

which results in 2 data sets: Human|Brain|Plasma and Human|Liver Plasma

Use case 1

New data source contains the same grouping data (Human|Brain|Plasma and Human|Liver Plasma) e.g.

Time [min]	Organ	Compartment	Species	Concentration [mg/ml]
1	Brain	Plasma	Human	0,1
2	Brain	Plasma	Human	12
3	Brain	Plasma	Human	2
4	Brain	Plasma	Human	1
0	Liver	Plasma	Human	0,2
1	Liver	Plasma	Human	8
2	Liver	Plasma	Human	2
3	Liver	Plasma	Human	1

In this case time and concentration of previosly imported data sets (Human|Brain|Plasma and Human|Liver Plasma) will be just updated with the new values

Use case 2

Information about some previously available data sets is not available in the new data source, e.g.

Time [min]	Organ	Compartment	Species	Concentration [mg/ml]
1	Brain	Plasma	Human	0,1
2	Brain	Plasma	Human	12
3	Brain	Plasma	Human	2
4	Brain	Plasma	Human	1

Software checks if Human|Liver|Plasma is used in the project.

If YES: ERROR, user must delete it from all simulations/PIs/etc. first and then repeat the import procedure.
If NO: Human|Brain|Plasma will be updated with the new data and Human|Liver|Plasma will be deleted from the project

Use case 3

Information about all previously available data sets is available in the new data source; ADDITIONALY information about new data sets was added, e.g.

Time [min]	Organ	Compartment	Species	Concentration [mg/ml]
1	Brain	Plasma	Human	0,1
2	Brain	Plasma	Human	12
3	Brain	Plasma	Human	2
4	Brain	Plasma	Human	1
0	Liver	Plasma	Human	0,2
1	Liver	Plasma	Human	8
2	Liver	Plasma	Human	2
3	Liver	Plasma	Human	1
0	Heart	Plasma	Human	1
1	Heart	Plasma	Human	3
2	Heart	Plasma	Human	4
3	Heart	Plasma	Human	5

Previous state of the discussion

When importing observed data sets, one import process results in (generally) N observed data sets imported into a project. N>=1 (based on the grouping information).

It should be possible to update observed data sets imported from one data source by selecting another data source.

Proposed workflow:

User selects ONE observed data set in a project and then "Update from new data source"
Software detects if there are other observed data sets in the project which were imported from the same data source. If YES: user is asked if ALL those data sets should be updated or only the selected one (to be discussed: do we need this step or should ALL datasets be updated automatically?)
User defines a new data source
Software checks if the new data source has the same structure (e.g. data columns used by the import configuration) as the original one. If NOT: ERROR (Update from the new data source not possible)
Software checks if the new data source has the same combinations of metadata relevant for grouping. If NOT: to be discussed. Following scenarios are possible

a) Option 1: ERROR (Update from the new data source not possible)

b) Option 2: Observed data sets not available in the new data source are removed from project (complicated if some of observed data sets are used in simulations/PIs, etc!)

c)Option 3: Data sets available in the new data source are updated with the new data. Data sets not available in the new data source are kept AS IS.

d)Option 4: User can select and choose between Error/Delete(?)/Keep (previous options)
Preview is shown (s. Preview of data in the import configuration editor #625)
- In the preview in this use case it will not be possible to change the import configuration
- NEWLY ADDED datasets be selected/deselected (s. also Preview of data in the import configuration editor #625) (to be discussed: should this happen automatically?)

Example:

observed data is imported from

Time [min]	Organ	Compartment	Species	Concentration [mg/ml]
1	Brain	Plasma	Human	0,1
2	Brain	Plasma	Human	12
3	Brain	Plasma	Human	2
0	Liver	Plasma	Human	0,2
1	Liver	Plasma	Human	8
2	Liver	Plasma	Human	2

which results in 2 data sets: Human|Brain|Plasma and Human|Liver Plasma

Use case 1

New data source contains the same grouping data (Human|Brain|Plasma and Human|Liver Plasma) e.g.

Time [min]	Organ	Compartment	Species	Concentration [mg/ml]
1	Brain	Plasma	Human	0,1
2	Brain	Plasma	Human	12
3	Brain	Plasma	Human	2
4	Brain	Plasma	Human	1
0	Liver	Plasma	Human	0,2
1	Liver	Plasma	Human	8
2	Liver	Plasma	Human	2
3	Liver	Plasma	Human	1

In this case time and concentration of previosly imported data sets (Human|Brain|Plasma and Human|Liver Plasma) will be just updated with the new values

Use case 2

Information about some previously available data sets is not available in the new data source, e.g.

Time [min]	Organ	Compartment	Species	Concentration [mg/ml]
1	Brain	Plasma	Human	0,1
2	Brain	Plasma	Human	12
3	Brain	Plasma	Human	2
4	Brain	Plasma	Human	1

if we decide to go with 5a): ERROR
if we decide to go with 5b): Human|Brain|Plasma will be updated with the new data and Human|Liver|Plasma will be deleted from the project (complicated if some of observed data sets are used in simulations/PIs, etc!)
if we decide to go with 5c): Human|Brain|Plasma will be updated with the new data and Human|Liver|Plasma will be kept in the project AS IS

Use case 3

Information about all previously available data sets is available in the new data source; ADDITIONALY information about new data sets was added, e.g.

Time [min]	Organ	Compartment	Species	Concentration [mg/ml]
1	Brain	Plasma	Human	0,1
2	Brain	Plasma	Human	12
3	Brain	Plasma	Human	2
4	Brain	Plasma	Human	1
0	Liver	Plasma	Human	0,2
1	Liver	Plasma	Human	8
2	Liver	Plasma	Human	2
3	Liver	Plasma	Human	1
0	Heart	Plasma	Human	1
1	Heart	Plasma	Human	3
2	Heart	Plasma	Human	4
3	Heart	Plasma	Human	5

In this case time and concentration of previosly imported data sets (Human|Brain|Plasma and Human|Liver Plasma) will be updated with the new values. If the new data set for Human|Heart|Plasma will be added automatically depends on the decision for the step 6 above

The text was updated successfully, but these errors were encountered:

ju-rgen · 2021-01-13T12:48:59Z

Before update is finally performed a confirmation dialog should be displayed, where the user sees, how many new datasets are added, how many old datasets are deleted, how many datasets are updated, how many old datasets remain identical.
This allows the user to stop before somethin unintended happens, e.g. majority of datasets is deleted, because some wrong datasource was selected.

ju-rgen · 2021-01-18T16:43:14Z

Doesn't the uses invisible datasource grouping bear the risk, that in any non trivial distribution of the datasets of a datasource to multiple folders the user does not oversee, what s/he is updating?

I personally find it more clear to have a view where the datasets (= timeseries = curves) are grouped into imports (based on a import configuration and a data source), which could perhaps be done in a hierarchical view like in #786.
There I would start the workflow at a Import and call it "update data source".

But I admit, that the impact of such an update, e.g. update of plots is anyway somehow hidden to the user and requires a careful follow up in non trivial cases.

So the users should reflect this feature carefully.

georgeDaskalakis · 2021-03-09T09:39:45Z

The reload process currently has two options: reloading one specific dataset or reloading all the datasets that come from an excel file. The first option should not be available anymore - only reloading a whole file should be possible.

To do this the reload process should not delete the old datasets as is being done currently, but should load all the datasets from the file again, then present the user with an overview of what is currently loaded, what will be overwritten and what will be loaded as a new dataset (because of changes/additions to the excel file). Afterwards the data in the datasets that are going to be overwritten have to be edited - that way the simulations using those datasets will not lose their references to them.

Yuri05 · 2022-11-09T13:37:59Z

Implemented as part of Importer Redesign

Yuri05 added type: feature RFC Request For Comments Importer Observed data importer labels May 26, 2020

Yuri05 mentioned this issue Jun 25, 2020

Feature request: Update function for Observed Data Open-Systems-Pharmacology/PK-Sim#740

Closed

Yuri05 mentioned this issue Jan 14, 2021

Importer-Redesign: Filter #880

Closed

georgeDaskalakis self-assigned this Mar 9, 2021

Yuri05 closed this as completed Nov 9, 2022

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Update of previously imported observed data from another data source #785

Update of previously imported observed data from another data source #785

Yuri05 commented May 26, 2020 •

edited

Loading

Example:

Use case 1

Use case 2

Use case 3

ju-rgen commented Jan 13, 2021

ju-rgen commented Jan 18, 2021

georgeDaskalakis commented Mar 9, 2021

Yuri05 commented Nov 9, 2022

Update of previously imported observed data from another data source #785

Update of previously imported observed data from another data source #785

Comments

Yuri05 commented May 26, 2020 • edited Loading

Example:

Use case 1

Use case 2

Use case 3

Example:

Use case 1

Use case 2

Use case 3

ju-rgen commented Jan 13, 2021

ju-rgen commented Jan 18, 2021

georgeDaskalakis commented Mar 9, 2021

Yuri05 commented Nov 9, 2022

Yuri05 commented May 26, 2020 •

edited

Loading