Merge pull request #308 from esqLABS/documentation

Documenting plotting and sensitivity functions
esqLABS · Aug 18, 2022 · febe50a · febe50a
2 parents 54aa2f2 + 4999616
commit febe50a
Show file tree

Hide file tree

Showing 9 changed files with 68,187 additions and 20 deletions.
diff --git a/README.md b/README.md
@@ -1,6 +1,6 @@
 # esqlabsR <img src="man/figures/logo.png" align="right" width="240" />
 
-Utilities functions for modeling and simulation workflows within *esqLABS*. 
+The **`{esqlabsR}`** R-package is designed to facilitate and standardize **modeling and simulation** (M&S) of PBPK and QSP models implemented in [Open Systems Pharmacology Software](https://www.open-systems-pharmacology.org/) (OSPS) and executed from R. The package provides functions to read and run scenarios, workflows, and simulations. Furthermore, it creates visualizations based on non-code input from Excel files. The package is based on R functions in the [`ospsuite` package](https://github.com/Open-Systems-Pharmacology/OSPSuite-R).
 
 <!-- badges: start -->
 

diff --git a/tests/data/TestProject/Code/ProjectConfiguration.xlsx b/tests/data/TestProject/Code/ProjectConfiguration.xlsx
diff --git a/tests/data/TestProject/Models/Aciclovir.pkml b/tests/data/TestProject/Models/Aciclovir.pkml
diff --git a/tests/data/TestProject/Results/aciclovir_time_profile.png b/tests/data/TestProject/Results/aciclovir_time_profile.png
diff --git a/vignettes/data-handling.Rmd b/vignettes/data-handling.Rmd
@@ -3,7 +3,6 @@ title: "Data handling"
 output: 
   rmarkdown::html_vignette:
     toc: true
-#output: pdf_document
 vignette: >
   %\VignetteIndexEntry{Data handling}
   %\VignetteEngine{knitr::rmarkdown}
@@ -15,30 +14,38 @@ knitr::opts_chunk$set(
   collapse = TRUE,
   comment = "#>"
 )
+knitr::opts_knit$set(
+  root.dir = "../tests/data/TestProject/Code/"
+)
 ```
 
 ```{r, echo = FALSE, results = "hide", message = FALSE}
 library(esqlabsR)
 ```
 
-The workflow of modeling and simulation involves running simulations and comparing simulated data against observed data. Functionalities of `esqlabsR` require observed data to be present as [`DataSet`](https://www.open-systems-pharmacology.org/OSPSuite-R/reference/DataSet.html) objects. Please refer to the article [Observed data](https://www.open-systems-pharmacology.org/OSPSuite-R/articles/observed-data.html) for information on how to load data from excel or *.pkml files.
+### Workflow
+
+The workflow of modeling and simulation involves running simulations and comparing simulated data against observed data. Functionalities of `esqlabsR` require observed data to be present as [`DataSet`](https://www.open-systems-pharmacology.org/OSPSuite-R/reference/DataSet.html) objects. Please refer to the article [Observed data](https://www.open-systems-pharmacology.org/OSPSuite-R/articles/observed-data.html) for information on how to load data from Excel or `*.pkml` files.
 
-The function `loadObservedData()` facilitates loading data in standard esqlabs projects. Assuming the standard project folder structure with observed data being present in the "Data" folder, loading data can be done with the following code:
+The `createDefaultProjectConfiguration()` function expects to have a master configuration file called `ProjectConfiguration.xlsx` at the `/Code/` folder of the work package.
+
+The function `loadObservedData()` facilitates loading data in standard *esqLABS* projects. Assuming the standard project folder structure is followed, and Excel files with observed data are present in the `projectConfiguration$dataFolder` folder, the following chunk of code loads the data:
 
 ```{r loadObservedData}
-setwd("../tests/data/TestProject/Code/")
-# Create a ProjectConfiguration based on the file located in `tests/data/TestProject/Code`
 projectConfiguration <- createDefaultProjectConfiguration()
 dataSheets <- "Laskin 1982.Group A"
 observedData <- loadObservedData(projectConfiguration = projectConfiguration, sheets = dataSheets)
 
 print(names(observedData))
 ```
-The function will load from the excel file specified in the `ProjectConfiguration.xlsx`. 
-The function returns a list of `DataSet` class instances. The resulting object 
-will be used later to [run simulations](example-workflow.html), 
-[plot results](example-visualization.html) and 
-[run a sensitivity analysis](example-sensitivity.html).
+
+The `loadObservedData()` function loads the data from a path stored in the `projectConfiguration$dataFile`. This function returns a list of `DataSet` class instances. The resulting object will be used later to [run simulations](example-workflow.html), [plot results](example-visualization.html) and [run a sensitivity analysis](sensitivity-analysis.html).
+
+### Troubleshooting
+
+* The `sheets` argument of the `loadObservedData()` function should be a string or a list of strings. If a specified sheet is not found in the file, it will be omitted with a warning; the `observedData` variable may then be an empty named list.
+
+* If the data file specified in `projectConfiguration$dataFile` is missing in the filesystem, the `loadObservedData()` function will fail with an `Invalid File` message.
 
 More detailed information on function signatures can be found in:
 

diff --git a/vignettes/example-visualization.Rmd b/vignettes/example-visualization.Rmd
@@ -0,0 +1,97 @@
+---
+title: "Visualization and reporting"
+output: rmarkdown::html_vignette
+#output: pdf_document
+vignette: >
+  %\VignetteIndexEntry{Visualization and reporting}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>"
+)
+knitr::opts_knit$set(
+  root.dir = "../tests/data/TestProject/Code/"
+)
+```
+
+```{r, echo = FALSE, results = "hide", message = FALSE}
+library(esqlabsR)
+projectConfiguration <- createDefaultProjectConfiguration()
+dataSheets <- "Laskin 1982.Group A"
+observedData <- loadObservedData(projectConfiguration = projectConfiguration, sheets = dataSheets)
+scenarioConfiguration <- ScenarioConfiguration$new(projectConfiguration)
+OutputPaths <- enum(list(
+  Aciclovir_PVB = "Organism|PeripheralVenousBlood|Aciclovir|Plasma (Peripheral Venous Blood)"
+))
+simulatedScenarios <- runScenarios(
+  scenarioNames = c("TestScenario"),
+  scenarioConfiguration = scenarioConfiguration,
+  customParams = NULL, saveSimulationsToPKML = FALSE
+)
+```
+
+### Workflow
+
+Plotting the simulation results is an important part of model diagnostics and quality control. Simulated modeling scenarios can be passed to plotting functions from the `{ospsuite}` package to create uniformly-looking plots.
+
+`DataCombined` is a class used to store matching observed and simulated data. Initialize a new class instance and populate it with data with the following code:
+
+```{r datacombined}
+dataCombined <- DataCombined$new()
+dataCombined$addDataSets(observedData, names = "Observed", groups = "Aciclovir")
+dataCombined$addSimulationResults(simulatedScenarios$TestScenario$results, names = "Simulated", groups = "Aciclovir")
+dataCombined$toDataFrame()
+```
+
+The simulation results are stored in a list returned by the `runScenarios()` function. Plotting and visualization is performed by storing these results along with matching observed data in a `DataCombined` object and passing it to plotting functions.
+
+Plotting functions in the `{ospsuite}` package are wrappers around `{tlf}` plotting functions that provide default plot configuration options. All of them accept instances of `DataCombined` class as the data source. 
+
+Time profile plots visualize the pharmacokinetics of the drug in question and help assess if the observed data (represented by points and error bars) match the simulated data (represented by lines).
+
+
+```{r plot-time-profile, fig.width=7, fig.height=4}
+plotIndividualTimeProfile(dataCombined)
+```
+
+Observed versus simulated plots show if simulated time points and observed time points follow a linear trend.
+
+```{r plot-obs-vs-pred, fig.width=7, fig.height=4}
+plotObservedVsSimulated(dataCombined)
+```
+
+Residual plots show if there is a systematic bias in how the simulation represents values either in high-concentration or low-concentration regions, or, alternatively, in early or late time periods.
+
+```{r residuals-vs-simulated, fig.width=7, fig.height=4}
+plotResidualsVsSimulated(dataCombined)
+```
+
+```{r residuals-vs-time, fig.width=7, fig.height=4}
+plotResidualsVsTime(dataCombined)
+```
+
+The plots returned by plotting functions are `ggplot` objects. They can be modified further and saved to files with `{ggplot2}` functions:
+
+```{r save-plots}
+plotObject <- plotIndividualTimeProfile(dataCombined)
+ggplot2::ggsave(filename = "../Results/aciclovir_time_profile.png", plotObject, width = 8, height = 4)
+```
+
+### Troubleshooting
+
+* At any time, you can check the groups assigned to the datasets in the `DataCombined` object by examining the output of `dataCombined$toDataFrame()`.
+
+More detailed information on function signatures can be found in:
+
+* `ospsuite` documentation on:
+    * [Simulation class](https://www.open-systems-pharmacology.org/OSPSuite-R/reference/Simulation.html)
+    * [SimulationResults class](https://www.open-systems-pharmacology.org/OSPSuite-R/reference/SimulationResults.html)
+    * [DataCombined class](https://www.open-systems-pharmacology.org/OSPSuite-R/reference/DataCombined.html)
+    * [plotIndividualTimeProfile()](https://www.open-systems-pharmacology.org/OSPSuite-R/reference/plotIndividualTimeProfile.html)
+    * [plotObservedVsSimulated()](https://www.open-systems-pharmacology.org/OSPSuite-R/reference/plotObservedVsSimulated.html)
+    * [plotResidualsVsSimulated()](https://www.open-systems-pharmacology.org/OSPSuite-R/reference/plotResidualsVsSimulated.html)
+    * [plotResidualsVsTime()](https://www.open-systems-pharmacology.org/OSPSuite-R/reference/plotResidualsVsTime.html)
diff --git a/vignettes/example-workflow.Rmd b/vignettes/example-workflow.Rmd
@@ -1,7 +1,6 @@
 ---
 title: "Running simulations"
 output: rmarkdown::html_vignette
-#output: pdf_document
 vignette: >
   %\VignetteIndexEntry{Example workflow}
   %\VignetteEngine{knitr::rmarkdown}
@@ -11,20 +10,35 @@ vignette: >
 ```{r, include = FALSE}
 knitr::opts_chunk$set(
   collapse = TRUE,
-  comment = "#>",
-  eval = FALSE
+  comment = "#>"
+)
+knitr::opts_knit$set(
+  root.dir = "../tests/data/TestProject/Code/"
 )
 ```
 
 ```{r, echo = FALSE, results = "hide", message = F}
 library(esqlabsR)
+projectConfiguration <- createDefaultProjectConfiguration()
+dataSheets <- "Laskin 1982.Group A"
+observedData <- loadObservedData(projectConfiguration = projectConfiguration, sheets = dataSheets)
 ```
 
-Within the `esqlabsR` framework, the simulations are run by defining and executing multiple *scenarios*. To define a scenario, create an Excel file located in `projectConfiguration$scenarioDefinitionFile` that lists the scenario name, involved individuals, application protocols, simulation time, PKML model and model parameters. An example of the scenario definition used in this example is available [on github](https://github.com/esqLABS/esqlabsR/blob/HEAD/tests/data/TestProject/Parameters/Scenarios.xlsx).
+### Workflow 
+
+Within the `esqlabsR` framework, the simulations are run by defining and executing multiple *scenarios*. To define a scenario, create an Excel file located in `projectConfiguration$scenarioDefinitionFile` that lists the scenario name, involved individuals, application protocols, simulation time, PKML model and model parameters. 
+
+The package includes [an example scenario](https://github.com/esqLABS/esqlabsR/blob/HEAD/tests/data/TestProject/Parameters/Scenarios.xlsx) that models the administration of a single dose of 250 mg aciclovir intravenously to an individual with a 90 mL/min estimated glomerular filtration rate. To define this scenario, the example configuration includes several Excel configuration files:
 
-Scenarios are then executed by creating a `ScenarioConfiguration` class instance and calling a `runScenarios()` function:
+* the `ProjectConfiguration.xlsx` stores paths to four configuration files: `Parameters/IndividualPhysiology.xlsx`, `Parameters/ApplicationParameters.xlsx`, `Parameters/Scenarios.xlsx` and `Parameters/IndividualParameters.xlsx`
+* the `IndividualPhysiology.xlsx` file defines an individual called `Indiv1`: an aged 30, 73 kg and 176 cm Caucasian human male.
+* the `ApplicationParameters.xlsx` defines a protocol called `Aciclovir_iv_250mg`: a single dose of 250 mg delivered to the `Applications|IV 250mg 10min|Application_1|ProtocolSchemaItem` container.
+* the `Scenarios.xlsx` defines a scenario called `TestScenario`, using `Indiv1` as an individual and `Aciclovir_iv_250mg` as the application protocol. The model file `Aciclovir.pkml` is also referenced and is expected to exist in the `projectConfiguration$modelFolder` folder. 
+* the `IndividualParameters.xlsx` file redefines the GFR to be 90 mL/min
 
-```{r}
+Scenarios are then executed by creating a `ScenarioConfiguration` object and calling a `runScenarios()` function with a specific scenario name (`TestScenario` in this case):
+
+```{r runScenarios}
 scenarioConfiguration <- ScenarioConfiguration$new(projectConfiguration)
 OutputPaths <- enum(list(
   Aciclovir_PVB = "Organism|PeripheralVenousBlood|Aciclovir|Plasma (Peripheral Venous Blood)"
@@ -38,7 +52,19 @@ simulatedScenarios <- runScenarios(
 
 The `runScenarios()` function relies on an `OutputPaths` variable being defined in the current environment. The variable is expected to contain [references to an entity](https://docs.open-systems-pharmacology.org/working-with-mobi/mobi-documentation/model-building-components) that will be used as an output variable for the simulation.
 
-The `runScenarios()` function returns a named list of lists with `Simulation` class instances, `SimulationResults` class instances, and vectors of output values.
+The `runScenarios()` function returns a named list of lists with `Simulation` objects, `SimulationResults` objects, and vectors of output values.
+
+```{r simulatedScenarios}
+print(simulatedScenarios$TestScenario$results)
+```
+
+```{r outputValues}
+print(head(simulatedScenarios$TestScenario$outputValues$data))
+```
+
+### Troubleshooting 
+
+* The `runScenarios()` function will fail if the `OutputPaths` variable is missing in the current environment. Make sure to have this variable set up immediately before calling the `runScenarios()` function.
 
 More detailed information on function signatures can be found in:
 

diff --git a/vignettes/introduction-esqlabsr.Rmd b/vignettes/introduction-esqlabsr.Rmd
@@ -3,7 +3,6 @@ title: "Introduction to esqlabsR"
 output: 
   rmarkdown::html_vignette:
     toc: true
-#output: pdf_document
 vignette: >
   %\VignetteIndexEntry{Introduction to esqlabsR}
   %\VignetteEngine{knitr::rmarkdown}
@@ -32,7 +31,7 @@ The articles below covers several aspects of the software usage:
 * [Data handling](data-handling.html)
 * [Running simulations](example-workflow.html)
 * [Visualization and reporting](example-visualization.html)
-* [Sensitivity analysis](example-sensitivity.html)
+* [Sensitivity analysis](sensitivity-analysis.html)
 
 ## Object-oriented approach
 
@@ -50,3 +49,12 @@ resultOfAFunction <- object1$multiply(1,2)
 ```
 
 Important information about the object can be printed out by calling `print(object)`.
+
+## Configuring a project with no-code Excel files
+
+To facilitate modeling activities in a team with diverse backgrounds, `esqlabsR` employs Excel configuration files that define every aspect of a modeling project. The modeling projects are expected to follow a strict folder structure: 
+
+* the `Code/ProjectConfiguration.xlsx` file is a cornerstone Excel file that stores relative paths to all project components, including modeling scenarios, data files and parameter files. This Excel file is expected to contain fields that populate an instance of a [`ProjectConfiguration` class](../reference/ProjectConfiguration.html).
+* by default, the `Code` folder contains R script files
+* by default, the `Data` folder contains Excel files with observed data
+* by default, the `Parameters` folder contains Excel files with global, population and individual model parameters, definitions of application protocols and modeling scenarios. Examples of configuration files are included in the `/tests/data/TestProject` folder of the repository.
diff --git a/vignettes/sensitivity-analysis.Rmd b/vignettes/sensitivity-analysis.Rmd
@@ -0,0 +1,79 @@
+---
+title: "Sensitivity analysis"
+output: rmarkdown::html_vignette
+vignette: >
+  %\VignetteIndexEntry{Sensitivity analysis}
+  %\VignetteEngine{knitr::rmarkdown}
+  %\VignetteEncoding{UTF-8}
+---
+
+```{r, include = FALSE}
+knitr::opts_chunk$set(
+  collapse = TRUE,
+  comment = "#>"
+)
+knitr::opts_knit$set(
+  root.dir = "../tests/data/TestProject/Code/"
+)
+```
+
+```{r, echo = FALSE, results = "hide", message = FALSE}
+library(esqlabsR)
+projectConfiguration <- createDefaultProjectConfiguration()
+dataSheets <- "Laskin 1982.Group A"
+observedData <- loadObservedData(projectConfiguration = projectConfiguration, sheets = dataSheets)
+scenarioConfiguration <- ScenarioConfiguration$new(projectConfiguration)
+OutputPaths <- enum(list(
+  Aciclovir_PVB = "Organism|PeripheralVenousBlood|Aciclovir|Plasma (Peripheral Venous Blood)"
+))
+simulatedScenarios <- runScenarios(
+  scenarioNames = c("TestScenario"),
+  scenarioConfiguration = scenarioConfiguration,
+  customParams = NULL, saveSimulationsToPKML = FALSE
+)
+```
+
+### Workflow
+
+Sensitivity analysis quantifies how the pharmacokinetics of the drug changes with a variation in simulation parameters. This is important to track if the values of simulation parameters are uncertain.
+
+In the aciclovir simulation example, the lipophilicity of aciclovir was assumed to be -0.100 in log units. In the sensitivity analysis, we want to quantify how much the pharmacokinetic parameters change if the lipophilicity of aciclovir varies in the (-0.01, -1.00) range.
+
+The `sensitivityCalculation()` function in the `{esqlabsR}` package does that by re-running the simulation with a set of updated parameter values. The function returns a list with output paths, parameter paths, a `SimulationResults` object, and a data frame with computed pharmacokinetic parameters for each of the input parameter values.
+
+```{r analysis}
+simulation <- simulatedScenarios$TestScenario$simulation
+OutputPaths <- enum(list(
+  Aciclovir_PVB = "Organism|PeripheralVenousBlood|Aciclovir|Plasma (Peripheral Venous Blood)"
+))
+analysis <- sensitivityCalculation(simulation, OutputPaths, parameterPaths = "Aciclovir|Lipophilicity")
+head(analysis$pkData)
+```
+
+In the aciclovir example case, the default value of lipophilicity was -0.100 log units, corresponding to the area under curve (AUC) of 3915.87 µmol×min/L. Changing the lipophilicity to -0.01 log units leads to a decrease of AUC to 3895.43 µmol×min/L (a change of -0.52%), while changing the lipophilicity to -1.00 log units leads to an increase of AUC to 4015.73 µmol×min/L (a change of 2.55%).
+
+The results of the sensitivity analysis can be plotted with two functions:
+
+* the `sensitivitySpiderPlot()` function shows a separate plot for each of the pharmacokinetic parameters under investigation. By default, the `sensitivityCalculation()` function computes the changes in area under curve (`AUC_inf`), maximum concentration (`C_max`) and time when the maximum concentration is reached (`t_max`).
+
+```{r spider-plot, fig.width=7, fig.height=4}
+sensitivitySpiderPlot(analysis)
+```
+
+* the `sensitivityTimeProfiles()` function plots the concentration profile for each of the input parameter values:
+
+```{r time-profile, fig.width=7, fig.height=4}
+sensitivityTimeProfiles(analysis)
+```
+
+### Troubleshooting
+
+* The `SensitivityPKParameter` column in the output of `analysis$pkData` will be `NA` in the rows corresponding to the initial parameter values.
+
+More detailed information on function signatures can be found in:
+
+* `esqlabsR` documentation on:
+    * [sensitivityCalculation()](https://esqlabs.github.io/esqlabsR/reference/sensitivityCalculation.html)
+    * [sensitivitySpiderPlot()](https://esqlabs.github.io/esqlabsR/reference/sensitivitySpiderPlot.html)
+    * [sensitivityTimeProfiles()](https://esqlabs.github.io/esqlabsR/reference/sensitivityTimeProfiles.html)
+