Skip to content

Commit

Permalink
use readme Rmd
Browse files Browse the repository at this point in the history
  • Loading branch information
sckott committed Jan 5, 2024
1 parent 0fcbf0f commit 2e4704d
Show file tree
Hide file tree
Showing 3 changed files with 230 additions and 42 deletions.
1 change: 1 addition & 0 deletions .Rbuildignore
Original file line number Diff line number Diff line change
Expand Up @@ -9,3 +9,4 @@ docker
^pkgdown$
^.lintr$
man-roxygen/
^README\.Rmd$
144 changes: 144 additions & 0 deletions README.Rmd
Original file line number Diff line number Diff line change
@@ -0,0 +1,144 @@
<!-- README.md is generated from README.Rmd. Please edit that file -->

```{r include=FALSE}
knitr::opts_chunk$set(
comment = "#>",
collapse = TRUE,
warning = FALSE,
fig.path = "man/figures/README-",
out.width = "100%"
)
```

# rcromwell

<!-- badges: start -->
[![Project Status: Experimental – Useable, some support, not open to feedback, unstable API.](https://getwilds.github.io/badges/badges/experimental.svg)](https://getwilds.github.io/badges/#experimental)
[![R-CMD-check](https://github.com/getwilds/rcromwell/actions/workflows/R-CMD-check.yaml/badge.svg?branch=dev)](https://github.com/getwilds/rcromwell/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->

Convenience Tools for Managing WDL Workflows via Cromwell

## Installation

You can install the development version of `rcromwell` from [GitHub](https://github.com/) with:

```r
# install.packages("remotes")
remotes::install_github("getwilds/rcromwell")
```

Install a specific release version (in this case v1.0) by:

```r
remotes::install_github('getwilds/rcromwell@v1.0')
```

## Set up your Cromwell Server

Use instructions over in the [diy-cromwell-server repo](https://github.com/FredHutch/diy-cromwell-server) to get the configuration files needed and some testing workflows.


## Example workflow process

Set your Cromwell URL

```r
cromwell_config("http://gizmoXXX:20202")
```

### Validate your workflow formatting

```r
list.files(pattern = "*.wdl")
valid <- cromwell_validate(WDL = "myworkflow.wdl")
valid[["errors"]]
```

### Go fix your issues, now send your workflow to Crowmell

```r
thisJob <- cromwell_submit_batch(WDL = "myworkflow.wdl",
Params = "myworkflow-parameters.json",
Batch = "myworkflow-batch.json",
Options = "workflow-options.json")
(thisOne <- thisJob$id)
```

`thisJob$id` is the unique Cromwell ID for your entire workflow - you can use that to request all sorts of metadata!!!

### Now get all your metadata and track the workflow!!

Return a data frame of all jobs run in the past number of days (uses your database)

```r
jobs <- cromwell_jobs(days = 2)
```

Return a data frame (one line if you only submit one workflow id) containing workflow level metadata

```r
w <- cromwell_workflow(thisOne)
```

Print the current status of the workflow(s) is(are)

```r
w$status
```

Return a data frame containing all call level metadata

```r
c <- cromwell_call(thisOne)
```

Handy set of dplyr commands to tell you about how the various calls are doing

```r
c %>% group_by(callName, executionStatus) %>% summarize(status = n()) %>% arrange(executionStatus)
```

Returns a data frame containing call level call caching metadata

```r
ca <- cromwell_cache(thisOne)
```

Handy set of dplyr commands to tell you about what sort of call caching is happening

```r
ca %>% group_by(callCaching.hit, callName) %>% summarize(hits = n())
```

Opens up a popup in your browser with a timing diagram in it.

```r
cromwell_timing(thisOne)
```

Returns a data frame containing call level failure metadata

```r
f <- cromwell_failures(thisOne)
```

Will tell Cromwell to abort the current workflow - note this cannot be undone and it will take a while to stop all the jobs.

```r
abort <- cromwell_abort(thisOne)
```

When a workflow is done, request information about the workflow outputs.

```r
out <- cromwell_outputs(thisOne)
```

### Misc stuff

Ugly list of raw metadata should you need it for workflow troubleshooting

```r
cromwell_glob(thisOne); WTF[["failures"]]
```
127 changes: 85 additions & 42 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,93 +1,136 @@
<!-- README.md is generated from README.Rmd. Please edit that file -->



# rcromwell

R package for using Cromwell with WDL workflows.
<!-- badges: start -->
[![Project Status: Experimental – Useable, some support, not open to feedback, unstable API.](https://getwilds.github.io/badges/badges/experimental.svg)](https://getwilds.github.io/badges/#experimental)
[![R-CMD-check](https://github.com/getwilds/rcromwell/actions/workflows/R-CMD-check.yaml/badge.svg?branch=dev)](https://github.com/getwilds/rcromwell/actions/workflows/R-CMD-check.yaml)
<!-- badges: end -->

Convenience Tools for Managing WDL Workflows via Cromwell

## Install from GitHub
## Installation

Install the most recent version of `rcromwell`:
You can install the development version of `rcromwell` from [GitHub](https://github.com/) with:

```r
require(remotes)
remotes::install_github('getwilds/rcromwell')
# install.packages("remotes")
remotes::install_github("getwilds/rcromwell")
```

Install a specific release version (in this case v1.0) by:

```r
require(remotes)
remotes::install_github('getwilds/rcromwell@v1.0')
```

## Set up your Cromwell Server

Use instructions over in the [diy-cromwell-server repo](https://github.com/FredHutch/diy-cromwell-server) to get the configuration files needed and some testing workflows.
Use instructions over in the [diy-cromwell-server repo](https://github.com/FredHutch/diy-cromwell-server) to get the configuration files needed and some testing workflows.


## Example workflow process

```{r}
## Set your Cromwell URL
setCromwellURL(nodeAndPort = "gizmoXXX:20202")
Set your Cromwell URL

```r
cromwell_config("http://gizmoXXX:20202")
```

### Validate your workflow formatting

```{r}
```r
list.files(pattern = "*.wdl")
valid <- cromwellValidate(WDL = "myworkflow.wdl"); valid[["errors"]]
valid <- cromwell_validate(WDL = "myworkflow.wdl")
valid[["errors"]]
```
## Go fix your issues, now send your workflow to Crowmell

```{r}
thisJob <- cromwellSubmitBatch(WDL = "myworkflow.wdl",
### Go fix your issues, now send your workflow to Crowmell

```r
thisJob <- cromwell_submit_batch(WDL = "myworkflow.wdl",
Params = "myworkflow-parameters.json",
Batch = "myworkflow-batch.json",
Options = "workflow-options.json")
(thisOne <- thisJob$id)
```

`thisJob$id` is the unique Cromwell ID for your entire workflow - you can use that to request all sorts of metadata!!!

# thisJob$id is now the unique Cromwell ID for your entire workflow - you can use that to request all sorts of metadata!!!
thisOne<- thisJob$id; thisOne
### Now get all your metadata and track the workflow!!

Return a data frame of all jobs run in the past number of days (uses your database)

```r
jobs <- cromwell_jobs(days = 2)
```

## Now get all your metadata and track the workflow!!
Return a data frame (one line if you only submit one workflow id) containing workflow level metadata

```{r}
# Returns a data frame of all jobs run in the past number of days (uses your database)
jobs <- cromwellJobs(days = 2)
```r
w <- cromwell_workflow(thisOne)
```

# Returns a data frame (one line if you only submit one workflow id) containing workflow level metadata
w <- cromwellWorkflow(thisOne)
Print the current status of the workflow(s) is(are)

# This is handy to print the current status of the workflow(s) is(are)
```r
w$status
```

Return a data frame containing all call level metadata

```r
c <- cromwell_call(thisOne)
```

# Returns a data frame containing all call level metadata
c <- cromwellCall(thisOne)
Handy set of dplyr commands to tell you about how the various calls are doing

# Handy set of dplyr commands to tell you about how the various calls are doing
```r
c %>% group_by(callName, executionStatus) %>% summarize(status = n()) %>% arrange(executionStatus)
```

Returns a data frame containing call level call caching metadata

```r
ca <- cromwell_cache(thisOne)
```

# Returns a data frame containing call level call caching metadata
ca <- cromwellCache(thisOne)
Handy set of dplyr commands to tell you about what sort of call caching is happening

# Handy set of dplyr commands to tell you about what sort of call caching is happening
```r
ca %>% group_by(callCaching.hit, callName) %>% summarize(hits = n())
```

# Opens up a popup in your browser with a timing diagram in it.
cromwellTiming(thisOne)
Opens up a popup in your browser with a timing diagram in it.

# Returns a data frame containing call level failure metadata
f <- cromwellFailures(thisOne)
```r
cromwell_timing(thisOne)
```

# Will tell Cromwell to abort the current workflow - note this cannot be undone and it will take a while to stop all the jobs.
abort <- cromwellAbort(thisOne)
Returns a data frame containing call level failure metadata

# When a workflow is done, request information about the workflow outputs.
out <- cromwellOutputs(thisOne)
```r
f <- cromwell_failures(thisOne)
```

## Misc stuff
Will tell Cromwell to abort the current workflow - note this cannot be undone and it will take a while to stop all the jobs.

```{r}
# Ugly list of raw metadata should you need it for workflow troubleshooting
WTF <- cromwellGlob(thisOne); WTF[["failures"]]
```r
abort <- cromwell_abort(thisOne)
```

When a workflow is done, request information about the workflow outputs.

```r
out <- cromwell_outputs(thisOne)
```

### Misc stuff

Ugly list of raw metadata should you need it for workflow troubleshooting

```r
cromwell_glob(thisOne); WTF[["failures"]]
```

0 comments on commit 2e4704d

Please sign in to comment.