Hello! Hopefully you're here because you're interested in the R package psupertime
. psupertime
is an R package which analyses single cell RNA-seq ("scRNAseq") data where groups of the cells have labels following a known or expected sequence (for example, samples from a time series experiment day1, day2, ..., day5). It uses ordinal logistic regression to identify a small set of genes which recapitulate the group-level sequence for individual cells. It can be used for discovery of relevant genes, for exploration of unlabelled data, and assessment of one dataset with respect to the labels known for another dataset. You can find the psupertime
package here and read the pre-print here.
psupplementary
is a package for replicating the analyses in the psupertime
paper, and allowing users to play with psupertime
themselves and see what it can do. The psupertime
package has everything you need to do your own analysis; splitting the heavy datasets off into the psupplementary
package keeps the main package light.
If you haven't already, you'll need to install psupertime
, as follows:
remotes::install_github('wmacnair/psupertime', build = TRUE, build_opts = c("--no-resave-data", "--no-manual"))
library('psupertime')
(You may need to install the package remotes
, with install.packages('remotes')
. Installation took <90s on a Macbook Pro.)
Due to the large files, installing the psupplementary
package is slightly more complicated. You need to first clone the package
cd /path/to/packages
git clone https://github.com/wmacnair/psupplementary.git
then you have two options.
Fast option (~7 minutes): run this line in R to install it without building the vignettes:
devtools::install('path/to/packages/psupplementary')
Slower option (~15 minutes): run this line in R to install it with the vignettes:
devtools::install('path/to/packages/psupplementary', build_vignettes=TRUE)
This gives you a couple of webpages which walk you through some of the analyses done in the paper.
Once installed, you can call library('psupplementary')
to load up the package.
There are six scRNAseq datasets included in this package:
- acinar_sce, consisting of aging acinar cells, with sequential labels corresponding to age of donor in years, stored in the variable 'donor_age' (from here, GSE81547)
- germ_sce, consisting of developing human female germline cells, with sequential labels corresponding to age in weeks, stored in the variable 'time' (from here, GSE86146)
- beta_sce, consisting of developing beta cells, with sequential labels corresponding to developmental stage, stored in the variable 'age' (from here, GSE87375)
- hesc_sce, consisting of human embryonic stem cells, with sequential labels corresponding to embryonic day, stored in the variable 'esc_day' (from here, E-MTAB-3929)
- mef_sce, consisting of MEFs reprogrammed to neurons, with sequential labels corresponding to days since induction, stored in the variable 'time_point' (from here, GSE67310)
- colon_sce, consisting of human colon cells, where the sequential labels are derived from unsupervised clustering (from here, GSE102698)
To load this_sce
, just call data(this_sce)
.
There are several vignettes included in this package:
- Analysis of acinar cells labelled with donor ages (replicates Fig 1C, Fig 1D, Fig 1E, Supp Fig 01, Supp Fig 02, Supp Fig 03, Supp Fig 04, Supp Fig 05)
- Exploratory data analysis of unlabelled colon data (replicates Fig 1F, Supp Fig 15, Supp Fig 16, Supp Fig 17, Supp Fig 18)
(Our intention in future is to allow replication of all figures in the manuscript. At present the code allows replication of all analysis relating to Figure 1, including multiple supplementary figures; replication of the remaining figures will follow!)
To view the vignettes, run this code:
browseVignettes(package = 'psupplementary')
Probably the best thing to do is to run some of the analyses in the vignettes, then make copies of the code yourself to explore the possibilities of psupertime
further.
Please add any issues or requests to the Issues page. All feedback enthusiastically received.
Thanks!
Will