docs: big refactoring

rajewsky-lab · Apr 19, 2024 · e947f03 · e947f03
1 parent 0045e68
commit e947f03
Show file tree

Hide file tree

Showing 16 changed files with 878 additions and 688 deletions.
diff --git a/docs/computational/generate_expression_matrix.md b/docs/computational/generate_expression_matrix.md
@@ -1,49 +1,58 @@
-# Generating a cell-by-gene matrix
-After pairwise alignment, the same coordinate system is shared between the spatial barcodes and the
-staining images. 
+# Segmentation and single-cell quantification
+Once the ST and imaging modalities [have been aligned](pairwise_alignment.md#option-a-automated-alignment), 
+you can segment the images into single cells/nuclei, and then aggregate the spot locations into individual cells 
+for subsequent analysis.
 
-However, analysis (e.g., clustering, pseudotime, DGE...) is performed on single cells, not on individual capture areas 
-(0.6 μm in the current version of the protocol).
+## Cell segmentation from tissue image
+Let's create a new Open-ST h5 object containing a cell-by-gene expression matrix. First, you will need a cell 
+(or nuclear) segmentation mask.
 
-So, we show how to aggregate the $N\times G$ matrix ($N$ spots; $G$ genes)
-into a $M\times G$ matrix ($M$ segmented cells; $G$ genes), where $N$ maps to $M$ via the segmentation mask.
+```sh
+openst from_spacemake \
+     --project-id openst_demo_project \
+     --sample-id openst_demo_sample \
+     segment \
+     --model HE_cellpose_rajewsky # default model for segmentation of H&E images
+```
 
-## Segmentation of staining image
-To create such a spatial cell-by-gene ($M\times G$) expression matrix, you will first need a segmentation mask.
+=== "From (semi)automatic alignment"
 
-We efficiently segment cells (or nuclei) from staining images using [cellpose](https://github.com/MouseLand/cellpose).
-We provide a model that we fine-tuned for segmentation of fresh-frozen, H&E-stained tissue,
-[here](http://bimsbstatic.mdc-berlin.de/rajewsky/openst-public-data/models/HE_cellpose_rajewsky).
-You can specify any other model that works best for your data -
-refer to the [cellpose](https://cellpose.readthedocs.io/en/latest/index.html) documentation.
+     ```sh
+     openst from_spacemake \
+          --project-id openst_demo_project \
+          --sample-id openst_demo_sample \
+          segment \
+          --model HE_cellpose_rajewsky \
+          --image-in uns/spatial_pairwise_aligned/staining_image_transformed
+     ```
 
-```sh
-openst segment \
-    --h5-in spatial_stitched_spots.h5ad \ # after running the pairwise alignment
-    --image-in uns/spatial/staining_image \
-    --mask-out uns/spatial/staining_image_mask \
-    --model HE_cellpose_rajewsky
-    # --device cuda \ # uses GPU for segmentation, if available
-    # --chunked \ # specify if you run out of GPU memory - segments in chunks
-```
-By default, segmentation is extended radially 10 pixels. This can be changed with the argument `--dilate-px`.
+=== "From manual alignment"
 
-Make sure to populate the arguments with the values specific to your dataset. Here, we provide `--h5-in` consistent
-with the previous steps, `--image-in` and `--mask-out` will read and write the staining and mask inside the Open-ST h5 object,
-and `--model` is `HE_cellpose_rajewsky`, the default used in our manuscript. This is the model we recommend for H&E images, and
-weights are automatically downloaded. It is also [provided in our repo](http://bimsbstatic.mdc-berlin.de/rajewsky/openst-public-data/models/HE_cellpose_rajewsky).
-The rest of parameters can be checked with `openst segment --help`.
+     ```sh
+     openst from_spacemake \
+          --project-id openst_demo_project \
+          --sample-id openst_demo_sample \
+          segment \
+          --model HE_cellpose_rajewsky
+     ```
 
-!!! tip
-     **If your sample also contains very large cells** (e.g., adipocytes) that are not segmented with the previous parameters,
-     you can perform a second segmentation with a cellpose model, adjusting the diameter parameter.
+We segment cells (or nuclei) from staining images using [cellpose](https://github.com/MouseLand/cellpose).
+We provide a model that we fine-tuned for segmentation of fresh-frozen, H&E-stained tissue,
+[here](http://bimsbstatic.mdc-berlin.de/rajewsky/openst-public-data/models/HE_cellpose_rajewsky), but you can use
+any other model (e.g., pretrained from cellpose, like `cyto2` or `nuclei`, or your own).
+Also, by default, segmentation is extended radially 10 pixels (see `--dilate-px`), to account for cytoplasm surrounding
+the nucleus as a first approximation of cell shape (might hold or not depending on the tissue).
+
+??? question "I want to segment very small and very large cells..."
+
+     You can perform an additional round of segmentation by, e.g., adjusting the diameter parameter.
 
      ```sh
-     openst segment \
-          --h5-in spatial_stitched_spots.h5ad \
-          --image-in uns/spatial/staining_image \
+     openst from_spacemake \
+          --project-id openst_demo_project \
+          --sample-id openst_demo_sample \
+          segment \
           --mask-out uns/spatial/staining_image_mask_large \
-          --model HE_cellpose_rajewsky \
           --dilate-px 50 \
           --diameter 50 # diameter for the larger cell type
      ```
@@ -61,41 +70,81 @@ The rest of parameters can be checked with `openst segment --help`.
           --mask-out uns/spatial/staining_image_mask_combined
      ```
 
-You can assess the quality of the segmentation mask using `openst preview`, which leverages `napari` for visualization:
+## Quality control of segmentation
+You can assess the quality of segmentation with `openst preview`:
 
-```sh
-openst preview \
-     --h5-in spatial_stitched_spots.h5ad \
-     --image-key uns/spatial/staining_image uns/spatial/staining_image_mask
-```
+=== "From (semi)automatic alignment"
+
+     ```sh
+     openst from_spacemake \
+          --project-id openst_demo_project \
+          --sample-id openst_demo_sample \
+          preview \
+          --image-key uns/spatial_pairwise_aligned/staining_image_transformed uns/spatial/staining_image_mask
+     ```
+
+=== "From manual alignment"
+
+     ```sh
+     openst from_spacemake \
+          --project-id openst_demo_project \
+          --sample-id openst_demo_sample \
+          preview \
+          --image-key uns/spatial/staining_image uns/spatial/staining_image_mask
+     ```
 
-This will create a `napari` window with two image layers. We recommend changing the mask _image_ layer into a
+This will create a `napari` window with two image layers. Change the mask _image_ layer into a
 [_label_ layer](https://napari.org/stable/howtos/layers/labels.html), which is designed for _displaying each integer (ID
 from the segmentation mask) as a different random color, with background rendered as transparent_.
 
-## Assigning transcripts to segmented cells
-Now, we aggregate the initial $N\times G$ matrix into an $M\times G$ matrix,
-where $N$ maps to $M$ via the segmentation mask.
+If you are satistied with the quality of the segmentation, **you are all set to continue with single-cell quantification**.
 
-This step allows you to associate capture spots with segmented cells.
+## Single-cell quantification
 
-```sh
-openst transcript_assign \
-    --h5-in spatial_stitched_spots.h5ad \
-    --spatial-key obsm/spatial_pairwise_aligned_fine \
-    --mask-in uns/spatial/staining_image_mask \
-    --h5-out spatial_per_segmented_cell.h5ad
-```
+Then, you can create a single file containing the transcriptomic information aggregated into (segmented) single-cells.
+
+=== "From automatic alignment"
 
-In this case, the argument `--mask-in` was set to a single mask, but it can be set to the previously 
-introduced _combined_ masks (e.g., `--mask-in uns/spatial/staining_image_mask_combined`).
+     ```sh
+     openst from_spacemake \
+          --project-id openst_demo_project \
+          --sample-id openst_demo_sample \
+          transcript_assign \
+          --spatial-key obsm/spatial_pairwise_aligned_fine \
+          --image-key uns/spatial_pairwise_aligned/staining_image_transformed
+     ```
+
+=== "From semiautomatic alignment"
+
+     ```sh
+     openst from_spacemake \
+          --project-id openst_demo_project \
+          --sample-id openst_demo_sample \
+          transcript_assign \
+          --spatial-key obsm/spatial_manual_fine \
+          --image-key uns/spatial_pairwise_aligned/staining_image_transformed
+     ```
+
+=== "From manual alignment"
+
+     ```sh
+     openst from_spacemake \
+          --project-id openst_demo_project \
+          --sample-id openst_demo_sample \
+          transcript_assign \
+          --spatial-key obsm/spatial_manual_fine
+     ```
 
 ## Expected output
-After running the steps above, you will have a single `h5ad` file, containing the transcriptomic information per segmented cell,
-with spatial coordinates compatible with the staining image. The staining image and the segmented image are provided in this object,
+After the steps above, you will have a single `h5ad` file with transcriptomic information per segmented cell,
+with spatial coordinates aligned to the staining image. The staining image and the segmented image are provided in this object,
 so it is possible to visualize it with [squidpy](https://github.com/scverse/squidpy) or [spatialdata](https://github.com/scverse/spatialdata),
 among other tools.
 
-So, this concludes the preprocessing of 2D spatial transcriptomics and imaging data
+!!! warning
+     In the Open-ST h5 object, the cell with ID 0 will correspond to the background. Please remove it before
+     proceeding with analysis.
+
+This concludes the preprocessing of 2D spatial transcriptomics and imaging data
 of the Open-ST protocol. Next steps include 3D reconstruction, and
 downstream analysis of nD data.
diff --git a/docs/computational/getting_started.md b/docs/computational/getting_started.md
@@ -2,28 +2,19 @@
 
 After folowing the [experimental protocol](../experimental/getting_started.md), we provide the [`openst`](https://pypi.org/project/openst/)
 python package for transforming the raw sequencing data into objects that can be used for spatial, single-cell
-analysis.
-
-More specifically, our pipeline consists of the following steps:
+analysis, in four steps:
 
 1. [Preprocessing of sequencing](preprocessing_sequencing.md)
-2. [Preprocessing of imaging](preprocessing_imaging.md)
-3. [Align image to transcriptome](pairwise_alignment.md): the spatial coordinates of transcripts are aligned
-    to the imaging modality.
-4. [Generating a cell-by-gene matrix](generate_expression_matrix.md): transcripts
-    are quantified per cell using the segmentation information.
-5. [3D reconstruction](threed_reconstruction.md) of tissue imaging and transcriptome from serial sections.
+2. [Pairwise alignment](pairwise_alignment.md): the spatial coordinates of transcriptomics data are aligned
+    to tissue imaging.
+3. [Segmentation and single-cell quantification](generate_expression_matrix.md): transcriptomic data
+    are aggregated into single cells using the information from cell segmentation of tissue images.
+4. [3D reconstruction](threed_reconstruction.md) of tissue imaging and transcriptome from serial sections.
    *We provide tutorials for interactive visualization of 3D data.*
 
-If you're familiar with Python, you can install `openst` with [pip], the Python package manager.
-If not, we recommend using [docker].
-
-[pip]: #with-pip
-[docker]: #with-docker
-
 ## Installation
 
-### with pip <small>recommended</small>
+### with pip, <small>recommended</small>
 
 The computational tools of the Open-ST workflow are published as a [Python package]
 and can be installed with `pip`, ideally by using a [virtual environment].
@@ -66,6 +57,16 @@ pip install openst
     pip install napari
     ```
 
+### from git
+
+`openst` can be directly installed from the source [GitHub repository]:
+```
+git clone https://github.com/rajewsky-lab/openst.git
+pip install -e openst
+```
+
+  [GitHub repository]: https://github.com/rajewsky-lab/openst
+
 ### with docker
 
 The official [Docker image] is a great way to get up and running in a few
@@ -107,22 +108,4 @@ Now, you can execute PyQt5-based applications, and the GUI will be displayed on
       xhost -
       ```
 
-    [Docker image]: https://hub.docker.com/r/rajewsky/openst/
-
-### with git
-
-`openst` can be directly used from [GitHub] by cloning the
-repository into a subfolder of your project root which might be useful if you
-want to use the very latest version:
-
-```
-git clone https://github.com/rajewsky-lab/openst.git
-```
-
-Next, install the theme and its dependencies with:
-
-```
-pip install -e openst
-```
-
-  [GitHub]: https://github.com/rajewsky-lab/openst
+    [Docker image]: https://hub.docker.com/r/rajewsky/openst/