Skip to content

Commit

Permalink
Merge pull request #32 from rajewsky-lab/fast_cmdline
Browse files Browse the repository at this point in the history
Fast cmdline
  • Loading branch information
danilexn authored Apr 12, 2024
2 parents 13ab58e + ea3ec41 commit a8687b1
Show file tree
Hide file tree
Showing 28 changed files with 1,742 additions and 1,604 deletions.
30 changes: 13 additions & 17 deletions docs/computational/generate_expression_matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -19,15 +19,15 @@ refer to the [cellpose](https://cellpose.readthedocs.io/en/latest/index.html) do

```sh
openst segment \
--adata <path_to_aligned_h5ad> \
--h5-in <path_to_aligned_h5ad> \
--image-in <image_in_path> \
--output-mask <mask_out_path> \
--model <path>/HE_cellpose_rajewsky \
--chunked \ # divides the image into smaller chunks (lower memory usage)
--gpu \ # uses GPU for segmentation (Nvidia CUDA)
--num-workers 8 \ # processes the image in parallel
--dilate-px 10 # will extend the segmentation 10 micron around
--mask-out <mask_out_path> \
--model <path>/HE_cellpose_rajewsky
# --device cuda \ # uses GPU for segmentation, if available
# --chunked \ # specify if you run out of GPU memory - segments in chunks
```
By default, segmentation is extended radially 10 pixels. This can be changed with the argument `--dilate-px`.

Make sure to replace the placeholders (`<...>`). For instance,
`<path_to_aligned_h5ad>` is the full path to the `h5ad` file [after pairwise alignment](pairwise_alignment.md#expected-output);
`<image_in_path>` is the path to the image - a path to a file, or a location inside the `h5ad` file,
Expand All @@ -45,13 +45,10 @@ for segmentation of H&E images The rest of parameters can be checked with `opens

```sh
openst segment \
--adata <path_to_aligned_h5ad> \
--h5-in <path_to_aligned_h5ad> \
--image-in <image_in_path> \
--output-mask <mask_out_path_larger> \
--mask-out <mask_out_path_larger> \
--model <path>/HE_cellpose_rajewsky \
--chunked \ # divides the image into smaller chunks (lower memory usage)
--gpu \ # uses GPU for segmentation (Nvidia CUDA)
--num-workers 8 \ # processes the image in parallel
--dilate-px 50 \
--diameter 50 # diameter for the larger cell type
```
Expand All @@ -65,7 +62,7 @@ for segmentation of H&E images The rest of parameters can be checked with `opens

```sh
openst segment_merge \
--adata <path_to_aligned_h5ad> \
--h5-in <path_to_aligned_h5ad> \
--mask-in <mask_a> <mask_b>
--mask-out <mask_combined>
```
Expand All @@ -81,11 +78,10 @@ This step allows you to associate capture spots with segmented cells.

```sh
openst transcript_assign \
--adata <path_to_aligned_h5ad> \
--h5-in <path_to_aligned_h5ad> \
--spatial-key spatial_pairwise_aligned_fine \
--mask-in-adata \
--mask <mask_out_path> \
--output <path_to_sc_h5ad>
--mask-in <mask_out_path> \
--h5-out <path_to_sc_h5ad>
```

Replace the placeholders (`<...>`) as before; in this case, the placeholder `<mask_in_path>` must be set to
Expand Down
152 changes: 61 additions & 91 deletions docs/computational/pairwise_alignment.md
Original file line number Diff line number Diff line change
Expand Up @@ -18,7 +18,7 @@ The alignment workflow consists of two steps, that can be performed [automatical
## Required input data
For automatic and manual alignment, two inputs are required: (1) a stitched tile-scan of
the staining image (see [Preprocessing of imaging](preprocessing_imaging.md)), and (2) a
single [h5ad] file containing all the [barcoded tiles] of a sample
single [h5ad] file containing all the [barcoded tiles](preprocessing_sequencing.md#flow-cell-related-terms) of a sample
(see [Preprocessing of sequencing](preprocessing_sequencing.md)).

!!! warning
Expand Down Expand Up @@ -56,135 +56,105 @@ spots.
[h5ad]: https://anndata.readthedocs.io/en/latest/fileformat-prose.html

## Automated workflow
If you want to save time (😉), we provide a script that performs coarse and fine steps of alignment
If you want to save time, we provide a script that performs coarse and fine steps of alignment
automatically, by leveraging computer vision algorithms. To do so, make sure that you have the [necessary
input data](#required-input-data); then, open a termina, type and run the following command (just an example):

```bash
openst pairwise_aligner \
--image-in <path_image>/Image_Stitched_Composite.tif \
--h5-in <path>/<id>_stitched_spots.h5ad \
--h5-out <path>/<id>_stitched_spots_aligned.h5ad \
--metadata-out <path>/<id>_alignment_metadata.json \
--save-image-in-h5 \
--n-threads <cores>
--image-in Image_Stitched_Composite.tif \
--h5-in spatial_stitched_spots.h5ad \
--metadata alignment_metadata.json \
# --device cuda # by default, cpu. Only specify if you have cuda-compatible GPU
# --only-coarse # skips the fine fiducial registration, in case you want to do that manually
```

Make sure to replace the placeholders (`<...>`). For instance, `<path_image>` in the `--image-in` command
with the folder containing the [stitched images](preprocessing_imaging.md) of the current sample,
`<path>` in the `--h5-in` and `--h5-out` arguments to contain the folder containing the
[stitched barcoded tiles](preprocessing_sequencing.md), and `<id>` with the `sample_id` as
defined in the spacemake project. **Importantly**, make sure to specify a path where the metadata output file
should be created via `--metadata-out`; this will be useful for a visual assessment of whether
automated alignment worked or not. You can set `<cores>` depending on your machine (it defaults to 1).
Make sure to specify a path where the metadata output file
should be created via `--metadata`, for later visual assessment of the alignment.

If you want to run only the coarse phase of the pairwise alignment (i.e., to run the fine
alignment [yourself](#manual-workflow)), you can specify the argument `--only-coarse`. If you have a CUDA-compatible
GPU in your machine, you can specify the argument `--device cuda` to accelerate feature detection and matching.
alignment [yourself](#manual-workflow)), you can specify the argument `--only-coarse`.

!!! note
For aligning STS to H&E-stained tissues, **we recommend** leaving the arguments with the default values.
Anyway, you can get a full list of configurable parameters by running `openst pairwise_aligner --help`.

### Visual assessment of alignment
Before proceeding to generating a cell-by-gene matrix, **we strongly recommend** visually assessing the alignment.
For this, make sure that you specified a file name in the `--metadata-out`; then, you can open a terminal and run
the following command:
!!! tip
Right after automatic alignment and before
proceeding to segmentation and aggregating into a cell-by-gene matrix,
**we strongly recommend** visually assessing the alignment results. Specifically, that
the tissue is overall well-aligned in both modalities, and that the fiducial markers are
overlapping across all tiles.

```sh
openst report --metadata=<metadata_file> --html-out=<path_to_html_file>
```

Make sure to replace the placeholders (`<...>`) with the path to the metadata file (under `--metadata` argument), and
the desired path and filename of the HTML report that will be generated (under `--html-out` argument). Running this command
will create a HTML report containing images of the STS and staining image before and after alignment (coarse and/or fine,
depending on the configuration).

## Manual workflow
If the automatic alignment is not successful, or you prefer to run the alignment yourself, we provide a
Graphical User Interface (GUI) tool that allows to perform manual alignment. This GUI only needs an input file
generated by the `openst` pipeline containing the spatial transcriptome (barcoded spots-by-genes) and the staining
image, for visual reference. Such a file is generated upon running `openst pairwise_aligner` (if you'd like to
refine an automatic alignment). Alternatively, you can start from the [same files as in the automated workflow](#required-input-data),
and merge them with the command:
### Visual assessment with report
You can generate an HTML report that contains a qualitative summary of the alignment (images, parameters...)

```sh
openst manual_pairwise_aligner \
--prepare-data \
--h5-in <path>/<id>_stitched_spots_aligned.h5ad \
--image-in <path_image>/Image_Stitched_Composite.tif
openst report --metadata=alignment_metadata.json --html-out=alignment_report.html
```

Make sure to replace the placeholders (`<...>`). For instance, `<path_image>` in the `--image-in` command
with the folder containing the [stitched images](preprocessing_imaging.md) of the current sample,
`<path>` in the `--h5-in` argument to contain the folder containing the
[stitched barcoded tiles](preprocessing_sequencing.md), and `<id>` with the `sample_id` as
defined in the spacemake project.

!!! warning
When `--h5-out` is not provided, the image will be stored as a new layer in the
file specified at `--h5-in`.

Once you have prepared the input file, you can run the GUI with the following command (no arguments are required)

### Visual assessment with GUI
Alternatively, you can visualize the images & ST data interactively using the GUI.
- With the GUI:
```sh
openst manual_pairwise_aligner_gui
```

This GUI can be used to align the images from scratch, or to validate and refine the results from
automatic coarse and/or fine alignment. In any case, the GUI is the same: we provide a video walthrough
on how to use this GUI tool for all these cases. In summary, the GUI will allow you to create a `json` file
containing a list of pairs of corresponding points between the staining and spatial transcriptome coordinates.
This can be used later to transform the coordinates of the spatial trancriptomics to match the coordinate system of the image.
We provide a Graphical User Interface (GUI) for selecting keypoints between imaging & ST modalities,
for visualization and refinement of automatic results. This GUI requires a single Open-ST h5 object,
the output of `openst pairwise_aligner`.

---
## Manual/semiautomatic workflow
We provide a Graphical User Interface (GUI) for selecting keypoints between imaging & ST modalities,
for full manual alignment or refinement of automatic results. This GUI requires a single Open-ST h5 object
(after spatial stitching). There are two kinds of workflow:

:fontawesome-brands-youtube:{ style="color: #EE0F0F" }
__[Walkthrough of the GUI for manual alignment]__ by @danilexn – :octicons-clock-24:
5m – Learn how to align STS and imaging data in a step-by-step guide.
=== "Fully manual alignment"

[Walkthrough of the GUI for manual alignment]: https://www.youtube.com

---

Once the `json` list of point correspondences has been generated with the GUI, you can run the following command to transform
the coordinates of the spatial transcriptomics to match the coordinates of the staining image. We provide three alternative commands,
depending on whether you chose to align from scratch, make a manual fine alignment from a (manual or automated) coarse alignment,
or refine an automated (or manual) fine alignment
``` sh
# Add the image data to the Open-ST h5 object
openst merge_modalities \
--h5-in spatial_stitched_spots.h5ad \
--image-in Image_Stitched_Composite.tif

=== "Coarse from raw"
# Use the GUI to select the keypoints.json file
openst manual_pairwise_aligner_gui

``` sh
# Compute a rigid transformation from keypoints.json and apply it to the data
openst manual_pairwise_aligner \
--keypoints-json <path_to_keypoints.json> \
--h5-in <path_to_sts.h5ad> \
--coarse
--keypoints-in keypoints.json \
--h5-in spatial_stitched_spots.h5ad \
--spatial-key-in 'obsm/spatial' \
--spatial-key-out 'obsm/spatial_manual_transformed'
## --per-tile # when specified, there's one rigid transform per tile
```

=== "Fine from coarse"
=== "Semiautomatic alignment"

``` sh
# Use the GUI to select the keypoints.json file
openst manual_pairwise_aligner_gui

# Compute a rigid transformation from keypoints.json and apply it to the data
openst manual_pairwise_aligner \
--keypoints-json <path_to_keypoints.json> \
--h5-in <path_to_sts.h5ad> \
--fine
--keypoints-in keypoints.json \
--h5-in spatial_stitched_spots.h5ad \
--spatial-key-in 'obsm/spatial_pairwise_aligned_fine' \
--spatial-key-out 'obsm/spatial_pairwise_aligned_refined'
## --per-tile # when specified, there's one rigid transform per tile
```

=== "Refine from fine"
We provide a video showcasing the GUI, with an illustrative example of refinement from (coarse) automated alignment.

``` sh
openst manual_pairwise_aligner \
--keypoints-json <path_to_keypoints.json> \
--h5-in <path_to_sts.h5ad> \
--refine
```
---

:fontawesome-brands-youtube:{ style="color: #EE0F0F" }
__[Walkthrough of the GUI for manual alignment]__ by @danilexn – :octicons-clock-24:
5m – Learn how to visualize and align STS and imaging data in a step-by-step guide.

Make sure to replace the placeholders (`<...>`). For instance,
`<path_to_keypoints.json>` is the `json` file generated with the GUI, and in the `--h5-in` argument to contain the folder containing the
[stitched barcoded tiles](preprocessing_sequencing.md), and `<path_to_sts.h5ad>` is the path to the h5ad file that was loaded
with the GUI. Running the command above will generate a new [obsm] layer containing the transformed spatial transcriptome coordinates
[Walkthrough of the GUI for manual alignment]: https://www.youtube.com

[obsm]: https://anndata.readthedocs.io/en/latest/generated/anndata.AnnData.obsm.html
---

## Expected output
After running the automatic or manual alignment, you must have a single `h5ad` file, containing the transformed spatial coordinates.
Expand Down
4 changes: 2 additions & 2 deletions docs/computational/preprocessing_imaging.md
Original file line number Diff line number Diff line change
Expand Up @@ -23,10 +23,10 @@ Open a terminal, and run the following command:
openst image_stitch \
--microscope='keyence' \
--imagej-bin=<path_to_fiji_or_imagej> \
--input-dir=<path_to_tiles> \
--image-indir=<path_to_tiles> \
--tiles-prefix=<to_read> \
--tmp-dir=<tmp_dir> \
--output-image=<output_image>
--image-out=<output_image>
```
Make sure to replace the placeholders (`<...>`). For instance,
`<path_to_fiji_or_imagej>` is the path where the [Fiji](https://imagej.net/software/fiji/downloads) executable is;
Expand Down
22 changes: 10 additions & 12 deletions docs/computational/preprocessing_sequencing.md
Original file line number Diff line number Diff line change
Expand Up @@ -48,8 +48,8 @@ encoded in the `bcl` and `fastq` files. To obtain per-tile barcodes and coordina

```sh
openst barcode_preprocessing \
--in-fastq <fastq_of_tile> \
--out-path <out_path> \
--fastq-in <fastq_of_tile> \
--tilecoords-out <out_path> \
--out-suffix <out_suffix> \
--out-prefix <out_prefix> \
--crop-seq <len_int> \
Expand All @@ -65,7 +65,7 @@ files will be written; `<out_suffix>` and `<out_prefix>` are suffixes and prefix
must be written into the `csv` as their reverse-complementary; `--single-tile` argument is provided when the `fastq` file only contains data for
a single tile (**our recommendation**).

The code above will generate a file in `<out_path` per tile. Only a single fastq file can be provided at a time via `--in-fastq`. To
The code above will generate a file in `<out_path` per tile. Only a single fastq file can be provided at a time via `--fastq-in`. To
process this in parallel, you can run the following snippets (in Linux, assuming you start from the `fastq` files). We assume that
you have a file `lanes_and_tiles.txt`, that contains the tile identifiers that you want to process; you can generate this file with:

Expand All @@ -77,11 +77,10 @@ where `RunInfo.xml` is a file contained in the basecalls directory. *We don't en
this code snippet works* 🙈. Then, you can process various `fastq` files in the basecalls directory as follows:

```sh
cat lanes_and_tiles.txt | xargs xargs -n 1 -P <parallel_processes> -I {} \
cat lanes_and_tiles.txt | xargs -n 1 -P <parallel_processes> -I {} \
sh -c 'openst barcode_preprocessing \
--in-fastq <fastq_dir>/{}/Undetermined_S0_R1_001.fastq.gz \
--in-fastq <fastq_of_tile> \
--out-path <out_path> \
--fastq-in <fastq_dir>/{}/Undetermined_S0_R1_001.fastq.gz \
--tilecoords-out <out_path> \
--out-suffix .txt \
--out-prefix <out_prefix>"{}" \
--crop-seq <len_int> \
Expand All @@ -99,14 +98,13 @@ Otherwise, if you start from `bcl` files (raw basecalls), you can run demultiple
simultaneously to generating the barcode spatial coordinate file:

```sh
cat lanes_and_tiles.txt | xargs xargs -n 1 -P <parallel_processes> -I {} \
cat lanes_and_tiles.txt | xargs -n 1 -P <parallel_processes> -I {} \
sh -c 'bcl2fastq -R <bcl_in> --no-lane-splitting \
-o <bcl_out>/"{}" --tiles s_"{}"; \
openst barcode_preprocessing \
--in-fastq <bcl_out>/{}/Undetermined_S0_R1_001.fastq.gz \
--in-fastq <fastq_of_tile> \
--out-path <out_path> \
--fastq-in <bcl_out>/{}/Undetermined_S0_R1_001.fastq.gz \
--tilecoords-out <out_path> \
--out-suffix .txt \
--out-prefix <out_prefix>"{}" \
--crop-seq <len_int> \
Expand Down Expand Up @@ -161,7 +159,7 @@ To manually create 'puck_collection' files, you can run the following in a termi
openst spatial_stitch \
--tiles <space_separated_list_or_wildcards_to_h5ad> \
--tile-coordinates <path_to_coordinate_system> \
--output <output_puck_collection_h5ad>
--h5-out <output_puck_collection_h5ad>
```

This program has additional arguments that are explained when running `openst spatial_stitch --help`. Make sure to replace
Expand Down
15 changes: 7 additions & 8 deletions docs/examples/adult_mouse/generate_expression_matrix.md
Original file line number Diff line number Diff line change
Expand Up @@ -20,16 +20,16 @@ contains the spatial transcriptome coordinates and staining image after coarse+f

```sh
openst segment \
--adata alignment/openst_demo_adult_mouse_spatial_beads_puck_collection_aligned.h5ad \
--h5-in alignment/openst_demo_adult_mouse_spatial_beads_puck_collection_aligned.h5ad \
--image-in 'uns/spatial_pairwise_aligned/staining_image_transformed' \
--output-mask 'uns/spatial_pairwise_aligned/mask_transformed_0px' \
--mask-out 'uns/spatial_pairwise_aligned/mask_transformed_0px' \
--model models/HE_cellpose_rajewsky \
--chunked \
--gpu \
--device cuda \
--num-workers 8
```

After running this command, the segmentation mask is created and stored in the same `--adata` file, under
After running this command, the segmentation mask is created and stored in the same `--h5-in` file, under
the dataset `uns/spatial_pairwise_aligned/mask_transformed_0px`.

## Assigning transcripts to segmented cells
Expand All @@ -44,11 +44,10 @@ This step allows you to aggregate capture spots by segmented cells:

```sh
openst transcript_assign \
--adata alignment/openst_demo_adult_mouse_spatial_beads_puck_collection_aligned.h5ad \
--h5-in alignment/openst_demo_adult_mouse_spatial_beads_puck_collection_aligned.h5ad \
--spatial-key spatial_pairwise_aligned_fine \
--mask-in-adata \
--mask 'uns/spatial_pairwise_aligned/mask_transformed_0px' \
--output alignment/openst_demo_adult_mouse_by_cell.h5ad
--mask-in 'uns/spatial_pairwise_aligned/mask_transformed_0px' \
--h5-out alignment/openst_demo_adult_mouse_by_cell.h5ad
```

## Expected output
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/adult_mouse/pairwise_alignment.md
Original file line number Diff line number Diff line change
Expand Up @@ -66,7 +66,7 @@ coarse alignment, and the keypoints file, to perform the fine alignment:

```sh
openst manual_pairwise_aligner \
--keypoints-json alignment/openst_adult_demo_fine_keypoints.json \
--keypoints-in alignment/openst_adult_demo_fine_keypoints.json \
--h5-in alignment/openst_demo_adult_mouse_spatial_beads_puck_collection_aligned.h5ad \
--fine
```
Expand Down
2 changes: 1 addition & 1 deletion docs/examples/adult_mouse/preprocessing_sequencing.md
Original file line number Diff line number Diff line change
Expand Up @@ -162,4 +162,4 @@ If you specified options for *meshing* in the `run_mode`, there will be a file c
This contains *approximate* cell-by-gene information, as the transcripts are aggregated by a regular lattice and not by the true spatial arrangement of
cells. This might be already enough for some analyses.

Anyway... keep going with the tutorial if you want to unleash the full potential of Open-ST 😉.
Anyway... keep going with the tutorial if you want to unleash the full potential of Open-ST.
Loading

0 comments on commit a8687b1

Please sign in to comment.