This document describes the essentials needed to run the main looptrace
processing pipeline. This is primarily for end-users and does not describe much about the software design or packaging.
To be able to run this pipeline on the lab machine, these are the basic requirements:
- Have an account on the machine: you should be able to authenticate with your normal username/password combination used for most authentication within the institute.
- Be in the
docker
group: If you've not run something withdocker
on this machine previously, you're most likely not in this group. Ask Vince or Chris to add you.
- Main experiment folder (
CURR_EXP_HOME
environment variable): On the cluster and on the lab machine, this is often something like/path/to/experiments/folder/Experiments_00XXXX/00XXXX
, but it could be anything so long as the substructure matches what's expected / defined in the config file. - Images subfolder (created on lab machine or cluster): something like
images_all
, but just needs to match the value you'll give with the--images-folder
argument when running the pipeline.- Raw nuclei images subfolder: something like
nuc_images_raw
, though just must match the corresponding key in the config file - FISH images subfolder: something like
seq_images_raw
, though just must match the corresponding key in the config file
- Raw nuclei images subfolder: something like
- Pypiper subfolder (created on lab machine or on cluster): something like
pypiper_output
, where pipeline logs and checkpoints are written; this will be passed by you to the pipeline runner through the--pypiper-folder
argument when you run the pipeline. - Analysis subfolder (created on lab machine or on cluster): something like
2023-08-10_Analysis01
, though can be anything and just must match the name of the subfolder you supply in the config file, as the leaf of the path in theanalysis_path
value
This is a file that declares the imaging rounds executed over the course of an experiment. These data should take the form of a mapping, stored as a JSON object; the sections are described in the separate document for the imaging rounds configuration file. An example is available in the test data.
Looptrace uses a configuration file to define values for processing parameters and pointers to places on disk from which to read and write data. The path to the configuration file is a required parameter in order to run the pipeline, and it should be created (or copied and edited) before running anything. For requirements and suggestions on settings, refer to the separate documentation for the parameters configuration file.
This is optional, and if present it tells looptrace
how to analyze signal from different imaging channels than the one(s) in which spots were detected, in regions defined by those detected spots.
This is a JSON file and can be passed with as the argument to the --signal-config
option when running the pipeline.
For more about this file format and what's required, refer to the separate documentation for the signal analysis configuration file.
Once you have the minimal requirements, this will be the typical workflow for running the pipeline:
-
Login to the machine: something like
ssh username@ask-Vince-or-Chris-for-the-machine-domain
-
Path creation: Assure that the necessary filepaths exist; particularly easy to forget are the path to the folder in which analysis output will be placed (the value of
analysis_path
in the parameters config file), and the path to the folder in which the pipeline will place its own files (--pypiper-folder
argument at the command-line). See the data layout section. -
tmux
: attach to an existingtmux
session, or start a new one. See the tmux section for more info. -
Docker: Start the relevant Docker container, using just
docker
rather thannvidia-docker
if you don't want to run deconvolution (settingdecon_iter
to$0$ , see the parameters configuration file)nvidia-docker run --rm -it -u root -v '/groups/gerlich/experiments/.../00XXXX':/home/experiment looptrace:2024-04-05b bash
-
Run pipeline: Once in the Docker container, run the pipeline, replacing the file and folder names as needed / desired:
python /looptrace/bin/cli/run_processing_pipeline.py \ --rounds-config /home/experiment/looptrace_00XXXX__rounds.json \ --params-config /home/experiment/looptrace_00XXXX__params.yaml \ --signal-config /home/experiment/looptrace__00XXXX__signal.json --images-folder /home/experiment/images_all \ --pypiper-folder /home/experiment/pypiper_output
-
Detach:
Ctrl+b d
-- for more, see the tmux section.
NB: Make each path absolute. Relative paths may not behave well; try to always use absolute file/folder paths for each command-line argument.
NB: Remember to place single quotes around the filepath (experiment folder) you're making available as a volume (-v
) to the Docker container.
While often not necessary, this will protect you if your filepath contains spaces.
In this context, for running the pipeline, think of the terminal multiplexer (tmux
) as a way to start a long-running process and be assured that an interruption in Internet connectivity (e.g., computer sleep or network failure) won't also be an interruption in the long-running process. If your connection to the remote machine is interrupted, but you've started your long-running process in tmux
, that process won't be interrupted.
- Start a session:
tmux
- Detach from the active session:
Ctrl+b d
(i.e., pressCtrl
andb
at the same time, and thend
afterward.) - List sessions:
tmux list-sessions
- Attach to a session:
tmux attach -t <session-number>
- Detach again:
Ctrl+b d
- (Eventually) stop a session:
tmux kill-session -t <session-number>
For more, search for "tmux key bindings" or similar, or refer to this helpful Gist/