diff --git a/src/main/java/org/broadinstitute/hellbender/tools/copynumber/GermlineCNVCaller.java b/src/main/java/org/broadinstitute/hellbender/tools/copynumber/GermlineCNVCaller.java index 301e2d7e275..418e95296ae 100644 --- a/src/main/java/org/broadinstitute/hellbender/tools/copynumber/GermlineCNVCaller.java +++ b/src/main/java/org/broadinstitute/hellbender/tools/copynumber/GermlineCNVCaller.java @@ -97,6 +97,32 @@ * https://theano-pymc.readthedocs.io/en/latest/library/config.html. *
* + *Runtime and memory usage for {@link GermlineCNVCaller} can be impacted by (1) the number of input samples, (2) the + * number of intervals, (3) the highest allowed copy-number state (set using the {@code max-copy-number} argument), + * (4) the number of bias factors (set using the {@code max-bias-factors} argument), and convergence criteria.
+ * + *We recommend running {@link GermlineCNVCaller} in COHORT mode for approximately 200 samples at a time, processing + * between 5k to 12.5k intervals, and {@code max-copy-number} set to 5 across all analyses. For 200 samples and + * 5k intervals, approximately 16GB of memory should be enough to optimize memory usage; for the same + * analysis at 12.5k intervals, we recommend 32GB of memory. Runtimes are on the order of a few hours.
+ * + *Note that {@link GermlineCNVCaller} can be run on larger interval sets by scattering them into smaller "shards." + * The shards can subsequently be merged together by {@link PostprocessGermlineCNVCalls} tool. In cloud + * and HPC environments, the tool can then process each shard in parallel within a single job.
+ * + *By default, {@link GermlineCNVCaller} will attempt to use all CPU cores accessible to it within the runtime
+ * environment. Two environment variables - MKL_NUM_THREADS
and OMP_NUM_THREADS
- control the
+ * parallelism of the underlying linear algebra libraries.
Runtime is also affected by how fast the inference procedure converges. There are multiple tool arguments that can + * be used to set convergence criteria that could speed up this convergence, including but not limited to + * {@code caller-update-convergence-threshold}, {@code convergence-snr-averaging-window}, + * {@code convergence-snr-countdown-window}, and {@code convergence-snr-trigger-threshold}. However, modifying these + * arguments from the default settings might affect the final results, so please exercise caution if + * including any of these arguments.
+ * *