From 07fcd7148e40da1586beee9dfcf2696caa116731 Mon Sep 17 00:00:00 2001 From: Min Xu Date: Fri, 26 May 2017 00:42:34 -0400 Subject: [PATCH 1/2] Add jobmax and jobmin to the Titan's batch queue Add the jobmin=0 and jobmax=299008 to the Titan's default batch queue in config_batch.xml. The number of 299008 is total physical cores in Titan. This addition will let all jobs run in the default batch queue and avoid the failure caused by putting more than one job into the debug queue simultaneously as only 1 debug job is allowed in Titan. This fix is a better solution than that in the reverted PR #1534 [BFB] --- config/acme/machines/config_batch.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config/acme/machines/config_batch.xml b/config/acme/machines/config_batch.xml index 39420fc7f29..4869c3d76db 100644 --- a/config/acme/machines/config_batch.xml +++ b/config/acme/machines/config_batch.xml @@ -392,7 +392,7 @@ -l nodes={{ num_nodes }} - batch + batch debug From 42fa25049e7022c4ebb1b1c4c1fadeab448da2d6 Mon Sep 17 00:00:00 2001 From: Min Xu Date: Tue, 30 May 2017 22:04:16 -0400 Subject: [PATCH 2/2] Change jobmax for Titan debug queue Since there is no limit on how many cores can be used for a debug job in titan, the jobmax for debug queue is changed from 64 to the maximum number of cores (299008) avaiable. [BFB] --- config/acme/machines/config_batch.xml | 2 +- 1 file changed, 1 insertion(+), 1 deletion(-) diff --git a/config/acme/machines/config_batch.xml b/config/acme/machines/config_batch.xml index 4869c3d76db..8fa77abdfd9 100644 --- a/config/acme/machines/config_batch.xml +++ b/config/acme/machines/config_batch.xml @@ -393,7 +393,7 @@ batch - debug + debug