forked from SchedMD/slurm
-
Notifications
You must be signed in to change notification settings - Fork 0
/
RELEASE_NOTES
149 lines (125 loc) · 6.7 KB
/
RELEASE_NOTES
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
RELEASE NOTES FOR SLURM VERSION 13.12
29 August 2013
IMPORTANT NOTE:
If using the slurmdbd (SLURM DataBase Daemon) you must update this first.
The 13.12 slurmdbd will work with SLURM daemons of version 2.5 and above.
You will not need to update all clusters at the same time, but it is very
important to update slurmdbd first and having it running before updating
any other clusters making use of it. No real harm will come from updating
your systems before the slurmdbd, but they will not talk to each other
until you do. Also at least the first time running the slurmdbd you need to
make sure your my.cnf file has innodb_buffer_pool_size equal to at least 64M.
You can accomplish this by adding the line
innodb_buffer_pool_size=64M
under the [mysqld] reference in the my.cnf file and restarting the mysqld.
This is needed when converting large tables over to the new database schema.
SLURM can be upgraded from version 2.5 or 2.6 to version 13.12 without loss of
jobs or other state information. Upgrading directly from an earlier version of
SLURM will result in loss of state information.
HIGHLIGHTS
==========
-- Added partition configuration parameters AllowAccounts, AllowQOS,
DenyAccounts and DenyQOS to provide greater control over use.
-- Added the ability to perform load based scheduling. Allocating resources to
jobs on the nodes with the largest number if idle CPUs.
-- Add mechanism for job_submit plugin to generate error message for srun,
salloc or sbatch to stderr. New argument added to job_submit function in
the plugin.
-- Support for Postgres database has long since been out of date and
problimatic, so it has been removed entirely. If you would like to
use it the code still exists in <= 2.6, but will not be included in
this and future versions of the code.
RPMBUILD CHANGES
================
-- The rpmbuild option for a Cray system with ALPS has changed from
%_with_cray to %_with_cray_alps.
-- CRAY - Do not package Slurm's libpmi or libpmi2 libraries. The Cray version
of those libraries must be used.
CONFIGURATION FILE CHANGES (see "man slurm.conf" for details)
=============================================================
-- Added JobContainerPlugin configuration parameter and plugin infrastructure.
-- Added partition configuration parameters AllowAccounts, AllowQOS,
DenyAccounts and DenyQOS to provide greater control over use.
-- SelectType: select/cray has been renamed select/alps.
-- Added SchedulingParameters paramter of "CR_LLN" and partition parameter of
"LLN=yes|no" to enable loadbased scheduling.
-- In "SchedulerParameters", change default max_job_bf value from 50 to 100.
-- Added "JobAcctGatherParams" configuration parameter. Value of "NoShare"
disables accounting for shared memory.
-- Added FairShareDampeningFactor configuration parameter to offer a greater
priority range based upon utilization.
-- Change MaxArraySize from 16-bit to 32-bit field.
-- Added fields to "scontrol show job" output: boards_per_node,
sockets_per_board, ntasks_per_node, ntasks_per_board, ntasks_per_socket,
ntasks_per_core, and nice.
-- Added new configuration parameter of AuthInfo to specify port used by
authentication plugin.
DBD CONFIGURATION FILE CHANGES (see "man slurmdbd.conf" for details)
====================================================================
COMMAND CHANGES (see man pages for details)
===========================================
-- Added sbatch --signal option of "B:" to signal the batch shell rather than
only the spawned job steps.
-- Added sinfo and squeue format option of "%all" to print all fields available
for the data type with a vertical bar separating each field.
-- Add StdIn, StdOut, and StdErr paths to job information dumped with
"scontrol show job".
-- Permit Slurm administrator to submit a batch job as any user.
-- Added -I|--item-extract option to sh5util to extract data item from series.
-- Add squeue output format options for job command and working directory
(%o and %Z respectively).
-- Add stdin/out/err to sview job output.
-- Added a new option to the scontrol command to view licenses that are configured
in use and avalable. 'scontrol show licenses'.
-- Permit jobs in finished state to be requeued.
-- Added a new option to scontrol to put a requeued job on hold. A requeued
job can be put in a new special state called SPECIAL_EXIT indicating
the job has exited with a special value.
"scontrol requeuehold state=SpecialExit 123".
-- Pending job steps will have step_id of INFINITE rather than NO_VAL and
will be reported as "TBD" by scontrol and squeue commands.
-- Added sgather tool to gather files from a job's compute nodes into a
central location.
-- Added -S/--core-spec option to salloc, sbatch and srun commands to reserve
specialized cores for system use.
OTHER CHANGES
=============
-- Added job_info() and step_info() functions to the gres plugins to extract
plugin specific fields from the job's or step's GRES data structure.
API CHANGES
===========
Changed members of the following structs
========================================
-- partition_info, added: allow_accounts, allow_qos, deny_accounts,
deny_groups and deny_qos.
-- Struct job_info / slurm_job_info_t: Changed exc_node_inx, node_inx, and
req_node_inx from type int to type int32_t
-- Struct job_info / slurm_job_info_t: Changed array_task_id from type
uint16_t to type uint32_t
-- job_step_info_t: Changed node_inx from type int to type int32_t
-- Struct partition_info / partition_info_t: Changed node_inx from type int
to type int32_t
-- block_job_info_t: Changed cnode_inx from type int to type int32_t
-- block_info_t: Changed ionode_inx and mp_inx from type int to type int32_t
-- Struct reserve_info / reserve_info_t: Changed node_inx from type int to
type int32_t
Added the following struct definitions
======================================
-- struct job_info / slurm_job_info_t: Added std_err, std_in and std_out.
-- Struct job_info / slurm_job_info_t: Added core_spec
-- struct job_descriptorjob_desc_msg_t: Added core_spec
Changed the following enums and #defines
========================================
-- Add new job_state of JOB_BOOT_FAIL for job terminations due to failure to
boot it's allocated nodes or BlueGene block.
Added the following API's
=========================
-- API for retrieving license information.
extern int slurm_load_licenses PARAMS((license_info_msg_t **));
extern void slurm_free_license_info_msg PARAMS((license_info_msg_t *));
Change the following API's
===========================
-- Add task pointer to the task_post_term() function in task plugins. The
terminating task's PID is available in task->pid.
DBD API Changes
===============