This form can be used to create a Slurm configuration file with you controlling many of the important configuration parameters.
This is the full version of the Slurm configuration tool. This version has all the configuration options to create a Slurm configuration file. There is a simplified version of the Slurm configuration tool available at configurator.easy.html.
This tool supports Slurm version 23.11 only. Configuration files for other versions of Slurm should be built using the tool distributed with it in doc/html/configurator.html. Some parameters will be set to default values, but you can manually edit the resulting slurm.conf as desired for greater flexibility. See man slurm.conf for more details about the configuration parameters.
Note the while Slurm daemons create log files and other files as needed, it treats the lack of parent directories as a fatal error. This prevents the daemons from running if critical file systems are not mounted and will minimize the risk of cold-starting (starting without preserving jobs).
Note that this configuration file must be installed on all nodes in your cluster.
After you have filled in the fields of interest, use the "Submit" button on the bottom of the page to build the slurm.conf file. It will appear on your web browser. Save the file in text format as slurm.conf for use by Slurm.
For more information about Slurm, see https://slurm.schedmd.com/slurm.html
SlurmctldHost: Primary Controller Hostname
BackupController: Backup Controller Hostname (optional)
NodeName: Compute nodes
NodeAddr: Compute node addresses (optional)
PartitionName: Name of the one partition to be created
MaxTime: Maximum time limit of jobs in minutes or INFINITE
The following parameters describe a node's configuration. Set a value for CPUs. The other parameters are optional, but provide more control over scheduled resources:
CPUs: Count of processors on each compute node. If CPUs is omitted, it will be inferred from: Sockets, CoresPerSocket, and ThreadsPerCore.
Sockets: Number of physical processor sockets/chips on the node. If Sockets is omitted, it will be inferred from: CPUs, CoresPerSocket, and ThreadsPerCore.
CoresPerSocket: Number of cores in a single physical processor socket. The CoresPerSocket value describes physical cores, not the logical number of processors per socket.
ThreadsPerCore: Number of logical threads in a single physical core.
RealMemory: Amount of real memory. This parameter is required when specifying Memory as a consumable resource with the select/cons_tres plug-in. See below under Resource Selection.
SlurmUser
SlurmctldPort
SlurmdPort
StateSaveLocation: Slurmctld state save directory Must be writable by all SlurmctldHost nodes
SlurmdSpoolDir: Slurmd state save directory
Define when a non-responding (DOWN) node is returned to service. Select one value for ReturnToService: 0: When explicitly restored to service by an administrator. 1:Upon registration with a valid configuration only if it was set DOWN due to being non-responsive. 2:Upon registration with a valid configuration.
Prolog/Epilog: Path that will be executed as root on every node of a user's job before the job's tasks will be initiated there and after that job has terminated. These parameters are optional.
SrunProlog/Epilog: Fully qualified path to be executed by srun at job step initiation and termination. These parameters may be overridden by srun's --prolog and --epilog options These parameters are optional.
TaskProlog/Epilog: Fully qualified path to be executed as the user before each task begins execution and after each task terminates. These parameters are optional.
SlurmctldDebug (default is info)
SlurmctldLogFile (if empty, log goes to syslog)
SlurmdDebug (default is info)
SlurmdLogFile (if empty, log goes to syslog. String "%h" in name gets replaced with hostname)
JobCompLoc: This is the location of the text file to be written to (if JobCompType=filetxt), or the script to be run (if JobCompType=script), or the URL to the Elasticsearch server (if JobCompType=elasticsearch), or file containing librdkafka parameters (if JobCompType=jobcomp/kafka), database name (for other values of JobCompType).
Options below are for use with a database to specify where the database is running and how to connect to it JobCompHost: Host the database is running on for Job completion JobCompPort: Port the database server is listening on for Job completion JobCompUser: User we are to use to talk to the database for Job completion JobCompParams: Pass arbitrary text string to Job completion plugin JobCompPass: Password we are to use to talk to the database for Job completion
Options below are for use with a database to specify where the database is running and how to connect to it AccountingStorageHost: Host the database is running on for Job Accounting AccountingStoragePort: Port the database server is listening on for Job Accounting AccountingStorageUser: User we are to use to talk to the database for Job Accounting AccountingStoragePass: Password we are to use to talk to the database for Job Accounting. In the case of SlurmDBD, this will be an alternate socket name for use with a Munge daemon providing enterprise-wide authentication (while the default Munge socket would provide cluster-wide authentication only). AccountingStoreFlags: Comma separated list. Options are: 'job_comment' - store the job comment field in the database; 'job_env' - store a batch job's env in the database; 'job_extra' - store a batch job's extra field in the database; 'job_script' - store the job batch script in the database.
SlurmctldPidFile
SlurmdPidFile
SlurmctldTimeout: How many seconds the backup controller waits before becoming the active controller
SlurmdTimeout: How many seconds the Slurm controller waits for the slurmd to respond to a request before considering the node DOWN
InactiveLimit: How many seconds the Slurm controller waits for srun commands to respond before considering the job or job step inactive and terminating it. A value of zero indicates unlimited wait
MinJobAge: How many seconds the Slurm controller waits after a job terminates before purging its record. A record of the job will persist in job completion and/or accounting records indefinitely, but will no longer be visible with the squeue command after puring
KillWait: How many seconds a job is given to gracefully terminate after reaching its time limit and being sent SIGTERM before sending a SIGKILLL
WaitTime: How many seconds after a job step's first task terminates before terminating all remaining tasks. A value of zero indicates unlimited wait
Legal Notices Last modified 04 January 2024