scrun
Section: Slurm Commands (1)Updated: Slurm Commands
Index
NAME
scrun - an OCI runtime proxy for Slurm.SYNOPSIS
Create Operation
-
Prepares a new container with container-id in current working directory.
Start Operation
-
Request to start and run container in job.
Query State Operation
-
Output OCI defined JSON state of container.
Kill Operation
-
Send signal (default: SIGTERM) to container.
Delete Operation
-
Release any resources held by container locally and remotely.
Perform OCI runtime operations against container-id per:
https://github.com/opencontainers/runtime-spec/blob/main/runtime.mdscrun attempts to mimic the commandline behavior as closely as possible to crun and runc in order to maintain in place replacement compatibility with DOCKER and podman. All commandline arguments for crun and runc will be accepted for compatibility but may be ignored depending on their applicability.
DESCRIPTION
scrun is an OCI runtime proxy for Slurm. It acts as a common interface to DOCKER or podman to allow container operations to be executed under Slurm as jobs. scrun will accept all commands as an OCI compliant runtime but will proxy the container and all STDIO to Slurm for scheduling and execution. The containers will be executed remotely on Slurm compute nodes according to settings in oci.conf(5).
scrun requires all containers to be OCI image compliant per:
https://github.com/opencontainers/image-spec/blob/main/spec.md
RETURN VALUE
On successful operation, scrun will return 0. For any other condition scrun will return any non-zero number to denote a error.GLOBAL OPTIONS
- --cgroup-manager
- Ignored.
-
- --debug
- Activate debug level logging.
-
- -f <slurm_conf_path>
- Use specified slurm.conf for configuration.
Default: sysconfdir from configure during compilation -
- --usage
- Show quick help on how to call scrun
-
- --log-format=<json|text>
- Optional select format for logging. May be "json" or "text".
Default: text -
- --root=<root_path>
- Path to spool directory to communication sockets and temporary directories and
files. This should be a tmpfs and should be cleared on reboot.
Default: /run/user/{user_id}/scrun/ -
- --rootless
- Ignored. All scrun commands are always rootless.
-
- --systemd-cgroup
- Ignored.
-
- -v
- Increase logging verbosity. Multiple -v's increase verbosity.
-
- -V, --version
- Print version information and exit.
-
CREATE OPTIONS
- -b <bundle_path>, --bundle=<bundle_path>
- Path to the root of the bundle directory.
Default: caller's working directory -
- --console-socket=<console_socket_path>
- Optional path to an AF_UNIX socket which will receive a file descriptor
referencing the master end of the console's pseudoterminal.
Default: ignored -
- --no-pivot
- Ignored.
-
- --no-new-keyring
- Ignored.
-
- --pid-file=<pid_file_path>
- Specify the file to lock and populate with process ID.
Default: ignored -
- --preserve-fds
- Ignored.
-
DELETE OPTIONS
INPUT ENVIRONMENT VARIABLES
- SCRUN_DEBUG=<quiet|fatal|error|info|verbose|debug|debug2|debug3|debug4|debug5>
- Set logging level.
-
- SCRUN_STDERR_DEBUG=<quiet|fatal|error|info|verbose|debug|debug2|debug3|debug4|debug5>
- Set logging level for standard error output only.
-
- SCRUN_SYSLOG_DEBUG=<quiet|fatal|error|info|verbose|debug|debug2|debug3|debug4|debug5>
- Set logging level for syslogging only.
-
- SCRUN_FILE_DEBUG=<quiet|fatal|error|info|verbose|debug|debug2|debug3|debug4|debug5>
- Set logging level for log file only.
-
JOB INPUT ENVIRONMENT VARIABLES
- SCRUN_ACCOUNT
- See SLURM_ACCOUNT from srun(1).
-
- SCRUN_ACCTG_FREQ
- See SLURM_ACCTG_FREQ from srun(1).
-
- SCRUN_BURST_BUFFER
- See SLURM_BURST_BUFFER from srun(1).
-
- SCRUN_CLUSTER_CONSTRAINT
- See SLURM_CLUSTER_CONSTRAINT from srun(1).
-
- SCRUN_CLUSTERS
- See SLURM_CLUSTERS from srun(1).
-
- SCRUN_CONSTRAINT
- See SLURM_CONSTRAINT from srun(1).
-
- SLURM_CORE_SPEC
- See SLURM_ACCOUNT from srun(1).
-
- SCRUN_CPU_BIND
- See SLURM_CPU_BIND from srun(1).
-
- SCRUN_CPU_FREQ_REQ
- See SLURM_CPU_FREQ_REQ from srun(1).
-
- SCRUN_CPUS_PER_GPU
- See SLURM_CPUS_PER_GPU from srun(1).
-
- SCRUN_CPUS_PER_TASK
- See SRUN_CPUS_PER_TASK from srun(1).
-
- SCRUN_DELAY_BOOT
- See SLURM_DELAY_BOOT from srun(1).
-
- SCRUN_DEPENDENCY
- See SLURM_DEPENDENCY from srun(1).
-
- SCRUN_DISTRIBUTION
- See SLURM_DISTRIBUTION from srun(1).
-
- SCRUN_EPILOG
- See SLURM_EPILOG from srun(1).
-
- SCRUN_EXACT
- See SLURM_EXACT from srun(1).
-
- SCRUN_EXCLUSIVE
- See SLURM_EXCLUSIVE from srun(1).
-
- SCRUN_GPU_BIND
- See SLURM_GPU_BIND from srun(1).
-
- SCRUN_GPU_FREQ
- See SLURM_GPU_FREQ from srun(1).
-
- SCRUN_GPUS
- See SLURM_GPUS from srun(1).
-
- SCRUN_GPUS_PER_NODE
- See SLURM_GPUS_PER_NODE from srun(1).
-
- SCRUN_GPUS_PER_SOCKET
- See SLURM_GPUS_PER_SOCKET from salloc(1).
-
- SCRUN_GPUS_PER_TASK
- See SLURM_GPUS_PER_TASK from srun(1).
-
- SCRUN_GRES_FLAGS
- See SLURM_GRES_FLAGS from srun(1).
-
- SCRUN_GRES
- See SLURM_GRES from srun(1).
-
- SCRUN_HINT
- See SLURM_HIST from srun(1).
-
- SCRUN_JOB_NAME
- See SLURM_JOB_NAME from srun(1).
-
- SCRUN_JOB_NODELIST
- See SLURM_JOB_NODELIST from srun(1).
-
- SCRUN_JOB_NUM_NODES
- See SLURM_JOB_NUM_NODES from srun(1).
-
- SCRUN_LABELIO
- See SLURM_LABELIO from srun(1).
-
- SCRUN_MEM_BIND
- See SLURM_MEM_BIND from srun(1).
-
- SCRUN_MEM_PER_CPU
- See SLURM_MEM_PER_CPU from srun(1).
-
- SCRUN_MEM_PER_GPU
- See SLURM_MEM_PER_GPU from srun(1).
-
- SCRUN_MEM_PER_NODE
- See SLURM_MEM_PER_NODE from srun(1).
-
- SCRUN_MPI_TYPE
- See SLURM_MPI_TYPE from srun(1).
-
- SCRUN_NCORES_PER_SOCKET
- See SLURM_NCORES_PER_SOCKET from srun(1).
-
- SCRUN_NETWORK
- See SLURM_NETWORK from srun(1).
-
- SCRUN_NSOCKETS_PER_NODE
- See SLURM_NSOCKETS_PER_NODE from srun(1).
-
- SCRUN_NTASKS
- See SLURM_NTASKS from srun(1).
-
- SCRUN_NTASKS_PER_CORE
- See SLURM_NTASKS_PER_CORE from srun(1).
-
- SCRUN_NTASKS_PER_GPU
- See SLURM_NTASKS_PER_GPU from srun(1).
-
- SCRUN_NTASKS_PER_NODE
- See SLURM_NTASKS_PER_NODE from srun(1).
-
- SCRUN_NTASKS_PER_TRES
- See SLURM_NTASKS_PER_TRES from srun(1).
-
- SCRUN_OPEN_MODE
- See SLURM_MODE from srun(1).
-
- SCRUN_OVERCOMMIT
- See SLURM_OVERCOMMIT from srun(1).
-
- SCRUN_OVERLAP
- See SLURM_OVERLAP from srun(1).
-
- SCRUN_PARTITION
- See SLURM_PARTITION from srun(1).
-
- SCRUN_POWER
- See SLURM_POWER from srun(1).
-
- SCRUN_PROFILE
- See SLURM_PROFILE from srun(1).
-
- SCRUN_PROLOG
- See SLURM_PROLOG from srun(1).
-
- SCRUN_QOS
- See SLURM_QOS from srun(1).
-
- SCRUN_REMOTE_CWD
- See SLURM_REMOTE_CWD from srun(1).
-
- SCRUN_REQ_SWITCH
- See SLURM_REQ_SWITCH from srun(1).
-
- SCRUN_RESERVATION
- See SLURM_RESERVATION from srun(1).
-
- SCRUN_SIGNAL
- See SLURM_SIGNAL from srun(1).
-
- SCRUN_SLURMD_DEBUG
- See SLURMD_DEBUG from srun(1).
-
- SCRUN_SPREAD_JOB
- See SLURM_SPREAD_JOB from srun(1).
-
- SCRUN_TASK_EPILOG
- See SLURM_TASK_EPILOG from srun(1).
-
- SCRUN_TASK_PROLOG
- See SLURM_TASK_PROLOG from srun(1).
-
- SCRUN_THREAD_SPEC
- See SLURM_THREAD_SPEC from srun(1).
-
- SCRUN_THREADS_PER_CORE
- See SLURM_THREADS_PER_CORE from srun(1).
-
- SCRUN_THREADS
- See SLURM_THREADS from srun(1).
-
- SCRUN_TIMELIMIT
- See SLURM_TIMELIMIT from srun(1).
-
- SCRUN_TRES_BIND
- Same as --tres-bind
-
- SCRUN_TRES_PER_TASK
- See SLURM_TRES_PER_TASK from srun(1).
-
- SCRUN_UNBUFFEREDIO
- See SLURM_UNBUFFEREDIO from srun(1).
-
- SCRUN_USE_MIN_NODES
- See SLURM_USE_MIN_NODES from srun(1).
-
- SCRUN_WAIT4SWITCH
- See SLURM_WAIT4SWITCH from srun(1).
-
- SCRUN_WCKEY
- See SLURM_WCKEY from srun(1).
-
- SCRUN_WORKING_DIR
- See SLURM_WORKING_DIR from srun(1).
-
OUTPUT ENVIRONMENT VARIABLES
- SCRUN_OCI_VERSION
- Advertised version of OCI compliance of container.
-
- SCRUN_CONTAINER_ID
- Value based as container_id during create operation.
-
- SCRUN_PID
- PID of process used to monitor and control container on allocation node.
-
- SCRUN_BUNDLE
- Path to container bundle directory.
-
- SCRUN_SUBMISSION_BUNDLE
- Path to container bundle directory before modification by Lua script.
-
- SCRUN_ANNOTATION_*
- List of annotations from container's config.json.
-
- SCRUN_PID_FILE
- Path to pid file that is locked and populated with PID of scrun.
-
- SCRUN_SOCKET
- Path to control socket for scrun.
-
- SCRUN_SPOOL_DIR
- Path to workspace for all temporary files for current container. Purged by deletion operation.
-
- SCRUN_SUBMISSION_CONFIG_FILE
- Path to container's config.json file at time of submission.
-
- SCRUN_USER
- Name of user that called create operation.
-
- SCRUN_USER_ID
- Numeric ID of user that called create operation.
-
- SCRUN_GROUP
- Name of user's primary group that called create operation.
-
- SCRUN_GROUP_ID
- Numeric ID of user primary group that called create operation.
-
- SCRUN_ROOT
- See --root.
-
- SCRUN_ROOTFS_PATH
- Path to container's root directory.
-
- SCRUN_SUBMISSION_ROOTFS_PATH
- Path to container's root directory at submission time.
-
- SCRUN_LOG_FILE
- Path to scrun's log file during create operation.
-
- SCRUN_LOG_FORMAT
- Log format type during create operation.
-
JOB OUTPUT ENVIRONMENT VARIABLES
- SLURM_*_HET_GROUP_#
- For a heterogeneous job allocation, the environment variables are set separately for each component.
-
- SLURM_CLUSTER_NAME
- Name of the cluster on which the job is executing.
-
- SLURM_CONTAINER
- OCI Bundle for job.
-
- SLURM_CONTAINER_ID
- OCI id for job.
-
- SLURM_CPUS_PER_GPU
- Number of CPUs requested per allocated GPU.
-
- SLURM_CPUS_PER_TASK
- Number of CPUs requested per task.
-
- SLURM_DIST_PLANESIZE
- Plane distribution size. Only set for plane distributions.
-
- SLURM_DISTRIBUTION
- Distribution type for the allocated jobs.
-
- SLURM_GPU_BIND
- Requested binding of tasks to GPU.
-
- SLURM_GPU_FREQ
- Requested GPU frequency.
-
- SLURM_GPUS
- Number of GPUs requested.
-
- SLURM_GPUS_PER_NODE
- Requested GPU count per allocated node.
-
- SLURM_GPUS_PER_SOCKET
- Requested GPU count per allocated socket.
-
- SLURM_GPUS_PER_TASK
- Requested GPU count per allocated task.
-
- SLURM_HET_SIZE
- Set to count of components in heterogeneous job.
-
- SLURM_JOB_ACCOUNT
- Account name associated of the job allocation.
-
- SLURM_JOB_CPUS_PER_NODE
- Count of CPUs available to the job on the nodes in the allocation, using the format CPU_count[(xnumber_of_nodes)][,CPU_count [(xnumber_of_nodes)] ...]. For example: SLURM_JOB_CPUS_PER_NODE='72(x2),36' indicates that on the first and second nodes (as listed by SLURM_JOB_NODELIST) the allocation has 72 CPUs, while the third node has 36 CPUs. NOTE: The select/linear plugin allocates entire nodes to jobs, so the value indicates the total count of CPUs on allocated nodes. The select/cons_tres plugin allocates individual CPUs to jobs, so this number indicates the number of CPUs allocated to the job.
-
- SLURM_JOB_END_TIME
- The UNIX timestamp for a job's projected end time.
-
- SLURM_JOB_GPUS
- The global GPU IDs of the GPUs allocated to this job. The GPU IDs are not relative to any device cgroup, even if devices are constrained with task/cgroup. Only set in batch and interactive jobs.
-
- SLURM_JOB_ID
- The ID of the job allocation.
-
- SLURM_JOB_NODELIST
- List of nodes allocated to the job.
-
- SLURM_JOB_NUM_NODES
- Total number of nodes in the job allocation.
-
- SLURM_JOB_PARTITION
- Name of the partition in which the job is running.
-
- SLURM_JOB_QOS
- Quality Of Service (QOS) of the job allocation.
-
- SLURM_JOB_RESERVATION
- Advanced reservation containing the job allocation, if any.
-
- SLURM_JOB_START_TIME
- UNIX timestamp for a job's start time.
-
- SLURM_MEM_BIND
- Bind tasks to memory.
-
- SLURM_MEM_BIND_LIST
- Set to bit mask used for memory binding.
-
- SLURM_MEM_BIND_PREFER
- Set to "prefer" if the SLURM_MEM_BIND option includes the prefer option.
-
- SLURM_MEM_BIND_SORT
- Sort free cache pages (run zonesort on Intel KNL nodes)
-
- SLURM_MEM_BIND_TYPE
- Set to the memory binding type specified with the SLURM_MEM_BIND option. Possible values are "none", "rank", "map_map", "mask_mem" and "local".
-
- SLURM_MEM_BIND_VERBOSE
- Set to "verbose" if the SLURM_MEM_BIND option includes the verbose option. Set to "quiet" otherwise.
-
- SLURM_MEM_PER_CPU
- Minimum memory required per usable allocated CPU.
-
- SLURM_MEM_PER_GPU
- Requested memory per allocated GPU.
-
- SLURM_MEM_PER_NODE
- Specify the real memory required per node.
-
- SLURM_NTASKS
- Specify the number of tasks to run.
-
- SLURM_NTASKS_PER_CORE
- Request the maximum ntasks be invoked on each core.
-
- SLURM_NTASKS_PER_GPU
- Request that there are ntasks tasks invoked for every GPU.
-
- SLURM_NTASKS_PER_NODE
- Request that ntasks be invoked on each node.
-
- SLURM_NTASKS_PER_SOCKET
- Request the maximum ntasks be invoked on each socket.
-
- SLURM_OVERCOMMIT
- Overcommit resources.
-
- SLURM_PROFILE
- Enables detailed data collection by the acct_gather_profile plugin.
-
- SLURM_SHARDS_ON_NODE
- Number of GPU Shards available to the step on this node.
-
- SLURM_SUBMIT_HOST
- The hostname of the computer from which scrun was invoked.
-
- SLURM_TASKS_PER_NODE
- Number of tasks to be initiated on each node. Values are comma separated and in the same order as SLURM_JOB_NODELIST. If two or more consecutive nodes are to have the same task count, that count is followed by "(x#)" where "#" is the repetition count. For example, "SLURM_TASKS_PER_NODE=2(x3),1" indicates that the first three nodes will each execute two tasks and the fourth node will execute one task.
-
- SLURM_THREADS_PER_CORE
- This is only set if --threads-per-core or SCRUN_THREADS_PER_CORE were specified. The value will be set to the value specified by --threads-per-core or SCRUN_THREADS_PER_CORE. This is used by subsequent srun calls within the job allocation.
-
- SLURM_TRES_PER_TASK
- Set to the value of --tres-per-task. If --cpus-per-task or --gpus-per-task is specified, it is also set in SLURM_TRES_PER_TASK as if it were specified in --tres-per-task.
-
SCRUN.LUA
/etc/slurm/scrun.lua must be present on any node where scrun will be invoked. scrun.lua must be a compliant lua script.
Required functions
The following functions must be defined.
- • function slurm_scrun_stage_in(id, bundle, spool_dir, config_file, job_id, user_id, group_id, job_env)
-
Called right after job allocation to stage container into job node(s). Must
return SLURM.success or job will be cancelled. It is required that
function will prepare the container for execution on job node(s) as required to
run as configured in oci.conf(1). The function may block as long as
required until container has been fully prepared (up to the job's max wall
time).
-
- id
- Container ID
- bundle
- OCI bundle path
- spool_dir
- Temporary working directory for container
- config_file
- Path to config.json for container
- job_id
- jobid of job allocation
- user_id
- Resolved numeric user id of job allocation. It is generally expected that the lua script will be executed inside of a user namespace running under the root(0) user.
- group_id
- Resolved numeric group id of job allocation. It is generally expected that the lua script will be executed inside of a user namespace running under the root(0) group.
- job_env
- Table with each entry of Key=Value or Value of each environment variable of the job.
-
- • function slurm_scrun_stage_out(id, bundle, orig_bundle, root_path, orig_root_path, spool_dir, config_file, jobid, user_id, group_id)
-
Called right after container step completes to stage out files from job nodes.
Must return SLURM.success or job will be cancelled. It is required that
function will pull back any changes and cleanup the container on job node(s).
The function may block as long as required until container has been fully
prepared (up to the job's max wall time).
-
- id
- Container ID
- bundle
- OCI bundle path
- orig_bundle
- Originally submitted OCI bundle path before modification by set_bundle_path().
- root_path
- Path to directory root of container contents.
- orig_root_path
- Original path to directory root of container contents before modification by set_root_path().
- spool_dir
- Temporary working directory for container
- config_file
- Path to config.json for container
- job_id
- jobid of job allocation
- user_id
- Resolved numeric user id of job allocation. It is generally expected that the lua script will be executed inside of a user namespace running under the root(0) user.
- group_id
- Resolved numeric group id of job allocation. It is generally expected that the lua script will be executed inside of a user namespace running under the root(0) group.
-
Provided functions
The following functions are provided for any Lua function to call as needed.
- • slurm.set_bundle_path(PATH)
-
Called to notify scrun to use PATH as new OCI container bundle
path. Depending on the filesystem layout, cloning the container bundle may be
required to allow execution on job nodes.
- • slurm.set_root_path(PATH)
-
Called to notify scrun to use PATH as new container root filesystem
path. Depending on the filesystem layout, cloning the container bundle may be
required to allow execution on job nodes. Script must also update #/root/path
in config.json when changing root path.
- • STATUS,OUTPUT = slurm.remote_command(SCRIPT)
-
Run SCRIPT in new job step on all job nodes. Returns numeric job status
as STATUS and job stdio as OUTPUT. Blocks until SCRIPT exits.
- • STATUS,OUTPUT = slurm.allocator_command(SCRIPT)
-
Run SCRIPT as forked child process of scrun. Returns numeric job status
as STATUS and job stdio as OUTPUT. Blocks until SCRIPT exits.
- • slurm.log(MSG, LEVEL)
-
Log MSG at log LEVEL. Valid range of values for LEVEL is [0,
4].
- • slurm.error(MSG)
-
Log error MSG.
- • slurm.log_error(MSG)
-
Log error MSG.
- • slurm.log_info(MSG)
-
Log MSG at log level INFO.
- • slurm.log_verbose(MSG)
-
Log MSG at log level VERBOSE.
- • slurm.log_verbose(MSG)
-
Log MSG at log level VERBOSE.
- • slurm.log_debug(MSG)
-
Log MSG at log level DEBUG.
- • slurm.log_debug2(MSG)
-
Log MSG at log level DEBUG2.
- • slurm.log_debug3(MSG)
-
Log MSG at log level DEBUG3.
- • slurm.log_debug4(MSG)
-
Log MSG at log level DEBUG4.
- • MINUTES = slurm.time_str2mins(TIME_STRING)
-
Parse TIME_STRING into number of minutes as MINUTES. Valid formats:
-
- • days-[hours[:minutes[:seconds]]]
- • hours:minutes:seconds
- • minutes[:seconds]
- • -1
- • INFINITE
- • UNLIMITED
-
Example scrun.lua scripts
- Full Container staging example using rsync:
-
This full example will stage a container as given by docker or
podman. The container's config.json is modified to remove unwanted
functions that may cause the container run to under crun or
runc.
The script uses rsync to move the container to a shared filesystem
under the scratch_path variable.
NOTE: Support for JSON in liblua must generally be installed before Slurm is compiled. scrun.lua's syntax and ability to load JSON support should be tested by directly calling the script using lua outside of Slurm.
local json = require 'json' local open = io.open local scratch_path = "/run/user/" local function read_file(path) local file = open(path, "rb") if not file then return nil end local content = file:read "*all" file:close() return content end local function write_file(path, contents) local file = open(path, "wb") if not file then return nil end file:write(contents) file:close() return end function slurm_scrun_stage_in(id, bundle, spool_dir, config_file, job_id, user_id, group_id, job_env) slurm.log_debug(string.format("stage_in(%s, %s, %s, %s, %d, %d, %d)", id, bundle, spool_dir, config_file, job_id, user_id, group_id)) local status, output, user, rc local config = json.decode(read_file(config_file)) local src_rootfs = config["root"]["path"] rc, user = slurm.allocator_command(string.format("id -un %d", user_id)) user = string.gsub(user, "%s+", "") local root = scratch_path..math.floor(user_id).."/slurm/scrun/" local dst_bundle = root.."/"..id.."/" local dst_config = root.."/"..id.."/config.json" local dst_rootfs = root.."/"..id.."/rootfs/" if string.sub(src_rootfs, 1, 1) ~= "/" then -- always use absolute path src_rootfs = string.format("%s/%s", bundle, src_rootfs) end status, output = slurm.allocator_command("mkdir -p "..dst_rootfs) if (status ~= 0) then slurm.log_info(string.format("mkdir(%s) failed %u: %s", dst_rootfs, status, output)) return slurm.ERROR end status, output = slurm.allocator_command(string.format("/usr/bin/env rsync --exclude sys --exclude proc --numeric-ids --delete-after --ignore-errors --stats -a -- %s/ %s/", src_rootfs, dst_rootfs)) if (status ~= 0) then -- rsync can fail due to permissions which may not matter slurm.log_info(string.format("WARNING: rsync failed: %s", output)) end slurm.set_bundle_path(dst_bundle) slurm.set_root_path(dst_rootfs) config["root"]["path"] = dst_rootfs -- Always force user namespace support in container or runc will reject local process_user_id = 0 local process_group_id = 0 if ((config["process"] ~= nil) and (config["process"]["user"] ~= nil)) then -- resolve out user in the container if (config["process"]["user"]["uid"] ~= nil) then process_user_id=config["process"]["user"]["uid"] else process_user_id=0 end -- resolve out group in the container if (config["process"]["user"]["gid"] ~= nil) then process_group_id=config["process"]["user"]["gid"] else process_group_id=0 end -- purge additionalGids as they are not supported in rootless if (config["process"]["user"]["additionalGids"] ~= nil) then config["process"]["user"]["additionalGids"] = nil end end if (config["linux"] ~= nil) then -- force user namespace to always be defined for rootless mode local found = false if (config["linux"]["namespaces"] == nil) then config["linux"]["namespaces"] = {} else for _, namespace in ipairs(config["linux"]["namespaces"]) do if (namespace["type"] == "user") then found=true break end end end if (found == false) then table.insert(config["linux"]["namespaces"], {type= "user"}) end -- Provide default user map as root if one not provided if (true or config["linux"]["uidMappings"] == nil) then config["linux"]["uidMappings"] = {{containerID=process_user_id, hostID=math.floor(user_id), size=1}} end -- Provide default group map as root if one not provided -- mappings fail with build??? if (true or config["linux"]["gidMappings"] == nil) then config["linux"]["gidMappings"] = {{containerID=process_group_id, hostID=math.floor(group_id), size=1}} end -- disable trying to use a specific cgroup config["linux"]["cgroupsPath"] = nil end if (config["mounts"] ~= nil) then -- Find and remove any user/group settings in mounts for _, mount in ipairs(config["mounts"]) do local opts = {} if (mount["options"] ~= nil) then for _, opt in ipairs(mount["options"]) do if ((string.sub(opt, 1, 4) ~= "gid=") and (string.sub(opt, 1, 4) ~= "uid=")) then table.insert(opts, opt) end end end if (opts ~= nil and #opts > 0) then mount["options"] = opts else mount["options"] = nil end end -- Remove all bind mounts by copying files into rootfs local mounts = {} for i, mount in ipairs(config["mounts"]) do if ((mount["type"] ~= nil) and (mount["type"] == "bind") and (string.sub(mount["source"], 1, 4) ~= "/sys") and (string.sub(mount["source"], 1, 5) ~= "/proc")) then status, output = slurm.allocator_command(string.format("/usr/bin/env rsync --numeric-ids --ignore-errors --stats -a -- %s %s", mount["source"], dst_rootfs..mount["destination"])) if (status ~= 0) then -- rsync can fail due to permissions which may not matter slurm.log_info("rsync failed") end else table.insert(mounts, mount) end end config["mounts"] = mounts end -- Force version to one compatible with older runc/crun at risk of new features silently failing config["ociVersion"] = "1.0.0" -- Merge in Job environment into container -- this is optional! if (config["process"]["env"] == nil) then config["process"]["env"] = {} end for _, env in ipairs(job_env) do table.insert(config["process"]["env"], env) end -- Remove all prestart hooks to squash any networking attempts if ((config["hooks"] ~= nil) and (config["hooks"]["prestart"] ~= nil)) then config["hooks"]["prestart"] = nil end -- Remove all rlimits if ((config["process"] ~= nil) and (config["process"]["rlimits"] ~= nil)) then config["process"]["rlimits"] = nil end write_file(dst_config, json.encode(config)) slurm.log_info("created: "..dst_config) return slurm.SUCCESS end function slurm_scrun_stage_out(id, bundle, orig_bundle, root_path, orig_root_path, spool_dir, config_file, jobid, user_id, group_id) if (root_path == nil) then root_path = "" end slurm.log_debug(string.format("stage_out(%s, %s, %s, %s, %s, %s, %s, %d, %d, %d)", id, bundle, orig_bundle, root_path, orig_root_path, spool_dir, config_file, jobid, user_id, group_id)) if (bundle == orig_bundle) then slurm.log_info(string.format("skipping stage_out as bundle=orig_bundle=%s", bundle)) return slurm.SUCCESS end status, output = slurm.allocator_command(string.format("/usr/bin/env rsync --numeric-ids --delete-after --ignore-errors --stats -a -- %s/ %s/", root_path, orig_root_path)) if (status ~= 0) then -- rsync can fail due to permissions which may not matter slurm.log_info("rsync failed") else -- cleanup temporary after they have been synced backed to source slurm.allocator_command(string.format("/usr/bin/rm --preserve-root=all --one-file-system -dr -- %s", bundle)) end return slurm.SUCCESS end slurm.log_info("initialized scrun.lua") return slurm.SUCCESS
SIGNALS
When scrun receives SIGINT, it will attempt to gracefully cancel any related jobs (if any) and cleanup.
COPYING
Copyright (C) 2023 SchedMD LLC.This file is part of Slurm, a resource management program. For details, see <https://slurm.schedmd.com/>.
Slurm is free software; you can redistribute it and/or modify it under the terms of the GNU General Public License as published by the Free Software Foundation; either version 2 of the License, or (at your option) any later version.
Slurm is distributed in the hope that it will be useful, but WITHOUT ANY WARRANTY; without even the implied warranty of MERCHANTABILITY or FITNESS FOR A PARTICULAR PURPOSE. See the GNU General Public License for more details.
SEE ALSO
slurm(1), oci.conf(5), srun(1), crun, runc, DOCKER and podman
Index
- NAME
- SYNOPSIS
- Create Operation
- Start Operation
- Query State Operation
- Kill Operation
- Delete Operation
- DESCRIPTION
- RETURN VALUE
- GLOBAL OPTIONS
- CREATE OPTIONS
- DELETE OPTIONS
- INPUT ENVIRONMENT VARIABLES
- JOB INPUT ENVIRONMENT VARIABLES
- OUTPUT ENVIRONMENT VARIABLES
- JOB OUTPUT ENVIRONMENT VARIABLES
- SCRUN.LUA
- SIGNALS
- COPYING
- SEE ALSO
This document was created by man2html using the manual pages.
Time: 08:01:51 GMT, October 24, 2024