scontrol - Used view and modify Slurm configuration and state.

NAME

       scontrol - Used view and modify Slurm configuration and state.

SYNOPSIS

       scontrol [OPTIONS...] [COMMAND...]

DESCRIPTION

       scontrol  is used to view or modify Slurm configuration including: job,
       job  step,   node,   partition,   reservation,   and   overall   system
       configuration.  Most of the commands can only be executed by user root.
       If an attempt to view or modify configuration information is made by an
       unauthorized  user,  an error message will be printed and the requested
       action will not occur. If no command is entered on  the  execute  line,
       scontrol  will  operate in an interactive mode and prompt for input. It
       will  continue  prompting  for  input  and  executing  commands   until
       explicitly  terminated.  If  a  command is entered on the execute line,
       scontrol will execute that command  and  terminate.  All  commands  and
       options are case-insensitive, although node names, partition names, and
       reservation names are case-sensitive (node  names  "LX"  and  "lx"  are
       distinct).   All  commands and options can be abbreviated to the extent
       that the specification is unique.

OPTIONS

       -a, --all
              When the show command is  used,  then  display  all  partitions,
              their  jobs  and  jobs  steps.  This  causes  information  to be
              displayed about partitions that are  configured  as  hidden  and
              partitions that are unavailable to user’s group.

       -d, --detail
              Causes  the  show  command  to  provide additional details where
              available.

       -h, --help
              Print a help message describing the usage of scontrol.

       --hide Do not display information about hidden partitions,  their  jobs
              and   job  steps.   By  default,  neither  partitions  that  are
              configured as hidden nor those partitions unavailable to  user’s
              group will be displayed (i.e. this is the default behavior).

       -o, --oneliner
              Print information one line per record.

       -Q, --quiet
              Print  no  warning  or  informational messages, only fatal error
              messages.

       -v, --verbose
              Print  detailed  event  logging.  Multiple  -v’s  will   further
              increase  the  verbosity of logging. By default only errors will
              be displayed.

       -V , --version
              Print version information and exit.

       COMMANDS

       all    Show all partitions, their jobs  and  jobs  steps.  This  causes
              information to be displayed about partitions that are configured
              as hidden and partitions that are unavailable to user’s group.

       abort  Instruct the  Slurm  controller  to  terminate  immediately  and
              generate a core file.  See "man slurmctld" for information about
              where the core file will be written.

       checkpoint CKPT_OP ID
              Perform a checkpoint  activity  on  the  job  step(s)  with  the
              specified identification.  ID can be used to identify a specific
              job (e.g. "<job_id>", which  applies  to  all  of  its  existing
              steps)  or  a  specific  job  step  (e.g. "<job_id>.<step_id>").
              Acceptable values for CKPT_OP include:

              disable (disable future checkpoints)

              enable (enable future checkpoints)

              able (test if presently  not  disabled,  report  start  time  if
              checkpoint in progress)

              create (create a checkpoint and continue the job step)

              vacate (create a checkpoint and terminate the job step)

              error (report the result for the last checkpoint request,  error
              code and message)

              restart (restart execution of the  previously  checkpointed  job
              steps)

              Acceptable values for CKPT_OP include:

              MaxWait=<seconds> maximum time for checkpoint to be written.
                     Default value is  10  seconds.   Valid  with  create  and
                     vacate options only.

              ImageDir=<directory_name> Location of checkpoint file.
                     Valid with create, vacate and restart options only.  This
                     value takes precedent  over  any  --checkpoint-dir  value
                     specified at job submission time.

              StickToNodes If set, resume job on the same nodes are previously
              used.
                     Valid with the restart option only.

       create SPECIFICATION
              Create  a  new  partition  or reservation.  See the full list of
              parameters below.  Include the tag "res" to create a reservation
              without specifying a reservation name.

       completing
              Display  all  jobs  in  a COMPLETING state along with associated
              nodes in either a COMPLETING or DOWN state.

       delete SPECIFICATION
              Delete the entry with  the  specified  SPECIFICATION.   The  two
              SPECIFICATION     choices     are    PartitionName=<name>    and
              Reservation=<name>.  On Dynamically laid  out  Bluegene  systems
              BlockName=<name> also works.

       detail Causes  the  show  command  to  provide additional details where
              available, namely the specific CPUs and NUMA memory allocated on
              each  node.   Note that on computers with hyperthreading enabled
              and  SLURM  configured  to  allocate  cores,  each  listed   CPU
              represents one physical core.  Each hyperthread on that core can
              be allocated a separate task, so a  job’s  CPU  count  and  task
              count  may  differ.   See  the  --cpu_bind and --mem_bind option
              descriptions in srun man pages for more information.  The detail
              option is currently only supported for the show job command.

       exit   Terminate  the  execution  of  scontrol.  This is an independent
              command with no options meant for use in interactive mode.

       help   Display a description of scontrol options and commands.

       hide   Do not display partition,  job  or  jobs  step  information  for
              partitions  that are configured as hidden or partitions that are
              unavailable to the user’s group.  This is the default  behavior.

       notify job_id message
              Send  a message to standard error of the srun command associated
              with the specified job_id.

       oneliner
              Print information one line per record.

       pidinfo proc_id
              Print  the  Slurm  job  id  and   scheduled   termination   time
              corresponding  to  the  supplied  process  id,  proc_id,  on the
              current node.  This will work only with  processes  on  node  on
              which  scontrol  is run, and only for those processes spawned by
              SLURM and their descendants.

       listpids [job_id[.step_id]] [NodeName]
              Print  a  listing  of  the  process  IDs  in  a  job  step   (if
              JOBID.STEPID  is provided), or all of the job steps in a job (if
              job_id is provided), or all of the job steps in all of the  jobs
              on  the local node (if job_id is not provided or job_id is "*").
              This will work only with processes on the node on which scontrol
              is  run, and only for those processes spawned by SLURM and their
              descendants. Note that some SLURM configurations  (ProctrackType
              value  of  pgid  or  aix)  are  unable to identify all processes
              associated with a job or job step.

              Note that the NodeName option is only  really  useful  when  you
              have  multiple  slurmd daemons running on the same host machine.
              Multiple slurmd daemons on one host are, in general,  only  used
              by SLURM developers.

       ping   Ping  the  primary  and secondary slurmctld daemon and report if
              they are responding.

       quiet  Print no warning or informational  messages,  only  fatal  error
              messages.

       quit   Terminate the execution of scontrol.

       reconfigure
              Instruct  all  Slurm  daemons to re-read the configuration file.
              This command does not restart the daemons.  This mechanism would
              be  used  to  modify  configuration  parameters (Epilog, Prolog,
              SlurmctldLogFile, SlurmdLogFile,  etc.)  register  the  physical
              addition  or  removal of nodes from the cluster or recognize the
              change of a node’s configuration, such as the addition of memory
              or  processors.   The  Slurm controller (slurmctld) forwards the
              request all other daemons (slurmd daemon on each compute  node).
              Running  jobs continue execution.  Most configuration parameters
              can be changed by just  running  this  command,  however,  SLURM
              daemons  should  be  shutdown  and  restarted  if  any  of these
              parameters   are   to   be   changed:   AuthType,    BackupAddr,
              BackupController,     ControlAddr,    ControlMach,    PluginDir,
              StateSaveLocation, SlurmctldPort or SlurmdPort.

       resume job_id
              Resume a previously suspended job.

       requeue job_id
              Requeue a running or pending SLURM batch job.

       setdebug LEVEL
              Change the debug level of the slurmctld daemon.  LEVEL may be an
              integer  value  between  zero and nine (using the same values as
              SlurmctldDebug in the slurm.conf file) or the name of  the  most
              detailed  message type to be printed: "quiet", "fatal", "error",
              "info", "verbose", "debug",  "debug2",  "debug3",  "debug4",  or
              "debug5".   This  value  is  temporary  and  will be overwritten
              whenever the slurmctld daemon reads the slurm.conf configuration
              file  (e.g. when the daemon is restarted or scontrol reconfigure
              is executed).

       show ENTITY ID
              Display the state of the specified  entity  with  the  specified
              identification.   ENTITY  may  be  config,  daemons,  job, node,
              partition, reservation,  slurmd,  step,  topology,  hostlist  or
              hostnames  (also block or subbp on BlueGene systems).  ID can be
              used to identify a specific element of  the  identified  entity:
              the  configuration  parameter name, job ID, node name, partition
              name, reservation name, or job step ID for  config,  job,  node,
              partition, or step respectively.  For an ENTITY of topology, the
              ID may be a node or switch name.  If one node name is specified,
              all  switches connected to that node (and their parent switches)
              will be shown.  If more than one node name  is  specified,  only
              switches  that  connect  to  all  named  nodes  will  be  shown.
              hostnames takes an optional hostlist  expression  as  input  and
              writes  a  list of individual host names to standard output (one
              per line). If no hostlist expression is supplied,  the  contents
              of  the SLURM_NODELIST environment variable is used. For example
              "tux[1-3]" is mapped to "tux1","tux2" and "tux3"  (one  hostname
              per  line).   hostlist takes a list of host names and prints the
              hostlist  expression  for  them  (the  inverse  of   hostnames).
              hostlist   can  also  take  the  absolute  pathname  of  a  file
              (beginning  with  the  character  ’/’)  containing  a  list   of
              hostnames.   Multiple  node  names may be specified using simple
              node range expressions (e.g. "lx[10-20]"). All other  ID  values
              must  identify  a single element. The job step ID is of the form
              "job_id.step_id", (e.g. "1234.1").  slurmd reports  the  current
              status  of  the  slurmd  daemon  executing on the same node from
              which the scontrol command is executed (the local host). It  can
              be useful to diagnose problems.  By default, all elements of the
              entity type specified are printed.

       shutdown OPTION
              Instruct Slurm daemons to save current state and terminate.   By
              default,  the  Slurm controller (slurmctld) forwards the request
              all other daemons (slurmd daemon  on  each  compute  node).   An
              OPTION  of slurmctld or controller results in only the slurmctld
              daemon being shutdown and the slurmd daemons remaining active.

       suspend job_id
              Suspend a running job.  Use the resume  command  to  resume  its
              execution.   User  processes  must  stop  on  receipt of SIGSTOP
              signal and resume upon receipt of SIGCONT for this operation  to
              be  effective.  Not all architectures and configurations support
              job suspension.

       takeover
              Instruct SLURM’s backup  controller  (slurmctld)  to  take  over
              system control.  SLURM’s backup controller requests control from
              the primary and  waits  for  its  termination.  After  that,  it
              switches  from  backup  mode  to  controller  mode.  If  primary
              controller  can  not  be  contacted,  it  directly  switches  to
              controller  mode.  This  can  be  used  to  speed  up  the SLURM
              controller fail-over mechanism when the primary  node  is  down.
              This  can  be  used  to  minimize  disruption  if  the  computer
              executing  the  primary  SLURM  controller  is  scheduled  down.
              (Note:  SLURM’s primary controller will take the control back at
              startup.)

       update SPECIFICATION
              Update job, node, partition, or  reservation  configuration  per
              the  supplied specification. SPECIFICATION is in the same format
              as the Slurm configuration file  and  the  output  of  the  show
              command described above. It may be desirable to execute the show
              command (described above) on the specific entity  you  which  to
              update,   then   use   cut-and-paste   tools  to  enter  updated
              configuration  values  to  the  update.  Note  that  while  most
              configuration  values can be changed using this command, not all
              can be changed using this mechanism. In particular, the hardware
              configuration  of  a node or the physical addition or removal of
              nodes from the cluster may only be accomplished through  editing
              the  Slurm  configuration  file  and  executing  the reconfigure
              command (described above).

       verbose
              Print detailed event logging.  This includes time-stamps on data
              structures, record counts, etc.

       version
              Display the version number of scontrol being executed.

       !!     Repeat the last command executed.

       SPECIFICATIONS FOR UPDATE COMMAND, JOBS

       Account=<account>
              Account  name  to be changed for this job’s resource use.  Value
              may be cleared with blank data value, "Account=".

       Conn-Type=<type>
              Reset the node connection type.  Possible values  on  Blue  Gene
              are "MESH", "TORUS" and "NAV" (mesh else torus).

       Contiguous=<yes|no>
              Set  the job’s requirement for contiguous (consecutive) nodes to
              be allocated.  Possible values are "YES" and "NO".

       Dependency=<dependency_list>
              Defer  job’s   initiation   until   specified   job   dependency
              specification  is  satisfied.   Cancel  dependency with an empty
              dependency_list (e.g. "Dependency=").  <dependency_list>  is  of
              the  form  <type:job_id[:job_id][,type:job_id[:job_id]]>.   Many
              jobs can share the same  dependency  and  these  jobs  may  even
              belong to different  users.

              after:job_id[:jobid...]
                     This  job  can  begin  execution after the specified jobs
                     have begun execution.

              afterany:job_id[:jobid...]
                     This job can begin execution  after  the  specified  jobs
                     have terminated.

              afternotok:job_id[:jobid...]
                     This  job  can  begin  execution after the specified jobs
                     have terminated in some failed state (non-zero exit code,
                     node failure, timed out, etc).

              afterok:job_id[:jobid...]
                     This  job  can  begin  execution after the specified jobs
                     have successfully executed (ran to completion  with  non-
                     zero exit code).

              singleton
                     This   job  can  begin  execution  after  any  previously
                     launched jobs sharing the same job  name  and  user  have
                     terminated.

       EligibleTime=<time_spec>
              See StartTime.

       ExcNodeList=<nodes>
              Set  the job’s list of excluded node. Multiple node names may be
              specified   using   simple   node   range   expressions    (e.g.
              "lx[10-20]").   Value  may  be  cleared  with  blank data value,
              "ExcNodeList=".

       Features=<features>
              Set the job’s required node features.  The list of features  may
              include  multiple  feature  names  separated  by ampersand (AND)
              and/or   vertical   bar   (OR)    operators.     For    example:
              Features="opteron&video"   or  Features="fast|faster".   In  the
              first example, only nodes having both the feature "opteron"  AND
              the  feature  "video"  will  be  used.  There is no mechanism to
              specify that you  want  one  node  with  feature  "opteron"  and
              another  node  with  feature  "video"  in  case no node has both
              features.  If only one of a set of possible  options  should  be
              used  for  all  allocated  nodes,  then  use the OR operator and
              enclose  the  options  within  square  brackets.   For  example:
              "Features=[rack1|rack2|rack3|rack4]"  might  be  used to specify
              that all nodes must  be  allocated  on  a  single  rack  of  the
              cluster, but any of those four racks can be used.  A request can
              also specify the number of nodes needed  with  some  feature  by
              appending  an  asterisk  and  count after the feature name.  For
              example  "Features=graphics*4"  indicates  that  at  least  four
              allocated  nodes  must have the feature "graphics."  Constraints
              with node counts may only be combined with AND operators.  Value
              may be cleared with blank data value, for example "Features=".

       Geometry=<geo>
              Reset  the required job geometry.  On Blue Gene the value should
              be three digits separated by "x" or ",".  The  digits  represent
              the allocation size in X, Y and Z dimensions (e.g. "2x3x4").

       JobId=<id>
              Identify  the job to be updated. This specification is required.

       Licenses=<name>
              Specification of licenses (or other resources available  on  all
              nodes  of  the  cluster)  as described in salloc/sbatch/srun man
              pages.

       MinCPUsNode=<count>
              Set the job’s minimum number of CPUs per node to  the  specified
              value.

       MinMemoryCPU=<megabytes>
              Set  the job’s minimum real memory required per allocated CPU to
              the specified value.  Either MinMemoryCPU or  MinMemoryNode  may
              be set, but not both.

       MinMemoryNode=<megabytes>
              Set  the  job’s  minimum  real  memory  required per node to the
              specified value.  Either MinMemoryCPU or  MinMemoryNode  may  be
              set, but not both.

       MinTmpDiskNode=<megabytes>
              Set  the job’s minimum temporary disk space required per node to
              the specified value.

       Name=<name>
              Set the job’s name to the specified value.

       Nice[=delta]
              Adjust job’s priority by the specified value. Default  value  is
              100.   The adjustment range is from -10000 (highest priority) to
              10000 (lowest priority).  Nice value changes are  not  additive,
              but  overwrite any prior nice value and are applied to the job’s
              base priority.  Only privileged users  can  specify  a  negative
              adjustment.

       NumNodes=<min_count>[-<max_count>]
              Set  the  job’s minimum and optionally maximum count of nodes to
              be allocated.

       NumTasks=<count>
              Set the job’s count of required tasks to the specified value.

       Partition=<name>
              Set the job’s partition to the specified value.

       Priority=<number>
              Set the job’s priority to the specified value.  Note that a  job
              priority of zero prevents the job from ever being scheduled.  By
              setting a job’s priority to zero it is held.  Set  the  priority
              to  a  non-zero value to permit it to run.  Explicitly setting a
              job’s priority clears any previously set nice value.

       ReqCores=<count>
              Set the job’s count of minimum cores per socket to the specified
              value.

       ReqNodeList=<nodes>
              Set  the job’s list of required node. Multiple node names may be
              specified   using   simple   node   range   expressions    (e.g.
              "lx[10-20]").   Value  may  be  cleared  with  blank data value,
              "ReqNodeList=".

       ReqSockets=<count>
              Set the job’s count of minimum sockets per node to the specified
              value.

       ReqThreads=<count>
              Set the job’s count of minimum threads per core to the specified
              value.

       Requeue=<0|1>
              Stipulates whether  a  job  should  be  requeued  after  a  node
              failure: 0 for no, 1 for yes.

       ReservationName=<name>
              Set the job’s reservation to the specified value.

       Rotate=<yes|no>
              Permit  the  job’s  geometry to be rotated.  Possible values are
              "YES" and "NO".

       Shared=<yes|no>
              Set the job’s ability to share nodes with other  jobs.  Possible
              values are "YES" and "NO".

       StartTime=<time_spec>
              Set the job’s earliest initiation time.  It accepts times of the
              form HH:MM:SS to run a job at a specific time  of  day  (seconds
              are  optional).   (If that time is already past, the next day is
              assumed.)  You may also specify midnight, noon, or teatime (4pm)
              and  you  can  have  a  time-of-day  suffixed  with AM or PM for
              running in the morning or the evening.  You can  also  say  what
              day the job will be run, by specifying a date of the form MMDDYY
              or   MM/DD/YY   or   MM.DD.YY,   or   a   date   and   time   as
              YYYY-MM-DD[THH:MM[:SS]].   You  can  also  give times like now +
              count time-units, where the time-units can  be  minutes,  hours,
              days,  or weeks and you can tell SLURM to run the job today with
              the keyword today and to run the job tomorrow with  the  keyword
              tomorrow.

              Notes on date/time specifications:
               -   although   the   ’seconds’   field  of  the  HH:MM:SS  time
              specification is allowed by the code, note that the poll time of
              the  SLURM scheduler is not precise enough to guarantee dispatch
              of the job on the exact second.  The job  will  be  eligible  to
              start  on  the next poll following the specified time. The exact
              poll interval depends on the SLURM scheduler (e.g.,  60  seconds
              with the default sched/builtin).
               -   if   no  time  (HH:MM:SS)  is  specified,  the  default  is
              (00:00:00).
               - if a date is specified without a year (e.g., MM/DD) then  the
              current  year  is  assumed,  unless the combination of MM/DD and
              HH:MM:SS has already passed for that year,  in  which  case  the
              next year is used.

       TimeLimit=<time>
              The      job’s     time     limit.      Output     format     is
              [days-]hours:minutes:seconds or "UNLIMITED".  Input format  (for
              update     command)    set    is    minutes,    minutes:seconds,
              hours:minutes:seconds,   days-hours,    days-hours:minutes    or
              days-hours:minutes:seconds.   Time  resolution is one minute and
              second values are rounded up to the next minute.

       WCKey=<key>
              Set the job’s workload characterization  key  to  the  specified
              value.

       NOTE: The "show" command, when used with the "job" or "job <jobid>"
              entity  displays detailed information about a job or jobs.  Much
              of this information may  be  modified  using  the  "update  job"
              command  as  described  above.   However,  the  following fields
              displayed by the show job command are read-only  and  cannot  be
              modified:

       AllocNode:Sid
              Local node and system id making the resource allocation.

       EndTime
              The  time  the  job  is expected to terminate based on the job’s
              time limit.  When the  job  ends  sooner,  this  field  will  be
              updated with the actual end time.

       ExitCode=<exit>:<sig>
              Exit  status  reported  for the job by the wait() function.  The
              first number is the exit code, typically as set  by  the  exit()
              function.   The  second  number  of  the  signal that caused the
              process to terminate if it was terminated by a signal.

       JobState
              The current state of the job.

       NodeList
              The list of nodes allocated to the job.

       NodeListIndices
              The NodeIndices expose the internal indices into the node  table
              associated with the node(s) allocated to the job.

       PreSusTime
              Time the job ran prior to last suspend.

       Reason The reason job is not running: e.g., waiting "Resources".

       SuspendTime
              Time the job was last suspended or resumed.

       UserId  GroupId
              The user and group under which the job was submitted.

       NOTE on information displayed for various job states:
              When  you  submit  a  request  for  the  "show job" function the
              scontrol process makes an RPC request call to slurmctld  with  a
              REQUEST_JOB_INFO  message  type.   If  the  state  of the job is
              PENDING, then  it  returns  some  detail  information  such  as:
              min_nodes,  min_procs, cpus_per_task, etc. If the state is other
              than PENDING the code assumes that it is in a further state such
              as  RUNNING,  COMPLETE,  etc. In these cases the code explicitly
              returns zero for these values. These values are meaningless once
              the job resources have been allocated and the job has started.

       SPECIFICATIONS FOR UPDATE COMMAND, NODES

       NodeName=<name>
              Identify  the  node(s) to be updated. Multiple node names may be
              specified   using   simple   node   range   expressions    (e.g.
              "lx[10-20]"). This specification is required.

       Features=<features>
              Identify  feature(s)  to  be associated with the specified node.
              Any previously defined feature(s) will be overwritten  with  the
              new  value.  NOTE: Features assigned via scontrol do not survive
              the restart of the slurmctld  nor  will  they  survive  scontrol
              reconfigure  if  Features  are  defined  in  slurm.conf.  Update
              slurm.conf with any changes meant to be persistent.

       Reason=<reason>
              Identify the reason the  node  is  in  a  "DOWN"  or  "DRAINED",
              "DRAINING",  "FAILING" or "FAIL" state.  Use quotes to enclose a
              reason having more than one word.

       State=<state>
              Identify the state to be assigned to the node.  Possible  values
              are   "NoResp",  "ALLOC",  "ALLOCATED", "DOWN", "DRAIN", "FAIL",
              "FAILING", "IDLE", "MIXED", "MAINT",  "POWER_DOWN",  "POWER_UP",
              or  "RESUME".   If a node is in a "MIXED" state it usually means
              the node is in multiple states.  For instance if  only  part  of
              the  node  is "ALLOCATED" and the rest of the node is "IDLE" the
              state will be "MIXED".  If  you  want  to  remove  a  node  from
              service,  you  typically  want  to  set  it’s  state to "DRAIN".
              "FAILING" is similar to "DRAIN" except  that  some  applications
              will  seek  to  relinquish those nodes before the job completes.
              "RESUME" is  not  an  actual  node  state,  but  will  return  a
              "DRAINED",  "DRAINING", or "DOWN" node to service, either "IDLE"
              or "ALLOCATED" state as appropriate.  Setting a node "DOWN" will
              cause  all  running  and  suspended  jobs  on  that  node  to be
              terminated.  "POWER_DOWN" and "POWER_UP" will use the configured
              SuspendProg  and  ResumeProg programs to explicitly place a node
              in or out of a power saving mode.  The "NoResp" state will  only
              set the "NoResp" flag for a node without changing its underlying
              state.  While all of the above states are valid,  some  of  them
              are   not  valid  new  node  states  given  their  prior  state.
              Generally only "DRAIN", "FAIL" and "RESUME" should be used.

       Weight=<weight>
              Identify weight to be  associated  with  specified  nodes.  This
              allows  dynamic  changes  to weight associated with nodes, which
              will be used for the subsequent node allocation decisions.   Any
              previously  identified  weight  will be overwritten with the new
              value.NOTE: The Weight associated with nodes will  be  reset  to
              the  values  specified  in  slurm.conf  (if  any) upon slurmctld
              restart or reconfiguration.  Update slurm.conf with any  changes
              meant to be persistent.

       SPECIFICATIONS FOR CREATE, UPDATE, AND DELETE COMMANDS, PARTITIONS

       AllowGroups=<name>
              Identify the user groups which may use this partition.  Multiple
              groups may be specified in a comma separated  list.   To  permit
              all groups to use the partition specify "AllowGroups=ALL".

       AllocNodes=<name>
              Comma  separated list of nodes from which users can execute jobs
              in the partition.  Node names may be specified  using  the  node
              range  expression  syntax described above.  The default value is
              "ALL".

       Default=<yes|no>
              Specify if this partition is to be used by  jobs  which  do  not
              explicitly  identify a partition to use.  Possible output values
              are "YES" and "NO".  In order to change the default partition of
              a  running  system,  use  the  scontrol  update  command and set
              Default=yes for the partition that you want to  become  the  new
              default.

       DefaultTime=<time>
              Run  time limit used for jobs that don’t specify a value. If not
              set then MaxTime will be  used.   Format  is  the  same  as  for
              MaxTime.

       Hidden=<yes|no>
              Specify  if  the  partition  and  its jobs should be hidden from
              view.  Hidden partitions will by  default  not  be  reported  by
              SLURM APIs or commands.  Possible values are "YES" and "NO".

       MaxNodes=<count>
              Set  the  maximum number of nodes which will be allocated to any
              single job in the partition. Specify  a  number,  "INFINITE"  or
              "UNLIMITED".   (On  a  Bluegene  type  system  this represents a
              c-node count.)

       MaxTime=<time>
              The   maximum   run   time   for   jobs.    Output   format   is
              [days-]hours:minutes:seconds  or "UNLIMITED".  Input format (for
              update      command)      is      minutes,      minutes:seconds,
              hours:minutes:seconds,    days-hours,    days-hours:minutes   or
              days-hours:minutes:seconds.  Time resolution is one  minute  and
              second values are rounded up to the next minute.

       MinNodes=<count>
              Set  the  minimum number of nodes which will be allocated to any
              single job in the partition.   (On a Bluegene type  system  this
              represents a c-node count.)

       Nodes=<name>
              Identify  the  node(s)  to  be  associated  with this partition.
              Multiple node names may be specified  using  simple  node  range
              expressions  (e.g.  "lx[10-20]").   Note  that  jobs may only be
              associated with one partition at any time.  Specify a blank data
              value to remove all nodes from a partition: "Nodes=".

       PartitionName=<name>
              Identify  the  partition  to  be  updated. This specification is
              required.

       Priority=<count>
              Jobs submitted to a higher priority partition will be dispatched
              before pending jobs in lower priority partitions and if possible
              they will preempt running jobs from lower  priority  partitions.
              Note  that  a partition’s priority takes precedence over a job’s
              priority.  The value may not exceed 65533.

       RootOnly=<yes|no>
              Specify if only allocation requests initiated by user root  will
              be  satisfied.   This  can  be  used  to restrict control of the
              partition to some meta-scheduler.  Possible values are "YES" and
              "NO".

       Shared=<yes|no|exclusive|force>[:<job_count>]
              Specify  if  nodes  in  this partition can be shared by multiple
              jobs.  Possible values are "YES", "NO", "EXCLUSIVE" and "FORCE".
              An  optional  job count specifies how many jobs can be allocated
              to use each resource.

       State=<up|down>
              Specify if jobs  can  be  allocated  nodes  in  this  partition.
              Possible  values  are"UP"  and "DOWN".  If a partition allocated
              nodes to running jobs, those jobs will continue  execution  even
              after  the  partition’s state is set to "DOWN". The jobs must be
              explicitly canceled to force their termination.

       SPECIFICATIONS FOR CREATE, UPDATE, AND DELETE COMMANDS, RESERVATIONS

       Reservation=<name>
              Identify the name  of  the  reservation  to  be  created,
              updated,  or  deleted.   This  parameter  is required for
              update and is the only parameter for delete.  For create,
              if  you  do  not  want  to  give  a reservation name, use
              "scontrol create res ..." and  a  name  will  be  created
              automatically.

       Licenses=<license>
              Specification  of  licenses (or other resources available
              on all nodes of the cluster) which are  to  be  reserved.
              License  names  can  be followed by an asterisk and count
              (the default  count  is  one).   Multiple  license  names
              should be comma separated (e.g.  "Licenses=foo*4,bar").

       NodeCnt=<num>
              Identify   number   of  nodes  to  be  reserved.   A  new
              reservation must specify either NodeCnt or Nodes.

       Nodes=<name>
              Identify the node(s) to be reserved. Multiple node  names
              may  be  specified  using  simple  node range expressions
              (e.g. "Nodes=lx[10-20]").  Specify a blank data value  to
              remove  all  nodes  from  a reservation: "Nodes=".  A new
              reservation must specify either NodeCnt or Nodes.

       StartTime=<time_spec>
              The start time for the reservation.   A  new  reservation
              must  specify a start time.  It accepts times of the form
              HH:MM:SS  for  a  specific  time  of  day  (seconds   are
              optional).   (If  that time is already past, the next day
              is assumed.)  You may also  specify  midnight,  noon,  or
              teatime  (4pm)  and  you  can have a time-of-day suffixed
              with AM or PM for running in the morning or the  evening.
              You  can  also  say  what  day  the  job  will be run, by
              specifying a date of  the  form  MMDDYY  or  MM/DD/YY  or
              MM.DD.YY,  or a date and time as YYYY-MM-DD[THH:MM[:SS]].
              You can also give times  like  now  +  count  time-units,
              where  the  time-units  can  be  minutes, hours, days, or
              weeks and you can tell SLURM to run the  job  today  with
              the  keyword  today  and to run the job tomorrow with the
              keyword tomorrow.

       EndTime=<time_spec>
              The end time for the reservation.  A new reservation must
              specify an end time or a duration.  Valid formats are the
              same as for StartTime.

       Duration=<time>
              The length of a  reservation.   A  new  reservation  must
              specify  an  end  time  or a duration.  Valid formats are
              minutes,     minutes:seconds,      hours:minutes:seconds,
              days-hours,                           days-hours:minutes,
              days-hours:minutes:seconds,    or    UNLIMITED.      Time
              resolution is one minute and second values are rounded up
              to the next minute.

       PartitionName=<name>
              Identify the partition to be reserved.

       Flags=<flags>
              Flags associated  with  the  reservation.   In  order  to
              remove  a  flag  with the update option, preceed the name
              with a minus sign. For example: Flags=-DAILY (NOTE:  this
              option  is  not  supported  for  all  flags).   Currently
              supported flags include:

              MAINT       Maintenance mode, receives special accounting
                          treatment.   This  partition  is permitted to
                          use resources that  are  already  in  another
                          reservation.

              OVERLAP     This  reservation  can be allocated resources
                          that are already in another reservation.

              IGNORE_JOBS Ignore currently running jobs  when  creating
                          the  reservation.   This  can  be  especially
                          useful when reserving all nodes in the system
                          for maintenance.

              DAILY       Repeat the reservation at the same time every
                          day

              WEEKLY      Repeat the reservation at the same time every
                          week

              SPEC_NODES  Reservation  is  for  specific  nodes (output
                          only)

       Features=<features>
              Set the reservation’s required  node  features.  Multiple
              values  may be "&" separated if all features are required
              (AND operation)  or  separated  by  "|"  if  any  of  the
              specified  features  are  required (OR operation).  Value
              may be cleared with blank data value, "Features=".

       Users=<user list>
              List of users permitted to use the reserved nodes.   E.g.
              Users=jones1,smith2.   A  new  reservation  must  specify
              Users and/or Accounts.

       Accounts=<account list>
              List of accounts permitted to  use  the  reserved  nodes.
              E.g.  Accounts=physcode1,physcode2.  A user in any of the
              accounts may use the reserved nodes.  A  new  reservation
              must specify Users and/or Accounts.

       SPECIFICATIONS FOR UPDATE, BLOCK

       Bluegene systems only!

       BlockName=<name>
              Identify   the   bluegene   block  to  be  updated.  This
              specification is required.

       State=<free|error|remove>
              This will update the state of a bluegene block to  either
              FREE  or ERROR.  (i.e. update BlockName=RMP0 STATE=ERROR)
              State error will not allow jobs  to  run  on  the  block.
              WARNING!!!!  This  will  cancel  any  running  job on the
              block!  On dynamically laid out systems REMOVE will  free
              and  remove  the  block from the system.  If the block is
              smaller than a midplane every block on that midplane will
              be removed.

       SubBPName=<name>
              Identify   the  bluegene  ionodes  to  be  updated  (i.e.
              bg000[0-3]). This specification is required.

       ENVIRONMENT VARIABLES

       Some scontrol options may  be  set  via  environment  variables.
       These  environment  variables,  along  with  their corresponding
       options, are  listed  below.  (Note:  Commandline  options  will
       always override these settings.)

       SCONTROL_ALL        -a, --all

       SLURM_CONF          The  location  of  the  SLURM  configuration
                           file.

EXAMPLES

       # scontrol
       scontrol: show part debug
       PartitionName=debug
          AllocNodes=ALL AllowGroups=ALL Default=YES
          DefaultTime=NONE DisableRootJobs=NO Hidden=NO
          MaxNodes=UNLIMITED MaxTime=UNLIMITED MinNodes=1
          Nodes=snowflake[0-48]
          Priority=1 RootOnly=NO Shared=YES:4
          State=UP TotalCPUs=694 TotalNodes=49
       scontrol: update PartitionName=debug MaxTime=60:00 MaxNodes=4
       scontrol: show job 71701
       JobId=71701 Name=hostname
          UserId=da(1000) GroupId=da(1000)
          Priority=66264 Account=none QOS=normal WCKey=*123
          JobState=COMPLETED Reason=None Dependency=(null)
          TimeLimit=UNLIMITED    Requeue=1    Restarts=0    BatchFlag=0
       ExitCode=0:0
          SubmitTime=2010-01-05T10:58:40
       EligibleTime=2010-01-05T10:58:40
          StartTime=2010-01-05T10:58:40 EndTime=2010-01-05T10:58:40
          SuspendTime=None SecsPreSuspend=0
          Partition=debug AllocNode:Sid=snowflake:4702
          ReqNodeList=(null) ExcNodeList=(null)
          NodeList=snowflake0
          NumNodes=1 NumCPUs=10 CPUs/Task=2 ReqS:C:T=1:1:1
          MinCPUsNode=2 MinMemoryNode=0 MinTmpDiskNode=0
          Features=(null) Reservation=(null)
          Shared=OK Contiguous=0 Licenses=(null) Network=(null)
       scontrol: update JobId=71701 TimeLimit=30:00 Priority=500
       scontrol: show hostnames tux[1-3]
       tux1
       tux2
       tux3
       scontrol:     create      res      StartTime=2009-04-01T08:00:00
       Duration=5:00:00 Users=dbremer NodeCnt=10
       Reservation created: dbremer_1
       scontrol: update Reservation=dbremer_1 Flags=Maint NodeCnt=20
       scontrol: delete Reservation=dbremer_1
       scontrol: quit

COPYING

       Copyright  (C)  2002-2007  The  Regents  of  the  University  of
       California.  Produced at Lawrence Livermore National  Laboratory
       (cf, DISCLAIMER).  CODE-OCEC-09-009. All rights reserved.

       This  file is part of SLURM, a resource management program.  For
       details, see <https://computing.llnl.gov/linux/slurm/>.

       SLURM is free software; you can redistribute it and/or modify it
       under  the  terms of the GNU General Public License as published
       by the  Free  Software  Foundation;  either  version  2  of  the
       License, or (at your option) any later version.

       SLURM  is  distributed  in  the hope that it will be useful, but
       WITHOUT ANY WARRANTY;  without  even  the  implied  warranty  of
       MERCHANTABILITY  or  FITNESS  FOR A PARTICULAR PURPOSE.  See the
       GNU General Public License for more details.

FILES

       /etc/slurm.conf

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

EXAMPLES

COPYING

FILES

SEE ALSO