squeue - view information about jobs located in the SLURM scheduling

NAME

       squeue  -  view  information about jobs located in the SLURM scheduling
       queue.

SYNOPSIS

       squeue [OPTIONS...]

DESCRIPTION

       squeue is used to view job and job step information for jobs managed by
       SLURM.

OPTIONS

       -A <account_list>, --account=<account_list>
              Specify  the  accounts  of  the  jobs  to  view. Accepts a comma
              separated list of account names. This has no effect when listing
              job steps.

       -a, --all
              Display  information  about  jobs and job steps in all partions.
              This causes information to be displayed  about  partitions  that
              are  configured as hidden and partitions that are unavailable to
              user’s group.

       -h, --noheader
              Do not print a header on the output.

       --help Print a help message describing all options squeue.

       --hide Do not display information about  jobs  and  job  steps  in  all
              partions.  By  default,  information  about  partitions that are
              configured as hidden or are not available to  the  user’s  group
              will not be displayed (i.e. this is the default behavior).

       -i <seconds>, --iterate=<seconds>
              Repeatedly  gather  and  report the requested information at the
              interval specified (in seconds).   By  default,  prints  a  time
              stamp with the header.

       -j <job_id_list>, --jobs=<job_id_list>
              Requests a comma separated list of job ids to display.  Defaults
              to all jobs.  The --jobs=<job_id_list> option  may  be  used  in
              conjunction  with  the  --steps option to print step information
              about specific jobs.

       -l, --long
              Report more of the available information for the  selected  jobs
              or job steps, subject to any constraints specified.

       -n <hostlist>, --nodes=<hostlist>
              Report  only  on jobs allocated to the specified node or list of
              nodes.  This may either  be  the  NodeName  or  NodeHostname  as
              defined  in  slurm.conf(5)  in  the  event  that they differ.  A
              node_name of localhost is mapped to the current host name.

       -o <output_format>, --format=<output_format>
              Specify the information to be displayed, its size  and  position
              (right  or  left  justified).   The default formats with various
              options are

              default        "%.7i %.9P %.8j %.8u %.2t %.10M %.6D %R"

              -l, --long     "%.7i %.9P %.8j %.8u %.8T %.10M %.9l %.6D %R"

              -s, --steps    "%10i %.8j %.9P %.8u %.9M %N"

              The format of each field is "%[.][size]type".

              size    is the minimum field size.  If  no  size  is  specified,
                      whatever  is  needed  to  print  the information will be
                      used.

               .      indicates the output should be right justified and  size
                      must   be   specified.    By  default,  output  is  left
                      justified.

              Note that many of these type specifications are valid  only  for
              jobs  while  others  are  valid  only for job steps.  Valid type
              specifications include:

              %a  Account associated with the job.  (Valid for jobs only)

              %A  Number of tasks created by a job  step.   This  reports  the
                  value  of  the  srun  --ntasks option.  (Valid for job steps
                  only)

              %c  Minimum number of CPUs (processors) per  node  requested  by
                  the  job.   This  reports  the  value  of the srun --mincpus
                  option with a default value of zero.  (Valid for jobs only)

              %C  Number  of  CPUs  (processors)  requested  by  the  job   or
                  allocated  to it if already running.  As a job is completing
                  this  number  will  reflect  the  current  number  of   CPUs
                  allocated.  (Valid for jobs only)

              %d  Minimum  size  of  temporary disk space (in MB) requested by
                  the job.  (Valid for jobs only)

              %D  Number of nodes allocated to the job or the  minimum  number
                  of  nodes  required  by  a pending job. The actual number of
                  nodes allocated to a pending job may exceed this  number  if
                  the  job  specified  a  node  range count (e.g.  minimum and
                  maximum node counts) or the the job  specifies  a  processor
                  count instead of a node count and the cluster contains nodes
                  with varying processor counts. As a job is  completing  this
                  number  will  reflect the current number of nodes allocated.
                  (Valid for jobs only)

              %e  Time at which the job ended or is  expected  to  end  (based
                  upon its time limit).  (Valid for jobs only)

              %E  Job  dependency. This job will not begin execution until the
                  dependent job completes.  A value of zero implies  this  job
                  has no dependencies.  (Valid for jobs only)

              %f  Features required by the job.  (Valid for jobs only)

              %g  Group name of the job.  (Valid for jobs only)

              %G  Group ID of the job.  (Valid for jobs only)

              %h  Can  the  nodes  allocated  to  the job be shared with other
                  jobs.  (Valid for jobs only)

              %H  Minimum number of sockets per node  requested  by  the  job.
                  This  reports  the  value  of  the  srun  --sockets-per-node
                  option.  (Valid for jobs only)

              %i  Job or job step id.  (Valid for jobs and job steps)

              %I  Minimum number of cores per socket  requested  by  the  job.
                  This  reports  the  value  of  the  srun  --cores-per-socket
                  option.  (Valid for jobs only)

              %j  Job or job step name.  (Valid for jobs and job steps)

              %J  Minimum number of threads per core  requested  by  the  job.
                  This  reports  the  value  of  the  srun  --threads-per-core
                  option.  (Valid for jobs only)

              %k  Comment associated with the job.  (Valid for jobs only)

              %l  Time    limit    of    the    job    or    job    step    in
                  days-hours:minutes:seconds.   The  value may be "NOT_SET" if
                  not yet established or "UNLIMITED" for no limit.  (Valid for
                  jobs and job steps)

              %L  Time     left     for     the     job    to    execute    in
                  days-hours:minutes:seconds.  This  value  is  calculated  by
                  subtracting  the  job’s  time used from its time limit.  The
                  value may be "NOT_SET" if not yet established or "UNLIMITED"
                  for no limit.  (Valid for jobs only)

              %m  Minimum size of memory (in MB) requested by the job.  (Valid
                  for jobs only)

              %M  Time    used    by    the    job    or    job    step     in
                  days-hours:minutes:seconds.   The days and hours are printed
                  only as needed.  For job steps this field shows the  elapsed
                  time  since  execution began and thus will be inaccurate for
                  job steps which have been  suspended.   Clock  skew  between
                  nodes  in  the cluster will cause the time to be inaccurate.
                  If the time is obviously wrong (e.g. negative), it  displays
                  as "INVALID".  (Valid for jobs and job steps)

              %n  List  of node names (or base partitions on BlueGene systems)
                  explicitly requested by the job.  (Valid for jobs only)

              %N  List of nodes allocated to the job or job step. In the  case
                  of  a  COMPLETING  job, the list of nodes will comprise only
                  those nodes that have not  yet  been  returned  to  service.
                  (Valid for jobs and job steps)

              %O  Are  contiguous nodes requested by the job.  (Valid for jobs
                  only)

              %p  Priority of the job (converted to a  floating  point  number
                  between 0.0 and 1.0).  Also see %Q.  (Valid for jobs only)

              %P  Partition  of  the job or job step.  (Valid for jobs and job
                  steps)

              %q  Quality of service associated with the job.  (Valid for jobs
                  only)

              %Q  Priority  of  the  job  (generally  a  very  large  unsigned
                  integer).  Also see %p.  (Valid for jobs only)

              %r  The reason a job is in  its  current  state.   See  the  JOB
                  REASON CODES section below for more information.  (Valid for
                  jobs only)

              %R  For pending jobs: the reason a job is waiting for  execution
                  is  printed  within  parenthesis.   For terminated jobs with
                  failure: an explanation as to why the job failed is  printed
                  within  parenthesis.   For all other job states: the list of
                  allocate nodes.  See the JOB REASON CODES section below  for
                  more information.  (Valid for jobs only)

              %s  Node selection plugin specific data for a job. Possible data
                  includes: Geometry requirement of resource allocation (X,Y,Z
                  dimensions),  Connection  type (TORUS, MESH, or NAV == torus
                  else mesh), Permit rotation of geometry (yes  or  no),  Node
                  use (VIRTUAL or COPROCESSOR), etc.  (Valid for jobs only)

              %S  Actual  or  expected  start  time  of  the  job or job step.
                  (Valid for jobs and job steps)

              %t  Job state, compact  form:  PD  (pending),  R  (running),  CA
                  (cancelled),    CF(configuring),    CG    (completing),   CD
                  (completed),  F  (failed),  TO  (timeout),  and   NF   (node
                  failure).   See  the  JOB STATE CODES section below for more
                  information.  (Valid for jobs only)

              %T  Job  state,  extended  form:  PENDING,  RUNNING,  SUSPENDED,
                  CANCELLED,   COMPLETING,   COMPLETED,  CONFIGURING,  FAILED,
                  TIMEOUT, and NODE_FAIL.  See the  JOB  STATE  CODES  section
                  below for more information.  (Valid for jobs only)

              %u  User  name  for  a job or job step.  (Valid for jobs and job
                  steps)

              %U  User ID for a job or job step.   (Valid  for  jobs  and  job
                  steps)

              %v  Reservation for the job.  (Valid for jobs only)

              %x  List  of  node names explicitly excluded by the job.  (Valid
                  for jobs only)

              %z  Number of requested sockets, cores, and threads (S:C:T)  per
                  node for the job.  (Valid for jobs only)

       -p <part_list>, --partition=<part_list>
              Specify  the  partitions of the jobs or steps to view. Accepts a
              comma separated list of partition names.

       -q <qos_list>, --qos=<qos_list>
              Specify the qos(s) of the jobs or steps to view. Accepts a comma
              separated list of qos’s.

       -s, --steps
              Specify the job steps to view.  This flag indicates that a comma
              separated list of job steps to view  follows  without  an  equal
              sign  (see  examples).  The job step format is "job_id.step_id".
              Defaults to all job steps.

       -S <sort_list>, --sort=<sort_list>
              Specification of the order in which records should be  reported.
              This  uses  the same field specifciation as the <output_format>.
              Multiple sorts may be performed by listing multiple sort  fields
              separated  by  commas.  The field specifications may be preceded
              by "+" or "-"  for  ascending  (default)  and  descending  order
              respectively.   For example, a sort value of "P,U" will sort the
              records by partition name then by user id.  The default value of
              sort for jobs is "P,t,-p" (increasing partition name then within
              a given partition by increasing node state and  then  decreasing
              priority).   The  default  value  of sort for job steps is "P,i"
              (increasing partition name then  within  a  given  partition  by
              increasing step id).

       --start
              Report  the  expected  start  time  of  pending jobs in order of
              increasing start time.  This  is  equivalent  to  the  following
              options:  --format="%.7i  %.9P  %.8j  %.8u %.2t  %.19S %.6D %R",
              --sort=S and --states=PENDING.  Any  of  these  options  may  be
              explicitly  changed  as  desired by combining the --start option
              with other  option  values  (e.g.  to  use  a  different  output
              format).   The  expected  start  time  of  pending  jobs is only
              available if  the  SLURM  is  configured  to  use  the  backfill
              scheduling plugin.

       -t <state_list>, --states=<state_list>
              Specify  the  states of jobs to view.  Accepts a comma separated
              list of state names or "all". If "all" is specified then jobs of
              all  states  will  be  reported.  If  no state is specified then
              pending, running, and completing jobs are reported. Valid states
              (in  both  extended  and  compact  form)  include: PENDING (PD),
              RUNNING (R), SUSPENDED (S),  COMPLETING  (CG),  COMPLETED  (CD),
              CONFIGURING  (CF), CANCELLED (CA), FAILED (F), TIMEOUT (TO), and
              NODE_FAIL  (NF).  Note  the  <state_list>   supplied   is   case
              insensitve  ("pd"  and  "PD"  work the same).  See the JOB STATE
              CODES section below for more information.

       -u <user_list>, --user=<user_list>
              Request jobs or job steps from a comma separated list of  users.
              The list can consist of user names or user id numbers.

       --usage
              Print a brief help message listing the squeue options.

       -v, --verbose
              Report details of squeues actions.

       -V , --version
              Print version information and exit.

JOB REASON CODES

       These codes identify the reason that a job is waiting for execution.  A
       job may be waiting for more than one reason, in which case only one  of
       those reasons is displayed.

       Dependency          This   job  is  waiting  for  a  dependent  job  to
                           complete.

       None                No reason is set for this job.

       PartitionDown       The partition required by this job  is  in  a  DOWN
                           state.

       PartitionNodeLimit  The number of nodes required by this job is outside
                           of  it’s  partitions  current  limits.   Can   also
                           indicate that required nodes are DOWN or DRAINED.

       PartitionTimeLimit  The  job’s  time  limit  exceeds  it’s  partition’s
                           current time limit.

       Priority            One or more higher priority  jobs  exist  for  this
                           partition.

       Resources           The   job   is  waiting  for  resources  to  become
                           available.

       NodeDown            A node required by the job is down.

       BadConstraints      The job’s constraints can not be satisfied.

       SystemFailure       Failure of the SLURM system,  a  file  system,  the
                           network, etc.

       JobLaunchFailure    The  job could not be launched.  This may be due to
                           a file system problem, invalid program name, etc.

       NonZeroExitCode     The job terminated with a non-zero exit code.

       TimeLimit           The job exhausted its time limit.

       InactiveLimit       The job reached the system InactiveLimit.

JOB STATE CODES

       Jobs typically pass through several  states  in  the  course  of  their
       execution.    The  typical  states  are  PENDING,  RUNNING,  SUSPENDED,
       COMPLETING, and COMPLETED.  An explanation of each state follows.

       CA  CANCELLED       Job was explicitly cancelled by the user or  system
                           administrator.   The  job  may or may not have been
                           initiated.

       CD  COMPLETED       Job has terminated all processes on all nodes.

       CF  CONFIGURING     Job has been allocated resources, but  are  waiting
                           for them to become ready for use (e.g. booting).

       CG  COMPLETING      Job is in the process of completing. Some processes
                           on some nodes may still be active.

       F   FAILED          Job terminated with non-zero  exit  code  or  other
                           failure condition.

       NF  NODE_FAIL       Job  terminated  due  to  failure  of  one  or more
                           allocated nodes.

       PD  PENDING         Job is awaiting resource allocation.

       R   RUNNING         Job currently has an allocation.

       S   SUSPENDED       Job has  an  allocation,  but  execution  has  been
                           suspended.

       TO  TIMEOUT         Job terminated upon reaching its time limit.

ENVIRONMENT VARIABLES

       Some  squeue  options  may  be  set  via  environment  variables. These
       environment variables, along  with  their  corresponding  options,  are
       listed  below.  (Note:  Commandline  options will always override these
       settings.)

       SLURM_CONF          The location of the SLURM configuration file.

       SQUEUE_ACCOUNT      -A <account_list>, --account=<account_list>

       SQUEUE_ALL          -a, --all

       SQUEUE_FORMAT       -o <output_format>, --format=<output_format>

       SQUEUE_PARTITION    -p <part_list>, --partition=<part_list>

       SQUEUE_QOS          -p <qos_list>, --qos=<qos_list>

       SQUEUE_SORT         -S <sort_list>, --sort=<sort_list>

       SQUEUE_STATES       -t <state_list>, --states=<state_list>

       SQUEUE_USERS        -u <user_list>, --users=<user_list>

EXAMPLES

       Print the jobs scheduled in the debug partition and  in  the  COMPLETED
       state  in  the  format  with  six right justified digits for the job id
       followed by the priority with an arbitrary fields size:
       # squeue -p debug -t COMPLETED -o "%.6i %p"
        JOBID PRIORITY
        65543 99993
        65544 99992
        65545 99991

       Print the job steps in the debug partition sorted by user:
       # squeue -s -p debug -S u
         STEPID        NAME PARTITION     USER      TIME NODELIST
        65552.1       test1     debug    alice      0:23 dev[1-4]
        65562.2     big_run     debug      bob      0:18 dev22
        65550.1      param1     debug  candice   1:43:21 dev[6-12]

       Print information only about jobs 12345,12345, and 12348:
       # squeue --jobs 12345,12346,12348
        JOBID PARTITION NAME USER ST  TIME  NODES NODELIST(REASON)
        12345     debug job1 dave  R   0:21     4 dev[9-12]
        12346     debug job2 dave PD   0:00     8 (Resources)
        12348     debug job3 ed   PD   0:00     4 (Priority)

       Print information only about job step 65552.1:
       # squeue --steps 65552.1
         STEPID     NAME PARTITION    USER    TIME  NODELIST
        65552.1    test2     debug   alice   12:49  dev[1-4]

COPYING

       Copyright (C) 2002-2007 The Regents of the  University  of  California.
       Copyright (C) 2008-2009 Lawrence Livermore National Security.  Produced
       at   Lawrence   Livermore   National   Laboratory   (cf,   DISCLAIMER).
       CODE-OCEC-09-009. All rights reserved.

       This  file  is  part  of  SLURM,  a  resource  management program.  For
       details, see <https://computing.llnl.gov/linux/slurm/>.

       SLURM is free software; you can redistribute it and/or modify it  under
       the  terms  of  the GNU General Public License as published by the Free
       Software Foundation; either version 2  of  the  License,  or  (at  your
       option) any later version.

       SLURM  is  distributed  in the hope that it will be useful, but WITHOUT
       ANY WARRANTY; without even the implied warranty of  MERCHANTABILITY  or
       FITNESS  FOR  A PARTICULAR PURPOSE.  See the GNU General Public License
       for more details.

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

JOB REASON CODES

JOB STATE CODES

ENVIRONMENT VARIABLES

EXAMPLES

COPYING

SEE ALSO