Man Linux: Main Page and Category List

NAME

       sacct  -  displays  accounting  data  for all jobs and job steps in the
       SLURM job accounting log or SLURM database

SYNOPSIS

       sacct [OPTIONS...]

DESCRIPTION

       Accounting information for jobs invoked with SLURM are either logged in
       the job accounting log file or saved to the SLURM database.

       The  sacct  command  displays  job  accounting  data  stored in the job
       accounting log file or SLURM database in a variety of  forms  for  your
       analysis.   The  sacct command displays information on jobs, job steps,
       status, and exitcodes by default.  You can tailor the output  with  the
       use of the --fields= option to specify the fields to be shown.

       For  the  root user, the sacct command displays job accounting data for
       all users, although there are options to filter the  output  to  report
       only the jobs from a specified user or group.

       For  the  non-root  user,  the  sacct command limits the display of job
       accounting data  to  jobs  that  were  launched  with  their  own  user
       identifier  (UID)  by  default.   Data for other users can be displayed
       with the --all, --user, or --uid options.

       Note:     Much of the data reported by sacct has been generated by  the
                 wait3() and getrusage() system calls. Some systems gather and
                 report incomplete information for these calls; sacct  reports
                 values   of  0  for  this  missing  data.  See  your  systems
                 getrusage(3) man page for information about  which  data  are
                 actually available on your system.

                 If --dump is specified, the field selection options (--brief,
                 --format, ...) have no effect.

                 Elapsed time fields  are  presented  as  2  fields,  integral
                 seconds and integral microseconds

                 If --dump is not specified, elapsed time fields are presented
                 as [[days-]hours:]minutes:seconds.hundredths.

                 The  default  input  file  is   the   file   named   in   the
                 jobacct_logfile parameter in slurm.conf.

OPTIONS

       -a , --allusers
                 Displays  the  current  user’s  jobs. Displays all users jobs
                 when run by root.

       -A account_list , --accounts=account_list
                 Displays jobs when a comma separated  list  of  accounts  are
                 given as the argument.

       -b , --brief
                 Displays a brief listing, which includes the following data:

                 ·  jobid

                 ·  status

                 ·  exitcode

                 This  option  has  no  effect when the ---dump option is also
                 specified.

       -C cluster_list,  --cluster=cluster_list
                 Displays the statistics only for  the  jobs  started  on  the
                 clusters  specified  by  the cluster_list operand, which is a
                 comma-separated list of clusters.  Space characters  are  not
                 allowed  in the cluster_list. -1 for all clusters, default is
                 current cluster you are executing the sacct command on.

       -c , --completion
                 Use job completion instead of job accounting.

       -d , --dump
                 Dumps the raw data records.

                 The section titled "INTERPRETING THE  --dump  OPTION  OUTPUT"
                 describes the data output when this option is used.

       --duplicates
                 If  SLURM  job ids are reset, but the job accounting log file
                 isn’t reset at the same time (with -e, for example), some job
                 numbers will probably appear more than once in the accounting
                 log file to  refer  to  different  jobs;  such  jobs  can  be
                 distinguished by the "submit" time stamp in the data records.

                 When data for specific jobs are  requested  with  the  --jobs
                 option,  we  assume  that the user wants to see only the most
                 recent job with that number. This behavior can be  overridden
                 by  specifying  --duplicates,  in which case all records that
                 match the selection criteria will be returned.

       -e , --helpformat

                 Print a list  of  fields  that  can  be  specified  with  the
                 --format option.

                 Fields available:

                 AllocCPUS     Account       AssocID       AveCPU
                 AvePages      AveRSS        AveVMSize     BlockID
                 Cluster       CPUTime       CPUTimeRAW    Elapsed
                 Eligible      End           ExitCode      GID
                 Group         JobID         JobName       Layout
                 MaxPages      MaxPagesNode  MaxPagesTask  MaxRSS
                 MaxRSSNode    MaxRSSTask    MaxVMSize     MaxVMSizeNode
                 MaxVMSizeTask MinCPU        MinCPUNode    MinCPUTask
                 NCPUS         NNodes        NodeList      NTasks
                 Priority      Partition     QOS           QOSRAW
                 ReqCPUS       Reserved      ResvCPU       ResvCPURAW
                 Start         State         Submit        Suspended
                 SystemCPU     Timelimit     TotalCPU      UID
                 User          UserCPU       WCKey         WCKeyID

                 The  section  titled  "Job Accounting Fields" describes these
                 fields.

       -E end_time, --endtime=end_time

                 Select jobs eligible before time.  If states are  given  with
                 the -s option return jobs in this state before this period.

                 Valid  time  formats  are...   HH:MM[:SS] [AM|PM] MMDD[YY] or
                 MM/DD[/YY]  or  MM.DD[.YY]   MM/DD[/YY]-HH:MM[:SS]   YYYY-MM-
                 DD[THH:MM[:SS]]

       -f file,  --file=file
                 Causes the sacct command to read job accounting data from the
                 named file instead of the current SLURM  job  accounting  log
                 file. Only applicable when running the filetxt plugin.

       -g gid_list,  --gid=gid_list --group=group_list
                 Displays  the  statistics  only for the jobs started with the
                 GID or the GROUP specified by the gid_list  or  thegroup_list
                 operand,  which  is a comma-separated list.  Space characters
                 are not allowed.  Default is no restrictions..

       -h , --help
                 Displays a general help message.

       -j job(.step) ,  --jobs=job(.step)
                 Displays information about the specified job(.step) or
                 list of job(.step)s.

                 The  job(.step) parameter is a comma-separated list of
                 jobs.  Space characters  are  not  permitted  in  this
                 list.

                 The default is to display information on all jobs.

       -l, --long
                 Equivalent to specifying:

                 ´--fields=jobid,jobname,partition,maxvsize,maxvsizenode,maxvsizetask,avevsize,maxrss,maxrssnode,maxrsstask,averss,maxpages,maxpagesnode,maxpagestask,avepages,mincpu,mincpunode,mincputask,avecpu,ntasks,alloccpus,elapsed,state,exitcode´

       -L, --allclusters
                 Display jobs ran on all  clusters.  By  default,  only
                 jobs ran on the cluster from where sacct is called are
                 displayed.

       -n, --noheader
                 No heading will be added to the  output.  The  default
                 action is to display a header.

                 This  option  has  no effect when used with the --dump
                 option.

       -N, --nodelist
                 Display jobs that ran on any of these  nodes,  can  be
                 one or more using a ranged string.

       -o , --format
                 Comma  separated  list  of fields. (use "--helpformat"
                 for a list of available fields).

                 NOTE: When using the format option for listing various
                 fields you can put a %NUMBER afterwards to specify how
                 many characters should be printed.

                 i.e. format=name%30 will print 30 characters of  field
                 name  right justified.  A -30 will print 30 characters
                 left justified.

       -O , --formatted_dump
                 Dumps accounting records in an easy-to-read format.

                 This option is provided for debugging.

       -p , --parsable
                 output will be ’|’ delimited with a ’|’ at the end

       -P , --parsable2
                 output will be ’|’ delimited without a ’|’ at the end

       -r , --partition

                 Comma separated list of partitions to select jobs  and
                 job steps from. The default is all partitions.

       -s state_list ,  --state=state_list
                 Selects jobs based on their current state or the state
                 they were in during the time period given,  which  can
                 be designated with the following state designators:

                 r         running

                 s         suspended

                 ca        cancelled

                 cd        completed

                 pd        pending

                 f         failed

                 to        timed out

                 nf        node_fail

                 The  state_list  operand  is a comma-separated list of
                 these state designators.   Space  characters  are  not
                 allowed in the state_list NOTE: When specifying states
                 and no start time is given the  default  starttime  is
                 ’now’.  .

       -S , --starttime
                 Select jobs eligible after the specified time. Default
                 is midnight of current day.  If states are given  with
                 the  -s  option then return jobs in this state at this
                 time, ’now’ is also used as the default time.

                 Valid time formats are...  HH:MM[:SS] [AM|PM] MMDD[YY]
                 or   MM/DD[/YY]  or  MM.DD[.YY]  MM/DD[/YY]-HH:MM[:SS]
                 YYYY-MM-DD[THH:MM[:SS]]

       -T , --truncate
                 Truncate time.  So if a job started before --starttime
                 the start time would be truncated to --starttime.  The
                 same for end time and --endtime.

       -u uid_list,  --uid=uid_list --user=user_list
                 Use this comma separated list of uids or user names to
                 select  jobs  to  display.   By  default,  the running
                 user’s uid is used.

       --usage   Displays a help message.

       -v , --verbose
                 Primarily for debug use reports the state  of  certain
                 variables during processing.

       -V , --version
                 Print version.

       -W wckey_list,  --wckeys=wckey_list
                 Displays  the  statistics only for the jobs started on
                 the wckeys specified by the wckey_list operand,  which
                 is  a  comma-separated  list  of  wckey  names.  Space
                 characters are not allowed in the wckey_list.  Default
                 is all wckeys.

       -x associd_list, --associations=assoc_list
                 Displays  the  statistics  only  for  the jobs running
                 under the association ids specified by the  assoc_list
                 operand,   which   is   a   comma-separated   list  of
                 association ids.  Space characters are not allowed  in
                 the assoc_list. Default is all associations.

       -X , --allocations
                 Only  show cumulative statistics for each job, not the
                 intermediate steps.

   Job Accounting Fields
       The following describes each job accounting field:

              alloccpus Count of allocated processors.

              account   Account the job ran under.

              associd   Reference to the association of  user,  account
                        and cluster.

              avecpu    Average CPU time of a process.

              avepages  Average pages of a process.

              averss    Average resident set size of a process.

              avevsize  Average Virtual Memory size of a process.

              blockid   Block  ID,  applicable  to  BlueGene  computers
                        only.

              cluster   Cluster name.

              cputime   Formatted number of cpu seconds a  process  was
                        allocated.

              cputimeraw
                        How  much  cpu  time  process  was allocated in
                        second format, not formatted like above.

              elapsed   The jobs elapsed time.

                        The format of this fields output is as follows:
                        [DD-[hh:]]mm:ss

                        as defined by the following:

                        DD        days

                        hh        hours

                        mm        minutes

                        ss        seconds

              eligible  When the job became eligible to run.

              end       Termination  time  of the job. Format output is
                        as follows:
                        MM/DD-hh:mm:ss

                        as defined by the following:

                        MM        month

                        DD        day

                        hh        hours

                        mm        minutes

                        ss        seconds

              exitcode  The first non-zero error code returned  by  any
                        job step.

              gid       The  group  identifier  of the user who ran the
                        job.

              group     The group name of the user who ran the job.

              jobid     The number of the job or job step.   It  is  in
                        the form: job.jobstep.

              jobname   The   name   of   the  job  or  job  step.  The
                        slurm_accounting.log file is a space  delimited
                        file. Because of this if a space is used in the
                        jobname an underscore is  substituted  for  the
                        space  before  the  record  is  written  to the
                        accounting  file.  So  when  the   jobname   is
                        displayed by sacct the jobname that had a space
                        in it will now have an underscore in  place  of
                        the space.

              layout    What  the  layout  of  a  step  was when it was
                        running.  This can be used to give you an  idea
                        of which node ran which rank in your job.

              maxpages  Maximum page faults of a process.

              maxpagesnode
                        The node where the maxpages occured.

              maxpagestask
                        The  task  on  maxpagesnode  where the maxpages
                        occured.

              maxrss    Maximum resident set size of a process.

              maxrssnode
                        The node where the maxrss occured.

              maxrsstask
                        The  task  on  maxrssnode  where   the   maxrss
                        occured.

              maxvmsize Maximum  Virtual  Memory size of any process.

              maxvmsizenode
                        The node where the maxvsize occured.

              maxvmsizetask
                        The  task  on  maxvsizenode  where the maxvsize
                        occured.

              mincpu    Minimum cpu of any process.

              mincpunode
                        The node where the mincpu occured.

              mincputask
                        The  task  on  mincpunode  where   the   mincpu
                        occured.

              ncpus     Total number of CPUs allocated to the job.

              nodelist  List of nodes in job/step.

              nnodes    Number of nodes in a job or step.

              ntasks    Total number of tasks in a job or step.

              priority  Slurm priority.

              partition Identifies  the partition on which the job ran.

              qos       Name of Quality of Service.

              qosraw    Id of Quality of Service.

              reqcpus   Required CPUs.

              reserved  How much wall clock time was used  as  reserved
                        time  for  this  job.  This is derived from how
                        long a job was waiting from  eligible  time  to
                        when it actually started.

              resvcpu   Formatted  time  for  how long (cpu secs) a job
                        was reserved for.

              resvcpuraw
                        Reserved CPUs in second format, not  formatted.

              start     Initiation  time  of the job in the same format
                        as end.

              state     Displays the job status, or state.

                        Output can be  RUNNING,  SUSPENDED,  COMPLETED,
                        CANCELLED, FAILED, TIMEOUT, or NODE_FAIL.

              submit    The   time  and  date  stamp (in Universal Time
                        Coordinated, UTC) the job was  submitted.   The
                        format  of  the  output is identical to that of
                        the end field.

              suspended How long the job was suspended for.

              SystemCPU The amount of system CPU time used by  the  job
                        or  job  step.   The  format  of  the output is
                        identical to that of the elapsed field.

                        NOTE:  SystemCPU  provides  a  measure  of  the
                        task’s  parent process and does not include CPU
                        time of child processes.

              timelimit What the timelimit was/is for the job.

              TotalCPU  The sum of the SystemCPU and UserCPU time  used
                        by  the job or job step.  The total CPU time of
                        the job may exceed the job’s elapsed  time  for
                        jobs  that  include  multiple  job  steps.  The
                        format of the output is identical  to  that  of
                        the elapsed field.

                        NOTE: TotalCPU provides a measure of the task’s
                        parent process and does not include CPU time of
                        child processes.

              uid       The  user  identifier  of  the user who ran the
                        job.

              user      The user name of the user who ran the job.

              UserCPU   The amount of user CPU time used by the job  or
                        job   step.    The  format  of  the  output  is
                        identical to that of the elapsed field.

                        NOTE: UserCPU provides a measure of the  task’s
                        parent process and does not include CPU time of
                        child processes.

              wckey     Workload   Characterization   Key.    Arbitrary
                        string   for   grouping   orthogonal   accounts
                        together.

              wckeyid   Reference to the wckey.

INTERPRETING THE -DUMP OPTION OUTPUT

       The sacct commands --dump option displays data in  a  horizontal
       list  of  fields  depending  on the record type; there are three
       record types: JOB_START, JOB_STEP, and JOB_TERMINATED.  There is
       a subsection that describes the output for each record type.

       When  the data output is a job accounting field, as described in
       the section titled "Job Accounting Fields", only the name of the
       job   accounting   field   is   listed.   Otherwise,  additional
       information is provided.

       Note:     The output for the JOB_STEP and JOB_TERMINATED  record
                 types present a pair of fields for the following data:
                 Total CPU time, Total User CPU time, and Total  System
                 CPU time.  The first field of each pair is the time in
                 seconds expressed as an integer.  The second field  of
                 each   pair   is  the  fractional  number  of  seconds
                 multiplied by one million.  Thus,  a  pair  of  fields
                 output  as  "1 024315" means that the time is 1.024315
                 seconds.  The least significant digits in  the  second
                 field are truncated in formatted displays.

   Output for the JOB_START Record Type
       The  following  describes  the  horizontal  fields output by the
       sacct --dump option for the JOB_START record type.

              Field #   Field

              1         job

              2         partition

              3         submitted

              4         The jobs start time; this value is  the  number
                        of  non-leap  seconds since the Epoch (00:00:00
                        UTC, January 1, 1970)

              5         uid.gid

              6         (Reserved)

              7         JOB_START (literal string)

              8         Job Record Version (1)

              9         The number of fields in the record (16)

              10        uid

              11        gid

              12        The job name

              13        Batch Flag (0=no batch)

              14        Relative SLURM priority

              15        ncpus

              16        nodes

   Output for the JOB_STEP Record Type
       The following describes the  horizontal  fields  output  by  the
       sacct --dump option for the JOB_STEP record type.

              Field #   Field

              1         job

              2         partition

              3         submitted

              4         The  jobs  start time; this value is the number
                        of non-leap seconds since the  Epoch  (00:00:00
                        UTC, January 1, 1970)

              5         uid.gid

              6         (Reserved)

              7         JOB_STEP (literal string)

              8         Job Record Version (1)

              9         The number of fields in the record (38)

              10        jobid

              11        end

              12        Completion  Status;  the  mnemonics,  which may
                        appear  in  uppercase  or  lowercase,  are   as
                        follows:

                        CA        Cancelled

                        CD        Completed successfully

                        F         Failed

                        NF        Job terminated from node failure

                        R         Running

                        S         Suspended

                        TO        Timed out

              13        exitcode

              14        ntasks

              15        ncpus

              16        elapsed time in seconds expressed as an integer

              17        Integer  portion  of  the  Total  CPU  time  in
                        seconds for all processes

              18        Fractional  portion  of  the Total CPU time for
                        all processes expressed in microseconds

              19        Integer portion of the Total User CPU  time  in
                        seconds for all processes

              20        Fractional  portion  of the Total User CPU time
                        for all processes expressed in microseconds

              21        Integer portion of the Total System CPU time in
                        seconds for all processes

              22        Fractional portion of the Total System CPU time
                        for all processes expressed in microseconds

              23        rss

              24        ixrss

              25        idrss

              26        isrss

              27        minflt

              28        majflt

              29        nswap

              30        inblocks

              31        outblocks

              32        msgsnd

              33        msgrcv

              34        nsignals

              35        nvcsw

              36        nivcsw

              37        vsize

          Output for the JOB_TERMINATED Record Type
              The following describes the horizontal fields  output  by
              the  sacct  --dump option for the JOB_TERMINATED (literal
              string) record type.

              Field #   Field

              1         job

              2         partition

              3         submitted

              4         The jobs start time; this value is  the  number
                        of  non-leap  seconds since the Epoch (00:00:00
                        UTC, January 1, 1970)

              5         uid.gid

              6         (Reserved)

              7         JOB_TERMINATED (literal string)

              8         Job Record Version (1)

              9         The number of fields in the record (38)

                        Although thirty-eight fields are  displayed  by
                        the   sacct   command  for  the  JOB_TERMINATED
                        record, only fields 1 through 12  are  recorded
                        in  the  actual  data  file;  the sacct command
                        aggregates the remainder.

              10        The total elapsed time in seconds for the  job.

              11        end

              12        Completion  Status;  the  mnemonics,  which may
                        appear  in  uppercase  or  lowercase,  are   as
                        follows:

                        CA        Cancelled

                        CD        Completed successfully

                        F         Failed

                        NF        Job terminated from node failure

                        R         Running

                        TO        Timed out

              13        exitcode

              14        ntasks

              15        ncpus

              16        elapsed time in seconds expressed as an integer

              17        Integer  portion  of  the  Total  CPU  time  in
                        seconds for all processes

              18        Fractional  portion  of  the Total CPU time for
                        all processes expressed in microseconds

              19        Integer portion of the Total User CPU  time  in
                        seconds for all processes

              20        Fractional  portion  of the Total User CPU time
                        for all processes expressed in microseconds

              21        Integer portion of the Total System CPU time in
                        seconds for all processes

              22        Fractional portion of the Total System CPU time
                        for all processes expressed in microseconds

              23        rss

              24        ixrss

              25        idrss

              26        isrss

              27        minflt

              28        majflt

              29        nswap

              30        inblocks

              31        outblocks

              32        msgsnd

              33        msgrcv

              34        nsignals

              35        nvcsw

              36        nivcsw

              37        vsize

EXAMPLES

       This example illustrates the default  invocation  of  the  sacct
       command:

              # sacct
              Jobid      Jobname    Partition    Account AllocCPUS State     ExitCode
              ---------- ---------- ---------- ---------- ---------- ---------- --------
              2          script01   srun       acct1               1 RUNNING           0
              3          script02   srun       acct1               1 RUNNING           0
              4          endscript  srun       acct1               1 RUNNING           0
              4.0                   srun       acct1               1 COMPLETED         0

       This  example shows the same job accounting information with the
       brief option.

              # sacct --brief
              Jobid      Status     Exitcode
              ---------- ---------- --------
              2          RUNNING           0
              3          RUNNING           0
              4          RUNNING           0
              4.0        COMPLETED         0

              # sacct --allocations
              Jobid      Jobname    Partition Account    AllocCPUS  State     Exitcode
              ---------- ---------- ---------- ---------- ------- ---------- --------
              3          sja_init   andy       acct1            1 COMPLETED         0
              4          sjaload    andy       acct1            2 COMPLETED         0
              5          sja_scr1   andy       acct1            1 COMPLETED         0
              6          sja_scr2   andy       acct1           18 COMPLETED         2
              7          sja_scr3   andy       acct1           18 COMPLETED         0
              8          sja_scr5   andy       acct1            2 COMPLETED         0
              9          sja_scr7   andy       acct1           90 COMPLETED         1
              10         endscript  andy       acct1          186 COMPLETED         0

       This example demonstrates the ability to customize the output of
       the  sacct  command.   The  fields  are  displayed  in the order
       designated on the command line.

              # sacct --fields=jobid,ncpus,ntasks,nsignals,status
              Jobid     Elapsed    Ncpus     Ntasks   Status
              ---------- ---------- ---------- -------- ----------
              3            00:01:30          2        1 COMPLETED
              3.0          00:01:30          2        1 COMPLETED
              4            00:00:00          2        2 COMPLETED
              4.0          00:00:01          2        2 COMPLETED
              5            00:01:23          2        1 COMPLETED
              5.0          00:01:31          2        1 COMPLETED

COPYING

       Copyright (C) 2005-2007  Copyright  Hewlett-Packard  Development
       Company L.P.

       Copyright  (C)  2008-2009  Lawrence Livermore National Security.
       Produced  at  Lawrence  Livermore   National   Laboratory   (cf,
       DISCLAIMER). CODE-OCEC-09-009. All rights reserved.

       This  file is part of SLURM, a resource management program.  For
       details, see <https://computing.llnl.gov/linux/slurm/>.

       SLURM is free software; you can redistribute it and/or modify it
       under  the  terms of the GNU General Public License as published
       by the  Free  Software  Foundation;  either  version  2  of  the
       License, or (at your option) any later version.

       SLURM  is  distributed  in  the hope that it will be useful, but
       WITHOUT ANY WARRANTY;  without  even  the  implied  warranty  of
       MERCHANTABILITY  or  FITNESS  FOR A PARTICULAR PURPOSE.  See the
       GNU General Public License for more details.

FILES

       /etc/slurm.conf
                 Entries  to  this  file  enable  job  accounting   and
                 designate  the  job  accounting log file that collects
                 system job accounting.

       /var/log/slurm_accounting.log
                 The default job accounting log file.  By default, this
                 file  is  set  to  read  and write permission for root
                 only.

SEE ALSO

       sstat(1), ps(1), srun(1), squeue(1), getrusage(2), time(2)