Man Linux: Main Page and Category List

NAME

       SPANK - SLURM Plug-in Architecture for Node and job (K)control

DESCRIPTION

       This  manual  briefly  describes  the capabilities of the SLURM Plug-in
       architecture for Node and job Kontrol (SPANK)  as  well  as  the  SPANK
       configuration file: (By default: plugstack.conf.)

       SPANK  provides  a  very generic interface for stackable plug-ins which
       may be used to dynamically modify the job launch code in  SLURM.  SPANK
       plugins  may  be  built  without access to SLURM source code. They need
       only be compiled against SLURM’s spank.h  header  file,  added  to  the
       SPANK  config  file  plugstack.conf, and they will be loaded at runtime
       during the next job launch. Thus,  the  SPANK  infrastructure  provides
       administrators  and  other developers a low cost, low effort ability to
       dynamically modify the runtime behavior of SLURM job launch.

SPANK PLUGINS

       SPANK plugins are loaded in up to  three  separate  contexts  during  a
       SLURM job. Briefly, the three contexts are:

       local   In  local  context,  the  plugin  is  loaded by srun. (i.e. the
               "local" part of a parallel job).

       remote  In remote context, the plugin is loaded by  slurmd.  (i.e.  the
               "remote" part of a parallel job).

       allocator
               In  allocator  context,  the plugin is loaded in one of the job
               allocation utilities sbatch or salloc.

       In  local  context,   only   the   init,   exit,   init_post_opt,   and
       user_local_init  functions  are  called. In allocator context, only the
       init, exit, and init_post_opt functions are called.  Plugins may  query
       the  context  in  which  they  are  running  with the spank_context and
       spank_remote functions defined in <slurm/spank.h>.

       SPANK plugins may be called from multiple points during the  SLURM  job
       launch. A plugin may define the following functions:

       slurm_spank_init
         Called just after plugins are loaded. In remote context, this is just
         after job step is initialized. This function  is  called  before  any
         plugin option processing.

       slurm_spank_init_post_opt
         Called  at  the  same  point  as slurm_spank_init, but after all user
         options to the plugin have been processed. The reason that  the  init
         and  init_post_opt  callbacks  are  separated  is so that plugins can
         process system-wide options specified in plugstack.conf in  the  init
         callback,  then process user options, and finally take some action in
         slurm_spank_init_post_opt if necessary.

       slurm_spank_local_user_init
         Called in local (srun) context  only  after  all  options  have  been
         processed.   This  is  called  after  the  job  ID  and  step IDs are
         available.  This happens in srun after the allocation  is  made,  but
         before tasks are launched.

       slurm_spank_user_init
         Called  after  privileges  are  temporarily  dropped. (remote context
         only)

       slurm_spank_task_init_privileged
         Called for each  task  just  after  fork,  but  before  all  elevated
         privileges are dropped. (remote context only)

       slurm_spank_task_init
         Called for each task just before execve(2). (remote context only)

       slurm_spank_task_post_fork
         Called  for  each task from parent process after fork(2) is complete.
         Due to the fact that slurmd does not exec any tasks until  all  tasks
         have  completed  fork(2),  this  call is guaranteed to run before the
         user task is executed. (remote context only)

       slurm_spank_task_exit
         Called for each task as  its  exit  status  is  collected  by  SLURM.
         (remote context only)

       slurm_spank_exit
         Called once just before slurmstepd exits in remote context.  In local
         context, called before srun exits.

       All of these functions have the same prototype, for example:

          int slurm_spank_init (spank_t spank, int ac, char *argv[])

       Where spank is the SPANK handle which must be passed back to SLURM when
       the  plugin  calls  functions  like  spank_get_item  and  spank_getenv.
       Configured arguments  (See  CONFIGURATION  below)  are  passed  in  the
       argument vector argv with argument count ac.

       SPANK  plugins  can  query  the  current  list of supported slurm_spank
       symbols to determine if the current version  supports  a  given  plugin
       hook.   This  may be useful because the list of plugin symbols may grow
       in the future. The  query  is  done  using  the  spank_symbol_supported
       function, which has the following prototype:

           int spank_symbol_supported (const char *sym);

       The return value is 1 if the symbol is supported, 0 if not.

       SPANK  plugins  do  not  have direct access to internally defined SLURM
       data structures. Instead, information about the currently executing job
       is obtained via the spank_get_item function call.

         spank_err_t spank_get_item (spank_t spank, spank_item_t item, ...);

       The spank_get_item call must be passed the current SPANK handle as well
       as the item requested, which is defined by the passed  spank_item_t.  A
       variable  number  of  pointer  arguments  are also passed, depending on
       which item was requested by the plugin. A list of the valid values  for
       item is kept in the spank.h header file. Some examples are:

       S_JOB_UID
         User id for running job. (uid_t *) is third arg of spank_get_item

       S_JOB_STEPID
         Job   step  id  for  running  job.  (uint32_t  *)  is  third  arg  of
         spank_get_item.

       S_TASK_EXIT_STATUS
         Exit status for exited task. Only valid  from  slurm_spank_task_exit.
         (int *) is third arg of spank_get_item.

       S_JOB_ARGV
         Complete  job  command  line. Third and fourth args to spank_get_item
         are (int *, char ***).

       See spank.h for more details, and EXAMPLES  below  for  an  example  of
       spank_get_item usage.

       SPANK   plugins  may  also  use  the  spank_getenv,  spank_setenv,  and
       spank_unsetenv functions to view  and  modify  the  job’s  environment.
       spank_getenv   searches  the  job’s  environment  for  the  environment
       variable var and copies the current value into a buffer buf  of  length
       len.  spank_setenv allows a SPANK plugin to set or overwrite a variable
       in the job’s environment,  and  spank_unsetenv  unsets  an  environment
       variable in the job’s environment. The prototypes are:

        spank_err_t spank_getenv (spank_t spank, const char *var,
                            char *buf, int len);
        spank_err_t spank_setenv (spank_t spank, const char *var,
                            const char *val, int overwrite);
        spank_err_t spank_unsetenv (spank_t spank, const char *var);

       These  are  only necessary in remote context since modifications of the
       standard  process   environment   using   setenv(3),   getenv(3),   and
       unsetenv(3) may be used in local context.

       Functions are also available from within the SPANK plugins to establish
       environment variables to be  exported  to  the  SLURM  PrologSlurmctld,
       Prolog,  Epilog and EpilogSlurmctld programs (the so-called job control
       environment).  The name of environment variables established  by  these
       calls  will  be  prepended with the string SPANK_ in order to avoid any
       security implications of arbitrary environment variable control. (After
       all, the job control scripts do run as root or the SLURM user.).

       These functions are available from local context only.

         spank_err_t spank_job_control_getenv(spank_t spank, const char *var,
                              char *buf, int len);
         spank_err_t spank_job_control_setenv(spank_t spank, const char *var,
                              const char *val, int overwrite);
         spank_err_t spank_job_control_unsetenv(spank_t spank, const char *var);

       See spank.h for more information, and EXAMPLES below for an example for
       spank_getenv usage.

       Many of the described  SPANK  functions  available  to  plugins  return
       errors  via  the  spank_err_t  error type. On success, the return value
       will be set to ESPANK_SUCCESS, while on failure, the return value  will
       be  set to one of many error values defined in slurm/spank.h. The SPANK
       interface provides a simple function

         const char * spank_strerror(spank_err_t err);

       which may be used to translate a  spank_err_t  value  into  its  string
       representation.

SPANK OPTIONS

       SPANK  plugins also have an interface through which they may define and
       implement extra job options. These options are made  available  to  the
       user  through SLURM commands such as srun(1), salloc(1), and sbatch(1).
       if the option is specified by the user,  its  value  is  forwarded  and
       registered with the plugin in slurmd when the job is run.  In this way,
       SPANK plugins may dynamically provide new options and functionality  to
       SLURM.

       Each  option registered by a plugin to SLURM takes the form of a struct
       spank_option which is declared in <slurm/spank.h> as

          struct spank_option {
             char *         name;
             char *         arginfo;
             char *         usage;
             int            has_arg;
             int            val;
             spank_opt_cb_f cb;
          };

       Where

       name   is  the  name  of  the  option.  Its  length   is   limited   to
              SPANK_OPTION_MAXLEN defined in <slurm/spank.h>.

       arginfo
              is  a  description  of the argument to the option, if the option
              does take an argument.

       usage  is a short description of the option suitable for --help output.

       has_arg
              0  if  option  takes no argument, 1 if option takes an argument,
              and  2  if  the  option  takes  an   optional   argument.   (See
              getopt_long(3)).

       val    A  plugin-local value to return to the option callback function.

       cb     A callback function that is invoked when the  plugin  option  is
              registered   with   SLURM.   spank_opt_cb_f   is   typedef’d  in
              <slurm/spank.h> as

                typedef int (*spank_opt_cb_f) (int val, const char *optarg,
                                         int remote);

              Where val is the value of the  val  field  in  the  spank_option
              struct,  optarg  is  the  supplied  argument  if applicable, and
              remote is 0 if the function is being  called  from  the  "local"
              host (e.g. srun) or 1 from the "remote" host (slurmd).

       Plugin    options    may   be   registered   with   SLURM   using   the
       spank_option_register function. This function is only valid when called
       from the plugin’s slurm_spank_init handler, and registers one option at
       a time. The prototype is

          spank_err_t spank_option_register (spank_t sp,
                    struct spank_option *opt);

       This function will return ESPANK_SUCCESS on successful registration  of
       an  option,  or  ESPANK_BAD_ARG  for  errors  including invalid spank_t
       handle, or when the function is not called  from  the  slurm_spank_init
       function.  All options need to be registered from all contexts in which
       they will be used. For instance, if an option is  only  used  in  local
       (srun)  and remote (slurmd) contexts, then spank_option_register should
       only be called from within those contexts. For example:

          if (spank_context() != S_CTX_ALLOCATOR)
             spank_option_register (sp, opt);

       If,   however,   the   option   is   used   in   all   contexts,    the
       spank_option_register needs to be called everywhere.

       In  addition  to spank_option_register, plugins may also export options
       to SLURM by defining a table of struct  spank_option  with  the  symbol
       name spank_options. This method, however, is not supported for use with
       sbatch   and   salloc   (allocator   context),   thus   the   use    of
       spank_option_register is preferred. When using the spank_options table,
       the  final  element  in  the  array  must  be  filled  with  zeros.   A
       SPANK_OPTIONS_TABLE_END  macro  is provided in <slurm/spank.h> for this
       purpose.

       When an option is provided by the user on the local  side,  SLURM  will
       immediately  invoke  the option’s callback with remote=0. This is meant
       for the plugin to do local sanity checking of  the  option  before  the
       value is sent to the remote side during job launch. If the argument the
       user specified is invalid, the plugin should issue an error and issue a
       non-zero return code from the callback.

       On  the  remote  side,  options and their arguments are registered just
       after SPANK plugins are loaded and before  the  spank_init  handler  is
       called.   This   allows  plugins  to  modify  behavior  of  all  plugin
       functionality based  on  the  value  of  user-provided  options.   (See
       EXAMPLES below for a plugin that registers an option with SLURM).

CONFIGURATION

       The default SPANK plug-in stack configuration file is plugstack.conf in
       the same directory as slurm.conf(5), though this may be changed via the
       SLURM  config  parameter  PlugStackConfig.  Normally the plugstack.conf
       file should be identical on all nodes of the cluster.  The config  file
       lists  SPANK  plugins,  one  per line, along with whether the plugin is
       required or optional, and any global arguments that are to be passed to
       the  plugin  for runtime configuration.  Comments are preceded with ’#’
       and extend to the end of  the  line.   If  the  configuration  file  is
       missing or empty, it will simply be ignored.

       The format of each non-comment line in the configuration file is:

         required/optional   plugin   arguments

        For example:

         optional /usr/lib/slurm/test.so

       Tells  slurmd  to  load  the plugin test.so passing no arguments.  If a
       SPANK plugin is required, then failure of any of the plugin’s functions
       will  cause  slurmd  to  terminate the job, while optional plugins only
       cause a warning.

       If a fully-qualified path is not  specified  for  a  plugin,  then  the
       currently configure PluginDir in slurm.conf(5) is searched.

       SPANK  plugins  are stackable, meaning that more than one plugin may be
       placed into the config file. The  plugins  will  simply  be  called  in
       order,  one  after  the  other, and appropriate action taken on failure
       given that state of the plugin’s optional flag.

       Additional config files or directories of config files may be  included
       in  plugstack.conf  with  the include keyword. The include keyword must
       appear on its own line, and takes a glob as its parameter, so  multiple
       files may be included from one include line. For example, the following
       syntax will load all config files  in  the  /etc/slurm/plugstack.conf.d
       directory, in local collation order:

         include /etc/slurm/plugstack.conf.d/*

       which  might  be  considered  a  more flexible method for building up a
       spank plugin stack.

       The SPANK config file is re-read on each job  launch,  so  editing  the
       config  file will not affect running jobs. However care should be taken
       so that a partially edited config file is not read by a launching  job.

EXAMPLES

       Simple SPANK config file:

       #
       # SPANK config file
       #
       # required?       plugin                     args
       #
       optional          renice.so                  min_prio=-10
       required          /usr/lib/slurm/test.so

       The  following is a simple SPANK plugin to modify the nice value of job
       tasks. This plugin adds a --renice=[prio] option to  srun  which  users
       can  use  to set the priority of all remote tasks. Priority may also be
       specified via a SLURM_RENICE environment variable. A  minimum  priority
       may  be  established  via a "min_prio" parameter in plugstack.conf (See
       above for example).

       /*
        *   To compile:
        *    gcc -shared -o renice.so renice.c
        *
        */
       #include <sys/types.h>
       #include <stdio.h>
       #include <stdlib.h>
       #include <unistd.h>
       #include <string.h>
       #include <sys/resource.h>

       #include <slurm/spank.h>

       /*
        * All spank plugins must define this macro for the SLURM plugin loader.
        */
       SPANK_PLUGIN(renice, 1);

       #define PRIO_ENV_VAR "SLURM_RENICE"
       #define PRIO_NOT_SET 42

       /*
        *  Minimum allowable value for priority. May be set globally
        *   via plugin option min_prio=<prio>
        */
       static int min_prio = -20;

       static int prio = PRIO_NOT_SET;

       static int _renice_opt_process (int val, const char *optarg, int remote);
       static int _str2prio (const char *str, int *p2int);

       /*
        *  Provide a --renice=[prio] option to srun:
        */
       struct spank_option spank_options[] =
       {
           { "renice", "[prio]", "Re-nice job tasks to priority [prio].", 2, 0,
               (spank_opt_cb_f) _renice_opt_process
           },
           SPANK_OPTIONS_TABLE_END
       };

       /*
        *  Called from both srun and slurmd.
        */
       int slurm_spank_init (spank_t sp, int ac, char **av)
       {
           int i;

            /* Don’t do anything in sbatch/salloc
             */
            if (spank_context () == S_CTX_ALLOCATOR)
                 return (0);

           for (i = 0; i < ac; i++) {
               if (strncmp ("min_prio=", av[i], 9) == 0) {
                   const char *optarg = av[i] + 9;
                   if (_str2prio (optarg, &min_prio) < 0)
                       slurm_error ("Ignoring invalid min_prio value: %s", av[i]);
               }
               else {
                   slurm_error ("renice: Invalid option: %s", av[i]);
               }
           }

           if (!spank_remote (sp))
               slurm_verbose ("renice: min_prio = %d", min_prio);

           return (0);
       }

       int slurm_spank_task_post_fork (spank_t sp, int ac, char **av)
       {
           pid_t pid;
           int taskid;

           if (prio == PRIO_NOT_SET) {
               /*
                *  See if SLURM_RENICE env var is set by user
                */
               char val [1024];

               if (spank_getenv (sp, PRIO_ENV_VAR, val, 1024) != ESPANK_SUCCESS)
                   return (0);

               if (_str2prio (val, &prio) < 0) {
                   slurm_error ("Bad value for %s: %s", PRIO_ENV_VAR, optarg);
                   return (-1);
               }

           if (prio < min_prio)
               slurm_error ("%s=%d not allowed, using min=%d",
                   PRIO_ENV_VAR, prio, min_prio);
           }

           if (prio < min_prio)
               prio = min_prio;

           spank_get_item (sp, S_TASK_GLOBAL_ID, &taskid);
           spank_get_item (sp, S_TASK_PID, &pid);

           slurm_info ("re-nicing task%d pid %ld to %ld", taskid, pid, prio);

           if (setpriority (PRIO_PROCESS, (int) pid, (int) prio) < 0) {
               slurm_error ("setpriority: %m");
               return (-1);
           }

           return (0);
       }

       static int _str2prio (const char *str, int *p2int)
       {
           long int l;
           char *p;

           l = strtol (str, &p, 10);
           if ((*p != ’ ’) || (l < -20) || (l > 20))
               return (-1);

           *p2int = (int) l;

           return (0);
       }

       static int _renice_opt_process (int val, const char *optarg, int remote)
       {
           if (optarg == NULL) {
               slurm_error ("renice: invalid argument!");
               return (-1);
           }

           if (_str2prio (optarg, &prio) < 0) {
               slurm_error ("Bad value for --renice: %s", optarg);
               return (-1);
           }

           if (prio < min_prio)
               slurm_error ("--renice=%d not allowed, will use min=%d",
                            prio, min_prio);

           return (0);
       }

COPYING

       Copyright (C)  2006  The  Regents  of  the  University  of  California.
       Produced  at  Lawrence  Livermore National Laboratory (cf, DISCLAIMER).
       CODE-OCEC-09-009. All rights reserved.

       This file is  part  of  SLURM,  a  resource  management  program.   For
       details, see <https://computing.llnl.gov/linux/slurm/>.

       SLURM  is free software; you can redistribute it and/or modify it under
       the terms of the GNU General Public License as published  by  the  Free
       Software  Foundation;  either  version  2  of  the License, or (at your
       option) any later version.

       SLURM is distributed in the hope that it will be  useful,  but  WITHOUT
       ANY  WARRANTY;  without even the implied warranty of MERCHANTABILITY or
       FITNESS FOR A PARTICULAR PURPOSE.  See the GNU General  Public  License
       for more details.

FILES

       /etc/slurm/slurm.conf - SLURM configuration file.
       /etc/slurm/plugstack.conf - SPANK configuration file.
       /usr/include/slurm/spank.h - SPANK header file.

SEE ALSO

       srun(1), slurm.conf(5)