Man Linux: Main Page and Category List

NAME

       ompi-restart, orte-restart - Restart a previously checkpointed parallel
       job using the Open PAL Checkpoint/Restart Service (CRS)

       NOTE: ompi-restart, and orte-restart are all exact  synonyms  for  each
       other.  Using  any  of  the  names  will  result  in  exactly identical
       behavior.

SYNOPSIS

       ompi-restart [ options ] <GLOBAL SNAPSHOT HANDLE>

Options

       ompi-restart will attempt to restart a previously checkpointed parallel
       job   from   the   global   snapshot   handle   reference  returned  by
       ompi_checkpoint.

       <GLOBAL SNAPSHOT HANDLE>
                 The   global   snapshot   handle   reference   returned    by
                 ompi_checkpoint, used to restart the job. This is required to
                 be the last argument to this command.

       -h | --help
                 Display help for this command

       -p | --preload
                 Preload the checkpoint files on  the  remote  systems  before
                 restarting the application. Disabled by default.

       --fork    Fork  off  a  new process, which is the restarted process. By
                 default, the restarted process will replace ompi-restart.

       -s | --seq
                 The sequence number of the checkpoint  to  restart  from.  By
                 default,  the  most recent sequence number is used (specified
                 by -1).

       -hostfile | --hostfile
                 The hostfile from which to restart the application. Useful in
                 unscheduled  environments.  (Same  behavior  as --machinefile
                 option)

       -machinefile | --machinefile
                 The machinefile from which to restart the application. Useful
                 in  unscheduled  environments.  (Same  behavior as --hostfile
                 option)

       -v | --verbose
                 Enable verbose output for debugging.

       -gmca | --gmca <key> <value>
                 Pass  global  MCA  parameters  that  are  applicable  to  all
                 contexts.  <key>  is  the  parameter  name;  <value>  is  the
                 parameter value.

       -mca | --mca <key> <value>
                 Send arguments to various MCA modules.

DESCRIPTION

       ompi-restart can  be  invoked  multiple,  non-overlapping  times.  This
       allows the user to restart a previously running parallel job.

SEE ALSO

         orte-ps(1),  orte-clean(1),  ompi-checkpoint(1),  opal-checkpoint(1),
       opal-restart(1), opal_crs(7)