NAME
lamexec - Run non-MPI programs on LAM nodes.
SYNTAX
lamexec [-fhvD] [-c <#> | -np <#>] [-nw | -w] [-pty] [-s <node>] [-x
VAR1[=VALUE1][,VAR2[=VALUE2],...]] [<where>] <program> [--
<args>]
OPTIONS
-c <#> Synonym for -np (see below).
-D Use the executable program location as the current working
directory for created processes. The current working
directory of the created processes will be set before the
user's program is invoked.
-f Do not configure standard I/O file descriptors - use
defaults.
-h Print useful information on this command.
-np <#> (see below). Run this many copies of the program on the
given nodes. This option indicates that the specified file
is an executable program and not an application schema. If
no nodes are specified, all LAM nodes are considered for
scheduling; LAM will schedule the programs in a round-robin
fashion, "wrapping around" (and scheduling multiple copies on
a single node) if necessary.
-nw Do not wait for all processes to complete before exiting
lamexec. This option is mutually exclusive with -w.
-pty Enable pseudo-tty support. Among other things, this enabled
line-buffered output (which is probably what you want). The
only reason that this feature is not enabled by default is
because it is so new and has not been extensively tested yet.
-s <node> Load the program from this node. This option is not valid on
the command line if an application schema is specified.
-v Be verbose; report on important steps as they are done.
-w Wait for all applications to exit before lamexec exits.
-x Export the specified environment variables to the remote
nodes before executing the program. Existing environment
variables can be specified (see the Examples section, below),
or new variable names specified with corresponding values.
The parser for the -x option is not very sophisticated; it
does not even understand quoted values. Users are advised to
set variables in the environment, and then use -x to export
(not define) them.
<where> A set of node and/or CPU identifiers indicating where to
start
-- <args> Pass these runtime arguments to every new process. This must
always be the last argument to lamexec. This option is not
valid on the command line if an application schema is
specified.
DESCRIPTION
lamexec is essentially a clone of the mpirun(1), but is intended for
non-MPI programs.
One invocation of lamexec starts a non-MPI application running under
LAM. To start the same program on all LAM nodes, the application can
be specified on the lamexec command line. To start multiple
applications on the LAM nodes, an application schema is required in a
separate file. See appschema(5) for a description of the application
schema syntax, but it essentially contains multiple lamexec command
lines, less the command name itself. The ability to specify different
options for different instantiations of a program is another reason to
use an application schema.
Location Nomenclature
The location nomenclature that is used for the <where> clause mention
in the SYNTAX section, above, is identical to mpirun(1)'s nomenclature.
See the mpirun(1) man page for a lengthy discussion of the location
nomenclature.
Note that the by-CPU syntax, while valid for lamexec, is not quite as
meaningful because process rank ordering in MPI_COMM_WORLD is
irrelevant. As such, the by-node nomenclature is typically the
preferred syntax for lamexec.
Application Schema or Executable Program?
To distinguish the two different forms, lamexec looks on the command
line for <nodes> or the -c option. If neither is specified, then the
file named on the command line is assumed to be an application schema.
If either one or both are specified, then the file is assumed to be an
executable program. If <nodes> and -c both are specified, then copies
of the program are started on the specified nodes according to an
internal LAM scheduling policy. Specifying just one node effectively
forces LAM to run all copies of the program in one place. If -c is
given, but not <nodes>, then all LAM nodes are used. If <nodes> is
given, but not -c, then one copy of the program is run on each node.
Program Transfer
By default, LAM searches for executable programs on the target node
where a particular instantiation will run. If the file system is not
shared, the target nodes are homogeneous, and the program is frequently
recompiled, it can be convenient to have LAM transfer the program from
a source node (usually the local node) to each target node. The -s
option specifies this behavior and identifies the single source node.
Locating Files
LAM looks for an executable program by searching the directories in the
user's PATH environment variable as defined on the source node(s).
This behavior is consistent with logging into the source node and
executing the program from the shell. On remote nodes, the "." path is
the home directory.
LAM looks for an application schema in three directories: the local
directory, the value of the LAMAPPLDIR environment variable, and
laminstalldir/boot, where "laminstalldir" is the directory where
LAM/MPI was installed.
Standard I/O
LAM directs UNIX standard input to /dev/null on all remote nodes. On
the local node that invoked lamexec, standard input is inherited from
lamexec. The default is what used to be the -w option to prevent
conflicting access to the terminal.
LAM directs UNIX standard output and error to the LAM daemon on all
remote nodes. LAM ships all captured output/error to the node that
invoked lamexec and prints it on the standard output/error of lamexec.
Local processes inherit the standard output/error of lamexec and
transfer to it directly.
Thus it is possible to redirect standard I/O for LAM applications by
using the typical shell redirection procedure on lamexec.
% lamexec N my_app < my_input > my_output
The -f option avoids all the setup required to support standard I/O
described above. Remote processes are completely directed to /dev/null
and local processes inherit file descriptors from lamboot(1).
Pseudo-tty support
The -pty option enabled pseudo-tty support for process output. This
allows, among other things, for line buffered output from remote nodes
(which is probably what you want).
This option is not currently the default for lamexec because it has not
been thoroughly tested on a variety of different Unixes. Users are
encouraged to use -pty and report any problems back to the LAM Team.
Current Working Directory
The current working directory for new processes created on the local
node is inherited from lamexec. The current working directory for new
processes created on remote nodes is the remote user's home directory.
This default behavior is overridden by the -D option.
The -D option will change the current working directory of new
processes to the directory where the executable resides before the new
user's program is invoked.
An alternative to the -D option is the -wd option. -wd allows the user
to specify an arbitrary current working directory (vs. the location of
the executable). Note that the -wd option can be used in application
schema files (see appschema(5)) as well.
Process Environment
Processes in the application inherit their environment from the LAM
daemon upon the node on which they are running. The environment of a
LAM daemon is fixed upon booting of the LAM with lamboot(1) and is
inherited from the user's shell. On the origin node this will be the
shell from which lamboot(1) was invoked and on remote nodes this will
be the shell started by rsh(1). When running dynamically linked
applications which require the LD_LIBRARY_PATH environment variable to
be set, care must be taken to ensure that it is correctly set when
booting the LAM.
Exported Environment Variables
The -x option to lamexec can be used to export specific environment
variables to the new processes. While the syntax of the -x option
allows the definition of new variables, note that the parser for this
option is currently not very sophisticated - it does not even
understand quoted values. Users are advised to set variables in the
environment and use -x to export them; not to define them.
EXAMPLES
lamexec N prog1
Load and execute prog1 on all nodes. Search for the executable
file on each node.
lamexec -c 8 prog1
Run 8 copies of prog1 wherever LAM wants to run them.
lamexec n8-10 -v -nw -s n3 prog1 -- -q
Load and execute prog1 on nodes 8, 9, and 10. Search for prog1 on
node 3 and transfer it to the three target nodes. Report as each
process is created. Give "-q" as a command line to each new
process. Do not wait for the processes to complete before exiting
lamexec.
lamexec -v myapp
Parse the application schema, myapp, and start all processes
specified in it. Report as each process is created.
lamexec N N -pty -wd /workstuff/output -x DISPLAY run_app.csh
Run the application "run_app.csh" (assumedly a C shell script)
twice on each node in the system (ideal for 2-way SMPs). Also
enable pseudo-tty support, change directory to /workstuff/output,
and export the DISPLAY variable to the new processes (perhaps the
shell script will invoke an X application such as xv to display
output).
lamexec -np 5 -D `pwd`/my_application
A common usage of lamexec in environments where a filesystem is
shared between all nodes in the multicomputer, using the shell-
escaped "pwd" command specifies the full name of the executable to
run. This prevents the need for putting the directory in the path;
the remote notes will have an absolute filename to execute (and
change directory to it upon invocation).
DIAGNOSTICS
lamexec: Exec format error
A non-ASCII character was detected in the application schema. This
is usually a command line usage error where lamexec is expecting an
application schema and an executable file was given.
lamexec: syntax error in application schema, line XXX
The application schema cannot be parsed because of a usage or
syntax error on the given line in the file.
<filename>: No such file or directory
This error can occur in two cases. Either the named file cannot be
located or it has been found but the user does not have sufficient
permissions to execute the program or read the application schema.
RETURN VALUE
lamexec returns 0 if all processes started by lamexec exit normally. A
non-zero value is returned if an internal error occurred in lamexec, or
one or more processes exited abnormally. If an internal error occurred
in lamexec, the corresponding error code is returned. In the event
that one or more processes exit with non-zero exit code, the return
value of the process that lamexec first notices died abnormally will be
returned. Note that, in general, this will be the first process that
died but is not guaranteed to be so.
However, note that if the -nw switch is used, the return value from
lamexec does not indicate the exit status of the processes started by
it.
SEE ALSO
mpimsg(1), mpirun(1), mpitask(1), loadgo(1)