Man Linux: Main Page and Category List

NAME

       bhost - LAM boot schema (host file) format

SYNTAX

       #
       # comments
       #
       <machine> [cpu=<cpucount>] [user=<userid>]
       <machine> [cpu=<cpucount>] [user=<userid>]
        ...

DESCRIPTION

       A  boot  schema  describes  the  machines  that  will combine to form a
       multicomputer running LAM.  It is used by recon(1)  to  verify  initial
       conditions  for  running  LAM,  by  lamboot(1)  to  start  LAM,  and by
       lamhalt(1) to terminate LAM (note that lamwipe(1) has  been  deprecated
       by the lamhalt(1) command).

       The  particular  syntax  of  a  LAM boot schema is sometimes called the
       "host file" syntax.  It is line oriented.  One line indicates the  name
       of  a  machine,  typically  the  full Internet domain name, an optional
       number of CPUs available on that machine,  and  optionally  the  userid
       with which to access it.

       Common  boot  schema for a particular site may be created by the system
       administrator and placed in  the  installation  directory  under  etc/.
       They  typically  start with the prefix bhost.  Individual users usually
       create their own boot schema,  especially  if  the  configurations  are
       simple.

NAME RESOLUTION

       Note  that  lamboot  resolves  all names listed in bhost on the node in
       which lamboot  was  invoked  on.   The  lamboot(1)  man  page  contains
       information  about  address  resolution,  examples  on  how  to  handle
       multiple network interface cards (NICs) in a node, etc.

EXAMPLE

       Here is an example three node boot schema:

       #
       # example LAM host file
       #
       server.cluster.example.com schedule=no
       beowulf1.cluster.example.com cpu=2
       beowulf2.cluster.example.com
       beowulf2.cluster.example.com
       somewhere.else.example.com user=guest

       Note that the  "guest"  ID  is  significant,  since  the  user  has  an
       alternate  login  ID  on somewhere.else.example.com.  Additionally note
       that beowulf1 has a CPU count of 2 listed (a CPU count of 1 is  assumed
       if   it   is   not   given).    This   value   is  used  by  mpirun(1),
       MPI_Comm_spawn(2), and MPI_Comm_spawn_multiple(2) for the "C" (or  CPU)
       notation  that specifies how many ranks to start.  This is particularly
       useful for running on SMP machines.

       Note the schedule=no clause.  This means that LAM will boot a daemon on
       that  node,  but  by default, will not launch any MPI processes on that
       node.  This is handy for when you want to control your MPI applications
       from  one  node  (e.g.,  a  server),  but  don't  want  to  run any MPI
       applications on it.  In some environments this is  the  default  (e.g.,
       BProc).  See the LAM User's Guide for more details.

       beowulf2  is  listed  twice,  but has no specific CPU count listed.  In
       this case, LAM will keep a running tally of the total  number  of  CPUs
       for  that  host.   Hence, LAM will calculate that beowulf2 has two CPUs
       available  for  use.   Calculating  the  number  of  CPUs  by  counting
       occurances  of  a  hostname  is  useful  in a batch environment where a
       hostfile may list the same hostname multiple times, indicating that the
       batch scheduler has allocated multiple CPUs for a single job (e.g., PBS
       operates this way).

       For the above-mentioned schema, the command "mpirun C foo" would  start
       five  instances  of  the foo program; two on beowulf1, two on beowulf2,
       and one on somewhere.else.

FILES

       $LAMHOME/etc/bhost.def            default boot schema file

SEE ALSO

       LAM User's Guide, lamboot(1), lamhalt(1), mpirun(1), MPI_Comm_spawn(1),
       MPI_Comm_spawn_multiple(1), recon(1), lamwipe(1)