pdcp - copy files to groups of hosts in parallel

NAME

       pdcp - copy files to groups of hosts in parallel
       rpdcp - (reverse pdcp) copy files from a group of hosts in parallel

SYNOPSIS

       pdcp [options]... src [src2...] dest
       rpdcp [options]... src [src2...] dir

DESCRIPTION

       pdcp  is  a variant of the rcp(1) command.  Unlike rcp(1), which copies
       files to a single remote host, pdcp can copy files to  multiple  remote
       hosts  in  parallel.   However,  pdcp  does  not recognize files in the
       format ‘‘rname@rhost:path,’’ therefore all source files must be on  the
       local  host  machine.   Destination  nodes  must  be listed on the pdcp
       command line using a suitable target nodelist option (See  the  OPTIONS
       section  below).  Each destination node listed must have pdcp installed
       for the copy to succeed.

       When pdcp receives SIGINT (ctrl-C), it  lists  the  status  of  current
       threads.   A  second  SIGINT  within one second terminates the program.
       Pending threads may be canceled by issuing ctrl-Z within one second  of
       ctrl-C.  Pending threads are those that have not yet been initiated, or
       are still in the process of connecting to the remote host.

       Like  pdsh(1),  the  functionality  of  pdcp  may  be  supplemented  by
       dynamically  loadable  modules.  In pdcp, the modules may provide a new
       connect protocol (replacing the standard  rsh(1)  protocol),  filtering
       options  (e.g.  excluding  hosts  that are down), and/or host selection
       options (e.g. -a selects all nodes  from  a  local  config  file).   By
       default,  pdcp  requires  at  least  one "rcmd" module to be loaded (to
       provide the channel for remote copy).

REVERSE PDCP

       rpdcp performs a reverse parallel copy.  Rather than copying  files  to
       remote hosts, files are retrieved from remote hosts and stored locally.
       All directories or files retrieved will be  stored  with  their  remote
       hostname  appended  to  the  filename.   The destination file must be a
       directory when this option is used.

       In other respects, rpdcp is exactly like pdcp, and  further  statements
       regarding pdcp in this manual also apply to rpdcp.

RCMD MODULES

       The  method  by  which pdcp connects to remote hosts may be selected at
       runtime using the -R option (See OPTIONS below).  This functionality is
       ultimately  implemented  via  dynamically  loadable modules, and so the
       list of  available  options  may  be  different  from  installation  to
       installation.  A  list  of  currently available rcmd modules is printed
       when using any of the -h, -V, or -L options. The  default  rcmd  module
       will also be displayed with the -h and -V options.

       A list of rcmd modules currently distributed with pdcp follows.

       rsh     Uses  an internal, thread-safe implementation of BSD rcmd(3) to
               run commands using the standard rsh(1) protocol.

       ssh     Uses a variant of popen(3) to run multiple copies of the ssh(1)
               command.

       mrsh    This module uses the mrsh(1) protocol to execute jobs on remote
               hosts.    The   mrsh   protocol   uses   a   credential   based
               authentication,  forgoing  the need to allocate reserved ports.
               In other aspects, it acts just like rsh.

       krb4    The krb4 module allows users to execute remote  commands  after
               authenticating  with  kerberos.  Of  course,  the  remote  rshd
               daemons must be kerberized.

       xcpu    The xcpu  module  uses  the  xcpu  service  to  execute  remote
               commands.

OPTIONS

       The  list  of  available  pdcp  options  is  determined  at  runtime by
       supplementing the list  of  standard  pdcp  options  with  any  options
       provided  by  loaded  rcmd  and  misc  modules.  In some cases, options
       provided by modules may conflict with each other. In these  cases,  the
       modules are incompatible and the first module loaded wins.

Standard target nodelist options

       -w host,host,...
              Target  the  specified  list of hosts. Do not use with any other
              node selection options (e.g. -a, -g if they are  available).  No
              spaces   are  allowed  in  the  comma-separated  list.   A  list
              consisting of a single ‘-’ character causes the target hosts  to
              be  read  from  stdin,  one  per line. The host list may contain
              hostlist expressions  of  the  form  ‘‘host[1-5,7]’’.  For  more
              information   about   the  hostlist  format,  see  the  HOSTLIST
              EXPRESSIONS section below.

       -x host,host,...
              Exclude the specified hosts. May  be  specified  in  conjunction
              with  other  target  node  list  options such as -a and -g (when
              available). Hostlists may also be specified  to  the  -x  option
              (see HOSTLIST EXPRESSIONS secion below).

Standard pdcp options

       -h     Output  usage  menu  and  quit. A list of available rcmd modules
              will be printed at the end of the usage message.

       -q     List option values and the  target  nodelist  and  exit  without
              action.

       -b     Disable  ctrl-C  status  feature  so  that a single ctrl-C kills
              parallel copy. (Batch Mode)

       -r     Copy directories recursively.

       -p     Preserve modification time and modes.

       -e PATH
              Explicitly specify path to remote pdcp binary instead  of  using
              the locally executed path.

       -l user
              This  option  may be used to copy files as another user, subject
              to authorization. For BSD rcmd, this means the invoking user and
              system  must  be  listed  in  the  user´s .rhosts file (even for
              root).

       -t seconds
              Set the connect timeout. Default is 10 seconds.

       -f number
              Set the maximum number of simultaneous remote copies to  number.
              The default is 32.

       -R name
              Set  rcmd  module  to  name. This option may also be set via the
              PDSH_RCMD_TYPE environment variable. A list  of  available  rcmd
              modules may be obtained via either the -h or -L options.

       -L     List info on all loaded pdcp modules and quit.

       -d     Include more complete thread status when SIGINT is received, and
              display connect and command time statistics on stderr when done.

       -V     Output  pdcp  version  information, along with list of currently
              loaded modules, and exit.

HOSTLIST EXPRESSIONS

       As noted in sections above, pdcp accepts ranges  of  hostnames  in  the
       general  form:  prefix[n-m,l-k,...], where n < m and l < k, etc., as an
       alternative to explicit lists  of  hosts.   This  form  should  not  be
       confused  with  regular  expression  character classes (also denoted by
       ‘‘[]’’). For example, foo[19] does not  represent  foo1  or  foo9,  but
       rather represents a degenerate range: foo19.

       This  range  syntax  is  meant only as a convenience on clusters with a
       prefixNN naming convention and specification of ranges  should  not  be
       considered  necessary -- the list foo1,foo9 could be specified as such,
       or by the range foo[1,9].

       Some examples of range usage follow:

       Copy /etc/hosts to foo01,foo02,...,foo05
           pdcp -w foo[01-05] /etc/hosts /etc

       Copy /etc/hosts to foo7,foo9,foo10
           pdcp -w foo[7,9-10] /etc/hosts /etc

       Copy /etc/hosts to foo0,foo4,foo5
           pdcp -w foo[0-5] -x foo[1-3] /etc/hosts /etc

       As a reminder to the reader, some shells will interpret  brackets  (’[’
       and  ’]’)  for  pattern  matching.   Depending on your shell, it may be
       necessary to enclose ranged lists within quotes.  For example, in tcsh,
       the first example above should be executed as:

           pdcp -w "foo[01-05]" /etc/hosts /etc

ORIGIN

       Pdsh/pdcp  was  originally  a  rewrite  of  IBM  dsh(1)  by Jim Garlick
       <garlick@llnl.gov> on LLNL’s ASCI Blue-Pacific IBM SP  system.   It  is
       now also used on Linux clusters at LLNL.

LIMITATIONS

       When using ssh for remote execution, stderr of ssh to be folded in with
       that of the remote command.  When invoked by pdcp, it is  not  possible
       for  ssh  to  prompt for confirmation if a host key changes, prompt for
       passwords if RSA keys are not configured properly, etc..  Finally,  the
       connect  timeout  is  only  adjustable with ssh when the underlying ssh
       implementation supports it, and pdsh has been built to use the  correct
       option.

NAME

SYNOPSIS

DESCRIPTION

REVERSE PDCP

RCMD MODULES

OPTIONS

Standard target nodelist options

Standard pdcp options

HOSTLIST EXPRESSIONS

ORIGIN

LIMITATIONS

SEE ALSO