NetPIPE - Network Protocol Independent Performance Evaluator

NAME

       NetPIPE - Network Protocol Independent Performance Evaluator

SYNOPSIS

       NPtcp [-h receiver_hostname] [-b TCP_buffer_sizes] [options]

       mpirun [-machinefile hostlist] -np 2 NPmpi [-a] [-S] [-z] [options]

       mpirun [-machinefile hostlist] -np 2 NPmpi2 [-f] [-g] [options]

       NPpvm [options]

       See  the  TESTING sections below for a more complete description of how
       to run NetPIPE in each environment.  The OPTIONS section describes  the
       general  options  available  for all modules.  See the README file from
       the  tar-ball   at   http://www.scl.ameslab.gov/Projects/NetPIPE/   for
       documentation on the InfiniBand, GM, SHMEM, LAPI, and memcpy modules.

DESCRIPTION

       NetPIPE uses a simple series of ping-pong tests over a range of message
       sizes to provide a complete measure of the performance  of  a  network.
       It  bounces  messages of increasing size between two processes, whether
       across a network or within an SMP system.  Message sizes are chosen  at
       regular intervals, and with slight perturbations, to provide a complete
       evaluation of the communication system.  Each data point involves  many
       ping-pong   tests   to  provide  an  accurate  timing.   Latencies  are
       calculated by dividing the round trip time in half for small messages (
       less than 64 Bytes ).

       The  communication time for small messages is dominated by the overhead
       in the communication layers, meaning that the transmission  is  latency
       bound.   For  larger messages, the communication rate becomes bandwidth
       limited by some component in  the  communication  subsystem  (PCI  bus,
       network card link, network switch).

       These  measurements  can  be  done  at  the message-passing layer (MPI,
       MPI-2, and PVM) or at the native communications layers  that  that  run
       upon  (TCP/IP, GM for Myrinet cards, InfiniBand, SHMEM for the Cray T3E
       systems, and LAPI for IBM SP systems).  Recent work is being  aimed  at
       measuring  some  internal  system  properties such as the memcpy module
       that measures the internal memory copy rates, or a  disk  module  under
       development that measures the performance to various I/O devices.

       Some uses for NetPIPE include:

              Comparing  the latency and maximum throughput of various network
              cards.

              Comparing the performance between different types of networks.

              Looking for  inefficiencies  in  the  message-passing  layer  by
              comparing it to the native communication layer.

              Optimizing  the  message-passing  layer  and  tune OS and driver
              parameters  for  optimal  performance   of   the   communication
              subsystem.

       NetPIPE  is  provided with many modules allowing it to interface with a
       wide variety of communication layers.  It is fairly easy to  write  new
       interfaces  for  other reliable protocols by using the existing modules
       as examples.

TESTING TCP

       NPtcp can now be launched in two ways, by manually  starting  NPtcp  on
       both  systems  or by using a nplaunch script.  To manually start NPtcp,
       the NetPIPE receiver must be started first on the remote  system  using
       the command:

       NPtcp [options]

       then  the  primary  transmitter is started on the local system with the
       command

       NPtcp -h receiver_hostname [options]

       Any options used must be the same on both sides.

       The nplaunch script uses ssh  to  launch  the  remote  receiver  before
       starting the local transmitter.  To use rsh, simply change the nplaunch
       script.

       nplaunch NPtcp -h receiver_hostname [options]

       The -b TCP_buffer_sizes option sets the TCP socket buffer  size,  which
       can  greatly  influence  the  maximum  throughput  on  some systems.  A
       throughput graph that flattens out  suddenly  may  be  a  sign  of  the
       performance being limited by the socket buffer sizes.

TESTING MPI and MPI-2

       Use  of the MPI interface for NetPIPE depends on the MPI implementation
       being used.  All will require the number of processes to be  specified,
       usually  with  a  -np  2 argument.  Clusters environments may require a
       list of the hosts being  used,  either  during  initialization  of  MPI
       (during  lamboot  for  LAM-MPI)  or  when  each  job  is  run  (using a
       -machinefile argument for MPICH).  For LAM-MPI, for  example,  put  the
       list of hosts in hostlist then boot LAM and run NetPIPE using:

       lamboot -v -b hostlist

       mpirun -np 2 NPmpi [NetPIPE options]

       For MPICH use a command like:

       mpirun -machinefile hostlist -np 2 NPmpi [NetPIPE options]

       To  test  the  1-sided  communications  of  the MPI-2 standard, compile
       using:

       make mpi2

       Running as described above and MPI will use 1-sided MPI_Put() calls  in
       both  directions,  with  each receiver blocking until the last byte has
       been overwritten before bouncing the message back.  Use the  -f  option
       to force usage of a fence to block rather than an overwrite of the last
       byte.  The -g option will use MP_Get() functions to transfer  the  data
       rather than MP_Put().

TESTING PVM

       Start the pvm system using:

       pvm

       and adding a second machine with the PVM command

       add receiver_hostname

       Exit  the  PVM  command  line  interface  using  quit, then run the PVM
       NetPIPE receiver on one system with the command:

       NPpvm [options]

       and run the TCP NetPIPE  transmitter  on  the  other  system  with  the
       command:

       NPpvm -h receiver hostname [options]

       Any  options  used must be the same on both sides.  The nplaunch script
       may also be used with NPpvm as described above for NPtcp.

TESTING METHODOLOGY

       NetPIPE tests network performance by sending a number  of  messages  at
       each block size, starting from the lower bound on the message sizes.

       The  message  size  is incremented until the upper bound on the message
       size is reached or the time to transmit a  block  exceeds  one  second,
       which   ever  occurs  first.   Message  sizes  are  chosen  at  regular
       intervals, and for slight perturbations from them  to  provide  a  more
       complete evaluation of the communication subsystem.

       The  NetPIPE  output  file  may  be  graphed  using  a  program such as
       gnuplot(1).  The output file contains  three  columns:  the  number  of
       bytes  in the block, the transfer rate in bits per second, and the time
       to transfer the block  (half  the  round-trip  time).   The  first  two
       columns  are normally used to graph the throughput vs block size, while
       the third column provides the latency.   For  example,  the  throughput
       versus  block  size  graph can be created by graphing bytes versus bits
       per second.  Sample gnuplot(1) commands for such a graph would be

       set logscale x

       plot "np.out"

OPTIONS

       -a     asynchronous mode: prepost receives (MPI, IB modules)

       -b TCP_buffer_sizes
              Set the send and receive TCP buffer sizes (TCP module only).

       -B     Burst mode where all receives are preposted  at  once  (MPI,  IB
              modules).

       -f     Use a fence to block for completion (MPI2 module only).

       -g     Use MPI_Get() instead of MPI_Put() (MPI2 module only).

       -h hostname
              Specify  the  name of the receiver host to connect to (TCP, PVM,
              IB, GM).

       -I     Invalidate cache to measure performance  without  cache  effects
              (mostly affects IB and memcpy modules).

       -i     Do an integrity check instead of a performance evaluation.

       -l starting_msg_size
              Specify the lower bound for the size of messages to be tested.

       -n nrepeats
              Set  the  number  of repeats for each test to a constant.
              Otherwise, the number of repeats is chosen to provide  an
              accurate  timing  for  each  test.   Be  very  careful if
              specifying a low number so that the time  for  the  ping-
              pong test exceeds the timer accuracy.

       -O source_offset,dest_offset
              Specify the source and destination offsets of the buffers
              from perfect page alignment.

       -o output_filename
              Specify the output filename (default is np.out).

       -p perturbation_size
              NetPIPE chooses the message sizes at  regular  intervals,
              increasing  them exponentially from the lower boundary to
              the  upper  boundary.   At  each  point,  it  also  tests
              perturbations  of  3  bytes  above and 3 bytes below each
              test point to find idiosyncrasies in  the  system.   This
              perturbation value can be changed using the -p option, or
              turned off using -p 0 .

       -r     This option resets the TCP sockets after every test  (TCP
              module  only).   It is necessary for some streaming tests
              to get good measurements since the socket window size may
              otherwise collapse.

       -s     Set  streaming mode where data is only transmitted in one
              direction.

       -S     Use synchronous sends (MPI module only).

       -u upper_bound
              Specify the upper boundary to the size of  message  being
              tested.   By  default, NetPIPE will stop when the time to
              transmit a block exceeds one second.

       -z     Receive messages using MPI_ANY_SOURCE (MPI module only)

       -2     Set bi-directional mode where both sides send and receive
              at  the  same  time (supported by most modules).  You may
              need to use -a to choose asynchronous communications  for
              MPI  to avoid freeze-ups.  For TCP, the maximum test size
              will be limited by the TCP buffer sizes.

FILES

       np.out Default output file for NetPIPE.  Overridden  by  the  -o
              option.

AUTHOR

       The  original NetPIPE core plus TCP and MPI modules were written
       by Quinn Snell, Armin Mikler, Guy Helmer,  and  John  Gustafson.
       NetPIPE  is  currently  being  developed  and maintained by Dave
       Turner with contributions from many  students  (Bogdan  Vasiliu,
       Adam Oline, Xuehua Chen, and Brian Smith).

       Send comments/bug-reports to: <netpipe@scl.ameslab.gov>.

       Additional  information  about NetPIPE can be found on the World
       Wide Web at http://www.scl.ameslab.gov/Projects/NetPIPE/

BUGS

       As of version 3.6.1, there is  a  bug  that  causes  NetPIPE  to
       segfault on RedHat Enterprise systems. I will debug this as soon
       as  I  get  access  to  a  few  such  systems.    -Dave   Turner
       (turner@ameslab.gov)