Man Linux: Main Page and Category List

NAME

       bw_mem - time memory bandwidth

SYNOPSIS

       bw_mem_cp  [  -P  <parallelism> ] [ -W <warmups> ] [ -N <repetitions> ]
       size rd|wr|rdwr|cp|fwr|frd|bzero|bcopy [align]

DESCRIPTION

       bw_mem allocates twice the specified amount of memory,  zeros  it,  and
       then  times  the copying of the first half to the second half.  Results
       are reported in megabytes moved per second.

       The size specification may end with ‘‘k’’ or ‘‘m’’ to mean kilobytes (*
       1024) or megabytes (* 1024 * 1024).

OUTPUT

       Output format is "%0.2f %.2f\n", megabytes, megabytes_per_second, i.e.,

       8.00 25.33

       There are nine  different  memory  benchmarks  in  bw_mem.   They  each
       measure  slightly  different  methods  for  reading, writing or copying
       data.

       rd     measures the time to read data into the processor.  It  computes
              the sum of an array of integer values.  It accesses every fourth
              word.

       wr     measures the time  to  write  data  to  memory.   It  assigns  a
              constant value to each memory of an array of integer values.  It
              accesses every fourth word.

       rdwr   measures the time to read data into memory and then  write  data
              to  the  same  memory location.  For each element in an array it
              adds the current value to a running sum before assigning  a  new
              (constant) value to the element.  It accesses every fourth word.

       cp     measures the time to copy data from one location to another.  It
              does  an  array  copy:  dest[i]  = source[i].  It accesses every
              fourth word.

       frd    measures the time to read data into the processor.  It  computes
              the sum of an array of integer values.

       fwr    measures  the  time  to  write  data  to  memory.   It assigns a
              constant value to each memory of an array of integer values.

       fcp    measures the time to copy data from one location to another.  It
              does an array copy: dest[i] = source[i].

       bzero  measures how fast the system can bzero memory.

       bcopy  measures how fast the system can bcopy data.

MEMORY UTILIZATION

       This  benchmark can move up to three times the requested memory.  Bcopy
       will use 2-3 times as much memory bandwidth: there is one read from the
       source and a write to the destionation.  The write usually results in a
       cache line read and then a write back of the cache line at  some  later
       point.   Memory  utilization  might  be reduced by 1/3 if the processor
       architecture implemented ‘‘load cache line’’ and ‘‘store  cache  line’’
       instructions (as well as ‘‘getcachelinesize’’).

SEE ALSO

       lmbench(8).

AUTHOR

       Carl Staelin and Larry McVoy

       Comments, suggestions, and bug reports are always welcome.

(c)1994-2000 Larry McVoy and Carl St$Date$