Man Linux: Main Page and Category List

NAME

       bzz - DjVu general purpose compression utility.

SYNOPSIS

   Encoding:
       bzz -e[blocksize] inputfile outputfile

   Decoding:
       bzz -d inputfile outputfile

DESCRIPTION

       The  first  form  of  the command line (option -e ) compresses the data
       from file inputfile and writes the  compressed  data  into  outputfile.
       The  second  form  of  the  command line (option -d ) decompressed file
       inputfile and writes the output to outputfile.

OPTIONS

       -d     Decoding mode.

       -e[blocksize]
              Encoding mode.  The optional argument  blocksize  specifies  the
              size  of  the input file blocks processed by the Burrows-Wheeler
              transform expressed in kilobytes.  The default  block  sizes  is
              2048  KB.   The  maximal  block  size  is 4096 KB.  Specifying a
              larger block size usually produces higher compression ratios and
              increases  the  memory  requirements  of  both  the  encoder and
              decoder.  It is useless to specify a block size that  is  larger
              than the input file.

ALGORITHMS

       The  Burrows-Wheeler  transform is performed using a combination of the
       Karp-Miller-Rosenberg and the  Bentley-Sedgewick  algorithms.  This  is
       comparable  to (Sadakane, DCC 98) with a slightly more flexible ranking
       scheme. Symbols are then ordered according to  a  running  estimate  of
       their  occurrence frequencies.  The symbol ranks are then coded using a
       simple fixed tree and the ZP binary adaptive coder (Bottou, DCC 98).

       The Burrows-Wheeler transform is also used in the well known compressor
       bzip2.   The  originality  of  bzz is the use of the ZP adaptive coder.
       The adaptation noise can cost up to 5 percent in file  size,  but  this
       penalty is usually offset by the benefits of adaptation.

PERFORMANCE

       The  following  table shows comparative results (in bits per character)
       on the Canterbury Corpus ( http://corpus.canterbury.ac.nz ).  The  very
       good  bzz  performance  on  the spreadsheet file excl puts the weighted
       average ahead of much more sophisticated compressors such as fsmx.

+-------------------------------------------------------------------------------------------------------------+
|                                          Compression performance                                            |
|             text   fax    csrc   excl   sprc   tech   poem   html   lisp   man    play   Weighted   Average |
+-------------------------------------------------------------------------------------------------------------+
| compress    3.27   0.97   3.56   2.41   4.21   3.06   3.38   3.68   3.90   4.43   3.51     2.55      3.31   |
| gzip -9     2.85   0.82   2.24   1.63   2.67   2.71   3.23   2.59   2.65   3.31   3.12     2.08      2.53   |
| bzip2 -9    2.27   0.78   2.18   1.01   2.70   2.02   2.42   2.48   2.79   3.33   2.53     1.54      2.23   |
| ppmd        2.31   0.99   2.11   1.08   2.68   2.19   2.48   2.38   2.43   3.00   2.53     1.65      2.20   |
| fsmx        2.10   0.79   1.89   1.48   2.52   1.84   2.21   2.24   2.29   2.91   2.35     1.63      2.06   |
| bzz         2.25   0.76   2.13   0.78   2.67   2.00   2.40   2.52   2.60   3.19   2.52     1.44      2.16   |
+-------------------------------------------------------------------------------------------------------------+

       Note that  DjVu  contributors  have  several  entries  in  this  table.
       Program  compress was written some time ago by Joe Orost.  Program ppmd
       is an improvement of the PPM-C method invented by Paul Howard.

CREDITS

       Program bzz was written by  Leon  Bottou  <leonb@users.sourceforge.net>
       and  was  then  improved  by Andrei Erofeev <andrew_erofeev@yahoo.com>,
       Bill Riemers <docbill@sourceforge.net> and many others.

SEE ALSO

       djvu(1), compress(1), gzip(1), bzip2(1)