NAME
wzip - lossy data compression and denoising
SYNOPSIS
wzip [ -c | -d | -dn | -hdn ] num sf
DESCRIPTION
This manual page documents the wzip command.
wzip is a program that can be used for LOSSY data compression and
denoising. It reads from STDIN and writes to STDOUT. In compression
mode the input is a sequence of ascii floating-point values. num is
the number of these data values. The output is a sequence of small
integers, most of them zero in typical application. This is ready for
effective compression with a standard loss-less compression program
like gzip.
The program can also be used for denoising. In this case both input and
output are sequences of ascii floating-point values.
The scale factor sf determines the strength of compression or
denoising. A higher scale factor means heavier compression and
stronger denoising. Four times the standard deviation of the noise
content is a good start. Otherwise 5 percent of the overall signal
amplitude might be used as a first estimation of a suitable scale
factor.
If the noise content of the input data is strongly non-Gaussian-
distributed, like Poisson noise. The input data should be transformed
to approximate Gaussian-distributed noise. If the input values are
Poisson-distributed, that means for example raw counts per channel in
EDX or XPD, they can be transformed to approximate Gaussian-distributed
noise by transformation of each data point with y:=2.0*sqrt(x+0.25109).
Back transformation is done with y:=(x/2)^2. The summand 0.25109
compensates for the bias caused by the asymmetry of the Poisson-
distribution.
Invoking the program without any options writes examples of the use of
the program to STDERR.
OPTIONS
There must be given exactly one option.
-c Compression, reads num ascii floating-point values from STDIN
and writes a sequence of integers with high redundancy to
STDOUT.
-d Decompression, reads from STDIN and writes a sequence of num
ascii floating-point values to STDOUT. These are more or less
similar to the original data.
-dn Denoising, reads num ascii floating-point values from STDIN and
writes a sequence of num ascii floating-point values to STDOUT.
These are more or less similar to the original data.
-hdn Denoising with hard thresholding instead of wavelet shrinkage.
Single untouched noise peaks may be visible with this mode. On
the other hand, there is much less impact on the signal slope.
SEE ALSO
Donoho, D.L.; Johnstone, I.M.: Adapting to unknown smoothness via
wavelet shrinkage, technical report 425, Department of Statistics,
Stanford University, Stanford, June 1993,
ftp://playfair.stanford.edu/pub/donoho/ausws.ps.Z
Franzen, A.: Compression of process data with a wavelet method, steel
res. 69 (1998), No. 1, pp. 28/30
Franzen, A.: Non-linear denoising with wavelet transformation, Z.
Metallkd. 89 (1998), No. 4, pp. 297/302
AUTHOR
This manual page was written by Andreas Franzen <anfra@debian.org>, for
the Debian GNU/Linux system (but may be used by others).
Copyright (C) 1997 Andreas Franzen, placed under the GNU General Public
License, see the file copyright for details.
24 December 1997