NAME
unifuzz - Emit strings designed to test Unicode handling
SYNOPSIS
unifuzz ([option flags])
DESCRIPTION
unifuzz emits strings designed to test the ability of programs intended
to accept Unicode input to handle unexpected input. These include:
characters from all Unicode ranges, Private Use characters, surrogates,
undefined characters, non-characters, control characters, exotic space
characters, sequences violating normalization rules, unexpected
sequences (e.g. a base character from one range followed by a combining
character from another range), and long sequences of combining
characters. It can also generate very long lines, strings containing
embedded nulls, and ill-formed UTF-8.
COMMAND LINE FLAGS
-b Restrict the output to the Basic Multilingual Plane (Plane 0).
-g Do not emit specific characters.
-h Print usage information.
-l Emit very long lines.
-n Emit string with embedded nulls.
-q Be quiet. Omit commentary.
-r <number>
Set the number of random characters to emit.
-S Scan ranges - emit a character from each range.
-s <seed>
Set the seed for the random number generator.
-u Emit ill-formed UTF-8.
-v Print version information.
The sequence of random characters is determined by a pseudorandom
number generator, so the same sequence can be obtained by setting the
seed to the same value. If not set on the command line, a seed is
chosen based on the time of execution. The seed used is included in the
output in a line of the form "Seed = NNNNNN" immediately preceding the
random character sequence. Note that in order to obtain the same
sequence it is necessary to keep the same setting for restriction of
output to the BMP.
REFERENCES
Unicode Standard, version 5.0
AUTHOR
Bill Poser
billposer@alum.mit.edu
LICENSE
GNU General Public License
April, 2008 unifuzz(1)