Man Linux: Main Page and Category List

NAME

       pvanal - Converts a soundfile into a series of short-time Fourier
       transform frames. .

DESCRIPTION

       Fourier analysis for the Csound pvoc generator

SYNTAX

           csound -U pvanal [flags] infilename outfilename

           pvanal [flags] infilename outfilename

PVANAL EXTENSION TO CREATE A PVOC-EX FILE.

       The standard Csound utility program pvanal has been extended to enable
       a PVOC-EX format file to be created, using the existing interface. To
       create a PVOC-EX file, the file name must be given the required
       extension, “.pvx”, e.g “test.pvx”. The requirement for the FFT size to
       be a power of two is here relaxed, and any positive value is accepted;
       odd numbers are rounded up internally. However, power-of-two sizes are
       still to be preferred for all normal applications.

       The channel select flags are ignored, and all source channels will be
       analysed and written to the output file, up to a compiler-set limit of
       eight channels. The analysis window size (iwinsize) is set internally
       to double the FFT size.

INITIALIZATION

       pvanal converts a soundfile into a series of short-time Fourier
       transform (STFT) frames at regular timepoints (a frequency-domain
       representation). The output file can be used by pvoc to generate audio
       fragments based on the original sample, with timescales and pitches
       arbitrarily and dynamically modified. Analysis is conditioned by the
       flags below. A space is optional between the flag and its argument.

       -s srate -- sampling rate of the audio input file. This will over-ride
       the srate of the soundfile header, which otherwise applies. If neither
       is present, the default is 10000.

       -c channel -- channel number sought. The default is 1.

       -b begin -- beginning time (in seconds) of the audio segment to be
       analyzed. The default is 0.0

       -d duration -- duration (in seconds) of the audio segment to be
       analyzed. The default of 0.0 means to the end of the file.

       -n frmsiz -- STFT frame size, the number of samples in each Fourier
       analysis frame. Must be a power of two, in the range 16 to 16384. For
       clean results, a frame must be larger than the longest pitch period of
       the sample. However, very long frames result in temporal "smearing" or
       reverberation. The bandwidth of each STFT bin is determined by sampling
       rate / frame size. The default framesize is the smallest power of two
       that corresponds to more than 20 milliseconds of the source (e.g. 256
       points at 10 kHz sampling, giving a 25.6 ms frame).

       -w windfact -- Window overlap factor. This controls the number of
       Fourier transform frames per second. Csound´s pvoc will interpolate
       between frames, but too few frames will generate audible distortion;
       too many frames will result in a huge analysis file. A good compromise
       for windfact is 4, meaning that each input point occurs in 4 output
       windows, or conversely that the offset between successive STFT frames
       is framesize/4. The default value is 4. Do not use this flag with -h.

       -h hopsize -- STFT frame offset. Converse of above, specifying the
       increment in samples between successive frames of analysis (see also
       lpanal). Do not use with -w.

       -H -- Use a Hamming window instead of the default von Hann window.

       -K -- Use a Kaiser window instead of the default von Hann window. The
       Kaiser parameter default is 6.8, but can be set with the -B option.

       -B beta -- Set the beta parameter for any Kaiser window used to the
       floating point value beta.

EXAMPLES

           pvanal asound pvfile

       will analyze the soundfile "asound" using the default frmsiz and
       windfact to produce the file "pvfile" suitable for use with pvoc.

   Files
       The output file has a special pvoc header containing details of the
       source audio file, the analysis frame rate and overlap. Frames of
       analysis data are stored as float, with the magnitude and “frequency”
       (in Hz) for the first N/2 + 1 Fourier bins of each frame in turn.
       “Frequency” encodes the phase increment in such a way that for strong
       harmonics it gives a good indication of the true frequency. For low
       amplitude or rapidly moving harmonics it is less meaningful.

   Diagnostics
       Prints total number of frames, and frames completed every 20th frame.

CREDITS

       Author: Dan Ellis

       MIT Media Lab

       Cambridge, Massachussetts

       1990

AUTHORS

       Barry Vercoe
       MIT Media Lab

           Author.

       Dan Ellis
       MIT Media Lab,
                 Cambridge
                 Massachussetts

           Author.

COPYRIGHT