sphinx_fe - Convert audio files to acoustic feature files

NAME

       sphinx_fe - Convert audio files to acoustic feature files

SYNOPSIS

       sphinx_fe [ options ]...

DESCRIPTION

       This  program  converts  audio  files  (in  either  Microsoft WAV, NIST
       Sphere, or raw format) to acoustic feature files for  input  to  batch-
       mode  speech  recognition.   The  resulting  files  are also useful for
       various other things.  A list of options follows:

       -alpha Preemphasis parameter

       -blocksize
              Block size, used to limit the number of samples used at  a  time
              when reading very large audio files

       -c     file for batch processing

       -cep2spec
              Input is cepstral files, output is log spectral files

       -di    directory, input file names are relative to this, if defined

       -dither
              Add 1/2-bit noise

       -do    directory, output files are relative to this

       -doublebw
              Use double bandwidth filters (same center freq)

       -ei    extension to be applied to all input files

       -eo    extension to be applied to all output files

       -example
              Shows example of how to use the tool

       -fbtype
              FB Type of mel_scale or log_linear

       -feat  SPHINX format - big endian

       -frate Frame rate

       -help  Shows the usage of the tool

       -i     audio input file

       -input_endian
              Endianness  of  input data, big or little, ignored if NIST or MS
              Wav

       -lifter
              Length of sin-curve for liftering, or 0 for no liftering.

       -logspec
              Write out logspectral files instead of cepstra

       -lowerf
              Lower edge of filters

       -mach_endian
              Endianness of machine, big or little

       -mswav Defines input format as Microsoft Wav (RIFF)

       -ncep  Number of cep coefficients

       -nchans
              Number of channels of data (interlaced samples assumed)

       -nfft  Size of FFT

       -nfilt Number of filter banks

       -nist  Defines input format as NIST sphere

       -nskip a control file was specified, the number of utterances  to  skip
              at the head of the file

       -o     cepstral output file

       -raw   Defines input format as raw binary data

       -remove_dc
              Remove DC offset from each frame

       -round_filters
              Round mel filter frequencies to DFT points

       -runlen
              a  control  file  was  specified,  the  number  of utterances to
              process (see -nskip too)

       -samprate
              Sampling rate

       -seed  Seed for random number generator; if less than  zero,  pick  our
              own

       -smoothspec
              Write out cepstral-smoothed logspectral files

       -spec2cep
              Input is log spectral files, output is cepstral files

       -transform
              Which  type  of  transform  to use to calculate cepstra (legacy,
              dct, or htk)

       -unit_area
              Normalize mel filters to unit area

       -upperf
              Upper edge of filters

       -verbose
              Show input filenames

       -warp_params
              defining the warping function

       -warp_type
              Warping function type (or shape)

       -whichchan
              Channel to process

       -wlen  Hamming window length

       Currently the only kind of features supported are MFCCs  (mel-frequency
       cepstral  coefficients).   There are numerous options which control the
       properties of the output features.   It  is  VERY  important  that  you
       document  the  specific  set  of  flags used to create any given set of
       feature files, since this information is  NOT  recorded  in  the  files
       themselves,  and  any  mismatch  between the parameters used to extract
       features for  recognition  and  those  used  to  extract  features  for
       training will cause recognition to fail.

AUTHOR

       Written  by numerous people at CMU from 1994 onwards.  This manual page
       by David Huggins-Daines <dhuggins@cs.cmu.edu>

COPYRIGHT

       Copyright © 1994-2007 Carnegie Mellon University.  See the file COPYING
       included with this package for more information.

                                  2007-08-27