NAME
pocketsphinx_batch - Run speech recognition in batch mode
SYNOPSIS
pocketsphinx_batch -hmm hmmdir -dict dictfile [ options ]...
DESCRIPTION
Run speech recognition over a list of utterances in batchmode. A list
of arguments follows:
-adcdev
name for audio input (platform-specific)
-adchdr
Size of audio file header in bytes (headers are ignored)
-adcin Input is raw audio data
-agc Automatic gain control for c0 (’max’, ’emax’, ’noise’, or
’none’)
-agcthresh
Initial threshold for automatic gain control
-allphone
Do phoneme recognition
-alpha Preemphasis parameter
-backtrace
Print back trace of recognition results
-beam Beam width applied to every frame in Viterbi search (smaller
values mean wider beam)
-bestpath
Run bestpath (Dijkstra) search over word lattice (3rd pass)
-bestpathlw
Language model probability weight for bestpath search
-cachesen
Cache senone scores from first pass search
-cep2spec
Input is cepstral files, output is log spectral files
-cepdir
files directory (prefixed to filespecs in control file)
-cepext
Input files extension (prefixed to filespecs in control file)
-ceplen
Number of components in the input feature vector
-cmn Cepstral mean normalization scheme (’current’, ’prior’, or
’none’)
-cmninit
Initial values (comma-separated) for cepstral mean when ’prior’
is used
-compallsen
Compute all senone scores in every frame (can be faster when
there are many senones)
-ctl file listing utterances to be processed
-ctlcount
No. of utterances to be processed (after skipping -ctloffset
entries)
-ctlincr
Do every Nth line in the control file
-ctloffset
No. of utterances at the beginning of -ctl file to be skipped
-dict pronunciation dictionary (lexicon) input file
-dither
Add 1/2-bit noise
-doublebw
Use double bandwidth filters (same center freq)
-dsratio
Frame GMM computation downsampling ratio
-fbtype
FB Type of mel_scale or log_linear
-fdict word pronunciation dictionary input file
-feat Feature stream type, depends on the acoustic model
-fillpen
Filler word transition penalty
-frate Frame rate
-fsg state grammar
-fsgbfs
Force backtrace from FSG final state
-fsgctlfn
finite state grammar control file
-fsgusealtpron
Use alternative pronunciations for FSG
-fsgusefiller
(FSG Mode (Mode 2) only) Insert filler words at each state.
-fwd3g Use trigrams in first pass search
-fwdflat
Run forward flat-lexicon search over word lattice (2nd pass)
-fwdflatbeam
Beam width applied to every frame in second-pass flat search
-fwdflatefwid
Minimum number of end frames for a word to be searched in
fwdflat search
-fwdflatlw
Language model probability weight for flat lexicon (2nd pass)
decoding
-fwdflatsfwin
Window of frames in lattice to search for successor words in
fwdflat search
-fwdflatwbeam
Beam width applied to word exits in second-pass flat search
-fwdtree
Run forward lexicon-tree search (1st pass)
-hmm containing acoustic model files.
-hyp output file name
-hypseg
output with segmentation file name
-input_endian
Endianness of input data, big or little, ignored if NIST or MS
Wav
-kdmaxbbi
Maximum number of Gaussians per leaf node in kd-Trees
-kdmaxdepth
Maximum depth of kd-Trees to use
-kdtree
file for Gaussian selection
-latsize
Lattice size
-lifter
Length of sin-curve for liftering, or 0 for no liftering.
-live Get input from audio hardware
-lm trigram language model input file
-lmctl a set of language model
The -hmm and -dict arguments are always required. Either -lm or -fsg
is required, depending on whether you are using a statistical language
model or a finite-state grammar. To do batchmode recognition, you will
need to specify a control file, using -ctl This is a simple text file
containing one entry per line. Each entry is the name of an input file
relative to the -cepdir directory, and without the filename extension
(which is given in the -cepext argument).
If you are using acoustic feature files as input (see sphinx_fe(1) for
information on how to generate these), you can also specify a subpart
of a file, using the following format:
FILENAME START-FRAME END-FRAME UTTERANCE-ID
AUTHOR
Written by numerous people at CMU from 1994 onwards. This manual page
by David Huggins-Daines <dhuggins@cs.cmu.edu>
COPYRIGHT
Copyright © 1994-2007 Carnegie Mellon University. See the file COPYING
included with this package for more information.
SEE ALSO
pocketsphinx_continuous(1), sphinx_fe(1).
2007-08-27