NAME
sylseg-sk - segments a Slovak words in to the sylables
SYNOPSIS
sylseg-sk [--best] [--color] [--dl debug level] [--help] [--ofile
<file_name>] [<input_file>]
DESCRIPTION
The sylabic segmentation is esential for some linguistic or speech
recognition applications. Depending on the language either rule based
or statistical approach is beying used. For Slovak the statistical
approach seems to be more suitable.
sylseg-sk implements one of the statistical approaches for the syllabic
segmentaion. Each input word is segmented into the syllables. The
several possible segmentations are generated and sorted by the
likelihood. If no input file is specified, the standard input is
expected. If input file is used then the output is written in to the
file as well. The filename is input filename with the extension
".syllables".
The input output code page is ISO 8859-2. To use it with different CP
use some CP convertor and pipes. For example to have input and output
in UTF-8 use (for interactive use): filterm UTF8-iso2 iso2-UTF8 sylseg-
sk or (for batch processing) iconv -f UTF-8 -t ISO_8859-2 | sylseg-sk |
iconv -f ISO_8859-2 -t UTF-8
Performance of the syllabic segmentation depend on the used statistics.
To improve the quality of the segmentaion is possible to train the
better system with the sylseg-sk-training tool and replace the original
file located in /usr/share/sylseg_sk/sylseg-sk.stats
The design of the sylseg-sk is language independent. With retrained
statistics it theoreticaly should work for any language.
OPTIONS
--best Print the best result only.
--color
Enable color output.
--dl 1..5
Set the debug level. Control the amount of displayed information
The debug level 0 displays nothing. The maximum level 5 displays
full debugging report. The default debug level is 1.
--help display a short help text
--ofile <file_name>
Write output also in to given file.
EXAMPLES
Use standard input and debug level 3:
sylseg-sk --dl 3
Process all the from file aaa.txt and print just the best segmentation:
sylseg-sk --best aaa.txt
EXIT STATUS
sylseg-sk returns a zero if it succeeds to process all the input words
AUTHOR
Jozef Ivanecky (dodo (at) kanoistika.sk)
SEE ALSO
sylseg-sk-training(1), filterm(1), iconv(1), konwert(1)