Man Linux: Main Page and Category List

NAME

       genpyt - generate the PINYIN lexicon

SYNOPSIS

       genpyt lexicon-file result-file log-file slm-file

DESCRIPTION

       genpyt is used to generate the PINYIN lexicon.  It only works on
       zh_CN.UTF-8 locale.

ARGUMENTS

       lexicon-file
           Specify a dictionary file. It should be a line-based text file in
           utf-8 encoding . Each line looks like:

              CCC  id  [pinyin'pinyin'pinyin]*

           A default dictionary file can be found at
           /usr/share/sunpinyin/dict.utf8.

       result-file
           The output binary PINYIN lexicon file. This lexicon contains a trie
           presenting the key tree of PINYIN. And all of the candiate words
           are sorted using the unigram in slm-file. This file can be used
           with sunpinyin input method engines.

       log-file
           Specify the file to where the log goes. The log-file can be seen as
           the human-readble presentation of the binary output file.

       slm-file
           The language model from which the unigram information are
           retrieved. Typically, the slm-file is generated by slmthread.

AUTHOR

       Originally written by Phill.Zhang <phill.zhang@sun.com>.  Currently
       maintained by Kov.Chai <tchaikov@gmail.com>.

SEE ALSO

       slmthread(1).