Man Linux: Main Page and Category List

NAME

       apertium-tagger - This application is part of  ( apertium )

       This  tool  is  part  of  the  apertium open-source machine translation
       architecture: http://www.apertium.org.

SYNOPSIS

       apertium-tagger --train|-t {n} DIC CRP TSX PROB [--debug|-d]

       apertium-tagger  --supervised|-s  {n}  DIC  CRP  TSX  PROB  HTAG  UNTAG
       [--debug|-d]

       apertium-tagger --retrain|-r {n} CRP PROB [--debug|-d]

       apertium-tagger   --tagger|-g  [--first|-f]  PROB  [--debug|-d]  [INPUT
       [OUTPUT]]

DESCRIPTION

       apertium-tagger is the application responsible for the  apertium  part-
       of-speech tagger training or tagging, depending on the calling options.
       This command only reads from the standard input if the option  --tagger
       or -g is used.

OPTIONS

       -t {n}, --train {n}
              Initializes    parameters    through    the    Kupiec’s   method
              (unsupervised), then performs n  iterations  of  the  Baum-Welch
              training algorithm (unsupervised).

       -s {n}, --supervised {n}
              Initializes  parameters  against a hand-tagged text (supervised)
              through the maximum likelihood estimate method, then performs  n
              iterations of the Baum-Welch training algorithm (unsupervised)

       -r {n}, --retrain {n}
              Retrains  the  model  with  n  additional  Baum-Welch iterations
              (unsupervised).

       -g, --tagger
              Tags input text by means of Viterbi algorithm.

       -p, --show-superficial
              Prints the superficial form of the word along side  the  lexical
              form in the output stream.

       -f, --first
              Used  if  conjuntion with -g (--tagger) makes the tagger to give
              all lexical forms of each word, being the  choosen  one  in  the
              first place (after the lemma)

       -d, --debug
              Print error (if any) or debug messages while operating.

       -m, --mark
              Mark disambiguated words.

       -h, --help
              Display a help message.

FILES

       These are the kinds of files used with each option:

       DIC Full expanded dictionary file

       CRP Training text corpus file

       TSX Tagger specification file, in XML format

       PROB Tagger data file, built in the training and used while tagging

       HTAG Hand-tagged text corpus

       UNTAG  Untagged  text  corpus, morphological analysis of HTAG corpus to
       use both jointly with -s option

       INPUT Input file, stdin by default

       OUTPUT Output file, stdout by default

SEE ALSO

       lt-proc(1),    lt-comp(1),    lt-expand(1),     apertium-translator(1),
       apertium(1).

BUGS

       Lots of...lurking in the dark and waiting for you!

AUTHOR

       Copyright  (c)  2005,  2006  Universitat  d’Alacant  /  Universidad  de
       Alicante.  This is free software.  You may redistribute  copies  of  it
       under    the    terms    of    the    GNU    General   Public   License
       <http://www.gnu.org/licenses/gpl.html>.

                                  2006-08-30                apertium-tagger(1)