Man Linux: Main Page and Category List

NAME

       apertium - This application is part of ( apertium )

       This  tool  is  part  of the apertium machine translation architecture:
       http://apertium.sf.net.

SYNOPSIS

       apertium [-d datadir] [-f format]  [-u]  [-a]  {language-pair}  [infile
       [outfile]]

DESCRIPTION

       apertium  is  the  application  that  most  people  will be using as it
       simplifies the use of apertium/lt-toolbox tools for machine translation
       purposes.

       This  tool  tries to ease the use of lt-toolbox (which contains all the
       lexical processing modules and tools) and apertium (which contains  the
       rest of the engine) by providing a unique front-end to the end-user.

       The   different   modules   behind  the  apertium  machine  translation
       architecture are in order:
              · de-formatter: Separates the text to  be  translated  from  the
              format information.

              · morphological-analyser: Tokenizes the text in surface forms.

              ·   part-of-speech  tagger:  Chooses  one  surface  forms  among
              homographs.

              · lexical transfer module: Reads  each  source-language  lexical
              form  and delivers a corresponding target-language lexical form.

              · structural transfer module: Detects fixed-length  patterns  of
              lexical forms (chunks or phrases) needing special processing due
              to  grammatical  divergences  between  the  two  languages   and
              performs the corresponding transformations.

              ·  morphological  generator:  Delivers a target-language surface
              form  for  each  target-language  lexical  form,   by   suitably
              inflecting it.

              ·  post-generator:  Performs  orthographical  operations such as
              contractions and apostrophations.

              · re-formatter: Restores the format information encapsulated  by
              the  de-formatter  into  the  translated  text  and  removes the
              encapsulation sequences used to protect  certain  characters  in
              the source text.

OPTIONS

       -d  datadir  The  directory holding the linguistic data.  By default it
       will used the expected installation path.

       language-pair The language pair: LANG1-LANG2 (for instance es-ca or ca-
       es).

       -f  format Specifies the format of the input and output files which can
       have these values:
              · txt (default value) Input and output files are in text format.

              ·  html Input and output files are in "html" format. This "html"
              is the one acceptd by the vast majority of web browsers.

              · rtf Input and output files are in "rtf" format.  The  accepted
              "rtf"  is  the  one  generated  by  Microsoft  WordPad  (C)  and
              Microsoft Office (C) up to and including Office-97.

       -u Disable marking of unknown words with the ’*’ character.

       -a Enable marking of disambiguated words with the ’=’ character.

FILES

       These are the two files that can be used with this command:

       infile Input file (stdin by default).

       outfile Output file (stdout by default).

SEE ALSO

       lt-proc(1), lt-comp(1), lt-expand(1), apertium-tagger(1).

BUGS

       Lots of...lurking in the dark and waiting for you!

AUTHOR

       (c) 2005,2006 Universitat d’Alacant  /  Universidad  de  Alicante.  All
       rights reserved.

                                  2006-03-08                       apertium(1)