lt-proc - This application is part of the lexical processing modules

NAME

       lt-proc  -  This  application is part of the lexical processing modules
       and tools ( lttoolbox )

       This tool is part of the  apertium  machine  translation  architecture:
       http://apertium.sf.net.

SYNOPSIS

       lt-proc  [  -a  |  -g  |  -n | -p | -s | -v | -h ] fst_file [input_file
       [output_file]]

       lt-proc [  --analysis  |  --generation  |  --non-marked-gen  |  --post-
       generation  |  --sao  |  --version  |  --help  ]  fst_file  [input_file
       [output_file]]

DESCRIPTION

       lt-proc is the application responsible of providing  the  four  lexical
       processing functionalities

              · morphological analyser  ( option -a )

              · lexical transfer  ( option -n )

              · morphological generator  ( option -g )

              · post-generator  ( option -p )

       It  accomplishes  these  tasks  by  reading  binary  files containing a
       compact and  efficient  representation  of  dictionaries  (a  class  of
       finite-state  transducers  called  augmented letter transducers). These
       files are generated by lt-comp(1).

       It is worth to mention that some characters (‘[’, ‘]’, ‘$’,  ‘^’,  ‘/’,
       ‘+’)  are  special chars used for format and encapsulation. They should
       be escaped if they have to be used literally, for  instance:  ‘[’...‘]’
       are ignored and the format of a linefeed is ‘^...$’.

OPTIONS

       -a, --analysis
              Tokenizes  the  text  in  surface  forms  (lexical units as they
              appear in texts) and delivers, for each  surface  form,  one  or
              more  lexical  forms  consisting  of lemma, lexical category and
              morphological  inflection  information.  Tokenization   is   not
              straightforward  due  to  the  existence,  on  the  one hand, of
              contractions, and, on the  other  hand,  of  multi-word  lexical
              units.  For  contractions,  the system reads in a single surface
              form and delivers the corresponding sequence of  lexical  forms.
              Multi-word  surface  forms  are  analysed  in  a  left-to-right,
              longest-match  fashion.  Multi-word   surface   forms   may   be
              invariable  (such as a multi-word preposition or conjunction) or
              inflected  (for  example,  in  es,  "echaban  de  menos",  "they
              missed", is a form of the imperfect indicative tense of the verb
              "echar de menos", "to miss"). Limited support for some kinds  of
              discontinuous  multi-word  units  is also available. Single-word
              surface forms analysis produces output like  the  one  in  these
              examples: "cantar" -> ‘^cantar/cantar<vblex><inf>$’ or "cantaba"
              ->
              ‘^cantaba/cantar<vblex><pii><p1><sg>/cantar<vblex><pii><p3><sg>$’.

       -g, --generation
              Delivers a target-language surface form for each target-language
              lexical form, by suitably inflecting it.

       -n, --non-marked-gen
              Morphological  generation  (like  -g)  but  without unknown word
              marks (asterisk ‘*’).

       -p, --post-generation
              Performs orthographical  operations  such  as  contractions  and
              apostrophations.  The  post-generator  is  usually dormant (just
              copies the input to the output) until  a  special  alarm  symbol
              contained  in  some target-language surface forms wakes it up to
              perform a particular string transformation if necessary; then it
              goes back to sleep.

       -s, --sao
              Input processing is in orthoepikon (previosuly ‘sao’) annotation
              system format: http://orthoepikon.sf.net.

       -v, --version
              Display the version number.

       -h, --help
              Display this help.

FILES

       input_file The input compiled dictionary.

BUGS

       Lots of...lurking in the dark and waiting for you!

AUTHOR

       (c) 2005,2006 Universitat d’Alacant  /  Universidad  de  Alicante.  All
       rights reserved.

                                  2006-03-23                        lt-proc(1)

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

FILES

SEE ALSO

BUGS

AUTHOR