Man Linux: Main Page and Category List

NAME

       fst-infl3 - morphological analysers

SYNOPSIS

       fst-infl [ options ] file [ input-file [ output-file ] ]
       fst-infl2 [ options ] file [ input-file [ output-file ] ]
       fst-infl3 [ options ] file [ input-file [ output-file ] ]

OPTIONS

       -t file
              Read  an alternative transducer from file and use it if the main
              transducer fails to find an analysis. By iterating this  option,
              a cascade of transducers may be tried to find an analysis.

       -b     Print surface and analysis symbols. (fst-infl2 only)

       -n     Print   multi-character  symbols  without  the  enclosing  angle
              brackets.  (fst-infl only)

       -d     The analyses are symbolically disambiguated  by  returning  only
              analyses  with  a  minimal  number  of  morphemes.  This  option
              requires that morpheme boundaries are marked with the  tag  <X>.
              If  no <X> tag is found in the analysis string, then the program
              (basically)  counts  the  number  of   multi-character   symbols
              consisting entirely of upper-case characters and uses this count
              for disambiguation. The latter heuristic was developed  for  the
              German SMOR morphology. (This option is only available with fst-
              infl2 and fst-infl3.)

       -e n   If no regular analysis is found, do robust  matching  and  print
              analyses  with  up  to n edit errors. The set of edit operations
              currently includes replacement,  insertion  and  deletion.  Each
              operation  has  currently  a fixed error weight of 1. (fst-infl2
              only)

       -% f   Disambiguates the analyses statistically  and  prints  the  most
              likely  analyses with at least f % of the total probability mass
              of the analyses. The transducer weights are  read  from  a  file
              obtained  by appending .prob to the name of the transducer file.
              The weight files are created with fst-train.  (fst-infl2 only)

       -p     Print the probability of each analysis. (fst-infl2 only)

       -c     use this option if the transducer was  compiled  on  a  computer
              with  a different endianness. If you have a transducer which was
              compiled on a Sparc computer  and  you  want  to  use  it  on  a
              Pentium, you need to use this option. (fst-infl2 only)

       -q     Suppress status messages.

       -h     Print usage information.

DESCRIPTION

       fst-infl is a morphological analyser. The first argument is the name of
       a file which was generated by fst-compiler.  The second argument is the
       name  of  the input file. The third argument is the output file. If the
       third argument is missing, output is directed to stdout.  If the second
       argument is missing, as well, input is read from stdin.

       fst-infl2  is  similar  to  fst-infl  but needs a transducer in compact
       format (see the man pages for fst-compiler and fst-compact).  fst-infl2
       is implemented differently from fst-infl and usually much faster.

       fst-infl3  is also similar to fst-infl but needs a transducer in lowmem
       format (see the man pages for fst-compiler and  fst-lowmem).  fst-infl3
       accesses  the transducer on disc rather than reading it into memory. It
       starts very fast and needs very little memory, but is slower than  fst-
       infl2.

       fst-infl  reads  the  transducer  which is stored in the argument file.
       Then it reads the input file line by line. Each line is  analysed  with
       the transducer and all resulting analyses are printed (see also the man
       pages for fst-mor).

BUGS

       No bugs are known so far.

SEE ALSO

       fst-compiler, fst-mor

AUTHOR

       Helmut Schmid, Institute for Computational Linguistics,  University  of
       Stuttgart,   Email:   schmid@ims.uni-stuttgart.de,   This  software  is
       available under the GNU Public License.

                                 November 2004                     fst-infl(1)