NAME
fst-infl2 - morphological analysers
SYNOPSIS
fst-infl [ options ] file [ input-file [ output-file ] ]
fst-infl2 [ options ] file [ input-file [ output-file ] ]
fst-infl3 [ options ] file [ input-file [ output-file ] ]
OPTIONS
-t file
Read an alternative transducer from file and use it if the main
transducer fails to find an analysis. By iterating this option,
a cascade of transducers may be tried to find an analysis.
-b Print surface and analysis symbols. (fst-infl2 only)
-n Print multi-character symbols without the enclosing angle
brackets. (fst-infl only)
-d The analyses are symbolically disambiguated by returning only
analyses with a minimal number of morphemes. This option
requires that morpheme boundaries are marked with the tag <X>.
If no <X> tag is found in the analysis string, then the program
(basically) counts the number of multi-character symbols
consisting entirely of upper-case characters and uses this count
for disambiguation. The latter heuristic was developed for the
German SMOR morphology. (This option is only available with fst-
infl2 and fst-infl3.)
-e n If no regular analysis is found, do robust matching and print
analyses with up to n edit errors. The set of edit operations
currently includes replacement, insertion and deletion. Each
operation has currently a fixed error weight of 1. (fst-infl2
only)
-% f Disambiguates the analyses statistically and prints the most
likely analyses with at least f % of the total probability mass
of the analyses. The transducer weights are read from a file
obtained by appending .prob to the name of the transducer file.
The weight files are created with fst-train. (fst-infl2 only)
-p Print the probability of each analysis. (fst-infl2 only)
-c use this option if the transducer was compiled on a computer
with a different endianness. If you have a transducer which was
compiled on a Sparc computer and you want to use it on a
Pentium, you need to use this option. (fst-infl2 only)
-q Suppress status messages.
-h Print usage information.
DESCRIPTION
fst-infl is a morphological analyser. The first argument is the name of
a file which was generated by fst-compiler. The second argument is the
name of the input file. The third argument is the output file. If the
third argument is missing, output is directed to stdout. If the second
argument is missing, as well, input is read from stdin.
fst-infl2 is similar to fst-infl but needs a transducer in compact
format (see the man pages for fst-compiler and fst-compact). fst-infl2
is implemented differently from fst-infl and usually much faster.
fst-infl3 is also similar to fst-infl but needs a transducer in lowmem
format (see the man pages for fst-compiler and fst-lowmem). fst-infl3
accesses the transducer on disc rather than reading it into memory. It
starts very fast and needs very little memory, but is slower than fst-
infl2.
fst-infl reads the transducer which is stored in the argument file.
Then it reads the input file line by line. Each line is analysed with
the transducer and all resulting analyses are printed (see also the man
pages for fst-mor).
BUGS
No bugs are known so far.
SEE ALSO
fst-compiler, fst-mor
AUTHOR
Helmut Schmid, Institute for Computational Linguistics, University of
Stuttgart, Email: schmid@ims.uni-stuttgart.de, This software is
available under the GNU Public License.
November 2004 fst-infl(1)