hspell - Hebrew spellchecker

NAME

       hspell - Hebrew spellchecker

SYNOPSIS

       hspell [ -acDhHilnsvV ] [file...]

DESCRIPTION

       hspell  tries  to  find  incorrectly  spelled Hebrew words in its input
       files.

       Like the traditional Unix spell(1), hspell outputs the sorted  list  of
       incorrect  words, and does not (yet) have a more friendly interface for
       making corrections  for  you.  However,  unlike  spell(1),  hspell  can
       suggest   possible   corrections   for  some  spelling  errors  -  such
       suggestions are enabled with the -c (correct) and -n (notes) options.

       Hspell currently expects  ISO-8859-8-encoded  input  files.  Non-Hebrew
       characters   in   the  input  files  are  ignored,  allowing  the  easy
       spellchecking of Hebrew-English texts, as well as HTML  or  TeX  files.
       If  files  using  a  different encoding (e.g., UTF8) are to be checked,
       they must  be  converted  first  to  ISO-8859-8  (e.g.,  see  iconv(1),
       recode(1)).

       The  output  will also be in ISO-8859-8 encoding, in so-called "logical
       order", so it is normally useful to pipe it to bidiv(1) before viewing,
       as in:

              hspell -c filename | bidiv | less

       If no input file is given, hspell reads from its standard input.

OPTIONS

       -v     If  the -v option is given, hspell prints emacs-oriented version
              information and exits.

       -vv    Repetition of the -v option causes  hspell  to  also  show  some
              information  on  which optional features were enabled at compile
              time.

       -V     With the  -V  option,  hspell  prints  true  and  human-oriented
              version information and exits.

       -c     If  the  -c option is given, hspell will suggest corrections for
              misspelled words, whenever it can  find  such  corrections.  The
              correction  mechanism  in  this  release  is  especially good at
              finding corrections for incorrect  niqqud-less  spellings,  with
              missing or extra ’immot-qri’a.

       -n     The  -n  option  will  give  some  longer  "notes" about certain
              spelling errors, explaining why these are indeed errors  (or  in
              what  cases using this word is in fact correct). It is recommend
              to combine the two options, -cn for maximal correction help from
              hspell.

       -l     The  -l  (linguistic  information)  option will explain for each
              correct word why it was recognized (show the basic  noun,  verb,
              etc.,  that  this  inflection relates to, and its tense, gender,
              associated Kinnuy, or other relevant information)

              If Hspell was built without morphological analysis support, this
              option  will only show the correct splits of the given word into
              prefix + word, as the full information incurs a 4-fold  increase
              in the installation size.

              Giving  the  -c  option  in  addition  to  -l results in special
              behavior. In that case hspell suggests  "corrections"  to  every
              word  (regardless  if  they  are  in the dictionary or not), and
              shows the linguistic information on all those words. This can be
              useful  for a reader application, which may also want to be able
              to understand misspellings and their possible meanings.

       -s     Normally, the  words  deemed  spelling  mistakes  are  shown  in
              alphabetical  order.   The  -s  option  orders them by severity,
              i.e., the errors that most frequently appear in the document are
              shown  first.   This option is most useful for people helping to
              build hspell’s word list, and are  looking  for  common  correct
              words that hspell does not know yet.

       -a     With  the  -a  option,  hspell  tries  to  emulate (as little as
              possible of) ispell’s pipe interface. This  allows  Lyx,  Emacs,
              Geresh and KDE to use hspell as an external spell-checker.

       -i     This  option  only has any effect when used together with the -a
              option. Normally, hspell -a only checks the spelling  of  Hebrew
              words. If the given file also contains non-Hebrew words (such as
              English words), these are simply ignored. Adding the  -i  option
              tells  hspell  to  pass  the  non-Hebrew words to ispell(1), and
              return its  answer  as  an  answer  from  hspell.   This  allows
              conveniently spell-checking mixed Hebrew-English documents.

              Running  hspell  with the program name hspell-i also enables the
              -i option. This is a useful trick when  an  application  expects
              just  the  name  of  a spell-checking program, and adds only the
              "-a" option (without giving the  user  an  option  to  also  add
              "-i").  The  multispell  script  supplied  with  hspell serves a
              similar purpose, with more  control  over  encodings  and  which
              spell-checker to run for non-Hebrew words.

       -H     By  default, Hspell does not allow the He Ha-sh’ela prefix. This
              is because this prefix is not normally used  in  modern  Hebrew,
              and  generates many false-negatives (errors, like He followed by
              a possessed noun, are thought to  be  correct).  The  -H  option
              nevertheless tells Hspell to allow this prefix.

       -D base
              Load  the  word  lists from the given base pathname, rather than
              from the compiled-in default  path.  This  is  mostly  used  for
              testing  Hspell, when the dictionaries have been compiled in the
              current directory and hspell is run as "hspell -Dhebrew.wgz".

       -d, -B, -m, -T, -C, -S, -P, -p, -w, and -W
              These options are passed to hspell by lyx or other applications,
              and are cordially ignored.

SPELLING STANDARD

       Hspell was designed to be 100% and strictly compliant with the official
       niqqud-less spelling rules ("Ha-ktiv  Khasar  Ha-niqqud",  colloquially
       known  as "Ktiv Male") published by the Academy of the Hebrew Language.

       This is both  an  advantage  and  a  disadvantage,  depending  on  your
       viewpoint.   It’s  an  advantage  because  it  encourages a correct and
       consistent  spelling  style  throughout   your   writing.   It   is   a
       disadvantage,  because  a  few  of  the  Academia’s  official  spelling
       decisions are relatively unknown to the general public.

       Users  of  Hspell  (and  all  Hebrew  writers,  for  that  matter)  are
       encouraged  to  read the Academia’s official niqqud-less spelling rules
       (which are printed at the end of most modern Hebrew  dictionaries,  and
       an         abridged        version        is        available        in
       http://hebrew-academy.huji.ac.il/decisio.html).   Users   are    also
       encouraged  to  refer  to Hebrew dictionaries which use the niqqud-less
       spelling (such as Millon Ha-hove, Rav Milim, and the new Even Shoshan).

       Hspell’s   distribution   (and  Web  site)  also  include  a  document,
       niqqudless.odt, which explains Hspell’s spelling standard in detail (in
       Hebrew).  It  explains  both  the  overall principles, and why specific
       words are spelled the way they are.

       Future releases  might  include  an  option  for  alternative  spelling
       standards.

BEHIND THE SCENES

       The  hspell  program  itself is mostly a simple (but efficient) program
       that checks input words against a long list of valid  words.  The  real
       "brains"  behind  it  are  the  word lists (dictionary) provided by the
       Hspell project.

       In order for this dictionary to be completely free  of  other  people’s
       copyright   restrictions,   the   Hspell   project   is   a  clean-room
       implementation, not based on pre-existing word lists or spell checkers,
       or on copying of printed dictionaries.

       The  word  list  is  also  not based on automatic scanning of available
       Hebrew documents (such as online newspapers), because there is  no  way
       to   guarantee   that   such  a  list  will  be  correct  (not  contain
       misspellings, useless proper  names,  and  so  on),  complete  (certain
       inflections  might  not  appear  in  the chosen samples), or consistent
       (especially when it comes to niqqud-less spelling rules).

       Instead, our idea was to write programs which  know  how  to  correctly
       inflect  Hebrew  nouns  and  conjugate Hebrew verbs. The input to these
       programs is a list of noun stems and verb roots, plus hints needed  for
       the  correct inflection when these cannot be figured out automatically.
       Most of the effort that went into the Hspell project went into building
       these  input files.  Then, "word list generators" (written in Perl, and
       are also part of the Hspell project) create the complete inflected word
       list  that  will  be  used  by the spellchecking program, hspell.  This
       generation process is only done once, when building hspell from source.

       These  lists,  before and after inflection, may be useful for much more
       than spellchecking. Morphological analysis (which hspell provides  with
       the -l option) is one example. For more ideas, see Hspell project’s Web
       site, at http://ivrix.org.il/projects/spell-checker.

FILES

       ~/.hspell_words, ./hspell_words
              These files, if they exist, should  contain  a  list  of  Hebrew
              words that hspell will also accept as correct words.

              Note  that only these words exactly will be added - they are not
              inflected, and prefixes are not automatically allowed.

       /usr/local/share/hspell/*
              The standard Hebrew word lists used by hspell.

EXIT STATUS

       Currently always 0.

VERSION

       The version of hspell described by this manual page  is  1.1  (December
       31, 2009)

COPYRIGHT

       Copyright (C) 2000-2009, Nadav Har’El <nyh@math.technion.ac.il> and Dan
       Kenigsberg <danken@cs.technion.ac.il>.

       Hspell is free software, released under the GNU General Public  License
       (GPL).   Note  that not only the programs in the distribution, but also
       the dictionary files and the generated word lists, are  licensed  under
       the GPL.  There is no warranty of any kind.

       See  the LICENSE file for more information and the exact license terms.

       The   latest   version   of   this   software   can   be    found    in
       http://hspell.ivrix.org.il/

ACKNOWLEDGMENTS

       The hspell utility and the linguistic databases behind it (collectively
       called  "the  Hspell   project")   were   created   by   Nadav   Har’El
       <nyh@math.technion.ac.il>       and       by       Dan       Kenigsberg
       <danken@cs.technion.ac.il>.

       Although we wrote all of Hspell’s code ourselves, we are truly indebted
       to  the  old-style  "open  source"  pioneers  -  people who wrote books
       instead of hiding their knowledge  in  proprietary  software.  For  the
       correct  noun inflections, Dr. Shaul Barkali’s "The Complete Noun Book"
       has been a great help. Prof. Uzzi Ornan’s booklet "Verb Conjugation  in
       Flow  Charts"  has  been  instrumental  in  the  implementation of verb
       conjugation, and Barkali’s "The Complete Verb Book" was used too.

       During  our  work  we  have  extensively  used  a  number   of   Hebrew
       dictionaries,  including Even Shoshan, Millon Ha-hove and Rav-Milim, to
       ensure the correctness of certain words. Various Hebrew newspapers  and
       books,  both  printed  and  online,  were  used for inspiration and for
       finding words we still do not recognize.

       We wish to thank Cilla Tuviana and Dr. Zvi Har’El for their  assistance
       with some grammatical questions.

       Several  other  people helped us in various releases, with suggestions,
       fixes or patches -  they  are  listed  in  the  WHATSNEW  file  in  the
       distribution.

BUGS

       This manual page is in English.

       For  GUI-lovers, hspell’s user interface is an abomination. However, as
       more and more applications learn  to  interface  with  hspell,  and  as
       Hspell’s data becomes available in multi-lingual spellcheckers (such as
       aspell  and  hunspell),  this  will  no  longer  be   an   issue.   See
       http://hspell.ivrix.org.il/  for instructions on how to use Hspell in a
       variety of applications.

       hspell’s being limited to the ISO-8859-8 encoding, and not  recognizing
       UTF-8  or  even  CP1255  (including  niqqud),  is almost an anachronism
       today.

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

SPELLING STANDARD

BEHIND THE SCENES

FILES

EXIT STATUS

VERSION

COPYRIGHT

ACKNOWLEDGMENTS

SEE ALSO

BUGS