Man Linux: Main Page and Category List


       estwaver - command line interface of web crawler


       estwaver init rootdir

       estwaver crawl [-restart|-revisit|-revcont] rootdir

       estwaver unittest rootdir

       estwaver fetch [-proxy hostr port] [-tout num] [-il lang] url


       estwaver  is an aggregation of sub commands.  The name of a sub command
       is specified  by  the  first  argument.   Other  arguments  are  parsed
       according  to  each  sub  command.   The argument rootdir specifies the
       crawler root directory which contains configuration file and so on.

       estwaver init rootdir
              Create the crawler root directory.

       estwaver crawl [-restart|-revisit|-revcont] rootdir
              Start crawling.
              If -restart is specified, crawling is restarted  from  the  seed
              If -revisit is specified, collected documents are revisited.
              If  -revcont is specified, collected documents are revisited and
              then crawling is continued.</dd>

       estwaver unittest rootdir
              Perform unit tests.

       estwaver fetch [-proxy hostr port] [-tout num] [-il lang] url
              Fetch a document.
              url specifies the URL of a document.
              -proxy specifies the host name and the port number of the  proxy
              -tout specifies timeout in seconds.
              -il  specifies  the  preferred  language.   By  default,  it  is

       All sub commands return 0 if the operation is success, else  return  1.
       A  running  crawler  finishes with closing the database when it catches
       the signal 1 (SIGHUP), 2 (SIGINT), 3 (SIGQUIT), or 15 (SIGTERM).

       When crawling finishes, there is a directory _index in the crawler root
       directory.  It is an index available by estcmd and so on.


       estconfig(1),   estcmd(1),   estmaster(1),   estcall(1),   estraier(3),