Man Linux: Main Page and Category List

NAME

       ra-retrieve   -  retrieve  files  that  match  a  query  for  use  with
       remembrance agent software

SYNOPSIS

       ra-retrieve [--version] [-v] [-d] <base-dir> [--docnum <docnum>]

DESCRIPTION

       ra-index  and  ra-retrieve  make  up  the  Savant  search  engine,   an
       information retrieval engine designed as a back-end for the Remembrance
       Agent (RA).  Given a collection of the user’s accumulated email, usenet
       news  articles,  papers,  saved HTML files and other text notes, the RA
       attempts to find those documents which are most relevant to the  user’s
       current  context.  That is, it searches this collection of text for the
       documents which bear the highest word-for-word similarity to  the  text
       the  user  is  currently  editing, in the hope that they will also bear
       high conceptual similarity and thus be useful  to  the  user’s  current
       work.   With  the  Emacs  front-end, these suggestions are continuously
       displayed in a small buffer at the bottom of the user’s window.   If  a
       suggestion  looks  useful, the full text can be retrieved with a single
       command.

       The  Remembrance  Agent  works  in  two  stages.   First,  the   user’s
       collection  of  text  documents  is  indexed into a database saved in a
       vector format.  After the database is created, the other stage  of  the
       Remembrance  Agent  is  run  from  emacs, where it periodically takes a
       sample of text from the working buffer and finds those  documents  from
       the  collection that are most similar.  It summarizes the top documents
       in a small emacs window and allows you to retrieve the entire  text  of
       any one with a keystroke.  See the README file for information on using
       the Emacs front-end.

       The RA is primarily designed as a proactive information  provider  that
       continually  gives  you  information  that  might  be  relevant to your
       current environment.  In this mode, ra-retrieve is run by a  front  end
       and  its  output is parsed into a more human-readable format.  However,
       Savant can also be used as a standard text  and  information  retrieval
       search engine.

   USAGE
       The  one  argument to ra-retrieve is <base-dir>, which is the directory
       containing the  index  files  created  by  ra-index.   This  starts  an
       interactive  process  that  handles  queries and returns documents that
       most match that query.  When running with the -v argument, a  menu  and
       other information is printed.  Without the -v option it is assumed that
       ra-retrieve has been run from a front-end, and only minimal information
       is printed.  The following commands are available:

       query <num-lines>
              Find  the  <num-lines>  most  relevant  documents  to  a  query.
              Default is 5.  Enter the text of the query,  followed  by  a  ^D
              (ASCII  04).   If  the  query matches a predefined template then
              fields are parsed and separately.  For example,  in  emacs  mail
              mode  the  from,  subject,  date and body of the message are all
              individually parsed.  If  no  template  matches,  the  query  is
              assumed to be plain text.

              A  query  will output up to n summary lines followed by a period
              on a line by itself.  Each summary  line  will  contain  a  line
              number,  relevance  number,  document  number,  and  a series of
              fields describing the document.  The final  field  is  a  comma-
              separated  list of words from the query that most contributed to
              this document being chosen.

       retrieve <document number>
              Retrieve and print the document with the given document  number.
              Document  number  is the third field outputed by the query.  The
              full text of the document is displayed.  If the  document  is  a
              part  of  a larger file, such as in an email in an archive file,
              only that one document is shown.

       loc-retrieve <document number>
              Retrieve and print the location of the document with  the  given
              document  number.  Three values are displayed, each on their own
              separate line.  The first is the character offset to  where  the
              beginning of the document is found.  The second is the character
              offset for the end of the document.  The third line contains the
              fully  expanded  filename  for  the  document  itself.   This is
              primarily so front-ends can load the document and  display  them
              with their own formatting.

       info   Display version and database info.

       quit   Quit.

       ?      Display menu.

   OPTIONS
       -v     Verbose mode.  Print a menu and other info for running without a
              front-end.

       -d     Debug mode.  Print not-so-useful information.

       [--docnum <docnum>]
              Print the contents of the specified document  number  and  exit.
              This  option doesn’t use as much memory as interactive mode, and
              is useful for scripts that call this program.

SEE ALSO

       ra-index(1)

AUTHOR

       Bradley Rhodes, MIT Media Lab.  Please send comments and  questions  to
       ra-bugs@media.mit.edu.   New  versions  and  updates  can  be  found at
       http://www.media.mit.edu/~rhodes/RA/

COPYRIGHT

       All code included in versions up to and including 2.09:
          Copyright (C) 2001 Massachusetts Institute of Technology.

       All modifications subsequent to  version  2.09  are  copyright  Bradley
       Rhodes or their respective authors.

       Developed  by  Bradley  Rhodes at the Media Laboratory, MIT, Cambridge,
       Massachusetts, with support from British Telecom and Merrill Lynch.

       This program is free software; you can redistribute it and/or modify it
       under  the  terms of the GNU General Public License as published by the
       Free Software Foundation; either version 2 of the License, or (at  your
       option) any later version.  For commercial licensing under other terms,
       please consult the MIT Technology Licensing Office.

       This program may be subject to the following US and/or foreign  patents
       (pending):  "Method  and  Apparatus  for  Automated,  Context-Dependent
       Retrieval of Information," MIT Case No. 7870TS. If any of these patents
       are  granted,  royalty-free license to use this and derivative programs
       under the GNU General Public License are hereby granted.

       This program is distributed in the hope that it  will  be  useful,  but
       WITHOUT   ANY   WARRANTY;   without   even   the  implied  warranty  of
       MERCHANTABILITY or FITNESS FOR  A  PARTICULAR  PURPOSE.   See  the  GNU
       General Public License for more details.

       You should have received a copy of the GNU General Public License along
       with this program; if not, write to the Free Software Foundation, Inc.,
       59 Temple Place - Suite 330, Boston, MA 02111-1307, USA.

BUGS

       Dates  are not currently indexed, so anything trying to do a date query
       gets no suggestion back.

       Requires GNU make to compile.

       The template structure isn’t documented.