Man Linux: Main Page and Category List

NAME

       recoll.conf - main personal configuration file for Recoll

DESCRIPTION

       This file defines the indexation configuration for the Recoll full-text
       search system.

       The  system-wide  configuration  file  is   normally   located   inside
       /usr/[local]/share/recoll/examples.  Any  parameter  set  in the common
       file may be overriden by setting it in the personal configuration file,
       by default: $HOME/.recoll/recoll.conf

       Please  note  while  we  try  to keep this manual page reasonably up to
       date, it will frequently lag the current state  of  the  software.  The
       best  source of information about the configuration are the comments in
       the configuration file.

       A short extract of the file might look as follows:

              # Space-separated list of directories to index.
              topdirs =  ~/docs /usr/share/doc

              [~/somedirectory-with-utf8-txt-files]
              defaultcharset = utf-8

       There are three kinds of lines:

              ·      Comment or empty

              ·      Parameter affectation

              ·      Section definition

       Empty lines or lines beginning with # are ignored.

       Affectation lines are in the form ’name = value’.

       Section lines allow redefining a parameter  for  a  directory  subtree.
       Some   of   the   parameters   used   for  indexaction  are  looked  up
       hierarchically from the more to the less specific. Not  all  parameters
       can  be  meaningfully redefined, this is specified for each in the next
       section.

       The tilde character (~) is expanded in file names to the  name  of  the
       user’s home directory.

       Where  values  are  lists,  white  space  is  used  for separation, and
       elements with embedded spaces can be quoted with double-quotes.

OPTIONS

       topdirs = directories
              Specifies the list of directories to index (recursively).

       dbdir = directory
              The name of the Xapian database directory. It will be created if
              needed  when  the  database  is  initialized.  If this is not an
              absolute  pathname,  it  will   be   taken   relative   to   the
              configuration directory.

       skippedNames = patterns
              A  space-separated  list  of  patterns  for  names  of  files or
              directories that should be completely ignored. The list  defined
              in the default file is:

              *~ #* bin CVS  Cache caughtspam  tmp

              The  list  can  be  redefined  for  subdirectories,  but is only
              actually changed for the top level ones in topdirs

       skippedPaths = patterns
              A space-separated list of patterns for paths the indexer  should
              not descend into. Together with topdirs, this allows pruning the
              indexed tree to one’s content. daemSkippedPaths can be  used  to
              define a specific value for the real time indexing monitor.

       followLinks = boolean
              Specifies  if  the  indexer  should  follow symbolic links while
              walking the file tree. The default is to ignore  symbolic  links
              to avoid multiple indexing of linked files. No effort is made to
              avoid duplication when this option is set to true.  This  option
              can be set individually for each of the topdirs members by using
              sections. It can not be changed below the topdirs level.

       loglevel = value
              Verbosity level for recoll and recollindex. A value of  4  lists
              quite  a lot of debug/information messages. 3 lists only errors.
              daemloglevel can be used to specify a different  value  for  the
              real-time indexing daemon.

       logfilename = file
              Where  should the messages go. ’stderr’ can be used as a special
              value.  daemlogfilename can be used to specify a different value
              for the real-time indexing daemon.

       indexstemminglanguages = languages
              A  list of languages for which the stem expansion databases will
              be built. See recollindex(1) for possible values.

       defaultcharset = charset
              The name of the character set used for files that do not contain
              a  character  set definition (ie: plain text files). This can be
              redefined for any subdirectory.

       maxfsoccuppc = percentnumber
              Maximum file system occupation  before  we  stop  indexing.  The
              value  is  a percentage, corresponding to what the "Capacity" df
              output column  shows.   The  default  value  is  0,  meaning  no
              checking.

       idxflushmb = megabytes
              Threshold  (megabytes  of  new  text  data)  where we flush from
              memory to disk index.  Setting  this  can  help  control  memory
              usage.  A  value of 0 means no explicit flushing, letting Xapian
              use its own default, which is  flushing  every  10000  documents
              (memory  usage  depends  on  average document size). The default
              value is 10.

       filtersdir = directory
              A directory to search for the external filter  scripts  used  to
              index  some  types  of  files.  The value should not be changed,
              except if you want to modify one of  the  default  scripts.  The
              value can be redefined for any subdirectory.

       iconsdir = directory
              The  name  of  the  directory where recoll result list icons are
              stored. You can change this if you want different images.

       guesscharset = boolean
              Try to guess the character set of files if no internal value  is
              available (ie: for plain text files). This does not work well in
              general, and should probably not be used.

       usesystemfilecommand = boolean
              Decide if we use the file -i system command as a final step  for
              determining  the  mime  type for a file (the main procedure uses
              suffix associations as defined in the mimemap file). This can be
              useful  for  files with suffixless names, but it will also cause
              the indexation of many bogus "text" files.

       indexedmimetypes = list
              Recoll normally indexes any file which it  knows  how  to  read.
              This  list  lets you restrict the indexed mime types to what you
              specify. If the variable is unspecified or the list  empty  (the
              default), all supported types are processed.

       compressedfilemaxkbs = value
              Size  limit for compressed (.gz or .bz2) files. These need to be
              decompressed in a temporary directory for identification,  which
              can be very wasteful if ’uninteresting’ big compressed files are
              present.  Negative means no limit, 0 means no processing of  any
              compressed file. Defaults to -1.

       indexallfilenames = boolean
              Recoll indexes file names into a special section of the database
              to allow specific file names searches  using  wild  cards.  This
              parameter  decides  if  file name indexing is performed only for
              files with mime types that would  qualify  them  for  full  text
              indexation,  or  for  all  files  inside  the selected subtrees,
              independent of mime type.

       idxabsmlen = value
              Recoll stores an abstract  for  each  indexed  file  inside  the
              database. The text can come from an actual ’abstract’ section in
              the document or will just be the beginning of the  document.  It
              is  stored  in  the index so that it can be displayed inside the
              result lists without decoding the original file. The  idxabsmlen
              parameter  defines  the size of the stored abstract. The default
              value is 250 bytes.  The search interface gives you  the  choice
              to  display  this  stored  text or a synthetic abstract built by
              extracting text around the search terms. If  you  always  prefer
              the  synthetic  abstract,  you  can reduce this value and save a
              little space.

       aspellLanguage = lang
              Language definitions to use when creating the aspell dictionary.
              The  value must match a set of aspell language definition files.
              You can type "aspell config" to see where  these  are  installed
              (look  for  data-dir). The default if the variable is not set is
              to use your desktop national language environment to  guess  the
              value.

       noaspell = boolean
              If  this is set, the aspell dictionary generation is turned off.
              Useful for cases where you don’t need the functionality or  when
              it   is   unusable  because  aspell  crashes  during  dictionary
              generation.

       nocjk = boolean
              If this  set  to  true,  specific  east  asian  (Chinese  Korean
              Japanese)  characters/word  splitting  is  turned off. This will
              save a small amount of cpu if you have no CJK documents. If your
              document  base does include such text but you are not interested
              in searching it, setting nocjk may be  a  significant  time  and
              space saver.

       cjkngramlen = value
              This  lets  you adjust the size of n-grams used for indexing CJK
              text. The default value of 2 is  probably  appropriate  in  most
              cases. A value of 3 would allow more precision and efficiency on
              longer words, but the  index  will  be  approximately  twice  as
              large.

SEE ALSO

       recollindex(1) recoll(1)

                                8 January 2006