Man Linux: Main Page and Category List

NAME

     detoxrc - configuration file for detox(1)

OVERVIEW

     detox allows for configuration of its sequences through config files.
     This document describes how these files work.

IMPORTANT

     When setting up a new set of rules, the safe and wipeup filters must
     always be run after a translating filter (or series thereof), such as the
     utf_8 or the uncgi filters.  Otherwise, the risk of introducing illegal
     characters into the filename is introduced.

SYNTAX

     The format of this configuration file is C-like.  It is based loosely off
     named’s configuration files.  Each statement is semicolon terminated, and
     modifiers on a particular statement are generally contained within
     braces.

     sequence "name" {...};
         Defines a sequence of filters to run a filename through.  "name"
         specifies how the user will refer to the particular sequence during
         runtime.  Quotes around the sequence name are generally optional, but
         should be used if the sequence name does not start with a letter.

         There is a special sequence, named "default", which is the default
         sequence used by detox.  This can be overridden through the command
         line option -s or the environmental variable DETOX_SEQUENCE.

         Sequence names are case sensitive and unique throughout all
         sequences; that is, if a system wide file defines normal_seq and a
         user has a sequence with the same name in their .detoxrc, the users’
         normal_seq will take precedence.

     iso8859_1 {filename "/path/to/filename";};
         This translates ISO 8859-1 (aka Latin-1) characters into lower ASCII
         equivalents.  The output is not necessarily safe, and should also be
         run through the safe filter.

         Under normal circumstances, the filename syntax is not needed.  Detox
         looks in several locations for a file called iso8859_1.tbl, which is
         a set of rules defining how an ISO 8859-1 character should be
         translated.

         In the event this table doesn’t exist, you have two options.  You can
         download or create your own, and tell detox the location of it using
         the filename syntax shown above, or you can let detox fall back on
         its internal tables.  The internal tables translate the same as the
         stock translation tables.

         You can chain together multiple iso8859_1 translations, as long as
         the default value of all but the last one is set to nothing.  This is
         explained in detox.tbl(5).

         This filter is mutually exclusive with the utf_8 filter.

     utf_8 {filename "/path/to/filename";};
         This translates Unicode characters, encoded by the UTF-8 translation
         method, into safe equivalents.

         This operates in a manner similar to iso8859_1, except it looks for a
         translation table called unicode.tbl.

         The default internal translation for Unicode characters only contains
         the lower 256 characters of Unicode, which is equivalent to the set
         of Basic Latin and Latin-1 characters.

     uncgi;
         This translates CGI escaped strings into their ASCII equivalents. The
         output of this is not necessarily safe, and could contain ISO 8859-1
         chars or potentially UTF-8 characters.

     safe {filename "/path/to/filename";};
         This could also be called "safe for UNIX-like operating systems".  It
         translates characters that are difficult to work with in UNIX
         environments into characters that are not.

         In earlier versions this filter was entirely internal.  Starting with
         1.2.0, this filter is controlled by a translation table.  In the
         absense of the translation table, the previous code will be employed
         for the translation.  Also, prior to 1.2.0, the safe filter removed
         leading dashes to prevent the hassle of dealing with a filename in
         the format -filename.  This functionality is exclusively handled by
         the wipeup filter now.

         See the SAFE section for more details on what this filter translates
         by default.

     wipeup {remove_trailing;};
         This wipes up any excessive characters.  For instance, multiple
         underscores or dashes will be converted into a single underscore or
         dash.  Any series of dash and underscore (i.e. "_-_") will be
         converted into a single dash.

         The remove trailing option removes a dash or underscore followed
         immediately by a period.

         See the WIPEUP section for more details on what this filter
         translates.

     max_length {length value;};
         This trims a file down to the length specified (or less).  It is
         conscious of extensions and attempts to preserve anything following
         the last period in a filename.

         For instance, given a max length of 12, and a filename of
         "this_is_my_file.txt", the filter would output "this_is_.txt".

     lower;
         This translates uppercase characters into lowercase characters.

     # Comments
         Any thing after a # on any line is ignored.

EXAMPLE

     sequence default {
       uncgi;
       iso8859_1 {
         filename "iso8859_1.tbl";
       };
     # utf_8 {
     #   filename "unicode.tbl";
     # };
       safe {
         filename "safe.tbl";
       };
       wipeup {
         remove_trailing;
       };
     # max_length {
     #   length 128;
     # };
     };

SAFE

     The following characters are translated by the stock safe filter.  They
     can be tuned by updating safe.tbl or creating a copy of safe.tbl and
     updating your rc file.

   Rules that apply anywhere in the filename:
           Safe       Original
           _and_      &
           _          space ‘ ! @ $ * \ | : ; " ’ < > ? /
           -          ( ) [ ] { }

WIPEUP

     The following characters are translated by the wipeup filter.

   Rules that apply anywhere in the filename:
           Wipeup    Original
           -         -_
           -         _-
           -         --
           _         __

   Rules that apply only at the beginning of a filename:
     Any leading dashes are stripped to prevent programs from interpreting
     these files as command line options.

           Wipeup     Original
           removed    - _ #

   Rules that apply when remove trailing is enabled:
           Wipeup    Original
           .         .-
           .         -.
           .         ._
           .         _.

SEE ALSO

     detox(1), detox.tbl(5).

AUTHORS

     detox was written by Doug Harple.