NAME
archmbox - a simple email archiver
SYNOPSIS
archmbox [ -h | --version ]
archmbox MODE [ OPTIONS ] -d date mailbox [ mailbox ... ]
archmbox MODE [ OPTIONS ] -o days mailbox [ mailbox ... ]
DESCRIPTION
Archmbox is a simple email archiver written in perl; it parses one or
more mailboxes, select some or all messages and then perform specific
actions on the selected messages.
Four different MODES are available:
· list mode, which is useful to list all selected messages before
archmbox performs the real operations (archiving or deleting)
· kill mode, if messages should be deleted from the mailbox(es)
rather than archived
· archive mode, to archive the selected messages in a different
mailbox
· copy mode, to copy selected messages from a source mailbox(es)
without modifying it
Messages selection is based upon a date criteria; an absolute date or a
days offset can be specified.
It is also possible to refine the selection using perl regular
expressions on the header fields of the message. Keep in mind to quote
the so called metacharacters, which are reserved for use in perl’s
regex notation. The metacharacters are
{}[]()^$.|*+?\
All archived messages are stored in a new mailbox with the same name of
the original one + .archived as extension (this is the default, but can
be changed); the archive mailbox can be saved in gz or bz2 compressed
format as well.
Please note that the archive mailbox format is always mbox, regardless
of original mailbox format. Moreover, mailboxes must be specified using
the full path.
Messages are appended to the archive mailbox to allow multiple
executions of the script against the same mailbox.
MODES
-a, --archive
Selected messages are archived in a different mailbox.
-k, --kill
Selected messages are deleted rather than archived.
-l, --list
List all selected messages.
Warnings about skipped mailboxes (in use, empty ...) are printed
to stderr. So redirecting them to /dev/null won’t clutter your
list.
-y, --copy
Selected messages are copied from the source mailbox.
OPTIONS
-b, --backup
Creates a backup of the original mailbox before archmbox
execution. The mailbox is called mailbox.backup
--bzip2
Use bzip2 to compress the archive mailbox (use with -c).
-c, --compress
Compress the archive mailbox after script execution.
-d, --date <date>
Specifies the threshold date for messages. The date must be
supplied in the following format: yyyy-mm-dd
-D, --date-header
Force the use of the "Date:" header to age a message. If the
header is somehow corrupt, the date/time informations are
gathered for the beginning line of the message.
-e, --extension <extension>
Specifies the suffix for the archive mailbox; the default is
archived. If none is specified, no suffix will be used (use
carefully).
-f, --full-name
Prepends the path of the mailbox to the name of the archive
mailbox. This option overrides -n.
--format
Specifies the format of the mailboxes to parse. Legal values are
mbox and mbx. Defaults to "mbox".
-h, --help
Prints help.
-i, --ignore <regexp>
Any mailbox/directory matching <regexp> will be skipped while
archiving.
--keep-flagged
Flagged messages will not be archived.
--keep-unread
Unread messages will not be archived.
-m, --minsize
Specifies the minimum size of the mailbox to be archived.
Mailboxes smaller than <minsize> will not be parsed for
archiving.
-n, --archive-name
Specifies the name of the archive file (default: mailbox name)
--nosymlink
Do not follow symbolic links when processing mailboxes.
--nowarnings
Suppress mailbox related warnings. Use only if you know what
you’re doing!
--omit-prefix <prefix>
Omit <prefix> from the name of the mailbox when full name
(option -f) is required.
-o, --offset <days>
Specifies the offset (in days) from today for threshold date of
a message. This option replaces -d. If you specify -1, archmbox
will operate on all messages.
-p, --archive-path, --path <directory>
Specifies where to store the archive mailbox (default: ".").
<directory> must be specified using full path. The --path option
is now deprecated and will be dropped in future releases.
-r, --reverse
Reverse the sense of offset or date value. It usually means
older than but with this switch, it means newer than.
-R, --recursive
Act recursively on directories. If one or more directories are
specified on the command line, all mailboxes stored in those
directories will be parsed for archiving. Implies option -f.
-t, --tmpdir <directory>
Specify a temporary working directory. This value overrides the
default one, which will be set in descending order to the first
defined one of: the environment variables $TMPDIR and $TMP, the
compiled in one and, as a fallback, ’/tmp’.
To see the default value used by archmbox, do: archmbox --help.
<directory> must be specified using full path.
--time <time>
Use <time> in conjunction with <date> (option -d) to refine the
threshold age for archiving. <time> must be specified in the
following format: hh:mm:ss.
--totals
Prints an overall summary of the archiving operations. The
summary contains the number of parsed and skipped mailboxes, the
total number of messages parsed and saved, the total space used
and saved.
-v, --verbose
Verbosity level. Default is 1 (line per message) in --list
output. So, if set to 1 it only lists msgid, sender and subject.
With -v=2, it also prints date.
--version
Prints version number.
-x, --regexp <header=regexp>
It is specified in form -x field=’regexp’, where field can be
any header. The header part is case sensitive. The regexp part
is case sensitive if the regexp contains at least one upper case
letter, and case insensitive otherwise.
If message satisfies date range, but does not satisfy regexp
match on specified field, it won’t be archived.
The option can be specified more than once; in this case, the
message is regexp matched against all the given rules, and if it
satisfies any, it will be archived.
-X, --Regexp <header=regexp>
Same as -x, --regexp except that for matching the regular
expressions a logical ’and’ mode is used for all regexp
including the regexp given by -x, --regexp.
CONFIGURATION
Archmbox is completely written in perl, but it uses some shell helpers
to perform its job (fuser, rm, gzip/gunzip etc.).
The correct path for the helpers (both required and optional ones) is
probed at installation time. If one required helper is missing the
installation will not take place. If one optional helper is missing,
the feature provided using that helper will be unavailable, but the
script will be installed anyway.
All other relevant configuration options can be specified at
installation time or at run time using the command line switches.
USAGE EXAMPLES
A complete example:
archmbox -a -b -c -e 01 -f -d 2002-01-01 -p ~/mail-archive
~/Mail/personal-stuff
This will archive all messages older than (received before...) Jan 1st
2002 from the personal-stuff mailbox in the Mail directory. Archive
messages are saved in a mailbox called Mail-personal-stuff.01.gz in the
~/mail-archive directory. After execution, you’ll find a mailbox called
personal-stuff.backup in ~/Mail.
Complex examples, using perl regular expressions:
archmbox -a -o 1 --keep-flagged --keep-unread \
-x From=’(nagios|arpwatch|logcheck)@host\.net’ \
-x Subject=’^(Security Events|Syslog Summary|\[SNORT\])’ \
~/Mail/inbox
This will archive all unflagged, read messages older than 1 day where
the sender address matches nagios@host.net, arpwatch@host.net or
logcheck@host.net or whose subject field starts with either ’Security
Events’ or ’Syslog Summary’ or ’[SNORT]’ from the mailbox ~/Mail/inbox.
Messages will be saved in inbox.archive in the current directory where
archmbox was started from.
archmbox --archive --offset 1 --keep-flagged --keep-unread \
--Regexp From=’@(host1|host2).example\.com’ \
--regexp Subject=’^(Security Events|Syslog Summary|\[SNORT\])’ \
--archive-path ~/Mail/local-network.archive \
--archive-name system-msgs \
--extension ’none’ \
~/Mail/inbox
This will archive all unflagged, read messages older than 1 day where
the sender address matches @host1.example.com or @host2.example.com and
whose subject field starts with either ’Security Events’ or ’Syslog
Summary’ or ’[SNORT]’ from the mailbox ~/Mail/inbox. Messages will be
archived to the mbox system-msgs in the directory
~/Mail/local-network.archive.
Some simpler examples:
archmbox -a -o 15 ~/Mail/personal-stuff
This will archive all messages older than 15 days in
personal-stuff.archived (uncompressed mailbox).
archmbox -a -r -o 15 ~/Mail/personal-stuff
The same as above, but only messages newer than 15 days will be
archived.
archmbox -k -o 15 ~/Mail/personal-stuff
This will delete all messages older than 15 days from
Mail/personal-stuff
archmbox -a -o 15 ~/Mail/* -c
This will archive all messages older than 15 days in every mailbox
found in ~/Mail. All the archive mailboxes will be compressed.
archmbox -l -r -c /tmp/mbox -o 20
List all messages in /tmp/mbox which are newer than 20 days. Option -c
is meaningless (and so ignored...).
archmbox -l -r -c /tmp/mbox -o 20 -a --bzip2
Same as above, but archiving is forced (-a) and bzip2 is used for
compression.
archmbox -a -x Subject=’archmbox’ -o 7 ~/mbox
Select for archiving all messages older than 7 days whose subject field
satisfies regexp match Subject =~ /archmbox/ (Subject is case
sensitive, archmbox is is case insensitive).
archmbox -l -x Subject=’archmbox’ -x From=’fritz’ -o 7 ~/mbox
Select for archiving all messages older than 7 days whose subject field
contains archmbox or the sender is fritz (matches are case
insensitive).
archmbox -l -x Subject=’archmbox’ -X From=’fritz’ -o 7 ~/mbox
Select for archiving all messages older than 7 days whose subject field
contains archmbox and the sender is fritz (matches are case
insensitive).
archmbox -a -o 5 -R /tmp/mbox ~/Mail
archmbox will archive all messages older than five days in /tmp/mbox.
It then start parsing all mailboxes stored in ~/Mail (recursion is
active, and ~/Mail is a directory). If one or more directories will be
found in ~/Mail, those directories will be explored as well.
archmbox -a -o -1 ~/Mail/my_mbx_mailbox --format mbx
archmbox archives all messages stored in my_mbx_mailbox and puts them
into my_mbx_mailbox.archived. The source mailbox is a mbx mailbox
(--format mbx is used). The archive mailbox will be a mbox mailbox.
NOTES
When the script has to decide if a message needs to be selected from
the mailbox, it looks for the header From generated by the mail server
(this is the first line of the message) and doesn’t care about the date
specified by the sender’s mail client. This is useful to avoid removing
messages sent from misconfigured mail clients. This behavior can be
changed by forcing the use of the "Date:" header (option -D).
Not all options are meaningfull in all modes, ie compression is
meaningless in list or kill mode. If you specify a useless option for a
particular mode, archmbox simply ignores it.
Archmbox uses a working directory to store temporary mailboxes. A
default value for that directory is hard coded in the script, but can
be changed during the configuration/installation process (see INSTALL
for details). It might happen that your mailboxes are too big for the
partition holding this temporary directory, or you might want to
perform archiving on too much mailboxes at the same time. In other
words, you may run out of space. Use the -t option to specify a
suitable working directory at runtime.
If you see some differences in the mailbox’s dimension (size/free
space), keep in mind that your mailbox may contain a special message
(512 bytes in size) with internal information related to the mailbox.
This message is meaningless for you, though archmbox recognizes it and
lets you be aware of it. That message is left untouched in your source
mailbox.
A few words about locking. There has been a discussion about archmbox
handles file locking. The answer is simple: no mailbox is ever locked.
The reason behind this behavior is that I want archmbox to be as least
invasive as possible, so other kind of checks are performed to ensure
that no data is lost (mailbox has changed/mailbox is in use by another
program). I will surely add some locking mechanism in the future.
You don’t need to execute archmbox as root... just take care to have
write permissions for the directories you use.
LINKS
Archmbox can be downloaded from:
http://adc-archmbox.sourceforge.net
Archmbox is distributed under the terms of the GPL
AUTHOR(S)
Copyright (C) 2001-2005
Alessandro Dotti Contra <adotti@users.sourceforge.net>
Parts of the code were contributed by:
Alex Aminoff, Brian Medley, Buck Holsinger, Davor Ocelic, Fabrice
Noilhan, Jayanth Varma, Juergen Edner, Laurent Cheylus, Nicolas
Ecarnot, Paco Regodon, Scott Thompson, Juergen Desher.
The FreeBSD port is maintained by Talal Al-Dik.
The OpenDarwin port is maintained by Markus Weissman.
The Debian package is maintained by Alberto Furia <straluna@email.it>
BUGS
Please report bugs to <adotti@users.sourceforge.net>
SEE ALSO
PERLREQUICK(1), PERLRETUT(1), PERLRE(1)
archmbox(1)