NAME
bsfilter — bayesian spam filter
SYNOPSIS
bsfilter [options] [commands] < MAIL
bsfilter [options] [commands] MAIL ...
DESCRIPTION
bsfilter filters out spam mails.
If commands are specified, bsfilter is in maintenance mode, otherwise
it is in filtering mode.
If bsfilter reads spam from stdin in filtering mode, exit status is 0.
It is 1 in case of a clean mail.
COMMANDS
--add-clean, -c
add mails into the clean token database.
--add-spam, -s
add mails into the spam token database.
--sub-clean, -C
subtract mails from the clean token database.
--sub-spam, -S
subtract mails from the spam token database.
--update, -u
update the probability table from clean and spam token
databases.
--export-clean
export the clean token database.
--export-spam
export the spam token database.
--import-clean
import the clean token database.
--import-spam
import the spam token database.
--export-probability
export the probability database (for debugging purpose).
OPTIONS
--homedir directory
specify the name of the bsfilter’s home directory.
If this option is not used, a directory specified with the
environment variable "BSFILTERHOME" is used.
If the variable "BSFILTERHOME" is not defined, ".bsfilter"
directory under your home is used.
If the variable "HOME" is not defined, a directory which
bsfilter is located at is used.
--config-file file
specify the name of the bsfilter’s configuration file
"bsfilter.conf" in bsfilter’s home directory is used by
default.
--max-line number
check and/or study the first number of lines default is 500.
0 means all.
--db sdbm|gdbm|bdb1|bdb|qdbm
specify the name of database type "sdbm" by default.
--jtokenizer bigram|block|mecab|chasen|kakasi
-j bigram|block|mecab|chasen|kakasi
specify algorithm of a tokenizer for Japanese language
"bigram" by default.
--list-clean
print filename of clean mail.
--list-spam
print filename of spam.
--imap access IMAP server.
--imap-server hostname
specify hostname of IMAP server.
--imap-port number
specify port number of IMAP server. default is 143.
--imap-auth method
specify authorization method. default is "auto". "cram-md5"
use "AUTHENTICATE CRAM-MD5" command. "login" use
"AUTHENTICATE LOGIN" command. "loginc" use "LOGIN" command.
"auto" try "cram-md5", "login" and "loginc" in this order.
--imap-user name
specify user name of IMAP server.
--imap-password password
specify password of imap-user.
--imap-folder-clean folder
specify destination folder for clean mails. "inbox.clean" for
example.
--imap-folder-spam folder
specify destination folder for spams. "inbox.spam" for
example.
--imap-fetch-unseen
filter or study mails without SEEN flag.
--imap-fetch-unflagged
filter or study mails without "X-Spam-Flag" header.
--imap-reset-seen-flag
reset SEEN flag when bsfilter moves or modifies mails.
--pop work as POP proxy.
--pid-file file
specify filename for logging process ID of bsfilter
"bsfilter.pid" in bsfilter’s home directory is used by
default this function is valid when "--pop" is specified.
--tasktray
sit in tasktray this is valid with "--pop" on VisualuRuby.
--pop-server hostname
specify hostname of POP server.
--pop-port number
specify port number of POP server. default is 110.
--pop-proxy-if address
specify address of interface which bsfilter listens at
default is 0.0.0.0 and all interfaces are active.
--pop-proxy-port number
specify port number which bsfilter listens at. default is
10110.
--pop-user name
optional. specify username of POP server.
bsfilter checks match between value of this options and a
name which MUA sends.
in case of mismatch, bsfilter closes sockets.
--pop-proxy-set set[,set...]
specify rules of pop proxy.
alternative way of pop-server, pop-port, pop-proxy-port and
pop-user option.
format of "set" is "pop-server:pop-port:[proxy-
interface]:proxy-port[:pop-user]".
If proxy-interface is specified and isn’t 0.0.0.0 , other
interfaces are not used.
"--pop-proxy-set 192.168.1.1:110::10110" is equivalent with
"--pop-server 192.168.1.1 --pop-port 110 --pop-proxy-port
10110".
--pop-max-size number
When mail is longer than the specified number, the mail is
not filtered. When 0 is specified, all mails are tested and
filtered. unit is byte. default is 50000.
--ssl use POP over SSL with --pop option and use IMAP over SSL with
--imap option.
--ssl-cert filename|dirname
specify a filename of a certificate of a trusted CA or a name
of a directory of certificates.
--method g|r|rf
-m g|r|rf specify filtering method. "rf" by default. "g" means Paul
Graham method, "r" means Gary Robinson method, and "rf" means
Robinson-Fisher method.
--spam-cutoff number
specify spam-cutoff value. 0.9 by default for Paul Graham
method. 0.582 by default for Gary Robinson method. 0.95 by
default for Robinson-Fisher method.
--auto-update, -a
recognize mails, add them into clean or spam token database
and update the probability table.
--disable-degeneration, -D
disable degeneration during probability table lookup.
--disable-utf-8
disable utf-8 support.
--refer-header header[,header...]
refer specified headers of mails.
--ignore-header, -H
ignore headers of mails. (it is same as --refer-header "".)
--ignore-body, -B
ignore body of mails, except URL or mail address.
--ignore-plain-text-part
ignore plain text part if html part is included in the mail.
--ignore-after-last-atag
ignore text after last "A" tag.
--mark-in-token characters
specify characters which are allowable in a token "*’!" by
default.
--show-process
show summary of execution.
--show-new-token
show tokens which are newly added into the token database.
--mbox use "unix from" to divide mbox format file.
--max-mail number
reduce token database when the number of stored mails is
larger than this one 10000 by default.
--min-mail number
reduce token database as if this number of mails are stored
8000 by default.
--pipe write a mail to stdout. this options is invalid when
"--imap" or "--pop" is specified.
--insert-revision
insert "X-Spam-Revision: bsfilter release..." into a mail.
--insert-flag
insert "X-Spam-Flag: Yes" or "X-Spam-Flag: No" into a mail.
--insert-probability
insert "X-Spam-Probability: number" into a mail.
--header-prefix string
insert "X-specified_string-..." headers, instead of "Spam".
(it is valid with --insert-flag and/or --insert-probability
option.)
--mark-spam-subject
insert "[SPAM] " at the beginning of Subject header.
--mark-ubject-prefix string
insert specified string, instead of "[SPAM] ". (it is valid
with --mark-spam-subject option.)
--show-db-status
show numbers of tokens and mails in databases and quit.
--help, -h
show help message.
--quiet, -q
quiet mode.
--verbose, -v
verbose mode.
--debug, -d
debug mode.
EXAMPLES
% bsfilter -s ~/Mail/spam/* ## add spam
% bsfilter -u -c ~/Mail/job/* ~/Mail/private/* ## add clean mails and update probability table
% bsfilter ~/Mail/inbox/1 ## show spam probability
## recipe of procmail
:0 HB
* ? bsfilter -a
spam/.
## recipe of procmail
:0 fw
| bsfilter -a --pipe --insert-flag --insert-probability
SEE ALSO
http://bsfilter.org/, http://sourceforge.jp/projects/bsfilter/
http://exerb.sourceforge.jp/,
http://www.osk.3web.ne.jp/~nyasu/software/vrproject.html,
http://www.ruby-lang.org/
AUTHOR
The original manual is in the bsfilter command it self which is written
by NABEYA Kenichi (upstream author). This manual page was transrated
from the manual by akira yamada <akira@debian.org> for the Debian
system (but may be used by others). Permission is granted to copy,
distribute and/or modify this document under the terms of the GNU
General Public License, Version 2 any later version published by the
Free Software Foundation.
On Debian systems, the complete text of the GNU General Public License
can be found in /usr/share/common-licenses/GPL.