NAME DSPAM - DSPAM Core Analyis Engine Functions
libdspam, dspam_init, dspam_create, dspam_addattribute, dspam_attach,
dspam_process, dspam_getsource, dspam_detach, dspam_clearattributes,
dspam_destroy
DSPAM Core Analyis Engine Functions
SYNOPSIS
#include <libdspam.h>
DSPAM_CTX *dspam_init(const char *user, const char *group,
const char *home, int mode, u_int32_t flags);
DSPAM_CTX *dspam_create(const char *user, const char *group,
const char *home, int mode, u_int32_t flags);
int dspam_addattribute(DSPAM_CTX *CTX, const char *name,
const char *value);
int dspam_clearattributes(DSPAM_CTX *CTX);
int dspam_attach(DSPAM_CTX *CTX, void *dbh);
int dspam_process(DSPAM_CTX *CTX, const char *message);
int dspam_getsource(DSPAM_CTX *CTX, char *buf, size_t size);
int dspam_detach(DSPAM_CTX *CTX);
int dspam_destroy(DSPAM_CTX *CTX);
DESCRIPTION
libdspam provides core message processing and classification
functionality.
The dspam_init() function creates and initializes a new classification
context and attaches the context to whatever backend storage facility
was configured. The user and group arguments provided are used to read
and write information stored for the user and group specified. The home
argument is used to configure libdspam’s storage around the base
directory specified. The mode specifies the operating mode to
initialize the classification context with and may be one of:
DSM_PROCESS Process the message and return a result
DSM_CLASSIFY Classify message only, no learning
DSM_TOOLS No processing, attach to storage only
The flags provided further tune the classification context for a
specific function. Multiple flags may be OR’d together.
DSF_CHAINED Use a Chained (Multi-Word) Tokenizer
DSF_SBPH Use Sparse Binary Polynomial Hashing Tokenizer
DSF_SIGNATURE A binary signature is requested/provided
DSF_NOISE Apply Bayesian Noise Reduction logic
DSF_WHITELIST Use automatic whitelisting logic
DSF_MERGED Merge group metadata with user’s in memory
Upon successful completion, dspam_init() will return a pointer to a new
classification context structure containing a copy of the configuration
passed into dspam_init(), a connected storage driver handle, and a set
of preliminary user control data read from storage.
The dspam_create() function performs in exactly the same manner as the
dspam_init() function, but does not attach to storage. Instead, the
caller must also call dspam_attach() after setting any storage-
specific attributes using dspam_addattribute(). This is useful for
cases where the implementor would prefer to configure storage
internally rather than having libdspam read a configuration from a
file.
The dspam_addattribute() function is called to set attributes within
the classification context. Some storage drivers support the use of
passing specific attributes such as server connect information. The
driver-independent attributes supported by DSPAM include:
IgnoreHeader Specify a specific header to ignore
LocalMX Specify a local mail exchanger to assist in
correct results from dspam_getsource().
Only driver-dependent attributes need be set prior to a call to
dspam_attach(). Driver-independent attributes may be set both before
and after storage has been attached.
The dspam_attach() function attaches the storage interface to the
classification context and alternatively established an initial
connection with storage if dbh is NULL. Some storage drivers support
only a NULL value for dbh, while others (such as mysql_drv, pgsql_drv,
and sqlite_drv) allow an open database handle to be attached. This
function should only be called after an initial call to dspam_create()
and should never be called if using dspam_init(), as storage is
automatically attached by a call to dspam_init().
The dspam_process() function performs analysis of the message passed
into it and will return zero on successful completion. If successful,
CTX->result will be set to one of three classification results:
DSR_ISSPAM Message was classified as spam
DSR_ISINNOCENT Message was classified as nonspam
DSR_ISWHITELISTED Recipient was automatically whitelisted
Should the call fail, one of the following errors will be returned:
EINVAL An invalid call or invalid parameter used.
EUNKNOWN Unexpected error, such as malloc() failure
EFILE Error opening or writing to a file or file handle
ELOCK Locking failure
EFAILURE The operation itself has failed
The dspam_getsource() function extracts the source sender from the
message passed in during a call to dspam_process() and writes not more
than size bytes to buf.
The dspam_detach() function can be called when a detachment from
storage is desired, but the context is still needed. The storage driver
is closed, leaving the classification context in place. Once the
context is no longer needed, another call to dspam_destroy() should be
made. If you are closing storage and destroying the context at the same
time, it is not necessary to call this function. Instead you may call
dspam_destroy() directly.
The dspam_clearattributes() function is called to clear any attributes
previously set using dspam_addattribute() within the classification
context. It is necessary to call this function prior to replacing any
attributes already written.
The dspam_destroy() function should be called when the context is no
longer needed. If a connection was established to storage internally,
the connection is closed and all data is flushed and written. If a
handle was attached, the handle will remain open.
AUTHORS
Jonathan A. Zdziarski
For more information, see http://www.nuclearelephant.com.
SEE ALSO
dspam_stats(1), dspam_train(1), dspam_clean(1), dspam_dump(1),
dspam_merge(1)