NAME
queue_splitter - PgQ consumer that transports events from one queue
into several target queues
SYNOPSIS
queue_splitter.py [switches] config.ini
DESCRIPTION
queue_spliter is PgQ consumer that transports events from source queue
into several target queues. ev_extra1 field in each event shows into
which target queue it must go. (pgq.logutriga() puts there the table
name.)
One use case is to move events from OLTP database to batch processing
server. By using queue spliter it is possible to move all kinds of
events for batch processing with one consumer thus keeping OLTP
database less crowded.
QUICK-START
Basic queue_splitter setup and usage can be summarized by the following
steps:
1. pgq must be installed both in source and target databases. See
pgqadm man page for details. Target database must also have pgq_ext
schema installed.
2. edit a queue_splitter configuration file, say
queue_splitter_sourcedb_sourceq_targetdb.ini
3. create source and target queues
$ pgqadm.py ticker.ini create <queue>
4. launch queue splitter in daemon mode
$ queue_splitter.py queue_splitter_sourcedb_sourceq_targetdb.ini -d
5. start producing and consuming events
CONFIG
Common configuration parameters
job_name
Name for particulat job the script does. Script will log under this
name to logdb/logserver. The name is also used as default for PgQ
consumer name. It should be unique.
pidfile
Location for pid file. If not given, script is disallowed to
daemonize.
logfile
Location for log file.
loop_delay
If continuisly running process, how long to sleep after each work
loop, in seconds. Default: 1.
connection_lifetime
Close and reconnect older database connections.
use_skylog
foo.
Common PgQ consumer parameters
pgq_queue_name
Queue name to attach to. No default.
pgq_consumer_id
Consumers ID to use when registering. Default: %(job_name)s
queue_splitter parameters
src_db
Source database.
dst_db
Target database.
Example config file
[queue_splitter]
job_name = queue_spliter_sourcedb_sourceq_targetdb
src_db = dbname=sourcedb
dst_db = dbname=targetdb
pgq_queue_name = sourceq
logfile = ~/log/%(job_name)s.log
pidfile = ~/pid/%(job_name)s.pid
COMMAND LINE SWITCHES
Following switches are common to all skytools.DBScript-based Python
programs.
-h, --help
show help message and exit
-q, --quiet
make program silent
-v, --verbose
make program more verbose
-d, --daemon
make program go background
Following switches are used to control already running process. The
pidfile is read from config then signal is sent to process id specified
there.
-r, --reload
reload config (send SIGHUP)
-s, --stop
stop program safely (send SIGINT)
-k, --kill
kill program immidiately (send SIGTERM)
USECASE
How to to process events created in secondary database with several
queues but have only one queue in primary database. This also shows how
to insert events into queues with regular SQL easily.
CREATE SCHEMA queue;
CREATE TABLE queue.event1 (
-- this should correspond to event internal structure
-- here you can put checks that correct data is put into queue
id int4,
name text,
-- not needed, but good to have:
primary key (id)
);
-- put data into queue in urlencoded format, skip actual insert
CREATE TRIGGER redirect_queue1_trg BEFORE INSERT ON queue.event1
FOR EACH ROW EXECUTE PROCEDURE pgq.logutriga(´singlequeue´, ´SKIP´);
-- repeat the above for event2
-- now the data can be inserted:
INSERT INTO queue.event1 (id, name) VALUES (1, ´user´);
If the queue_splitter is put on "singlequeue", it spreads the event on
target to queues named "queue.event1", "queue.event2", etc. This keeps
PgQ load on primary database minimal both CPU-wise and
maintenance-wise.
09/22/2008