NAME
cdb - Constant DataBase manipulation tool
SYNOPSYS
cdb -q [-m] [-n num] dbname key
cdb -d [-m] [dbname|-]
cdb -l [-m] [dbname|-]
cdb -s [dbname|-]
cdb -c [-m] [-t tmpname|-] [-p perms] [-weru0] dbname [infile...]
DESCRIPTION
cdb used to query, dump, list, analyze or create CDB (Constant
DataBase) files. Format of cdb described in cdb(5) manpage. This
manual page corresponds to version 0.77 of tinycdb package.
Query
cdb -q finds given key in a given dbname cdb file, and writes
associated value to standard output if found (and exits with zero), or
exits with non-zero if not found. dbname must be seekable file, and
stdin can not be used as input. By default, cdb will print all records
found. Options recognized in query mode:
-nnum causes cdb to find and write a record with a given number num
starting with 1 β when there are many records with a given key.
-m newline will be added after every value printed. By default,
multiple values will be written without any delimiter.
Dump/List
cdb -d dumps contents, and cdb -l lists keys of cdbfile (or standard
input if not specified) to standard output, in format controlled by
presence of -m option. See subsection "Formats" below. Output from
cdb -d can be used as an input for cdb -c.
Create
Cdb database created in two stages: temporary database is created, and
after it is complete, it gets atomically renamed to permanent place.
This avoids requirements for locking between readers and writers (or
creaters). cdb -c will attempt to create cdb in file tmpname (or
dbname with ".tmp" appended if no -t option given) and then rename it
to dbname. It will read supplied infiles (or standard input if none
specified). Options recognized in create mode:
-t tmpname
use given tmpname as temporary file. Defaults to dbname.tmp
(i.e. with output file with .tmp added). Note tmpname must be
in the same filesystem as output file, as cdb uses rename(2) to
finalize the database creation procedure. If tmpname is a
single dash (-), no temp file will be created, database will be
built in-place. This mode is useful when the final renaming is
done by the caller.
-p perms
permissions for the newly created file (usually an octal number,
like 0644). By default the permissions are 0666 (with current
process umask applied). If this option is specified, current
umask value has no effect.
-w warn about duplicate keys.
-e abort on duplicate keys (implies -w).
-r replace existing key with new one in case of duplicate. This
may require database file rewrite to remove old records, and can
be slow.
-0 zero-fill existing records when duplicate records are added.
This is faster than -r, but leaves extra zeros in the database
file in case of duplicates.
-u do not add duplicate records.
-m interpret input as a sequence of lines, one record per line,
with value separated from a key by space or tab characters,
instead of native cdb format (see "Input/Output Format" below).
Note that using any option that requires duplicate checking will slow
creation process significantly, especially for large databases.
Statistics
cdb -s will analyze dbfile and print summary to standard output.
Statistics include: total number of rows in a file, minimum, average
and maximum key and value lengths, hash tables (max 256) and entries
used, number of hash collisions (that is, more than one key point to
the same hash table entry), minimum, average and maximum hash table
size (of non-empty tables), and number of keys that sits at 10
different distances from itβs calculated hash table index β keys in
distance 0 requires only one hash table lookup, 1 β two and so on; more
keys at greater distance means slower database search.
Input/Output Format
By default, cdb expects (for create operation) or writes (for
dump/list) native cdb format data. Cdb native format is a sequence of
records in a form:
+klen,vlen:key->val\n
where "+", ",", ":", "-", ">" and "\n" (newline) are literal
characters, klen and vlen are length of key and value as decimal
numbers, and key and val are key and value themselves. Series of
records terminated by an empty line. This is the only format where key
and value may contain any character including newline, zero (\0) and so
on.
When -l option requested (list keys mode), cdb will produce slightly
modified output in a form:
+klen:key\n
(note vlen and val are omitted, together with surrounding delimiters).
If -m option is given, cdb will expect or produce one line for every
record (newline is a record delimiter), and every line should contain
optional whitespace, key, whitespace and value up to end of line.
Lines started with hash character (#) and empty lines are ignored.
This is the same format as mkmap(1) utility expects.
OPTIONS SUMMARY
Here is a short summary of all options accepted by cdb utility:
-0 zero-fill duplicate records in create (-c) mode.
-c create mode.
-d dump mode.
-e abort (error) on duplicate key in create (-c) mode.
-h print short help and exit.
-l list mode.
-m input or output is in "map" format, not in native cdb format.
In query mode, add a newline after every value written.
-nnum find and print numth record in query (-q) mode.
-q query mode.
-r replace duplicate keys in create (-c) mode.
-s statistics mode.
-t tempfile
specify temporary file when creating (-c) cdb file (use single
dash (-) as tempfile to stop using temp file).
-u do not insert duplicate keys (unique) in create (-c) mode.
-w warn about duplicate keys in create (-c) mode.
AUTHOR
The tinycdb package written by Michael Tokarev <mjt@corpit.ru>, based
on ideas and shares file format with original cdb library by Dan
Bernstein.
SEE ALSO
cdb(5), cdb(3).
LICENCE
Public domain.
Jan 2009 cdb(1)