NAME
tchdb - the hash database API
DESCRIPTION
Hash database is a file containing a hash table and is handled with the
hash database API.
To use the hash database API, include ‘tcutil.h’, ‘tchdb.h’, and
related standard header files. Usually, write the following
description near the front of a source file.
#include <tcutil.h>
#include <tchdb.h>
#include <stdlib.h>
#include <time.h>
#include <stdbool.h>
#include <stdint.h>
Objects whose type is pointer to ‘TCHDB’ are used to handle hash
databases. A hash database object is created with the function
‘tchdbnew’ and is deleted with the function ‘tchdbdel’. To avoid
memory leak, it is important to delete every object when it is no
longer in use.
Before operations to store or retrieve records, it is necessary to open
a database file and connect the hash database object to it. The
function ‘tchdbopen’ is used to open a database file and the function
‘tchdbclose’ is used to close the database file. To avoid data missing
or corruption, it is important to close every database file when it is
no longer in use. It is forbidden for multible database objects in a
process to open the same database at the same time.
API
The function ‘tchdberrmsg’ is used in order to get the message string
corresponding to an error code.
const char *tchdberrmsg(int ecode);
‘ecode’ specifies the error code.
The return value is the message string of the error code.
The function ‘tchdbnew’ is used in order to create a hash database
object.
TCHDB *tchdbnew(void);
The return value is the new hash database object.
The function ‘tchdbdel’ is used in order to delete a hash database
object.
void tchdbdel(TCHDB *hdb);
‘hdb’ specifies the hash database object.
If the database is not closed, it is closed implicitly.
Note that the deleted object and its derivatives can not
be used anymore.
The function ‘tchdbecode’ is used in order to get the last happened
error code of a hash database object.
int tchdbecode(TCHDB *hdb);
‘hdb’ specifies the hash database object.
The return value is the last happened error code.
The following error codes are defined: ‘TCESUCCESS’ for
success, ‘TCETHREAD’ for threading error, ‘TCEINVALID’
for invalid operation, ‘TCENOFILE’ for file not found,
‘TCENOPERM’ for no permission, ‘TCEMETA’ for invalid meta
data, ‘TCERHEAD’ for invalid record header, ‘TCEOPEN’ for
open error, ‘TCECLOSE’ for close error, ‘TCETRUNC’ for
trunc error, ‘TCESYNC’ for sync error, ‘TCESTAT’ for stat
error, ‘TCESEEK’ for seek error, ‘TCEREAD’ for read
error, ‘TCEWRITE’ for write error, ‘TCEMMAP’ for mmap
error, ‘TCELOCK’ for lock error, ‘TCEUNLINK’ for unlink
error, ‘TCERENAME’ for rename error, ‘TCEMKDIR’ for mkdir
error, ‘TCERMDIR’ for rmdir error, ‘TCEKEEP’ for existing
record, ‘TCENOREC’ for no record found, and ‘TCEMISC’ for
miscellaneous error.
The function ‘tchdbsetmutex’ is used in order to set mutual exclusion
control of a hash database object for threading.
bool tchdbsetmutex(TCHDB *hdb);
‘hdb’ specifies the hash database object which is not
opened.
If successful, the return value is true, else, it is
false.
Note that the mutual exclusion control of the database
should be set before the database is opened.
The function ‘tchdbtune’ is used in order to set the tuning parameters
of a hash database object.
bool tchdbtune(TCHDB *hdb, int64_t bnum, int8_t apow, int8_t
fpow, uint8_t opts);
‘hdb’ specifies the hash database object which is not
opened.
‘bnum’ specifies the number of elements of the bucket
array. If it is not more than 0, the default value is
specified. The default value is 16381. Suggested size
of the bucket array is about from 0.5 to 4 times of the
number of all records to be stored.
‘apow’ specifies the size of record alignment by power of
2. If it is negative, the default value is specified.
The default value is 4 standing for 2^4=16.
‘fpow’ specifies the maximum number of elements of the
free block pool by power of 2. If it is negative, the
default value is specified. The default value is 10
standing for 2^10=1024.
‘opts’ specifies options by bitwise-or: ‘HDBTLARGE’
specifies that the size of the database can be larger
than 2GB by using 64-bit bucket array, ‘HDBTDEFLATE’
specifies that each record is compressed with Deflate
encoding, ‘HDBTBZIP’ specifies that each record is
compressed with BZIP2 encoding, ‘HDBTTCBS’ specifies that
each record is compressed with TCBS encoding.
If successful, the return value is true, else, it is
false.
Note that the tuning parameters should be set before the
database is opened.
The function ‘tchdbsetcache’ is used in order to set the caching
parameters of a hash database object.
bool tchdbsetcache(TCHDB *hdb, int32_t rcnum);
‘hdb’ specifies the hash database object which is not
opened.
‘rcnum’ specifies the maximum number of records to be
cached. If it is not more than 0, the record cache is
disabled. It is disabled by default.
If successful, the return value is true, else, it is
false.
Note that the caching parameters should be set before the
database is opened.
The function ‘tchdbsetxmsiz’ is used in order to set the size of the
extra mapped memory of a hash database object.
bool tchdbsetxmsiz(TCHDB *hdb, int64_t xmsiz);
‘hdb’ specifies the hash database object which is not
opened.
‘xmsiz’ specifies the size of the extra mapped memory.
If it is not more than 0, the extra mapped memory is
disabled. The default size is 67108864.
If successful, the return value is true, else, it is
false.
Note that the mapping parameters should be set before the
database is opened.
The function ‘tchdbsetdfunit’ is used in order to set the unit step
number of auto defragmentation of a hash database object.
bool tchdbsetdfunit(TCHDB *hdb, int32_t dfunit);
‘hdb’ specifies the hash database object which is not
opened.
‘dfunit’ specifie the unit step number. If it is not
more than 0, the auto defragmentation is disabled. It is
disabled by default.
If successful, the return value is true, else, it is
false.
Note that the defragmentation parameters should be set
before the database is opened.
The function ‘tchdbopen’ is used in order to open a database file and
connect a hash database object.
bool tchdbopen(TCHDB *hdb, const char *path, int omode);
‘hdb’ specifies the hash database object which is not
opened.
‘path’ specifies the path of the database file.
‘omode’ specifies the connection mode: ‘HDBOWRITER’ as a
writer, ‘HDBOREADER’ as a reader. If the mode is
‘HDBOWRITER’, the following may be added by bitwise-or:
‘HDBOCREAT’, which means it creates a new database if not
exist, ‘HDBOTRUNC’, which means it creates a new database
regardless if one exists, ‘HDBOTSYNC’, which means every
transaction synchronizes updated contents with the
device. Both of ‘HDBOREADER’ and ‘HDBOWRITER’ can be
added to by bitwise-or: ‘HDBONOLCK’, which means it opens
the database file without file locking, or ‘HDBOLCKNB’,
which means locking is performed without blocking.
If successful, the return value is true, else, it is
false.
The function ‘tchdbclose’ is used in order to close a hash database
object.
bool tchdbclose(TCHDB *hdb);
‘hdb’ specifies the hash database object.
If successful, the return value is true, else, it is
false.
Update of a database is assured to be written when the
database is closed. If a writer opens a database but
does not close it appropriately, the database will be
broken.
The function ‘tchdbput’ is used in order to store a record into a hash
database object.
bool tchdbput(TCHDB *hdb, const void *kbuf, int ksiz, const void
*vbuf, int vsiz);
‘hdb’ specifies the hash database object connected as a
writer.
‘kbuf’ specifies the pointer to the region of the key.
‘ksiz’ specifies the size of the region of the key.
‘vbuf’ specifies the pointer to the region of the value.
‘vsiz’ specifies the size of the region of the value.
If successful, the return value is true, else, it is
false.
If a record with the same key exists in the database, it
is overwritten.
The function ‘tchdbput2’ is used in order to store a string record into
a hash database object.
bool tchdbput2(TCHDB *hdb, const char *kstr, const char *vstr);
‘hdb’ specifies the hash database object connected as a
writer.
‘kstr’ specifies the string of the key.
‘vstr’ specifies the string of the value.
If successful, the return value is true, else, it is
false.
If a record with the same key exists in the database, it
is overwritten.
The function ‘tchdbputkeep’ is used in order to store a new record into
a hash database object.
bool tchdbputkeep(TCHDB *hdb, const void *kbuf, int ksiz, const
void *vbuf, int vsiz);
‘hdb’ specifies the hash database object connected as a
writer.
‘kbuf’ specifies the pointer to the region of the key.
‘ksiz’ specifies the size of the region of the key.
‘vbuf’ specifies the pointer to the region of the value.
‘vsiz’ specifies the size of the region of the value.
If successful, the return value is true, else, it is
false.
If a record with the same key exists in the database,
this function has no effect.
The function ‘tchdbputkeep2’ is used in order to store a new string
record into a hash database object.
bool tchdbputkeep2(TCHDB *hdb, const char *kstr, const char
*vstr);
‘hdb’ specifies the hash database object connected as a
writer.
‘kstr’ specifies the string of the key.
‘vstr’ specifies the string of the value.
If successful, the return value is true, else, it is
false.
If a record with the same key exists in the database,
this function has no effect.
The function ‘tchdbputcat’ is used in order to concatenate a value at
the end of the existing record in a hash database object.
bool tchdbputcat(TCHDB *hdb, const void *kbuf, int ksiz, const
void *vbuf, int vsiz);
‘hdb’ specifies the hash database object connected as a
writer.
‘kbuf’ specifies the pointer to the region of the key.
‘ksiz’ specifies the size of the region of the key.
‘vbuf’ specifies the pointer to the region of the value.
‘vsiz’ specifies the size of the region of the value.
If successful, the return value is true, else, it is
false.
If there is no corresponding record, a new record is
created.
The function ‘tchdbputcat2’ is used in order to concatenate a string
value at the end of the existing record in a hash database object.
bool tchdbputcat2(TCHDB *hdb, const char *kstr, const char
*vstr);
‘hdb’ specifies the hash database object connected as a
writer.
‘kstr’ specifies the string of the key.
‘vstr’ specifies the string of the value.
If successful, the return value is true, else, it is
false.
If there is no corresponding record, a new record is
created.
The function ‘tchdbputasync’ is used in order to store a record into a
hash database object in asynchronous fashion.
bool tchdbputasync(TCHDB *hdb, const void *kbuf, int ksiz, const
void *vbuf, int vsiz);
‘hdb’ specifies the hash database object connected as a
writer.
‘kbuf’ specifies the pointer to the region of the key.
‘ksiz’ specifies the size of the region of the key.
‘vbuf’ specifies the pointer to the region of the value.
‘vsiz’ specifies the size of the region of the value.
If successful, the return value is true, else, it is
false.
If a record with the same key exists in the database, it
is overwritten. Records passed to this function are
accumulated into the inner buffer and wrote into the file
at a blast.
The function ‘tchdbputasync2’ is used in order to store a string record
into a hash database object in asynchronous fashion.
bool tchdbputasync2(TCHDB *hdb, const char *kstr, const char
*vstr);
‘hdb’ specifies the hash database object connected as a
writer.
‘kstr’ specifies the string of the key.
‘vstr’ specifies the string of the value.
If successful, the return value is true, else, it is
false.
If a record with the same key exists in the database, it
is overwritten. Records passed to this function are
accumulated into the inner buffer and wrote into the file
at a blast.
The function ‘tchdbout’ is used in order to remove a record of a hash
database object.
bool tchdbout(TCHDB *hdb, const void *kbuf, int ksiz);
‘hdb’ specifies the hash database object connected as a
writer.
‘kbuf’ specifies the pointer to the region of the key.
‘ksiz’ specifies the size of the region of the key.
If successful, the return value is true, else, it is
false.
The function ‘tchdbout2’ is used in order to remove a string record of
a hash database object.
bool tchdbout2(TCHDB *hdb, const char *kstr);
‘hdb’ specifies the hash database object connected as a
writer.
‘kstr’ specifies the string of the key.
If successful, the return value is true, else, it is
false.
The function ‘tchdbget’ is used in order to retrieve a record in a hash
database object.
void *tchdbget(TCHDB *hdb, const void *kbuf, int ksiz, int *sp);
‘hdb’ specifies the hash database object.
‘kbuf’ specifies the pointer to the region of the key.
‘ksiz’ specifies the size of the region of the key.
‘sp’ specifies the pointer to the variable into which the
size of the region of the return value is assigned.
If successful, the return value is the pointer to the
region of the value of the corresponding record. ‘NULL’
is returned if no record corresponds.
Because an additional zero code is appended at the end of
the region of the return value, the return value can be
treated as a character string. Because the region of the
return value is allocated with the ‘malloc’ call, it
should be released with the ‘free’ call when it is no
longer in use.
The function ‘tchdbget2’ is used in order to retrieve a string record
in a hash database object.
char *tchdbget2(TCHDB *hdb, const char *kstr);
‘hdb’ specifies the hash database object.
‘kstr’ specifies the string of the key.
If successful, the return value is the string of the
value of the corresponding record. ‘NULL’ is returned if
no record corresponds.
Because the region of the return value is allocated with
the ‘malloc’ call, it should be released with the ‘free’
call when it is no longer in use.
The function ‘tchdbget3’ is used in order to retrieve a record in a
hash database object and write the value into a buffer.
int tchdbget3(TCHDB *hdb, const void *kbuf, int ksiz, void
*vbuf, int max);
‘hdb’ specifies the hash database object.
‘kbuf’ specifies the pointer to the region of the key.
‘ksiz’ specifies the size of the region of the key.
‘vbuf’ specifies the pointer to the buffer into which the
value of the corresponding record is written.
‘max’ specifies the size of the buffer.
If successful, the return value is the size of the
written data, else, it is -1. -1 is returned if no
record corresponds to the specified key.
Note that an additional zero code is not appended at the
end of the region of the writing buffer.
The function ‘tchdbvsiz’ is used in order to get the size of the value
of a record in a hash database object.
int tchdbvsiz(TCHDB *hdb, const void *kbuf, int ksiz);
‘hdb’ specifies the hash database object.
‘kbuf’ specifies the pointer to the region of the key.
‘ksiz’ specifies the size of the region of the key.
If successful, the return value is the size of the value
of the corresponding record, else, it is -1.
The function ‘tchdbvsiz2’ is used in order to get the size of the value
of a string record in a hash database object.
int tchdbvsiz2(TCHDB *hdb, const char *kstr);
‘hdb’ specifies the hash database object.
‘kstr’ specifies the string of the key.
If successful, the return value is the size of the value
of the corresponding record, else, it is -1.
The function ‘tchdbiterinit’ is used in order to initialize the
iterator of a hash database object.
bool tchdbiterinit(TCHDB *hdb);
‘hdb’ specifies the hash database object.
If successful, the return value is true, else, it is
false.
The iterator is used in order to access the key of every
record stored in a database.
The function ‘tchdbiternext’ is used in order to get the next key of
the iterator of a hash database object.
void *tchdbiternext(TCHDB *hdb, int *sp);
‘hdb’ specifies the hash database object.
‘sp’ specifies the pointer to the variable into which the
size of the region of the return value is assigned.
If successful, the return value is the pointer to the
region of the next key, else, it is ‘NULL’. ‘NULL’ is
returned when no record is to be get out of the iterator.
Because an additional zero code is appended at the end of
the region of the return value, the return value can be
treated as a character string. Because the region of the
return value is allocated with the ‘malloc’ call, it
should be released with the ‘free’ call when it is no
longer in use. It is possible to access every record by
iteration of calling this function. It is allowed to
update or remove records whose keys are fetched while the
iteration. However, it is not assured if updating the
database is occurred while the iteration. Besides, the
order of this traversal access method is arbitrary, so it
is not assured that the order of storing matches the one
of the traversal access.
The function ‘tchdbiternext2’ is used in order to get the next key
string of the iterator of a hash database object.
char *tchdbiternext2(TCHDB *hdb);
‘hdb’ specifies the hash database object.
If successful, the return value is the string of the next
key, else, it is ‘NULL’. ‘NULL’ is returned when no
record is to be get out of the iterator.
Because the region of the return value is allocated with
the ‘malloc’ call, it should be released with the ‘free’
call when it is no longer in use. It is possible to
access every record by iteration of calling this
function. However, it is not assured if updating the
database is occurred while the iteration. Besides, the
order of this traversal access method is arbitrary, so it
is not assured that the order of storing matches the one
of the traversal access.
The function ‘tchdbiternext3’ is used in order to get the next
extensible objects of the iterator of a hash database object.
bool tchdbiternext3(TCHDB *hdb, TCXSTR *kxstr, TCXSTR *vxstr);
‘hdb’ specifies the hash database object.
‘kxstr’ specifies the object into which the next key is
wrote down.
‘vxstr’ specifies the object into which the next value is
wrote down.
If successful, the return value is true, else, it is
false. False is returned when no record is to be get out
of the iterator.
The function ‘tchdbfwmkeys’ is used in order to get forward matching
keys in a hash database object.
TCLIST *tchdbfwmkeys(TCHDB *hdb, const void *pbuf, int psiz, int
max);
‘hdb’ specifies the hash database object.
‘pbuf’ specifies the pointer to the region of the prefix.
‘psiz’ specifies the size of the region of the prefix.
‘max’ specifies the maximum number of keys to be fetched.
If it is negative, no limit is specified.
The return value is a list object of the corresponding
keys. This function does never fail. It returns an
empty list even if no key corresponds.
Because the object of the return value is created with
the function ‘tclistnew’, it should be deleted with the
function ‘tclistdel’ when it is no longer in use. Note
that this function may be very slow because every key in
the database is scanned.
The function ‘tchdbfwmkeys2’ is used in order to get forward matching
string keys in a hash database object.
TCLIST *tchdbfwmkeys2(TCHDB *hdb, const char *pstr, int max);
‘hdb’ specifies the hash database object.
‘pstr’ specifies the string of the prefix.
‘max’ specifies the maximum number of keys to be fetched.
If it is negative, no limit is specified.
The return value is a list object of the corresponding
keys. This function does never fail. It returns an
empty list even if no key corresponds.
Because the object of the return value is created with
the function ‘tclistnew’, it should be deleted with the
function ‘tclistdel’ when it is no longer in use. Note
that this function may be very slow because every key in
the database is scanned.
The function ‘tchdbaddint’ is used in order to add an integer to a
record in a hash database object.
int tchdbaddint(TCHDB *hdb, const void *kbuf, int ksiz, int
num);
‘hdb’ specifies the hash database object connected as a
writer.
‘kbuf’ specifies the pointer to the region of the key.
‘ksiz’ specifies the size of the region of the key.
‘num’ specifies the additional value.
If successful, the return value is the summation value,
else, it is ‘INT_MIN’.
If the corresponding record exists, the value is treated
as an integer and is added to. If no record corresponds,
a new record of the additional value is stored.
The function ‘tchdbdbadddouble’ is used in order to add a real number
to a record in a hash database object.
double tchdbadddouble(TCHDB *hdb, const void *kbuf, int ksiz,
double num);
‘hdb’ specifies the hash database object connected as a
writer.
‘kbuf’ specifies the pointer to the region of the key.
‘ksiz’ specifies the size of the region of the key.
‘num’ specifies the additional value.
If successful, the return value is the summation value,
else, it is Not-a-Number.
If the corresponding record exists, the value is treated
as a real number and is added to. If no record
corresponds, a new record of the additional value is
stored.
The function ‘tchdbsync’ is used in order to synchronize updated
contents of a hash database object with the file and the device.
bool tchdbsync(TCHDB *hdb);
‘hdb’ specifies the hash database object connected as a
writer.
If successful, the return value is true, else, it is
false.
This function is useful when another process connects to
the same database file.
The function ‘tchdboptimize’ is used in order to optimize the file of a
hash database object.
bool tchdboptimize(TCHDB *hdb, int64_t bnum, int8_t apow, int8_t
fpow, uint8_t opts);
‘hdb’ specifies the hash database object connected as a
writer.
‘bnum’ specifies the number of elements of the bucket
array. If it is not more than 0, the default value is
specified. The default value is two times of the number
of records.
‘apow’ specifies the size of record alignment by power of
2. If it is negative, the current setting is not
changed.
‘fpow’ specifies the maximum number of elements of the
free block pool by power of 2. If it is negative, the
current setting is not changed.
‘opts’ specifies options by bitwise-or: ‘HDBTLARGE’
specifies that the size of the database can be larger
than 2GB by using 64-bit bucket array, ‘HDBTDEFLATE’
specifies that each record is compressed with Deflate
encoding, ‘HDBTBZIP’ specifies that each record is
compressed with BZIP2 encoding, ‘HDBTTCBS’ specifies that
each record is compressed with TCBS encoding. If it is
‘UINT8_MAX’, the current setting is not changed.
If successful, the return value is true, else, it is
false.
This function is useful to reduce the size of the
database file with data fragmentation by successive
updating.
The function ‘tchdbvanish’ is used in order to remove all records of a
hash database object.
bool tchdbvanish(TCHDB *hdb);
‘hdb’ specifies the hash database object connected as a
writer.
If successful, the return value is true, else, it is
false.
The function ‘tchdbcopy’ is used in order to copy the database file of
a hash database object.
bool tchdbcopy(TCHDB *hdb, const char *path);
‘hdb’ specifies the hash database object.
‘path’ specifies the path of the destination file. If it
begins with ‘@’, the trailing substring is executed as a
command line.
If successful, the return value is true, else, it is
false. False is returned if the executed command returns
non-zero code.
The database file is assured to be kept synchronized and
not modified while the copying or executing operation is
in progress. So, this function is useful to create a
backup file of the database file.
The function ‘tchdbtranbegin’ is used in order to begin the transaction
of a hash database object.
bool tchdbtranbegin(TCHDB *hdb);
‘hdb’ specifies the hash database object connected as a
writer.
If successful, the return value is true, else, it is
false.
The database is locked by the thread while the
transaction so that only one transaction can be activated
with a database object at the same time. Thus, the
serializable isolation level is assumed if every database
operation is performed in the transaction. All updated
regions are kept track of by write ahead logging while
the transaction. If the database is closed during
transaction, the transaction is aborted implicitly.
The function ‘tchdbtrancommit’ is used in order to commit the
transaction of a hash database object.
bool tchdbtrancommit(TCHDB *hdb);
‘hdb’ specifies the hash database object connected as a
writer.
If successful, the return value is true, else, it is
false.
Update in the transaction is fixed when it is committed
successfully.
The function ‘tchdbtranabort’ is used in order to abort the transaction
of a hash database object.
bool tchdbtranabort(TCHDB *hdb);
‘hdb’ specifies the hash database object connected as a
writer.
If successful, the return value is true, else, it is
false.
Update in the transaction is discarded when it is
aborted. The state of the database is rollbacked to
before transaction.
The function ‘tchdbpath’ is used in order to get the file path of a
hash database object.
const char *tchdbpath(TCHDB *hdb);
‘hdb’ specifies the hash database object.
The return value is the path of the database file or
‘NULL’ if the object does not connect to any database
file.
The function ‘tchdbrnum’ is used in order to get the number of records
of a hash database object.
uint64_t tchdbrnum(TCHDB *hdb);
‘hdb’ specifies the hash database object.
The return value is the number of records or 0 if the
object does not connect to any database file.
The function ‘tchdbfsiz’ is used in order to get the size of the
database file of a hash database object.
uint64_t tchdbfsiz(TCHDB *hdb);
‘hdb’ specifies the hash database object.
The return value is the size of the database file or 0 if
the object does not connect to any database file.
SEE ALSO
tchtest(1), tchmttest(1), tchmgr(1), tokyocabinet(3)