NAME
pgdbf - convert XBase / FoxPro tables to PostgreSQL
SYNOPSIS
pgdbf [-cCdDeEhtTuU] [-m memofile] filename [indexcolumn ...]
DESCRIPTION
PgDBF is a program for converting XBase databases - particularly FoxPro
tables with memo files - into a format that PostgreSQL can directly
import. It’s a compact C project with no dependencies other than
standard Unix libraries. While the project is relatively tiny and
simple, it’s also heavily optimized via profiling - routine benchmark
were many times faster than with other Open Source programs. In fact,
even on slower systems, conversions are typically limited by hard drive
speed.
Features
PgDBF was designed with a few core principles:
Simplicity. This code should be understandable by anyone who
wants to hack it.
Robustness. Every syscall that might possibly fail is checked
for success.
Speed. PgDBF was born to be the fastest conversion available
anywhere.
Completeness. It has full support for FoxPro memo files.
Portability. PgDBF runs on 32- and 64-bit systems, and both
little-endian (eg x86) and big-endian (eg PowerPC)
architectures.
Performance
PgDBF’s speed is generally limited by how fast it can read your hard
drives. A striped RAID of quick disks can keep PgDBF pretty well fed
on a single-processor system. One problem area is with memo files,
which may become very internally fragmented as memo fields are created,
deleted, and updated. For best results, consider placing the DBF and
FPT files on a RAM drive so that there’s no seek penalty as there is
with spinning hard drives, or using a filesystem such as ZFS that
caches aggressively.
One particularly fragmented 160MB table with memo fields used to take
over three minutes on a FreeBSD UFS2 filesystem. Moving the files to a
RAM disk dropped the conversion time to around 1.2 seconds.
A certain test table used during development comprises a 280MB DBF file
and a 660MB memo file. PgDBF converts this to a 1.3 million row
PostgreSQL table in about 11 seconds, or at a rate of almost 120,000
rows per second.
OPTIONS
-c Generate a CREATE TABLE statement to make a table with similar
datatypes and column names as the DBF file. Default.
-C Suppress the CREATE TABLE statement.
-d Generate a DROP TABLE statement before the CREATE TABLE
statement. This is useful for replacing the contents of a table
that already exists in PostgreSQL. Default.
-D Suppress the DROP TABLE statement.
-e Change the DROP TABLE statement to DROP TABLE IF EXISTS so that
newer versions of PostgreSQL (8.2+) will only attempt to drop
the table if it’s already defined. PostgreSQL will return an
error when attempting to drop a table that does not exist unless
IF EXISTS is used. Default.
-E Do not use the IF EXISTS modifier to DROP TABLE for
compatibility with versions of PostgreSQL older than 8.2.
-h Print a help message, then exit.
-m memofile
The name of the associated memo file (if necessary).
-t Wrap the entire script in a transaction. Default.
-T Remove the wrapper transaction. This is generally not a good
idea as it can cause the table to appear completely empty to
other clients during the data copying phase. If the entire
process occurs inside a transaction, the update is atomic and
other clients will have full access to all data in the table at
all times.
-u Issue a TRUNCATE TABLE statement to clear the contents of a
table before copying data into it.
-U Suppress the TRUNCATE TABLE statement. Default.
OPTION NOTES
The -c and -d arguments are incompatible with -u as it’s pointless to
truncate a newly-created table. Specifying -c or -d will disable the
TRUNCATE TABLE statement as though -U was given. Similarly, using the
-u argument will disable the CREATE TABLE and DROP TABLE statements as
if -C and -D were given.
BUGS
When multiple incompatible interpretations of a type are available,
such as the B type which can mean binary object in dBASE V or double-
precision float in FoxPro, PgDBF currently uses the FoxPro
interpretation.
Most XBase datatypes are supported, but some are not (yet). As of this
writing, PgDBF can handle boolean, currency, date, double-precision
float, float, general (although only outputs empty strings; it’s
unclear how to resolve OLE objects at this time), integer, memo,
numeric, timestamp, and varchar fields. If you need other datatypes,
send a small sample database for testing.
AUTHOR
Kirk Strauser <kstrauser@users.sourceforge.net>