Judy arrays - C library functions for creating and accessing dynamic

NAME

       Judy  arrays  -  C library functions for creating and accessing dynamic
       arrays

SYNOPSIS

       Judy1  - maps an Index (word) to a bit
       JudyL  - maps an Index (word) to a Value (word/pointer)
       JudySL - maps an Index (null terminated string) to a Value
       JudyHS - maps an Index (array-of-bytes) of Length to a Value

DESCRIPTION

       The Judy family of functions  supports  fully  dynamic  arrays.   These
       arrays  may  be indexed by a 32- or 64-bit word (depending on processor
       word size), a null terminated string or an array-of-bytes plus  length.
       A  dynamic  array  (sparsely  populated)  can  also  be thought of as a
       mapping function or associative memory.

       A Word_t is a typedef unsigned long int  in Judy.h and must be the same
       size as sizeof(void *) I.E. a pointer.

       Judy1  functions: Index is a Word_t and Value is just a bit or simply a
       flag that Index is present or missing from  the  array.   This  can  be
       thought of as a huge bitmap.

       JudyL  functions:  Index is a Word_t and Value is a Word_t.  This makes
       JudyL a pure word-to-word/pointer mapper.  JudySL and JudyHL are  based
       on this property of JudyL.

       JudySL  functions:  Index  is  a  null-terminated string and Value is a
       Word_t.

       JudyHS functions:  Index  is  an  array-of-bytes  of  length:   Length.
       Value  is  a  Word_t.  This new addition (May 2004) to Judy is a hybird
       using the best features  of  hashing  and  Judy  methods.   The  author
       believes  JudyHS  is  a  good  replacement  for  a  hashing method when
       resizing the hash table is done during population growth.  A  correctly
       tuned  hash  method  with  a  static  hash table size and population is
       unbeatable for speed.  However,  JudyHS  will  perform  better  than  a
       hashing  method  with  smaller  and larger populations than the optimum
       hash table size.  JudyHS does not have a  degenerate  performance  case
       where  knowledge of the hash algorithm can be exploited.  (I.E.  JudyHS
       does not use a linked list to handle hash collisions, it uses a tree of
       JudyL arrays and a virtual hash table size of 4 billion).

       Judy  arrays  are  both  speed- and memory-efficient, with no tuning or
       configuration  required,  across  a  wide  range  of  index  set  types
       (sequential,  periodic,  clustered,  random).   Judy’s speed and memory
       usage are typically better than  other  data  storage  models  such  as
       skiplists, linked lists, binary, ternary, b-trees, or even hashing, and
       improves with very large data sets.

       A Judy array is created merely by defining  a  null  pointer  and  then
       storing  (inserting)  the  first  element  into  the  array  under that
       pointer.  The memory used by a Judy array is nearly proportional to the
       population (number of elements).

       Judy  has  two  Application  Program  Interfaces  (APIs):   a  C  macro
       interface, and a function call interface.  Because the macro forms  are
       sometimes  faster  and have a simpler error handling interface than the
       equivalent functions, they are the preferred  way  of  using  the  Judy
       functions.

       Since  an  initial (empty) Judy array is represented by a null pointer,
       it is possible to construct an array of Judy arrays.  In other words, a
       Judy  array’s  Values  (except  Judy1)  can  be  pointers to other Judy
       arrays.  This makes it very  simple  to  construct  an  array  with  an
       arbitrary  number of dimensions or Index sizes.  (JudySL and JudyHS are
       implemented using JudyL this way).

A 10 MINUTE TECHNICAL DESCRIPTION

       may be found at http://judy.sourceforge.net/downloads/10minutes.htm

A 3 HOUR TECHNICAL DESCRIPTION (out of date and a bit corny)

       may be found at http://judy.sourceforge.net/application/shop_interm.pdf

DOWNLOADS

       Judy        source        downloads        are       available       at
       http://sourceforge.net/projects/judy
       Binarys may be built and installed in a minute or two after downloading

       For   versions  including  more  platforms  and/or  new  features  see:
       http://judy.sourceforge.net/downloads/

AUTHOR

       Judy was invented by Doug  Baskins  (dougbaskins  .AT,  yahoo.com)  and
       implemented   by  Hewlett-Packard.   (Note:   Judy  is  named  for  the
       inventor’s sister, after discarding many proposed names.)

FILES

       Locations of interest include:
       http://sourceforge.net/projects/judy -- project downloads
       file:/usr/share/doc/Judy/ -- for HTML version of man pages.
       /usr/share/doc/Judy/demo/ -- demonstration program source files.
       The author attempted  to  write  interesting  application  notes  using
       advanced    features    of    Judy.     They    may    be    found   at
       "http://judy.sourceforge.net/application/ (Some may be out of date).

ERRORS

       A lot of thought (and time) went into making  error  handling  in  Judy
       simple,  while  maintaining flexibility and capability.  Error handling
       is a very boring subject even to  write  about.   So  read  this  short
       section  and  use  the  recommended  second  method.   It generates the
       fastest code, uses the least amount of memory and requires you to write
       extra  code  only  for insert/deletes functions.  Also it is compatible
       with the other two methods.  This method is for  production  code  that
       may  want  to  handle malloc() fails differently than the Judy default.
       If the Judy default method of handling malloc() fails are OK, then  use
       the first method.

       There are two (2) categories of Judy error returns, (or for any dynamic
       ADT):

       1) User programming errors (bugs) such as memory corruption or  invalid
       pointers.
       2) Out-of-memory (malloc() failure) with Insert (Set) or Delete (Unset)
       when modifying a Judy array.  Not all calls to insert and  delete  call
       malloc(),  so they may succeed even when a call to malloc() would fail.

       There are roughly three (3) methods of handling errors when  using  the
       macros:

1) Default Error Handling Method

       The default is to print error messages to stderr, for example:

       File ’YourCfile.c’, line 1234: JudyLIns(), JU_ERRNO_* == 2, ID == 321
       This  indicates  that  an  error occurred in the JudyLIns() function at
       line 321.  Line 1234 is the line in ’YourCfile.c’ where the JLI()  call
       failed.   JU_ERRNO_* == 2 is equal to JU_ERRNO_NOMEM (as defined in the
       Judy.h file).  The ID number indicates the source line  number  in  the
       function where the error originated.  Your program then terminates with
       an exit(1);.  By default, both categories of  Judy  error  returns  are
       printed  this  way.   (The  ’ID == 321’ is for die hards that want more
       detail or for debugging Judy itself.)

2) Disable Macro Error Handling

       When your program is "bug free", the only  errors  returned  should  be
       malloc()  failures.   Therefore  all  error returns can be treated as a
       malloc() failure.  By using the below #define, all  error  testing  and
       printing  is turned off.  Additional code needs to be added to the code
       that can have malloc() failures.  Judy was designed to leave  the  same
       data  in  the array before the call if a malloc() fail occurs.  (During
       testing of Judy, we found very few malloc()/OS’s  that  were  bug  free
       after  a malloc() failure.  Sometimes it took weeks to discover because
       most systems go into a paging frenzy before running out of memory).

       #define JUDYERROR_NOTEST 1
       (in your program code), or

       cc -DJUDYERROR_NOTEST sourcefile -lJudy
       (on your command line).

       // This is an example of how to program using method two (2).

       JLI(PValue, PLArray, Index);
       if (PValue == PJERR) goto out_of_memory_handling;

       JLD(RC_int, PLArray, Index);
       if (RC_int == JERR) goto out_of_memory_handling;

       J1S(RC_int, P1Array, Index);
       if (RC_int == JERR) goto out_of_memory_handling;

       J1U(RC_int, P1Array, Index);
       if (RC_int == JERR) goto out_of_memory_handling;

       Note:     Without     ’JUDYERROR_NOTEST’     defined,     the     ’goto
       out_of_memory_handling’  will  never  be executed and will be optimized
       out by the compiler.  The default method will be  used  --  Macro  will
       print error information if an error occurs as explained above.

       With ’JUDYERROR_NOTEST’ defined, the ’goto out_of_memory_handling’ will
       be executed when an error occurs  --  which  should  only  happen  when
       malloc() fails.

3) User-Specified JUDYERROR() Macro Method

       The  JUDYERROR()  macro  (in  Judy.h) provides flexibility for handling
       error returns as needed to suit your program while still using the Judy
       array  macros  instead  of  function  calls.   You  can use a different
       JUDYERROR() macro to suit your  needs.   The  following  example  is  a
       possible  alternative to the default. It is used to distinguish between
       the two types of errors (described above), and explicitly test for  the
       remaining JU_ERRNO_NOMEM errors possible in your program.

       // This is an example of Judy macro API to continue when out of memory
       // and print and exit(1) when any other error occurs.

       #ifndef JUDYERROR_NOTEST
       #include <stdio.h>  // needed for fprintf()

       // This is the macro that the Judy macro APIs use for return codes of -1:

       #define JUDYERROR(CallerFile, CallerLine, JudyFunc, JudyErrno, JudyErrID) \
       {                                                                         \
           if ((JudyErrno) != JU_ERRNO_NOMEM) /* ! a malloc() failure */         \
           {                                                                     \
               (void) fprintf(stderr, "File ’%s’, line %d: %s(), "               \
                   "JU_ERRNO_* == %d, ID == %d\n",                               \
                   CallerFile, CallerLine,                                       \
                   JudyFunc, JudyErrno, JudyErrID);                              \
               exit(1);                                                          \
           }                                                                     \
       }
       #endif // JUDYERROR_NOTEST not defined
       This error handling macro must be included before the #include <Judy.h>
       statement in your program.

NAME

SYNOPSIS

DESCRIPTION

A 10 MINUTE TECHNICAL DESCRIPTION

A 3 HOUR TECHNICAL DESCRIPTION (out of date and a bit corny)

DOWNLOADS

AUTHOR

FILES

ERRORS

1) Default Error Handling Method

2) Disable Macro Error Handling

3) User-Specified JUDYERROR() Macro Method

SEE ALSO