djvused - Multi-purpose DjVu document editor.

NAME

       djvused - Multi-purpose DjVu document editor.

SYNOPSIS

       djvused [options] djvufile

DESCRIPTION

       Program djvused is a powerful command line tool for manipulating multi-
       page documents, creating or  editing  annotation  chunks,  creating  or
       editing  hidden  text layers, pre-computing thumbnail images, and more.
       The program first reads the  DjVu  document  djvufile  and  executes  a
       number of djvused commands.

       Djvused  commands  can  be read from a specific file (when option -f is
       specified), read from the command line (when option -e  is  specified),
       or read from the standard input (the default).

OPTIONS

       -v     Cause  djvused  to  print  a  command line prompt before reading
              commands and a brief message describing  how  each  command  was
              executed.   This  option  is  very  useful for debugging djvused
              scripts and also for interactively entering djvused commands  on
              the standard input.

       -f scriptfile
              Cause djvused to read commands from file scriptfile.

       -e command
              Cause  djvused  to  execute the commands specified by the option
              argument commands.  It is  advisable  to  surround  the  djvused
              commands  by  single  quotes  in order to prevent unwanted shell
              expansion.

       -s     Cause djvused to save the  file  djvufile  after  executing  the
              specified  commands.   This is similar to executing command save
              immediately before terminating the program.

       -n     Cause djvused to disregard save commands.  This  is  useful  for
              debugging  djvused  scripts  without  overwriting  files on your
              disk.

DJVUSED EXAMPLES

       There are many ways to use program  djvused.   The  following  examples
       illustrate some common uses of this program.

   Obtaining the size of a page
       Command size outputs the width and height of the selected pages using a
       HTML friendly syntax.  For instance, the following command  prints  the
       size of page 3 of document myfile.djvu.

          djvused myfile.djvu -e 'select 3; size'

   Extracting the hidden text
       Command  print-pure-txt  outputs  the  text associated with a page or a
       document.  For instance, the following shell command outputs  the  text
       for  the  entire  document.  Lines and pages are delimited by the usual
       control characters.

          djvused myfile.djvu -e 'print-pure-txt'

       Command print-txt produces  a  more  extensive  output  describing  the
       structure  and the location of the text components.  The syntax of this
       output is  described  later  in  this  man  page.   For  instance,  the
       following shell command outputs extended text information for page 3 of
       document myfile.djvu.

          djvused myfile.djvu -e 'select 3; print-txt'

   Extracting the annotations
       Annotation data can be extracted using command print-ant.   The  syntax
       of  the  annotation  data  is  described  later  in this man page.  For
       instance, the following shell command outputs the annotation  data  for
       the first page of document myfile.djvu.

          djvused myfile.djvu -e 'select 1; print-ant'

       Command  print-ant  only  prints the annotations stored in the selected
       component file.  Command print-merged-ant  also  retrieves  annotations
       from all the component files referenced by the current page (using INCL
       chunks) and prints the merged information.

   Dumping/restoring annotations and text
       Three commands, output-txt, output-ant, and output-all, produce djvused
       scripts.   For instance, the following shell command produces a djvused
       script, myfile.dsed, that recreates all the text and annotation data in
       document myfile.djvu.

          djvused myfile.djvu -e 'output-all' > myfile.dsed

       Script  myfile.dsed  is  a  text  file  that can be easily edited.  The
       following  shell  command  then  recreates  the  text  and   annotation
       information in file myfile.djvu.

          djvused myfile.djvu -f myfile.dsed -s

   Extracting a page
       Both   commands   save-page  and  save-page-with  create  a  DjVu  file
       representing the selected component file of a document.  The  following
       shell  command, for instance, creates a file p05.djvu containing page 5
       of document myfile.djvu.

          djvused myfile.djvu -e 'select 5; save-page p05.djvu'

       Each page of a document might import data from another  component  file
       using  the so-called inclusion ( INCL ) chunks.  Command save-page then
       produces a file with unresolved references to imported  data.   Such  a
       file  should  then be made part of a multi-page document containing the
       required data in other component files.  On  the  other  hand,  command
       save-page-with copies all the imported data into the output file.  This
       file is directly usable. Yet  collecting  several  such  files  into  a
       multi-page document might lead to useless data replication.

   Pre-computing thumbnails
       Commands   set-thumbnails  constructs  thumbnails  that  can  be  later
       displayed by DjVu viewers.  The following shell command, for  instance,
       computes  thumbnails  of  size  64x64  pixels  for  all  pages  of file
       myfile.djvu.

          djvused myfile.djvu -e 'set-thumbnails 64' -s

DJVUSED COMMANDS

       Command lines might contain zero, one, or more djvused commands and  an
       optional  comment.   Multiple  djvused  commands must be separated by a
       semicolon character ';'.  Comments are introduced by the '#'  character
       and extend until the end of the command line.

   Selection commands
       Multi-page  DjVu documents are composed of a number of component files.
       Most component files describe a specific  page  of  a  document.   Some
       component  files  contain  information  shared by several pages such as
       shared image data, shared  annotations  or  thumbnails.   Many  djvused
       commands  operate on selected component files.  All component files are
       initially selected.  The following commands are useful for changing the
       selection.

       n      Print the total number of pages in the document.

       ls     List all component files in the document.  Each line contains an
              optional page number, a letter  describing  the  component  file
              type,  the  size  of  the  component file, and identifier of the
              component file.  Component file type letters  P,  I,  A,  and  T
              respectively  stand  for  page  data,  shared image data, shared
              annotation data, and thumbnail  data.   Page  numbers  are  only
              listed  for  component  files  containing page data.  When it is
              set, the optional page title (see command set-page-title  below)
              is displayed after the component file identifier.

       select [fileid]
              Select   the  component  file  identified  by  argument  fileid.
              Argument fileid must be either a page number or a component file
              identifier.  The select command selects all component files when
              the argument fileid is omitted.

       select-shared-ant
              Select a component file containing shared annotations.  Only one
              such  component  file is supported by the current DjVu software.
              This component file usually contains annotations  pertaining  to
              the  whole  document  as  opposed  to  specific pages.  An error
              message is displayed if there is no such component file.

       create-shared-ant
              Create  and  select   a   component   file   containing   shared
              annotations.   This  command  only selects the shared annotation
              component  file  if  such  a  component  file  already   exists.
              Otherwise  it creates a new shared annotation component file and
              makes sure that it is imported by all pages in the document.

   Text and annotation commands
       print-pure-txt
              Print the text stored in the hidden text layer of  the  selected
              pages.   A  similar  capability  is  offered by program djvutxt.
              Structural  information  is  sometimes  represented  by  control
              characters.  Text from different pages is delimited by form feed
              characters ("\f").  Lines are delimited  by  newline  characters
              ("\n").    Columns,   regions,   and  paragraphs  are  sometimes
              delimited by vertical tab ("\013"),  group  separators  ("\035")
              and unit separators ("\037") respectively.

       print-txt
              Prints extensive hidden text information for the selected pages.
              This information describes the structure  of  the  text  on  the
              document  page  and  locates the structural elements in the page
              image.  The syntax of this output is described later in this man
              page.

       remove-txt
              Remove  the  hidden text information from the selected component
              files.  For instance, executing commands select  and  remove-txt
              removes all hidden text information from the DjVu document.

       set-txt [djvusedtxtfile]
              Insert  hidden  text  information  into the selected pages.  The
              optional argument djvusedtxtfile names  a  file  containing  the
              hidden text information.  This file must contain data similar to
              what is  produced  by  command  print-txt.   When  the  optional
              argument   is   omitted,  the  program  reads  the  hidden  text
              information from the djvused script until  reaching  an  end-of-
              file or a line containing a single period.

       output-txt
              Prints  a  djvused  script  that  reconstructs  the  hidden text
              information for the selected pages.  This script  can  later  be
              edited  and executed by invoking program djvused with option -f.

       print-ant
              Prints the annotations of  the  selected  component  file.   The
              annotation  data  is represented using a simple syntax described
              later in this document.

       print-merged-ant
              Merge the annotations stored in  the  selected  component  files
              with the annotations imported from other component files such as
              the shared annotation component file..  The annotation  data  is
              represented  using  a  simple  syntax  described  later  in this
              document.

       remove-ant
              Remove the annotation information from  the  selected  component
              files.   For  instance, executing commands select and remove-ant
              removes all annotation information from the DjVu document.

       set-ant [djvusedantfile]
              Insert  annotations  into  the  selected  component  file.   The
              optional  argument  djvusedantfile  names  a file containing the
              annotation data.  This file must contain data similar to what is
              produced  by  command  print-ant.  When the optional argument is
              omitted, the program reads the annotation data from the  djvused
              script itself until reaching an end-of-file or a line containing
              a single period.

       output-ant
              Print  a  djvused  script  that  reconstructs   the   annotation
              information  for  the  selected pages.  This script can later be
              edited and executed by invoking program djvused with option  -f.

       print-meta
              Print  the  meta-data  part  of the annotations for the selected
              component  file.   This  command  displays  a  subset   of   the
              information  printed  by  command  print-ant  using  a different
              syntax.  Meta-data  are  organized  as  key-value  pairs.   Each
              printed  line  contains the key name such as author, title,etc.,
              followed by a tab character ("\t") and  a  double-quoted  string
              representing the UTF-8 encoded meta-data value.

       set-meta [djvusedmetafile]
              Set  the  meta-data  part  of  the  annotations  of the selected
              component file.  The remaining part of the annotations  is  left
              unchanged.   The  optional argument djvusedmetafile names a file
              containing the meta-data.  This file must contain  data  similar
              to  what  is  produced by command print-meta.  When the optional
              argument is omitted, the program reads the annotation data  from
              the  djvused  script  itself  until reaching an end-of-file or a
              line containing a single period.

       output-all
              Print a djvused script that reconstructs both  the  hidden  text
              and  the  annotation  information  for the selected pages.  This
              script can later be edited  and  executed  by  invoking  program
              djvused with option -f.

   Outline/bookmarks commands
       print-outline
              Print  the  outline  of the document.  Nothing is printed if the
              document contains no outline.

       set-outline [djvusedoutlinefile]
              Insert outline information  into  the  document.   The  optional
              argument  djvusedoutlinefile names a file containing the outline
              information.  This file must contain data  similar  to  what  is
              produced  by  command print-outline.  When the optional argument
              is omitted, the program reads the hidden text  information  from
              the  djvused  script  until  reaching  an  end-of-file or a line
              containing a single period.

   Thumbnail commands
       set-thumbnails sz
              Compute thumbnails of size szxsz pixels and insert them into the
              document.   DjVu viewers can later display these thumbnails very
              efficiently without need to download the  data  for  each  page.
              Typical thumbnail size range from 48 to 128 pixels.

       remove-thumbnails
              Remove  the pre-computed thumbnails from the DjVu document.  New
              thumbnails can then be computed using command set-thumbnails.

   Save commands
       The above commands only modify the memory image of the  DjVu  document.
       The following commands provide means to save the modified data into the
       file system.

       save   Save the  modified  DjVu  document  back  into  the  input  file
              djvufile  specified  by  the  arguments  of the program djvused.
              Nothing is done if the DjVu  file  was  not  modified.   Passing
              option  -s  program  djvused  is equivalent to executing command
              save before exiting the program.

       save-bundled filename
              Save the current DjVu document  as  a  bundled  multi-page  DjVu
              document  named  filename.   A  similar capability is offered by
              program djvmcvt.

       save-indirect filename
              Save the current DjVu document as an  indirect  multi-page  DjVu
              document.  The index file of the indirect document will be named
              filename.  All other files composing the indirect document  will
              be  saved  into the same directory as the index file.  A similar
              capability is offered by program djvmcvt.

       save-page filename
              Save the selected component file into DjVu file  filename.   The
              selected component file might import data from another component
              file using the  so-called  inclusion  (  INCL  )  chunks.   This
              command  then  produces  a  file  with  unresolved references to
              imported data.  Such a file should then be made part of a multi-
              page  document  containing  the required data in other component
              files.

       save-page-with filename
              Save the selected component file into DjVu file  filename.   All
              data  imported  from  other  component  files is copied into the
              output file as well.  This command always produces a usable DjVu
              file.   On  the other hand, collecting several such files into a
              multi-page document might lead to useless data replication.

   Miscellaneous commands
       help   Display  a  help  message  listing  all  commands  supported  by
              djvused.

       dump   Display  the  EA  IFF  85  structure  of  the document or of the
              selected component file.  A similar  capability  is  offered  by
              program djvudump.

       size   Display  the  width  and  the height of the selected pages.  The
              dimensions of each page are displayed using  a  syntax  suitable
              for direct insertion into the <EMBED...></EMBED> tags.

       set-page-title title
              Sets  a  page title for the selected page.  When page titles are
              available, recent versions  of  the  DjVuLibre  viewers  display
              these  page  titles instead of page numbers and also accept them
              in page selection options.  Command ls can be used to  see  both
              the  page  titles  and page identifiers.  To unset a page title,
              simply make it equal to the page identifier.

DJVUSED FILE FORMATS

       Djvused  uses  a  simple  parenthesized  syntax   to   represent   both
       annotations and hidden text.

       *  This   syntax  is  the  native  syntax  used  by  DjVu  for  storing
          annotations.  Program djvused simply compresses the annotation  data
          using the bzz(1) algorithm.

       *  This  syntax differs from the native syntax used by DjVu for storing
          the hidden text.  Program djvused performs the translations  between
          the  compact  binary  representation  used  by  DjVu  and the easily
          modifiable parenthesized syntax.

   General syntax
       Djvused files are ASCII text files.  The legal  characters  in  djvused
       files are the printable ASCII characters and the space, tab, cr, and nl
       characters.  Using other characters has undefined results.

       Djvused files are composed of a sequence of  expressions  separated  by
       blank  characters  (space,  tab,  cr,  or  nl).  There are four kind of
       expressions, namely integers, symbols, strings and lists.

       Integers:
              Integer numbers are represented by one or more digits, with  the
              usual interpretation.

       Symbols:
              Symbols,  or  identifiers,  are  sequences  of  printable  ascii
              characters  representing  a  name  or  a  keyword.    Acceptable
              characters are the alpha-numeric characters, the underscore "_",
              the minus character "-", and  the  hash  character  "#".   Names
              should not begin with a digit or a minus character.

       Strings:
              Strings   denote   an   arbitrary  sequence  of  bytes,  usually
              interpreted as a sequence of UTF-8 encoded characters.   Strings
              in djvused files are similar to strings in the C language.  They
              are surrounded by double quote characters.  Certain sequences of
              characters  starting  with  a  backslash  ("\")  have  a special
              meaning.  A backslash followed by letter  "a",  "b",  "t",  "n",
              "v", "f", "r", "\", and stands for the ascii character BEL(007),
              BS(008),   HT(009),   LF(010),   VT(011),   FF(012),    CR(013),
              BACKSLASH(134)  and  DOUBLEQUOTE(042) respectively.  A backslash
              followed by one to three digits stands for the byte whose  octal
              code  is expressed by the digits.  All other backslash sequences
              are  illegal.   All  non  printable  ascii  characters  must  be
              escaped.

       Lists: Lists  are  sequence  of  expressions  separated  by  blanks and
              surrounded by parentheses.  All expressions types are acceptable
              within a list, including sub-lists.

   Hidden text syntax
       The  building  blocks  of the hidden text syntax are lists representing
       each structural component of the hidden  text.   Structural  components
       have the following form:

          (type xmin ymin xmax ymax ... )

       The  symbol type must be one of page, column, region, para, line, word,
       or char, listed here by decreasing order of importance.   The  integers
       xmin,  ymin,  xmax,  and  ymax represent the coordinates of a rectangle
       indicating the position  of  the  structural  component  in  the  page.
       Coordinates  are measured in pixels and have their origin at the bottom
       left corner of the page.  The remaining expressions in the list  either
       is  a  single string representing the encoded text associated with this
       structural component, or is a sequence of structural components with  a
       lesser type.

       The  hidden  text  for  each  page  is  simply  represented by a single
       structural  element  of  type  page.   Various  level   of   structural
       information  are  acceptable.   For  instance, the page level component
       might only specify a page level string, or might only provide a list of
       lines,  or  might  provide  a  full  hierarchy  down  to the individual
       characters.

   Outline/Bookmark syntax
       The outline syntax is a single list of the form

          (bookmarks ...)

       The first element of the list  is  symbol  bookmarks.   The  subsequent
       elements  are  lists  representing  the toplevel outline entries.  Each
       outline entry is represented by a list with the following form:

          (title url ... )

       The string title is the title of the outline  entry.   The  destination
       string  url can be either an arbitrary percent encoded URL, or composed
       of the hash character ("#") followed by  a  page  name  or  number,  or
       composed  of  the  question mark character ("?")  followed by cgi-style
       arguments interpreted by the djvu viewer.  The remaining expressions in
       the list describe subentries of this outline entry.

   Annotation syntax
       Annotations  are  represented  by a sequence of annotation expressions.
       The following annotation expressions are recognized:

       (background color)
              Specify the color of the viewer area surrounding the DjVu image.
              Colors  are represented with the X11 hexadecimal syntax #RRGGBB.
              For instance, #000000 is black and #FFFFFF is white.

       (zoom zoomvalue)
              Specify  the  initial  zoom  factor  of  the  image.    Argument
              zoomvalue  can  be  one  of  stretch,  one2one,  width, page, or
              composed of the letter d followed by a number in range 1 to  999
              representing  a  zoom  factor  (such  as  in  d300  or  d150 for
              instance.)

       (mode modevalue)
              Specify  the  initial  display  mode  of  the  image.   Argument
              modevalue is one of color, bw, fore, or back.

       (align horzalign vertalign)
              Specify  how  the image should be aligned on the viewer surface.
              By default  the  image  is  located  in  the  center.   Argument
              horzalign  can  be  one  of  left,  center,  or right.  Argument
              vertalign can be one of top, center, or bottom.

       (maparea url comment area ...)
              Define an hyper-link for the specified destination.

              Argument url can have one of the following forms:

                 href
                 (url href target)

              where href is a string representing the destination  and  target
              is a string representing the target frame for the hyper-link, as
              defined by the HTML anchor tag <A>.  The destination string href
              can  be  either an arbitrary percent encoded URL, or composed of
              the hash character ("#") followed by a page name or  number,  or
              composed  of the question mark character ("?")  followed by cgi-
              style arguments interpreted by the djvu  viewer.   Page  numbers
              may  be  prefixed  with  an  optional  sign  to represent a page
              displacement.  For instance the strings "#-1" and "#+1"  can  be
              used to access the previous page and the next page.

              Argument  comment  is  a  string  that might be displayed by the
              viewer when the user moves the mouse over the hyper-link.

              Argument  area  defines  the  shape  and  the  location  of  the
              hyperlink.  The following forms are recognized:

                 (rect xmin ymin width height)
                 (oval xmin ymin width height)
                 (poly x0 y0 x1 y1 ... )
                 (text xmin ymin width height)
                 (line x0 y0 x1 y1)

              All    parameters    are   numbers   representing   coordinates.
              Coordinates are measured in pixels and have their origin at  the
              bottom left corner of the page.

              The  remaining  expressions  in  the  maparea list represent the
              visual effect associated with the hyper-link.

              A first set of options defines how borders are drawn  for  rect,
              oval, polygon, or text hyperlink areas.

                 (none)
                 (xor)
                 (border color)
                 (shadow_in [thickness])
                 (shadow_out [thickness])
                 (shadow_ein [thickness])
                 (shadow_eout [thickness])

              where parameter color has syntax #RRGGBB as described above, and
              parameter thickness is an integer in range 1 to  32.   The  last
              four border options are only supported for rect hyperlink areas.
              The default border is a simple black line.   Border  options  do
              not apply to line areas.

              When  a  border  option is specified, the border becomes visible
              when the user moves the mouse over the hyperlink. The border may
              be made always visible by using the following option:

                 (border_avis)

              The following two options may be used with rect hyperlink areas.
              The complete area will be highlighted using the specified  color
              at the specified opacity (0-100, default 50).

                 (hilite color)
                 (opacity op)

              This  is  often  used with an empty URL for simply emphasizing a
              specific segment of an image.

              The following three options may  be  used  with  line  areas  to
              specify an optional ending arrow, the line width and color.  The
              default is a black line with width 1 and without arrow.

                 (arrow)
                 (width w)
                 (lineclr color)

              Finally the following three options can be used with text areas.
              The  default  background color is transparent.  The default text
              color is black.  The pushpin option indicates that the  text  is
              symbolized  by  a small pushpin icon.  Clicking the icon reveals
              the text.

                 (backclr bkcolor)
                 (textclr txtcolor)
                 (pushpin)

       (metadata ... (key value) ... )
              Define meta-data entries.  Each entry is identified by a  symbol
              key  representing the nature of the meta data entry.  The string
              value represents the value  associated  with  the  corresponding
              key.   Two  sets  of keys are noteworthy: keys borrowed from the
              BibTex bibliography system,  and  keys  borrowed  from  the  PDF
              DocInfo   metadata.    BibTex   keys  are  always  expressed  in
              lowercase,  such  as  year,  booktitle,  editor,  author,  etc..
              DocInfo  keys  start  with  an  uppercase letter, such as Title,
              Author, Subject, Creator, Produced, Trapped,  CreationDate,  and
              ModDate.  The values associated with the last two keys should be
              dates expressed according to RFC 3339.

LIMITATIONS

       The current version of program  djvused  only  supports  selecting  one
       component  file or all component files.  There is no way to select only
       a few component files.

CREDITS

       This    program    was    initially    written    by    Leon     Bottou
       <leonb@users.sourceforge.net>   and   was   improved  by  Yann  Le  Cun
       <profshadoko@users.sourceforge.net>,   Florin   Nicsa,   Bill   Riemers
       <docbill@sourceforge.net> and many others.

NAME

SYNOPSIS

DESCRIPTION

OPTIONS

DJVUSED EXAMPLES

DJVUSED COMMANDS

DJVUSED FILE FORMATS

LIMITATIONS

CREDITS

SEE ALSO