Man Linux: Main Page and Category List

NAME

       unpaper - post-processing tool for scanned pages

SYNOPSIS

       unpaper [options] input-file(s) output-file(s)

DESCRIPTION

       unpaper  is  a  post-processing  tool  for  scanned  sheets  of  paper,
       especially for book  pages  that  have  been  scanned  from  previously
       created photocopies.

       The  main  purpose  is  to  make  scanned book pages better readable on
       screen after conversion to PDF. Additionally, unpaper might  be  useful
       to  enhance  the  quality  of  scanned  pages before performing optical
       character recognition (OCR).

OPTIONS

       Filenames may contain a  formatting  placeholder  starting  with  %  to
       insert  a page counter for multi-page processing. E.g.: scan%03d.pbm to
       process files scan001.pbm, scan002.pbm, scan003.pbm etc.

       -l, --layout single|double|none
              Set default layout options for a sheet:

              single: One page per sheet.

              double: Two pages per sheet, landscape orientation (one page  on
              the left half, one page on the right half).

              none:  No  auto-layout,  mask-scan-points  may  individually  be
              specified.

              Using  single  or  double   automatically   sets   corresponding
              --mask-scan-points. The default is single.

       -start, --start-sheet <sheet>
              Number  of first sheet to process in multi-sheet mode. (default:
              1)

       -end, --end-sheet <sheet>
              Number  of  last  sheet  to  process  in  multi-sheet  mode.  -1
              indicates   processing   until  no  more  input  file  with  the
              corresponding page number is available. (default: -1)

       -#, --sheet <sheet>{,<sheet>[-<sheet>]}
              Optionally specifies  which  sheets  to  process  in  the  range
              between start-sheet and end-sheet.

       -x, --exclude <sheet>{,<sheet>[-<sheet>]}
              Excludes sheets from processing in the range between start-sheet
              and end-sheet.

       --pre-rotate -90|90
              Rotates the whole  image  clockwise  (90)  or  counter-clockwise
              (-90) before any other processing.

       --post-rotate -90|90
              Rotates  the  whole  image  clockwise  (90) or counter-clockwise
              (-90) after any other processing.

       -M, --pre-mirror [v[ertical]][,][h[orizontal]]
              Mirror the image, after possible  pre-rotation.  Either  v  (for
              vertical  mirroring),  h  (for horizontal mirroring) or v,h (for
              both) can be specified.

       --post-mirror [v[ertical]][,][h[orizontal]]
              Mirror the image, after any  other  processing  except  possible
              post-rotation.

       --pre-shift <h>,<v>
              Shift   the  image  before  further  processing.  Values  for  h
              (horizontal shift) and v (vertical shift) can either be positive
              or negative.

       --post-shift <h>,<v>
              Shift the image after other processing. Values for h (horizontal
              shift)  and  v  (vertical  shift)  can  either  be  positive  or
              negative.

       --pre-wipe <left>,<top>,<right>,<bottom>
              Manually  wipe  out an area before further processing. Any pixel
              in a wiped area will be set to white. Multiple areas to be wiped
              may be specified by multiple occurrences of this options.

       --post-wipe <left>,<top>,<right>,<bottom>
              Manually wipe out an area after processing. Any pixel in a wiped
              area will be set to white. Multiple areas to  be  wiped  may  be
              specified by multiple occurrences of this options.

       --pre-border <left>,<top>,<right>,<bottom>
              Clear  the  border-area  of the sheet before further processing.
              Any pixel in the border area will be set to white.

       --post-border <left>,<top>,<right>,<bottom>
              Clear the border-area after processing. Any pixel in the  border
              area will be set to white.

       --pre-mask <x1>,<y1>,<x2>,<y2>
              Specify  masks  to  apply before any other processing. Any pixel
              outside a mask  will  be  set  to  white,  unless  another  mask
              includes  this  pixel.   Only  pixels inside a mask will remain.
              Multiple masks may be specified. No deskewing will be applied to
              the masks specified by --pre-mask.

       -s, --size <width>,<height>|<size-name>
              Change  the  sheet  size  before  other  processing  is applied.
              Content on the sheet gets zoomed to fit to the appropriate size,
              but  the  aspect  ratio  is  preserved.  Instead, if the sheet’s
              aspect ratio changes, the zoomed content gets  centered  on  the
              sheet. Size-name can also be a standard name as a4, letter, etc.
              Possible size names are:

              a5
              a4
              a3
              letter
              legal

              All  size  names  can  also  be  applied  in  rotated  landscape
              orientation, use a4-landscape, letter-landscape etc.

       --post-size <width>,<height>|<name>
              Change  the  sheet  size  preserving  the content’s aspect ratio
              after other processing steps are applied.

       --stretch <width>,<height>|<name>
              Change the  sheet  size  before  other  processing  is  applied.
              Content  on  the  sheet  gets  stretched  to the specified size,
              possibly changing the aspect ratio.

       --post-stretch <width>,<height>|<name>
              Change the sheet size after other processing is applied. Content
              on  the  sheet  gets  stretched  to the specified size, possibly
              changing the aspect ratio.

       -z, --zoom <factor>
              Change the sheet size according to the given factor before other
              processing is done.

       --post-zoom <factor>
              Change  the  sheet  size  according  to  the  given factor after
              processing is done.

       -bn, --blackfilter-scan-direction [v[ertical]][,][h[orizontal]]
              Directions in which to search for solidly black areas. Either  v
              (for vertical scanning), h (for horizontal scanning) or v,h (for
              both) can be specified. (default: v,h)

       -bs, --blackfilter-scan-size <size>|<h-size>,<v-size>
              Width of virtual bar used for mask detection. Two values may  be
              specified  to  individually  set  horizontal  and vertical size.
              (default: 20,20)

       -bd, --blackfilter-scan-depth <depth>|<h-depth,v-depth>
              Size of virtual bar used for  black  area  detection.  (default:
              500,500)

       -bp, --blackfilter-scan-step <step>|<h-step,v-step>
              Steps  to  move  virtual bar for black area detection. (default:
              5,5)

       -bt, --blackfilter-scan-threshold <t>
              Ratio of dark pixels above which a  black  area  gets  detected.
              (default: 0.95)

       -bx, --blackfilter-scan-exclude <left>,<top>,<right>,<bottom>
              Area  on  which  the blackfilter should not operate. This can be
              useful to prevent the blackfilter from  working  on  inner  page
              content.  May  be  specified multiple times to set more than one
              area.

       -bi, --blackfilter-intensity <i>
              Intensity with which to delete black areas. Larger  values  will
              leave  less  noise-pixels  around  former  black  areas, but may
              delete page content. (default: 20)

       -ni, --noisefilter-intensity <n>
              Intensity  with  which  to  delete  individual  pixels  or  tiny
              clusters  of  pixels.  Any  cluster  which  only contains n dark
              pixels together will be deleted. (default: 4)

       -ls, --blurfilter-size <size>|<h-size>,<v-size>
              Size of blurfilter area  to  search  for  ’lonely’  clusters  of
              pixels. (default: 100,100)

       -lp, --blurfilter-step <step>|<h-step>,<v-step>
              Size of ’blurring’ steps in each direction. (default: 50,50)

       -li, --blurfilter-intensity <ratio>
              Relative intensity with which to delete tiny clusters of pixels.
              Any blurred area which contains at most the ratio of dark pixels
              will be cleared. (default: 0.01)

       -gs, --grayfilter-size <size>|<h-size>,<v-size>
              Size  of  grayfilter  mask  to  search  for ’gray-only’ areas of
              pixels. (default: 50,50)

       -gp, --grayfilter-step <step>|<h-step>,<v-step>
              Size of steps moving the  grayfilter  mask  in  each  direction.
              (default: 20,20)

       -gt, --grayfilter-threshold <ratio>
              Relative intensity of grayness which is accepted before clearing
              the grayfilter mask in cases where no black pixel  is  found  in
              the mask. (default: 0.5)

       -p, --mask-scan-point <x>,<y>
              Manually   set   starting  point  for  mask-detection.  Multiple
              --mask-scan-point options may be specified  to  detect  multiple
              masks.

       -m, --mask <x1>,<y1>,<x2>,<y2>
              Manually add a mask, in addition to masks automatically detected
              around the --mask-scan-point coordinates (unless  --no-mask-scan
              is  specified).  Any  pixel outside a mask will be set to white,
              unless another mask covers this pixel.

       -mn, --mask-scan-direction [v[ertical]][,][h[orizontal]]
              Directions in which to search for mask  borders,  starting  from
              --mask-scan-point coordinates. Either v (for vertical scanning),
              h (for horizontal scanning) or v,h (for both) can be  specified.
              (default: h (v may cut text-paragraphs on single-page sheets))

       -ms, --mask-scan-size <size>|<h,v>
              Width of the virtual bar used for mask detection. Two values may
              be specified to individually set horizontal and  vertical  size.
              (default: 50,50)

       -md, --mask-scan-depth <dep>|<h,v>
              Height  of  the  virtual  bar used for mask detection. (default:
              -1,-1, using the total width or height of the sheet)

       -mp, --mask-scan-step <step>|<h,v>
              Steps to move the virtual bar for mask detection. (default: 5,5)

       -mt, --mask-scan-threshold <t>|<h,v>
              Ratio of dark pixels below which an edge gets detected, relative
              to max.  blackness  when  counting  from  the  start  coordinate
              heading towards one edge. (default: 0.1)

       -mm, --mask-scan-minimum <w>,<h>
              Minimum  allowed  size  of an auto-detected mask. Masks detected
              below this size will be ignored and set to the size specified by
              mask-scan-maximum. (default: 100,100)

       -mM, --mask-scan-maximum <w>,<h>
              Maximum  allowed  size  of an auto-detected mask. Masks detected
              above this size will  be  shrunk  to  the  maximum  value,  each
              direction  individually.  (default:  sheet  size,  or  page size
              derived from --layout option)

       -mc, --mask-color <color>
              Color value with which to wipe out pixels  not  covered  by  any
              mask.  Maybe useful for testing in order to visualize the effect
              of masking. (Note that an RGB-value is expected: R*65536 + G*256
              + B)

       -dn, --deskew-scan-direction <left>,<top>,<right>,<bottom>
              Edges  from  which to scan for rotation. Each edge of a mask can
              be used to detect the mask’s rotation.  If  multiple  edges  are
              specified,   the   average   value  will  be  used,  unless  the
              statistical deviation exceeds --deskew-scan-deviation. Use  left
              for  scanning  from the left edge, top for scanning from the top
              edge, right  for  scanning  from  the  right  edge,  bottom  for
              scanning  from  the bottom. Multiple directions can be separated
              by commas. (default: left,right)

       -ds, --deskew-scan-size <pixels>
              Size of virtual line for rotation detection. (default: 1500)

       -dd, --deskew-scan-depth <ratio>
              Amount of dark pixels to accumulate until scanning  is  stopped,
              relative to scan-bar size. (default: 0.5)

       -dr, --deskew-scan-range <degrees>
              Range in which to search for rotation, from -degrees to +degrees
              rotation. (default: 5.0)

       -dp, --deskew-scan-step <degrees>
              Steps between single rotation-angle  detections.  Lower  numbers
              lead to better results but slow down processing. (default: 0.1)

       -dv, --deskew-scan-deviation <dev>
              Maximum  statistical  deviation  allowed  among the results from
              detected edges. No rotation if exceeded. (default: 1.0)

       -W, --wipe <left>,<top>,<right>,<bottom>
              Manually wipe out an area. Any pixel in a wiped area will be set
              to  white.  Multiple  --wipe  areas  may  be  specified. This is
              applied after deskewing and before automatic border-scan.

       -mw, --middle-wipe <size>|<left>,<right>
              If --layout is set to double, this may specify  the  size  of  a
              middle area to wipe out between the two pages on the sheet. This
              may be useful if the blackfilter  fails  to  remove  some  black
              areas  (e.g.  resulting from photo-copying in the middle between
              two pages).

       -B, --border <left>,<top>,<right>,<bottom>
              Manually add a border. Any pixel in the border area will be  set
              to  white.  This is applied after deskewing and before automatic
              border-scan.

       -Bn, --border-scan-direction [v[ertical]][,][h[orizontal]]
              Directions in which to search for outer border.  Either  v  (for
              vertical  scanning),  h  (for  horizontal  scanning) or v,h (for
              both) can be specified. (default: v)

       -Bs, --border-scan-size <size>|<h,v>
              Width of virtual bar used for border detection. Two  values  may
              be  specified  to individually set horizontal and vertical size.
              (default: 5,5)

       -Bp, --border-scan-step <step>|<h,v>
              Steps to move virtual bar for border detection. (default: 5,5)

       -Bt, --border-scan-threshold <t>
              Absolute number of dark pixels covered by the  border-scan  mask
              above which a border is detected. (default: 5)

       -Ba, --border-align <left>,<top>,<right>,<bottom>
              Direction   where   to   shift  the  detected  border-area.  Use
              --border-margin to specify horizontal and vertical distances  to
              be kept from the sheet-edge. (default: none)

       -Bm, --border-margin <vertical>,<horizontal>
              Distance  to  keep  from  the  sheet edge when aligning a border
              area. May use measurement suffices such as cm, in.

       -w, --white-threshold <threshold>
              Brightness ratio  above  which  a  pixel  is  considered  white.
              (default: 0.9)

       -b, --black-threshold <threshold>
              Brightness  ratio  below which a pixel is considered black (non-
              gray). This is used by the gray-filter. This value is also  used
              when  converting  a  grayscale  image  to  black-and-white  mode
              (default: 0.33)

       -ip, --input-pages 1|2
              If 2 is specified, read two input  images  instead  of  one  and
              internally combine them to a doubled-layout sheet before further
              processing.  Before  internally  combining,  --pre-rotation   is
              optionally applied individually to both input images as the very
              first processing steps.

       -op, --output-pages 1|2
              If 2 is specified, write two output images instead of one, as  a
              result  of  splitting  a  doubled-layout sheet after processing.
              After splitting the sheet, --post-rotation is optionally applied
              individually  to  both output images as the very last processing
              step.

       -S, --sheet-size <width>,<height>|<size-name>
              Force a fix sheet size. Usually, the sheet size is determined by
              the  input  image size (if input-pages=1), or by the double size
              of the first page in a two-page input set (if input-pages=2). If
              the input image is smaller than the size specified here, it will
              appear centered and surrounded with a white border on the sheet.
              If  the input image is bigger, it will be centered and the edges
              will be cropped. This option may also be helpful to get  regular
              sized  output  images  if the input image sizes differ. Standard
              size-names like a4-landscape, letter,  etc.  may  be  used  (see
              --size). (default: as in input file)

       --sheet-background black|white
              Sets  a color with which the sheet is filled before any image is
              loaded and placed onto it. This can be  useful  when  the  sheet
              size and the image size differ.

       --no-blackfilter <sheet>{,<sheet>[-<sheet>]}
              Disables  black  area  scan.  Individual  sheet  indices  can be
              specified.

       --no-noisefilter <sheet>{,<sheet>[-<sheet>]}
              Disables the noise  filter.  Individual  sheet  indices  can  be
              specified.

       --no-blurfilter <sheet>{,<sheet>[-<sheet>]}
              Disables  the  blur  filter.  Individual  sheet  indices  can be
              specified.

       --no-grayfilter <sheet>{,<sheet>[-<sheet>]}
              Disables the  gray  filter.  Individual  sheet  indices  can  be
              specified.

       --no-mask-scan <sheet>{,<sheet>[-<sheet>]}
              Disables  mask-detection.  Masks  explicitly  set by --mask will
              still have effect. Individual sheet indices can be specified.

       --no-mask-center <sheet>{,<sheet>[-<sheet>]}
              Disables  auto-centering  of  each   mask.   Auto-centering   is
              performed  by  default  if  the  --layout  option  has been set.
              Individual sheet indices can be specified.

       --no-deskew <sheet>{,<sheet>[-<sheet>]}
              Disables deskewing. Individual sheet indices can be specified.

       --no-wipe <sheet>{,<sheet>[-<sheet>]}
              Disables explicit wipe-areas. This means the effect of parameter
              --wipe can be disabled individually per sheet.

       --no-border <sheet>{,<sheet>[-<sheet>]}
              Disables  explicitly  set  borders.  This  means  the  effect of
              parameter --border can be disabled individually per sheet.

       --no-border-scan <sheet>{,<sheet>[-<sheet>]}
              Disables border-scanning from the edges of the sheet. Individual
              sheet indices can be specified.

       --no-border-align <sheet>{,<sheet>[-<sheet>]}
              Disables  aligning  of the area detected by border-scanning (see
              --border-align). Individual sheet indices can be specified.

       -n, --no-processing <sheet>{,<sheet>[-<sheet>]}
              Do not  perform  any  processing  on  a  sheet  except  pre/post
              rotating  and  mirroring,  and file-depth conversions on saving.
              This option has the same effect as setting all --no-xxx  options
              together. Individual sheet indices can be specified.

       --no-qpixels
              Disable  qpixel-mode  for  deskewing (do not internally use a 4x
              bigger image when rotating).

       --no-multi-pages
              Disable  multi-page  processing  even  if  the  input   filename
              contains a counter).

       --dpi <dpi>
              Dots  per inch used for conversion of measured size values, like
              e.g. 21cm,27.9cm. Note that this parameter should  occur  before
              specifying  any  size  value  with measurement suffix. (default:
              300)

       -t, --type pbm|pgm
              Output file type. (default: as input)

       -d, --depth <bits>
              Output pixel depth. (default: as input)

       -T, --test-only
              Do not write any output.  May  be  useful  in  combination  with
              --verbose to get information about the input.

       -in, --input-file-sequence <file-patterns>
              Sequence   of   input  filename  patterns  which  is  repeatedly
              traversed while resolving input filenames. Specifying  a  single
              entry  is  equivalent  to  the first filename argument after the
              options-list.

       -out, --output-file-sequence <file-patterns>
              Sequence  of  output  filename  patterns  which  is   repeatedly
              traversed  while resolving output filenames. Specifying a single
              entry is equivalent to the second filename  argument  after  the
              options-list.

       -si, --start-input <nr>
              Set  the  first  page  number  to  substitute  for  %d  in input
              filenames. Every time the input file sequence is repeated,  this
              number       gets       increased      by      1.      (default:
              (startsheet-1)*inputpages+1)

       -so, --start-output <nr>
              Set the first  page  number  to  substitute  for  %d  in  output
              filenames. Every time the output file sequence is repeated, this
              number      gets      increased      by       1.       (default:
              (startsheet-1)*outputpages+1)

       --insert-blank <nr>{,<nr>[-<nr>]}
              Use  blank  input  instead  of an input file from the input file
              sequence  at  the  specified  index-positions.  The  input  file
              sequence  will be interrupted temporarily and will continue with
              the next input file afterwards. This can  be  useful  to  insert
              blank content into a sequence of input images.

       --replace-blank <nr>{,<nr>[-<nr>]}
              Like --insert-blank, but the input images at the specified index
              positions get replaced with  blank  content  and  thus  will  be
              ignored.

       --overwrite
              Allow   overwriting   existing   files.  Otherwise  the  program
              terminates with an error if an output-file to be written already
              exists.

       -q, --quiet
              Quiet mode, no output at all.

       -v, --verbose
              Verbose output, more info messages.

       -vv    Even   more  verbose  output,  show  parameter  settings  before
              processing.

       --time Output processing time consumed.

       -V, --version
              Output version and build information.

AUTHOR

       unpaper was written by Jens Gulden <unpaper@jensgulden.de>.

       This manual page was written by Julien BLACHE <jblache@debian.org>, for
       the Debian project (but may be used by others).

                               December 31, 2007