branch: externals/scanner
commit 4fd44f213fa2f515053f4129061ef6ce35769d59
Author: Raffael Stocker <r.stoc...@mnet-mail.de>
Commit: Raffael Stocker <r.stoc...@mnet-mail.de>

    add documentation of unpaper commands and options
---
 scanner.texi | 207 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++-
 1 file changed, 204 insertions(+), 3 deletions(-)

diff --git a/scanner.texi b/scanner.texi
index 4c62296e67..94a8480eae 100644
--- a/scanner.texi
+++ b/scanner.texi
@@ -71,7 +71,10 @@ The document was typeset with
 @c Insert new nodes with `C-c C-c n'.
 @node Overview
 @chapter Overview
-@cindex Overview
+@cindex overview
+
+This chapter gives provides you with the most important information to
+get started using Scanner.
 
 @menu
 * Introduction::
@@ -81,7 +84,7 @@ The document was typeset with
 
 @node Introduction
 @section Introduction
-@cindex Introduction
+@cindex introduction
 
 If you want to scan a document at high quality with @acronym{OCR,
 optical character recognition} and not use one of the available free
@@ -191,6 +194,7 @@ images.  These are described below.
 @item M-x scanner-scan-document
 @itemx C-u M-x scanner-scan-document
 @itemx C-u N M-x scanner-scan-document
+@findex scanner-scan-document
 Scan a document.  When called without a prefix argument, this command
 will scan only one page.  When called with the default prefix argument
 (as @kbd{C-u M-x scanner-scan-document}), it will ask after each scanned
@@ -233,6 +237,7 @@ for a multi-page scan.
 @item M-x scanner-scan-image
 @itemx C-u M-x scanner-scan-image
 @itemx C-u n M-x scanner-scan-image
+@findex scanner-scan-image
 Scan an image.  When called without a prefix argument, this command
 will scan only one image.  When called with the default prefix argument
 (as @kbd{C-u M-x scanner-scan-image}), it will ask after each scanned
@@ -287,7 +292,7 @@ Scanner menu (@clicksequence{Tools @click{} Scanner}).
 
 @node Configuration Commands
 @section Configuration Commands
-@cindex Configuration Commands
+@cindex configuration commands
 
 The following commands help you configure some of the more-often used
 options.  They only change the options for the running session; if you
@@ -297,6 +302,7 @@ Emacs sessions, use the customization interface.
 @table @kbd
 @item M-x scanner-set-image-resolution
 @item M-x scanner-set-document-resolution
+@findex scanner-set-document-resolution
 These commands interactively asks for a resolution (in @acronym{DPI,
 dots per inch}) to be used in subsequent image and document scans,
 respectively.  The corresponding user options is
@@ -310,6 +316,7 @@ and@*
 document resolution}.
 
 @item M-x scanner-select-papersize
+@findex scanner-select-papersize
 Select a paper size from @code{scanner-paper-sizes} or
 @code{:whatever}.  See also @code{scanner-doc-papersize}.
 
@@ -317,6 +324,7 @@ This command is available in the Scanner menu as@*
 @clicksequence{Tools @click{} Scanner @click{} Select paper size}.
 
 @item M-x scanner-select-image-size
+@findex scanner-select-image-size
 Select an image size.  This command interactively reads x and y
 dimensions in millimeter from the minibuffer and sets
 @code{scanner-image-size} accordingly.
@@ -325,6 +333,7 @@ This command is also available in the Scanner menu as@*
 @clicksequence{Tools @click{} Scanner @click{} Select image size}.
 
 @item M-x scanner-select-outputs
+@findex scanner-select-outputs
 Select the document outputs.  This command reads a list of document
 output formats.  See also @code{scanner-tesseract-outputs}.
 
@@ -332,6 +341,7 @@ This command is also available in the Scanner menu as@*
 @clicksequence{Tools @click{} Scanner @click{} Select document outputs}.
 
 @item M-x scanner-select-languages
+@findex scanner-select-languages
 Select the languages assumed for OCR.  This command reads a list of
 languages used for OCR.  The necessary @command{tesseract} data files
 must be available.  See @code{scanner-tesseract-languages}.
@@ -341,6 +351,7 @@ This command is also available in the Scanner menu as@*
 
 @item M-x scanner-select-device
 @itemx C-u M-x scanner-select-device
+@findex scanner-select-device
 Select a device, possibly triggering auto-detection.  Normally, manual
 device selection is not necessary as @command{scanimage} will
 auto-detect.  However, if you have multiple devices and want to change
@@ -353,6 +364,133 @@ This command is also available in the Scanner menu as@*
 @clicksequence{Tools @click{} Scanner @click{} Select scanning device}
 @end table
 
+The following commands can be found in the ``Scan Enhancement'' submenu
+of the Scanner menu (@clicksequence{Tools @click{} Scanner @click{} Scan
+Enhancement}).  They require @command{unpaper} to be installed.  Scan
+enhancement allows such post-processing operations as rotation,
+de-noising, and deskewing, among others.  It is highly recommended as a
+preparatory step before OCR.  The descriptions of the commands below
+give a few hints on the usage of @command{unpaper}.  For more details,
+see its man-page or web-site.
+
+@table @kbd
+@item M-x scanner-toggle-use-unpaper
+@findex scanner-toggle-use-unpaper
+Toggle the use of @command{unpaper} for scan enhancement.  This command
+changes the option @code{scanner-use-unpaper} during the session.  Only
+when this option is non-@code{nil} will @command{unpaper} be used and
+the other items in the ``Scan Enhancement'' menu be available.
+
+This command is also available in the Scanner menu as@*
+@clicksequence{Tools @click{} Scanner @click{} Scan Enhancement @click{}
+Use unpaper for scan enhancement}
+
+The following commands configure some important processing steps; see
+@ref{Configuring unpaper} for all the options.
+
+@item M-x scanner-select-page-layout
+@findex scanner-select-page-layout
+This command interactively asks for the page layout of the pages to be
+scanned.  Available options are ``single'', ``double'', and ``none''
+(the default).  If you scan a sheet with two pages, for example as with
+a book, you can choose ``double'' here so @command{unpaper} will divide
+the sheet into two output pages.  If you use ``single'', it will try to
+identify the actual (single-)page contents on the sheet and stretch
+these to fit the output page size.  If you don't want any rearrangement,
+choose ``none''.  Note that ``double'' page layout implies a landscape
+orientation.  This command sets the option
+@code{scanner-unpaper-page-layout} accordingly.  If you want to split up
+an input page into two output pages, you must also use the
+@command{scanner-select-output-pages} command.
+
+This command is also available in the Scanner menu as@*
+@clicksequence{Tools @click{} Scanner @click{} Scan Enhancement @click{}
+Select page layout}
+
+@item M-x scanner-select-input-pages
+@findex scanner-select-input-pages
+This command allows you to select the number of input pages.  Available
+options are @code{1} and @code{2}.  It sets the option
+@code{scanner-unpaper-input-pages}.  If you wanted to combine two
+scanned input pages into one page, for example, to have left and right
+sides on one sheet, you would select two input pages and one output
+page, together with a ``single'' (or ``none'') page layout.
+
+This command is also available in the Scanner menu as@*
+@clicksequence{Tools @click{} Scanner @click{} Scan Enhancement @click{}
+Select number of input pages}
+
+@item M-x scanner-select-output-pages
+@findex scanner-select-output-pages
+This command allows you to select the number of output pages.  Available
+options are @code{1} and @code{2}.  It sets the option
+@code{scanner-unpaper-output-pages}.  If you wanted to split one scanned
+input page into two output pages, for example, to have left and right
+sides from a book on separate pages, you would select one input page and
+two output pages, together with a ``double'' page layout.
+
+This command is also available in the Scanner menu as@*
+@clicksequence{Tools @click{} Scanner @click{} Scan Enhancement @click{}
+Select number of output pages}
+
+@item M-x scanner-select-pre-rotation
+@findex scanner-select-pre-rotation
+This command asks for the rotation to be applied before any further
+processing.  Available values are ``clockwise'', ``counter-clockwise'',
+and ``none''.  It sets the @code{scanner-unpaper-pre-rotation} option.
+You should use this option if you have a landscape-oriented document
+scanned as portrait.  Rotating before further processing is especially
+relevant for scanning double-page documents, as it ensures that the
+document is in the correct orientation before @command{unpaper} tries to
+split pages.
+
+This command is also available in the Scanner menu as@*
+@clicksequence{Tools @click{} Scanner @click{} Scan Enhancement @click{}
+Select page rotation before processing}
+
+@item M-x scanner-select-post-rotation
+@findex scanner-select-post-rotation
+This command asks for the rotation to be applied after all the 
+processing.  Available values are ``clockwise'', ``counter-clockwise'',
+and ``none''.  It sets the @code{scanner-unpaper-post-rotation} option.
+
+This command is also available in the Scanner menu as@*
+@clicksequence{Tools @click{} Scanner @click{} Scan Enhancement @click{}
+Select page rotation after processing}
+
+@item M-x scanner-select-pre-size
+@findex scanner-select-pre-size
+This command interactively asks for the page size to set before further
+processing.  The scanned sheets will be scaled to this size.  Available
+options are ``a5'', ``a4'', ``a3'', ``a5-landscape'', ``a4-landscape'',
+``a3-landscape'', ``letter'', ``legal'', ``letter-landscape'',
+``legal-landscape'', ``none'', and direct width and height
+specifications as in ``21cm,29.7cm''.  See the documentation for
+@command{unpaper} for the understood units.  If you choose ``none'', no
+size will be specified in the invocation of @command{unpaper} and it
+will select the size based on the input data.
+
+This command is also available in the Scanner menu as@*
+@clicksequence{Tools @click{} Scanner @click{} Scan Enhancement @click{}
+Select page size before processing}
+
+@item M-x scanner-select-post-size
+@findex scanner-select-post-size
+This command interactively asks for the page size to set after all the
+processing.  The processed sheets will be scaled to this size.  Available
+options are ``a5'', ``a4'', ``a3'', ``a5-landscape'', ``a4-landscape'',
+``a3-landscape'', ``letter'', ``legal'', ``letter-landscape'',
+``legal-landscape'', ``none'', and direct width and height
+specifications as in ``21cm,29.7cm''.  See the documentation for
+@command{unpaper} for the understood units.  If you choose ``none'', no
+size will be specified in the invocation of @command{unpaper} and it
+will select the size based on the processed data.
+
+This command is also available in the Scanner menu as@*
+@clicksequence{Tools @click{} Scanner @click{} Scan Enhancement @click{}
+Select page size after processing}
+@end table
+
 
 @node General Options
 @section General Options
@@ -531,42 +669,105 @@ are device-dependent.
 @cindex configuring unpaper
 
 @defopt scanner-unpaper-program
+This variable contains the path of the @command{unpaper} program.
 @end defopt
 
 @defopt scanner-use-unpaper
+If this option is non-@code{nil}, scan enhancement using
+@command{unpaper} is activated.  Although using @command{unpaper} is
+highly recommended, its configuration is a bit elaborate and might be
+confusing at first.  The default is therefore @code{nil}.
 @end defopt
 
 @defopt scanner-unpaper-page-layout
+This option specifies the page layout of the scanned sheets.  Allowed
+values are ``single'', ``double'', and ``none'', setting
+@command{unpaper} up for detection of the page extent.  Note that
+``double'' implies a landscape orientation.  This option corresponds to
+the @option{--layout} option of @command{unpaper}.  See its
+documentation for details on the implications of the values.  The
+default is ``none''.
 @end defopt
 
 @defopt scanner-unpaper-input-pages
+This option selects the number of pages per scanned sheet of input.
+Allowed values are @code{1} and @code{2}.  This variable corresponds to
+the @option{--input-pages} option of @command{unpaper}.  If set to two
+input pages, @command{unpaper} will pairwise combine input sheets.  The
+default is @code{1}.
 @end defopt
 
 @defopt scanner-unpaper-output-pages
+This option selects the number of pages per sheet of processed output.
+Allowed values are @code{1} and @code{2}.  This variable corresponds to
+the @option{--output-pages} option of @command{unpaper}.  If set to two
+output pages, @command{unpaper} will split up every page of processed
+output into two pages.  The default is @code{1}.
 @end defopt
 
 @defopt scanner-unpaper-pre-rotation
+This option specifies the rotation to be applied before further
+processing.  Allowed values are ``clockwise'', ``counter-clockwise'',
+and ``none''.  This variable corresponds to the @option{--pre-rotation}
+option of @command{unpaper}.  If you choose ``none'', no rotation is
+specified in the invocation of @command{unpaper}.  The default is
+``none.
 @end defopt
 
 @defopt scanner-unpaper-post-rotation
+This option specifies the rotation to be applied after all the
+processing.  Allowed values are ``clockwise'', ``counter-clockwise'',
+and ``none''.  This variable corresponds to the @option{--post-rotation}
+option of @command{unpaper}.  If you choose ``none'', no rotation is
+specified in the invocation of @command{unpaper}.  The default is
+``none.
 @end defopt
 
 @defopt scanner-unpaper-pre-size
+This option specifies the page size to assume before further processing.
+The scanned input will be scaled to this size.  Allowed values are
+``a5'', ``a4'', ``a3'', ``a5-landscape'', ``a4-landscape'',
+``a3-landscape'', ``letter'', ``legal'', ``letter-landscape'',
+``legal-landscape'', ``none'', and direct width and height
+specifications as in ``21cm,29.7cm''.  This variable corresponds to the
+@option{--size} option of @command{unpaper}.  The default is ``a4''.
 @end defopt
 
 @defopt scanner-unpaper-post-size
+This option specifies the page size to assume after all the processing.
+The processed output will be scaled to this size.  Allowed values are
+``a5'', ``a4'', ``a3'', ``a5-landscape'', ``a4-landscape'',
+``a3-landscape'', ``letter'', ``legal'', ``letter-landscape'',
+``legal-landscape'', ``none'', and direct width and height
+specifications as in ``21cm,29.7cm''.  This variable corresponds to the
+@option{--post-size} option of @command{unpaper}.  The default is ``a4''.
 @end defopt
 
 @defopt scanner-unpaper-border
+This option allows you to force a border of white pixels at the four
+edges of a scanned sheet.  Allowed is any list of four integers, for
+example, @code{(10 10 10 10)} (the default).  This is very useful to
+remove black or gray scan artefacts at the edges of a sheet.  Even if
+this is not specified, @command{unpaper} will try to detect any such
+artefacts and remove them.  However, forcing a border usually leads to
+better results.  This variable corresponds to the @option{--border}
+option of @command{unpaper}.
 @end defopt
 
 @defopt scanner-unpaper-switches
+Any additional parameters to @command{unpaper} can be specified using
+this option.  Allowed is any list comprising valid @command{unpaper}
+options as strings.
 @end defopt
 
 @node Configuring tesseract
 @section Configuring tesseract
 @cindex configuring tesseract
 
+@defopt scanner-tesseract-program
+This option specifies the path of the @command{tesseract} program.
+@end defopt
+
 @defopt scanner-tessdata-dir
 This option specifies the @file{tessdata} directory.  This directory is
 supposed to contain the language data files for @command{tesseract}.

Reply via email to