On Wed, 13 Feb 2008, [UTF-8] Lars NoodC)n wrote:
> Duncan Patton a Campbell wrote:
> > The following is proposed as a base methodology for paper copy document
> > archival to digital media.
>
> >... subject each scanned page to the following processess:
> >
> > 1. page scanned to .pnm via (sane)
> > 2. OCR extract of text from .pnm (ocrad)
> > 3. conversion of .pnm image to ??? (gm convert)
>
Another consideration with output images is format compatibility. Lossless
is required (as stated previouisly, to avoid introducing artifacts or
loosing detail), and a .tif is pretty much industry standard. It's also
readily rendered with a number of applications.
Problem is, there are many 'flavors' of .tif, many with different
compression schemes.
The most efficient image format is B&W (one bit per pixel), uncompressed
.tif. If your documents are older and have clarity problems with B&W,
going to grayscale is a compromise for storage space vs. enhanced
resolution.
A standard .tif is also compatible with almost all other imaging systems,
should future conversion/upgrades be performed.
Lee
================================================
Leland V. Lammert [EMAIL PROTECTED]
Chief Scientist Omnitec Corporation
Network/Internet Consultants www.omnitec.net
================================================