On Tue, Jan 23, 2001 at 11:07:21PM -0800, Eric G . Miller wrote: > Not really easy to do. PostScript has a lot of stuff that simply won't > translate to html easily or at all.
I'm not really interested in the formatting, just something that will extract the text of the PS doc and insert some reasonable markup into it so that I don't have to do it manually. (I'd still expect to need to clean it up, but even a zeroth approximation would be nice.) > I've been to at least one site that has documents on- > line as a series of images -- one per page -- with links between pages, > the top and bottom. Not beautiful, but actually functional and legible. I've seen that sort of thing too, and it drives me nuts. I'm a believer in the theory that textual information should be available online as _text_, not just as images. (A picture of a word is not worth a thousand words, but it's almost as big...) Anyhow, a couple other people have pointed me at tools that may be appropriate. I'll check them out and report my results. BTW, anyone know what's up with pstotext? I ran a PS doc through it last night and there were a lot of extra spa ces in the outpu t, including many in mid-word. Is this preventable? -- SGI products are used to create the 'Bugs' that entertain us in theatres and at home. - SGI job posting Geek Code 3.1: GCS d? s+: a- C++ UL++$ P++>+++ L+++>++++ E- W--(++) N+ o+ !K w---$ O M- V? PS+ PE Y+ PGP t 5++ X+ R++ tv b+ DI++++ D G e* h+ r y+