|
For some unknown
reason things are all working correctly now, and I didn’t do anything.
The only possibility is that when the new server was unplugged from its setting
up desk and swapped with the old out-going server and plugged into the Internet,
it would not boot; the CPU had died. So a new (and more powerful) one was
fitted. I can’t think of any other explanation. Thanks for your help Bob _____________________________________________________________________________________________ Robert Isaac Director and Web Administrator Volvo Owners Club ++++++++++++++++++++++++++++++++++++++++++++++++++ Please provide
more details on what happened. What do you mean it deleted the PDFs? Could you proved
the few lines of stdout when htdig is parsing a PDF? When you run
doc2html.pl or acroconv.pl via command-line on a given PDF do you see any text
returned? Not all PDFs have text in them.. some are actually
images of text. These type of PDFs need OCR software to get indexable
text out of. You can get more
speed by disabling index compression wordlist_compress_zlib:
false wordlist_compress:
false Thanks. On 7/31/05,
Robert Isaac < > I am
setting up a new ProLiant DL360 G4 server with Red Hat ES Linux > 4 and Apache
2.0.x. > > I had
copied over htdig 3.1.6 from the old server, but decided to > install > 3.2.0b6 with
the view of using it when the server goes live in a few days. > What a
nightmare. > > The
htdig web site ( http://www.htdig.org/dev/htdig-3.2/) > is ambiguous
about 3.2.0b6 and PDF indexing. In the FAQ 1.13 it refers > to FAQ 4.9.
I have the xpdf package installed, used it with 3.1.6. > When I
indexed our web site - 3200 pages half of them PDF's - it took > over 13
hours > - yes
thirteen hours!! And then it deleted every one of the PDF's. > That was > using: > >
external_parsers: application/pdf->text/html >
/var/www/cgi-bin/doc2html.pl > > in
htdig.conf. > > I also
tried acroconv.pl but it didn't work at all. > > I
would appreciate some help with this. > > Thanks > > Bob > >
[EMAIL PROTECTED] |

