On Tuesday 08 October 2002 08:41, Adrian Bolzan wrote:
> Hello,
>
> I am having trouble parsing Excel and Powerpoint files.
>
> I am using "xlhtml" and "ppthtml".
>
> Using application/msexcel (or application/vnd.ms-excel) from the
> command line converts the documents.  The command I used was:
>
> $ /usr/local/bin/doc2html.pl /var/httpd/aotnet/htdocs/PDFs/test.x
> ls application/msexcel http://URL/test.xls /home/httpd/aotne
> t/htdocs/search/conf/htdig.conf
>
> However, when i use these content types in htdig.conf the excel and
> powerpoint documents are not parsed.
>
> I see that the content type is application/excel when using the command
> -vvv option with htdig.  However, the .xls files are not parsed when using
> this content type at the command prompt nor when used in htdig.conf.
>
> Similar effects are seen for powerpoint documents.
>
> Word docs (catdoc), PDF's (pdf2html.pl) and RTF's (rtf2html.pl) are fine.

Could it be that the extensions for Excel and Powerpoint (.xls,.ppt) are 
listed in the "exclude_urls" directive of your ht://Dig configuration?

hth

  Torsten

-- 
InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH
Waldhofstra�e 14                            Tel: +49-4101-403605
D-25474 Ellerbek                            Fax: +49-4101-403606
E-Mail: [EMAIL PROTECTED]            Internet: http://www.inwise.de



-------------------------------------------------------
This sf.net email is sponsored by:ThinkGeek
Welcome to geek heaven.
http://thinkgeek.com/sf
_______________________________________________
htdig-general mailing list <[EMAIL PROTECTED]>
To unsubscribe, send a message to <[EMAIL PROTECTED]> with a 
subject of unsubscribe
FAQ: http://htdig.sourceforge.net/FAQ.html

Reply via email to