On Tuesday 08 October 2002 08:41, Adrian Bolzan wrote: > Hello, > > I am having trouble parsing Excel and Powerpoint files. > > I am using "xlhtml" and "ppthtml". > > Using application/msexcel (or application/vnd.ms-excel) from the > command line converts the documents. The command I used was: > > $ /usr/local/bin/doc2html.pl /var/httpd/aotnet/htdocs/PDFs/test.x > ls application/msexcel http://URL/test.xls /home/httpd/aotne > t/htdocs/search/conf/htdig.conf > > However, when i use these content types in htdig.conf the excel and > powerpoint documents are not parsed. > > I see that the content type is application/excel when using the command > -vvv option with htdig. However, the .xls files are not parsed when using > this content type at the command prompt nor when used in htdig.conf. > > Similar effects are seen for powerpoint documents. > > Word docs (catdoc), PDF's (pdf2html.pl) and RTF's (rtf2html.pl) are fine.
Could it be that the extensions for Excel and Powerpoint (.xls,.ppt) are listed in the "exclude_urls" directive of your ht://Dig configuration? hth Torsten -- InWise - Wirtschaftlich-Wissenschaftlicher Internet Service GmbH Waldhofstra�e 14 Tel: +49-4101-403605 D-25474 Ellerbek Fax: +49-4101-403606 E-Mail: [EMAIL PROTECTED] Internet: http://www.inwise.de ------------------------------------------------------- This sf.net email is sponsored by:ThinkGeek Welcome to geek heaven. http://thinkgeek.com/sf _______________________________________________ htdig-general mailing list <[EMAIL PROTECTED]> To unsubscribe, send a message to <[EMAIL PROTECTED]> with a subject of unsubscribe FAQ: http://htdig.sourceforge.net/FAQ.html

