As far as i know "Nutch" will satisfy your needs, altough i didn't test
it myself yet..
*Hey guys*,I search a Fulltext crawler for Solr, to index HTML,OpenOffice
and Ms Office documets,PDF and muchmore formates.
How indexed you the Data?
Maby you can help me to find a Crawler.
King
--
ekaabo GmbH
Christian Weyand
Entwickler
christian.wey...@ekaabo.de
Grundelbachstr. 84
69469 Weinheim
tel: +49-(0)6201-84520-0 (Zentrale)
fax: +49-(0)6201-84520-29
www.ekaabo.de
Amtsgericht Mannheim / HRB 701542
Geschäftsführer: Marco Ripanti