Re: solr application for website crawling and indexing html, pdf, word, ... files

2010-01-25 Thread Markus Jelsma
Hello Frank, Answers are inline: Frank van Lingen said: > I recently started working with solr and find it easy to setup and > tinker with. > > I now want to scale up my setup and was wondering if there is an > application/component that can do the following (I was not able to find > documentatio

Re: solr application for website crawling and indexing html, pdf, word, ... files

2010-01-25 Thread mike anderson
I think you might be looking for Apache Tika. On Mon, Jan 25, 2010 at 3:55 PM, Frank van Lingen wrote: > I recently started working with solr and find it easy to setup and tinker > with. > > I now want to scale up my setup and was wondering if there is an > application/component that can do the

solr application for website crawling and indexing html, pdf, word, ... files

2010-01-25 Thread Frank van Lingen
I recently started working with solr and find it easy to setup and tinker with. I now want to scale up my setup and was wondering if there is an application/component that can do the following (I was not able to find documentation on this on the solr site): -Can I send solr an xml document with a