Erik Hatcher wrote:

The idea of having Solr handle various document types is a good one, for sure. I'm not sure what specifics would need to be implemented, but I at least wanted to reply and say its a good idea!

Care has to be taken when passing a URL to Solr for it to go fetch, though. There are a lot of complexities in fetching resources via HTTP, especially when handing something off to Solr which should be behind a firewall and may not be able to see the web as you would with your browser.

In that case the client should encode the content and send it as part of the index insert/update request - the aim is to merely prevent the bloat caused by encoding the document (e.g. as base64) when the indexer can access the source document directly.

--
Alan Burlison
--

Reply via email to