On 7/26/2012 7:34 AM, Rafał Kuć wrote:
If you use Java (and I think you do, because you mention Lucene) you
should take a look at StreamingUpdateSolrServer. It not only allows
you to send data in batches, but also index using multiple threads.

A caveat to what Rafał said:

The streaming object has no error detection out of the box. It queues everything up internally and returns immediately. Behind the scenes, it uses multiple threads to send documents to Solr, but any errors encountered are simply sent to the logging mechanism, then ignored. When you use HttpSolrServer, all errors encountered will throw exceptions, but you have to wait for completion. If you need both concurrent capability and error detection, you would have to manage multiple indexing threads yourself.

Apparently there is a method in the concurrent class that you can override and handle errors differently, though I have not seen how to write code so your program would know that an error occurred. I filed an issue with a patch to solve this, but some of the developers have come up with an idea that might be better. None of the ideas have been committed to the project.

https://issues.apache.org/jira/browse/SOLR-3284

Just an FYI, the streaming class was renamed to ConcurrentUpdateSolrServer in Solr 4.0 Alpha. Both are available in 3.6.x.

Thanks,
Shawn

Reply via email to