On 7/26/2012 7:34 AM, Rafał Kuć wrote:
If you use Java (and I think you do, because you mention Lucene) you
should take a look at StreamingUpdateSolrServer. It not only allows
you to send data in batches, but also index using multiple threads.
A caveat to what Rafał said:
The streaming object has no error detection out of the box. It queues
everything up internally and returns immediately. Behind the scenes, it
uses multiple threads to send documents to Solr, but any errors
encountered are simply sent to the logging mechanism, then ignored.
When you use HttpSolrServer, all errors encountered will throw
exceptions, but you have to wait for completion. If you need both
concurrent capability and error detection, you would have to manage
multiple indexing threads yourself.
Apparently there is a method in the concurrent class that you can
override and handle errors differently, though I have not seen how to
write code so your program would know that an error occurred. I filed
an issue with a patch to solve this, but some of the developers have
come up with an idea that might be better. None of the ideas have been
committed to the project.
https://issues.apache.org/jira/browse/SOLR-3284
Just an FYI, the streaming class was renamed to
ConcurrentUpdateSolrServer in Solr 4.0 Alpha. Both are available in 3.6.x.
Thanks,
Shawn