Lan,

I assume that some particular server can freeze on such bulk. But overall
message seems not absolutely correct to me. Solr has a lot of mechanisms to
survive in such cases.
Bulk indexing is absolutely right (if you submit single request with long
iterator of SolrInputDocs). This indexing thread can occupy single cpu
core, keeping others ready for searches. Such indexing occupies
ramBufferSizeMB of heap. After limit is exceeded new segment is flushed to
disk, which require some IO and can impact searchers. (misconfigured merge
can ruin everything, of course)
Commit should been executed from business consideration not performance
ones. Commit leads to creating new searcher and warming it, these actions
can be memory and cpu expensive (almost single thread activity).
I did some experiments on 40 M index at desktop box. Constantly adding 1K
docs/sec with autocommit more than once per minute, doesn't have
significant impact on search latency.
Generally, yes. Master-Slave scheme has more performance, for sure.

On Sat, Jul 28, 2012 at 4:01 AM, Lan <dung....@gmail.com> wrote:

> I assume your're indexing on the same server that is used to execute search
> queries. Adding 20K documents in bulk could cause the Solr Server to 'stop
> the world' where the server would stop responding to queries.
>
> My suggestion is
> - Setup master/slave to insulate your clients from 'stop the world' events
> during indexing.
> - Update in batches with a commit at the end of the batch.
>
>
>
> --
> View this message in context:
> http://lucene.472066.n3.nabble.com/Bulk-Indexing-tp3997745p3997815.html
> Sent from the Solr - User mailing list archive at Nabble.com.
>



-- 
Sincerely yours
Mikhail Khludnev
Tech Lead
Grid Dynamics

<http://www.griddynamics.com>
 <mkhlud...@griddynamics.com>

Reply via email to