Re: Understanding solr commit

Rahul Ramesh Mon, 25 Jan 2016 03:29:28 -0800

Thanks for your replies.

A bit more detail about our setup.
The index size is close to 80Gb spread across 30 collections. The main
memory available is around 32Gb. We are always in short of memory!
Unfortunately we could not expand the memory as the server motherboard
doesnt support it.


We tried with solr auto commit features. However, sometimes we were getting
Java OOM exception and when I start digging more about it, somebody
suggested that I am not committing the collections often. So, we started
committing the collections explicitly.

Please let me know if our approach is not correct.

*Emir*,
We are committing to the collection only once. We have Node N1, N2 and N3
and for a collection Coll1, commit will happen to N1/coll1 every 3 minutes.
we are not doing it for every node. We will remove _shard<>_replica<> and
use only the collection name to commit.

*Alessandro*,
We are using Solr Cloud with replication factor of 2 and no of shards as
either 2 or 3.

Thanks,
Rahul









On Mon, Jan 25, 2016 at 4:43 PM, Alessandro Benedetti <abenede...@apache.org
> wrote:

> Let me answer in line :
>
> On 25 January 2016 at 11:02, Rahul Ramesh <rr.ii...@gmail.com> wrote:
>
> > We are facing some issue and we are finding it difficult to debug the
> > problem. We wanted to understand how solr commit works.
> > A background on our setup:
> > We have  3 Node Solr Cluster running in version 5.3.1. Its a index heavy
> > use case. In peak load, we index 400-500 documents/second.
> > We also want these documents to be visible as quickly as possible, hence
> we
> > run an external script which commits every 3 mins.
> >
>
> This is weird, why not using the auto-soft commit if you want visibility
> every 3 minutes ?
> Is there any particular reason you trigger the commit from the client ?
>
> >
> > Consider the three nodes as N1, N2, N3. Commit is an synchronous
> operation.
> > So, we will not get control till the commit operation is complete.
> >
> > Consider the following scenario. Although it looks like a basic scenario
> in
> > distributed system:-) but we just wanted to eliminate this possibility.
> >
> > step 1 : At time T1, commit happens to Node N1
> > step 2: At same time T1, we search for all the documents inserted in Node
> > N2.
> >
> > My question is
> >
> > 1. Is commit an atomic operation? I mean, will commit happen on all the
> > nodes at the same time?
> >
> Which kind of architecture of Solr are you using ? Are you using SolrCloud
> ?
>
> 2. Can we say that, the search result will always contain the documents
> > before commit / or after commit . Or can it so happen that we get new
> > documents fron N1, N2 but old documents (i.e., before commit)  from N3?
> >
> With a manual cluster it could faintly happen.
> In SolrCloud it should not, but I should double check the code !
>
> >
> > Thank you,
> > Rahul
> >
>
>
>
> --
> --------------------------
>
> Benedetti Alessandro
> Visiting card : http://about.me/alessandro_benedetti
>
> "Tyger, tyger burning bright
> In the forests of the night,
> What immortal hand or eye
> Could frame thy fearful symmetry?"
>
> William Blake - Songs of Experience -1794 England
>

Re: Understanding solr commit

Reply via email to