Thanks for suggestions. After reading that document I feel even more confused though because I always thought that hard commits should be less frequent that hard commits.
Is there any way to configure autoCommit, softCommit values on a per request basis? The majority of the time we have small flow of updates coming in and we would like to see them in ASAP. However we occasionally need to do some bulk indexing (once a week or less) and the need to see those updates right away isn't as critical. I would say 95% of the time we are in "Index-Light Query-Light/Heavy" mode and the other 5% is "Index-Heavy Query-Light/Heavy" mode. Thanks On Wed, Jan 22, 2014 at 5:33 PM, Erick Erickson <erickerick...@gmail.com>wrote: > When you're doing hard commits, is it with openSeacher = true or > false? It should probably be false... > > Here's a rundown of the soft/hard commit consequences: > > > http://searchhub.org/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ > > I suspect (but, of course, can't prove) that you're over-committing > and hitting segment > merges without meaning to... > > FWIW, > Erick > > On Wed, Jan 22, 2014 at 1:46 PM, Software Dev <static.void....@gmail.com> > wrote: > > A suggestion would be to hard commit much less often, ie every 10 > > minutes, and see if there is a change. > > > > - Will try this > > > > How much system RAM ? JVM Heap ? Enough space in RAM for system disk > cache ? > > > > - We have 18G of ram 12 dedicated to Solr but as of right now the total > > index size is only 5GB > > > > Ah, and what about network IO ? Could that be a limiting factor ? > > > > - What is the size of your documents ? A few KB, MB, ... ? > > > > Under 1MB > > > > - Again, total index size is only 5GB so I dont know if this would be a > > problem > > > > > > > > > > > > > > On Wed, Jan 22, 2014 at 12:26 AM, Andre Bois-Crettez > > <andre.b...@kelkoo.com>wrote: > > > >> 1 node having more load should be the leader (because of the extra work > >> of receiving and distributing updates, but my experiences show only a > >> bit more CPU usage, and no difference in disk IO). > >> > >> A suggestion would be to hard commit much less often, ie every 10 > >> minutes, and see if there is a change. > >> How much system RAM ? JVM Heap ? Enough space in RAM for system disk > cache > >> ? > >> What is the size of your documents ? A few KB, MB, ... ? > >> Ah, and what about network IO ? Could that be a limiting factor ? > >> > >> > >> André > >> > >> > >> On 2014-01-21 23:40, Software Dev wrote: > >> > >>> Any other suggestions? > >>> > >>> > >>> On Mon, Jan 20, 2014 at 2:49 PM, Software Dev < > static.void....@gmail.com> > >>> wrote: > >>> > >>> 4.6.0 > >>>> > >>>> > >>>> On Mon, Jan 20, 2014 at 2:47 PM, Mark Miller <markrmil...@gmail.com > >>>> >wrote: > >>>> > >>>> What version are you running? > >>>>> > >>>>> - Mark > >>>>> > >>>>> On Jan 20, 2014, at 5:43 PM, Software Dev <static.void....@gmail.com > > > >>>>> wrote: > >>>>> > >>>>> We also noticed that disk IO shoots up to 100% on 1 of the nodes. Do > >>>>>> all > >>>>>> updates get sent to one machine or something? > >>>>>> > >>>>>> > >>>>>> On Mon, Jan 20, 2014 at 2:42 PM, Software Dev < > >>>>>> > >>>>> static.void....@gmail.com>wrote: > >>>>> > >>>>>> We commit have a soft commit every 5 seconds and hard commit every > 30. > >>>>>>> > >>>>>> As > >>>>> > >>>>>> far as docs/second it would guess around 200/sec which doesn't seem > >>>>>>> > >>>>>> that > >>>>> > >>>>>> high. > >>>>>>> > >>>>>>> > >>>>>>> On Mon, Jan 20, 2014 at 2:26 PM, Erick Erickson < > >>>>>>> > >>>>>> erickerick...@gmail.com>wrote: > >>>>> > >>>>>> Questions: How often do you commit your updates? What is your > >>>>>>>> indexing rate in docs/second? > >>>>>>>> > >>>>>>>> In a SolrCloud setup, you should be using a CloudSolrServer. If > the > >>>>>>>> server is having trouble keeping up with updates, switching to > CUSS > >>>>>>>> probably wouldn't help. > >>>>>>>> > >>>>>>>> So I suspect there's something not optimal about your setup that's > >>>>>>>> the culprit. > >>>>>>>> > >>>>>>>> Best, > >>>>>>>> Erick > >>>>>>>> > >>>>>>>> On Mon, Jan 20, 2014 at 4:00 PM, Software Dev < > >>>>>>>> > >>>>>>> static.void....@gmail.com> > >>>>> > >>>>>> wrote: > >>>>>>>> > >>>>>>>>> We are testing our shiny new Solr Cloud architecture but we are > >>>>>>>>> experiencing some issues when doing bulk indexing. > >>>>>>>>> > >>>>>>>>> We have 5 solr cloud machines running and 3 indexing machines > >>>>>>>>> > >>>>>>>> (separate > >>>>> > >>>>>> from the cloud servers). The indexing machines pull off ids from a > >>>>>>>>> > >>>>>>>> queue > >>>>> > >>>>>> then they index and ship over a document via a CloudSolrServer. It > >>>>>>>>> > >>>>>>>> appears > >>>>>>>> > >>>>>>>>> that the indexers are too fast because the load (particularly > disk > >>>>>>>>> > >>>>>>>> io) > >>>>> > >>>>>> on > >>>>>>>> > >>>>>>>>> the solr cloud machines spikes through the roof making the entire > >>>>>>>>> > >>>>>>>> cluster > >>>>>>>> > >>>>>>>>> unusable. It's kind of odd because the total index size is not > even > >>>>>>>>> large..ie, < 10GB. Are there any optimization/enhancements I > could > >>>>>>>>> > >>>>>>>> try > >>>>> > >>>>>> to > >>>>>>>> > >>>>>>>>> help alleviate these problems? > >>>>>>>>> > >>>>>>>>> I should note that for the above collection we have only have 1 > >>>>>>>>> shard > >>>>>>>>> > >>>>>>>> thats > >>>>>>>> > >>>>>>>>> replicated across all machines so all machines have the full > index. > >>>>>>>>> > >>>>>>>>> Would we benefit from switching to a ConcurrentUpdateSolrServer > >>>>>>>>> where > >>>>>>>>> > >>>>>>>> all > >>>>>>>> > >>>>>>>>> updates get sent to 1 machine and 1 machine only? We could then > >>>>>>>>> > >>>>>>>> remove > >>>>> > >>>>>> this > >>>>>>>> > >>>>>>>>> machine from our cluster than that handles user requests. > >>>>>>>>> > >>>>>>>>> Thanks for any input. > >>>>>>>>> > >>>>>>>> > >>>>>>> > >>>>> > >>> -- > >>> André Bois-Crettez > >>> > >>> Software Architect > >>> Search Developer > >>> http://www.kelkoo.com/ > >>> > >> > >> Kelkoo SAS > >> Société par Actions Simplifiée > >> Au capital de € 4.168.964,30 > >> Siège social : 8, rue du Sentier 75002 Paris > >> 425 093 069 RCS Paris > >> > >> Ce message et les pièces jointes sont confidentiels et établis à > >> l'attention exclusive de leurs destinataires. Si vous n'êtes pas le > >> destinataire de ce message, merci de le détruire et d'en avertir > >> l'expéditeur. > >> >