Can't get any failures to happen on my end so I really haven't a clue.

Best,
Erick

On Thu, Jun 4, 2015 at 3:17 AM, Modassar Ather <modather1...@gmail.com> wrote:
> Hi,
>
> Please provide your inputs on optimize and commit running as background.
> Your suggestion will be really helpful.
>
> Thanks,
> Modassar
>
> On Tue, Jun 2, 2015 at 6:05 PM, Modassar Ather <modather1...@gmail.com>
> wrote:
>
>> Erick! I could not find any underlying setting of 10 minutes.
>> It is not only optimize but commit is also behaving in the same fashion
>> and is taking lesser time than usually had taken.
>> As per my observation both are running in background.
>>
>> On Fri, May 29, 2015 at 7:21 PM, Erick Erickson <erickerick...@gmail.com>
>> wrote:
>>
>>> I'm not talking about you setting a timeout, but the underlying
>>> connection timing out...
>>>
>>> The "10 minutes then the indexer exits" comment points in that direction.
>>>
>>> Best,
>>> Erick
>>>
>>> On Thu, May 28, 2015 at 11:43 PM, Modassar Ather <modather1...@gmail.com>
>>> wrote:
>>> > I have not added any timeout in the indexer except zk client time out
>>> which
>>> > is 30 seconds. I am simply calling client.close() at the end of
>>> indexing.
>>> > The same code was not running in background for optimize with
>>> solr-4.10.3
>>> > and org.apache.solr.client.solrj.impl.CloudSolrServer.
>>> >
>>> > On Fri, May 29, 2015 at 11:13 AM, Erick Erickson <
>>> erickerick...@gmail.com>
>>> > wrote:
>>> >
>>> >> Are you timing out on the client request? The theory here is that it's
>>> >> still a synchronous call, but you're just timing out at the client
>>> >> level. At that point, the optimize is still running it's just the
>>> >> connection has been dropped....
>>> >>
>>> >> Shot in the dark.
>>> >> Erick
>>> >>
>>> >> On Thu, May 28, 2015 at 10:31 PM, Modassar Ather <
>>> modather1...@gmail.com>
>>> >> wrote:
>>> >> > I could not notice it but with my past experience of commit which
>>> used to
>>> >> > take around 2 minutes is now taking around 8 seconds. I think this is
>>> >> also
>>> >> > running as background.
>>> >> >
>>> >> > On Fri, May 29, 2015 at 10:52 AM, Modassar Ather <
>>> modather1...@gmail.com
>>> >> >
>>> >> > wrote:
>>> >> >
>>> >> >> The indexer takes almost 2 hours to optimize. It has a
>>> multi-threaded
>>> >> add
>>> >> >> of batches of documents to
>>> >> >> org.apache.solr.client.solrj.impl.CloudSolrClient.
>>> >> >> Once all the documents are indexed it invokes commit and optimize. I
>>> >> have
>>> >> >> seen that the optimize goes into background after 10 minutes and
>>> indexer
>>> >> >> exits.
>>> >> >> I am not sure why this 10 minutes it hangs on indexer. This
>>> behavior I
>>> >> >> have seen in multiple iteration of the indexing of same data.
>>> >> >>
>>> >> >> There is nothing significant I found in log which I can share. I
>>> can see
>>> >> >> following in log.
>>> >> >> org.apache.solr.update.DirectUpdateHandler2; start
>>> >> >>
>>> >>
>>> commit{,optimize=true,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>>> >> >>
>>> >> >> On Wed, May 27, 2015 at 10:59 PM, Erick Erickson <
>>> >> erickerick...@gmail.com>
>>> >> >> wrote:
>>> >> >>
>>> >> >>> All strange of course. What do your Solr logs show when this
>>> happens?
>>> >> >>> And how reproducible is this?
>>> >> >>>
>>> >> >>> Best,
>>> >> >>> Erick
>>> >> >>>
>>> >> >>> On Wed, May 27, 2015 at 4:00 AM, Upayavira <u...@odoko.co.uk> wrote:
>>> >> >>> > In this case, optimising makes sense, once the index is
>>> generated,
>>> >> you
>>> >> >>> > are not updating It.
>>> >> >>> >
>>> >> >>> > Upayavira
>>> >> >>> >
>>> >> >>> > On Wed, May 27, 2015, at 06:14 AM, Modassar Ather wrote:
>>> >> >>> >> Our index has almost 100M documents running on SolrCloud of 5
>>> shards
>>> >> >>> and
>>> >> >>> >> each shard has an index size of about 170+GB (for the record,
>>> we are
>>> >> >>> not
>>> >> >>> >> using stored fields - our documents are pretty large). We
>>> perform a
>>> >> >>> full
>>> >> >>> >> indexing every weekend and during the week there are no updates
>>> >> made to
>>> >> >>> >> the
>>> >> >>> >> index. Most of the queries that we run are pretty complex with
>>> >> hundreds
>>> >> >>> >> of
>>> >> >>> >> terms using PhraseQuery, BooleanQuery, SpanQuery, Wildcards,
>>> boosts
>>> >> >>> etc.
>>> >> >>> >> and take many minutes to execute. A difference of 10-20% is
>>> also a
>>> >> big
>>> >> >>> >> advantage for us.
>>> >> >>> >>
>>> >> >>> >> We have been optimizing the index after indexing for years and
>>> it
>>> >> has
>>> >> >>> >> worked well for us. Every once in a while, we upgrade Solr to
>>> the
>>> >> >>> latest
>>> >> >>> >> version and try without optimizing so that we can save the many
>>> >> hours
>>> >> >>> it
>>> >> >>> >> take to optimize such a huge index, but find optimized index
>>> work
>>> >> well
>>> >> >>> >> for
>>> >> >>> >> us.
>>> >> >>> >>
>>> >> >>> >> Erick I was indexing today the documents and saw the optimize
>>> >> happening
>>> >> >>> >> in
>>> >> >>> >> background.
>>> >> >>> >>
>>> >> >>> >> On Tue, May 26, 2015 at 9:12 PM, Erick Erickson <
>>> >> >>> erickerick...@gmail.com>
>>> >> >>> >> wrote:
>>> >> >>> >>
>>> >> >>> >> > No results yet. I finished the test harness last night (not
>>> >> really a
>>> >> >>> >> > unit test, a stand-alone program that endlessly adds stuff and
>>> >> tests
>>> >> >>> >> > that every commit returns the correct number of docs).
>>> >> >>> >> >
>>> >> >>> >> > 8,000 cycles later there aren't any problems reported.
>>> >> >>> >> >
>>> >> >>> >> > Siiigggggh.
>>> >> >>> >> >
>>> >> >>> >> >
>>> >> >>> >> > On Tue, May 26, 2015 at 1:51 AM, Modassar Ather <
>>> >> >>> modather1...@gmail.com>
>>> >> >>> >> > wrote:
>>> >> >>> >> > > Hi,
>>> >> >>> >> > >
>>> >> >>> >> > > Erick you mentioned about a unit test to test the optimize
>>> >> running
>>> >> >>> in
>>> >> >>> >> > > background. Kindly share your findings if any.
>>> >> >>> >> > >
>>> >> >>> >> > > Thanks,
>>> >> >>> >> > > Modassar
>>> >> >>> >> > >
>>> >> >>> >> > > On Mon, May 25, 2015 at 11:47 AM, Modassar Ather <
>>> >> >>> modather1...@gmail.com
>>> >> >>> >> > >
>>> >> >>> >> > > wrote:
>>> >> >>> >> > >
>>> >> >>> >> > >> Thanks everybody for your replies.
>>> >> >>> >> > >>
>>> >> >>> >> > >> I have noticed the optimization running in background every
>>> >> time I
>>> >> >>> >> > >> indexed. This is 5 node cluster with solr-5.1.0 and uses
>>> the
>>> >> >>> >> > >> CloudSolrClient. Kindly share your findings on this issue.
>>> >> >>> >> > >>
>>> >> >>> >> > >> Our index has almost 100M documents running on SolrCloud.
>>> We
>>> >> have
>>> >> >>> been
>>> >> >>> >> > >> optimizing the index after indexing for years and it has
>>> worked
>>> >> >>> well for
>>> >> >>> >> > >> us.
>>> >> >>> >> > >>
>>> >> >>> >> > >> Thanks,
>>> >> >>> >> > >> Modassar
>>> >> >>> >> > >>
>>> >> >>> >> > >> On Fri, May 22, 2015 at 11:55 PM, Erick Erickson <
>>> >> >>> >> > erickerick...@gmail.com>
>>> >> >>> >> > >> wrote:
>>> >> >>> >> > >>
>>> >> >>> >> > >>> Actually, I've recently seen very similar behavior in Solr
>>> >> >>> 4.10.3, but
>>> >> >>> >> > >>> involving hard commits openSearcher=true, see:
>>> >> >>> >> > >>> https://issues.apache.org/jira/browse/SOLR-7572. Of
>>> course I
>>> >> >>> can't
>>> >> >>> >> > >>> reproduce this at will, siigggghhhh.
>>> >> >>> >> > >>>
>>> >> >>> >> > >>> A unit test should be very simple to write though, maybe
>>> I can
>>> >> >>> get to
>>> >> >>> >> > it
>>> >> >>> >> > >>> today.
>>> >> >>> >> > >>>
>>> >> >>> >> > >>> Erick
>>> >> >>> >> > >>>
>>> >> >>> >> > >>>
>>> >> >>> >> > >>>
>>> >> >>> >> > >>> On Fri, May 22, 2015 at 8:27 AM, Upayavira <
>>> u...@odoko.co.uk>
>>> >> >>> wrote:
>>> >> >>> >> > >>> >
>>> >> >>> >> > >>> >
>>> >> >>> >> > >>> > On Fri, May 22, 2015, at 03:55 PM, Shawn Heisey wrote:
>>> >> >>> >> > >>> >> On 5/21/2015 6:21 AM, Modassar Ather wrote:
>>> >> >>> >> > >>> >> > I am using Solr-5.1.0. I have an indexer class which
>>> >> invokes
>>> >> >>> >> > >>> >> > cloudSolrClient.optimize(true, true, 1). My indexer
>>> exits
>>> >> >>> after
>>> >> >>> >> > the
>>> >> >>> >> > >>> >> > invocation of optimize and the optimization keeps on
>>> >> >>> running in
>>> >> >>> >> > the
>>> >> >>> >> > >>> >> > background.
>>> >> >>> >> > >>> >> > Kindly let me know if it is per design and how can I
>>> >> make my
>>> >> >>> >> > indexer
>>> >> >>> >> > >>> to
>>> >> >>> >> > >>> >> > wait until the optimization is over. Is there a
>>> >> >>> >> > >>> configuration/parameter I
>>> >> >>> >> > >>> >> > need to set for the same.
>>> >> >>> >> > >>> >> >
>>> >> >>> >> > >>> >> > Please note that the same indexer with
>>> >> >>> >> > >>> cloudSolrServer.optimize(true, true,
>>> >> >>> >> > >>> >> > 1) on Solr-4.10 used to wait till the optimize was
>>> over
>>> >> >>> before
>>> >> >>> >> > >>> exiting.
>>> >> >>> >> > >>> >>
>>> >> >>> >> > >>> >> This is very odd, because I could not get
>>> HttpSolrServer to
>>> >> >>> >> > optimize in
>>> >> >>> >> > >>> >> the background, even when that was what I wanted.
>>> >> >>> >> > >>> >>
>>> >> >>> >> > >>> >> I wondered if maybe the Cloud object behaves
>>> differently
>>> >> with
>>> >> >>> >> > regard to
>>> >> >>> >> > >>> >> blocking until an optimize is finished ... except that
>>> >> there
>>> >> >>> is no
>>> >> >>> >> > code
>>> >> >>> >> > >>> >> for optimizing in CloudSolrClient at all ... so I don't
>>> >> know
>>> >> >>> where
>>> >> >>> >> > the
>>> >> >>> >> > >>> >> different behavior would actually be happening.
>>> >> >>> >> > >>> >
>>> >> >>> >> > >>> > A more important question is, why are you optimising?
>>> >> >>> Generally it
>>> >> >>> >> > isn't
>>> >> >>> >> > >>> > recommended anymore as it reduces the natural
>>> distribution
>>> >> of
>>> >> >>> >> > documents
>>> >> >>> >> > >>> > amongst segments and makes future merges more costly.
>>> >> >>> >> > >>> >
>>> >> >>> >> > >>> > Upayavira
>>> >> >>> >> > >>>
>>> >> >>> >> > >>
>>> >> >>> >> > >>
>>> >> >>> >> >
>>> >> >>>
>>> >> >>
>>> >> >>
>>> >>
>>>
>>
>>

Reply via email to