I'm not talking about you setting a timeout, but the underlying
connection timing out...

The "10 minutes then the indexer exits" comment points in that direction.

Best,
Erick

On Thu, May 28, 2015 at 11:43 PM, Modassar Ather <modather1...@gmail.com> wrote:
> I have not added any timeout in the indexer except zk client time out which
> is 30 seconds. I am simply calling client.close() at the end of indexing.
> The same code was not running in background for optimize with solr-4.10.3
> and org.apache.solr.client.solrj.impl.CloudSolrServer.
>
> On Fri, May 29, 2015 at 11:13 AM, Erick Erickson <erickerick...@gmail.com>
> wrote:
>
>> Are you timing out on the client request? The theory here is that it's
>> still a synchronous call, but you're just timing out at the client
>> level. At that point, the optimize is still running it's just the
>> connection has been dropped....
>>
>> Shot in the dark.
>> Erick
>>
>> On Thu, May 28, 2015 at 10:31 PM, Modassar Ather <modather1...@gmail.com>
>> wrote:
>> > I could not notice it but with my past experience of commit which used to
>> > take around 2 minutes is now taking around 8 seconds. I think this is
>> also
>> > running as background.
>> >
>> > On Fri, May 29, 2015 at 10:52 AM, Modassar Ather <modather1...@gmail.com
>> >
>> > wrote:
>> >
>> >> The indexer takes almost 2 hours to optimize. It has a multi-threaded
>> add
>> >> of batches of documents to
>> >> org.apache.solr.client.solrj.impl.CloudSolrClient.
>> >> Once all the documents are indexed it invokes commit and optimize. I
>> have
>> >> seen that the optimize goes into background after 10 minutes and indexer
>> >> exits.
>> >> I am not sure why this 10 minutes it hangs on indexer. This behavior I
>> >> have seen in multiple iteration of the indexing of same data.
>> >>
>> >> There is nothing significant I found in log which I can share. I can see
>> >> following in log.
>> >> org.apache.solr.update.DirectUpdateHandler2; start
>> >>
>> commit{,optimize=true,openSearcher=true,waitSearcher=true,expungeDeletes=false,softCommit=false,prepareCommit=false}
>> >>
>> >> On Wed, May 27, 2015 at 10:59 PM, Erick Erickson <
>> erickerick...@gmail.com>
>> >> wrote:
>> >>
>> >>> All strange of course. What do your Solr logs show when this happens?
>> >>> And how reproducible is this?
>> >>>
>> >>> Best,
>> >>> Erick
>> >>>
>> >>> On Wed, May 27, 2015 at 4:00 AM, Upayavira <u...@odoko.co.uk> wrote:
>> >>> > In this case, optimising makes sense, once the index is generated,
>> you
>> >>> > are not updating It.
>> >>> >
>> >>> > Upayavira
>> >>> >
>> >>> > On Wed, May 27, 2015, at 06:14 AM, Modassar Ather wrote:
>> >>> >> Our index has almost 100M documents running on SolrCloud of 5 shards
>> >>> and
>> >>> >> each shard has an index size of about 170+GB (for the record, we are
>> >>> not
>> >>> >> using stored fields - our documents are pretty large). We perform a
>> >>> full
>> >>> >> indexing every weekend and during the week there are no updates
>> made to
>> >>> >> the
>> >>> >> index. Most of the queries that we run are pretty complex with
>> hundreds
>> >>> >> of
>> >>> >> terms using PhraseQuery, BooleanQuery, SpanQuery, Wildcards, boosts
>> >>> etc.
>> >>> >> and take many minutes to execute. A difference of 10-20% is also a
>> big
>> >>> >> advantage for us.
>> >>> >>
>> >>> >> We have been optimizing the index after indexing for years and it
>> has
>> >>> >> worked well for us. Every once in a while, we upgrade Solr to the
>> >>> latest
>> >>> >> version and try without optimizing so that we can save the many
>> hours
>> >>> it
>> >>> >> take to optimize such a huge index, but find optimized index work
>> well
>> >>> >> for
>> >>> >> us.
>> >>> >>
>> >>> >> Erick I was indexing today the documents and saw the optimize
>> happening
>> >>> >> in
>> >>> >> background.
>> >>> >>
>> >>> >> On Tue, May 26, 2015 at 9:12 PM, Erick Erickson <
>> >>> erickerick...@gmail.com>
>> >>> >> wrote:
>> >>> >>
>> >>> >> > No results yet. I finished the test harness last night (not
>> really a
>> >>> >> > unit test, a stand-alone program that endlessly adds stuff and
>> tests
>> >>> >> > that every commit returns the correct number of docs).
>> >>> >> >
>> >>> >> > 8,000 cycles later there aren't any problems reported.
>> >>> >> >
>> >>> >> > Siiigggggh.
>> >>> >> >
>> >>> >> >
>> >>> >> > On Tue, May 26, 2015 at 1:51 AM, Modassar Ather <
>> >>> modather1...@gmail.com>
>> >>> >> > wrote:
>> >>> >> > > Hi,
>> >>> >> > >
>> >>> >> > > Erick you mentioned about a unit test to test the optimize
>> running
>> >>> in
>> >>> >> > > background. Kindly share your findings if any.
>> >>> >> > >
>> >>> >> > > Thanks,
>> >>> >> > > Modassar
>> >>> >> > >
>> >>> >> > > On Mon, May 25, 2015 at 11:47 AM, Modassar Ather <
>> >>> modather1...@gmail.com
>> >>> >> > >
>> >>> >> > > wrote:
>> >>> >> > >
>> >>> >> > >> Thanks everybody for your replies.
>> >>> >> > >>
>> >>> >> > >> I have noticed the optimization running in background every
>> time I
>> >>> >> > >> indexed. This is 5 node cluster with solr-5.1.0 and uses the
>> >>> >> > >> CloudSolrClient. Kindly share your findings on this issue.
>> >>> >> > >>
>> >>> >> > >> Our index has almost 100M documents running on SolrCloud. We
>> have
>> >>> been
>> >>> >> > >> optimizing the index after indexing for years and it has worked
>> >>> well for
>> >>> >> > >> us.
>> >>> >> > >>
>> >>> >> > >> Thanks,
>> >>> >> > >> Modassar
>> >>> >> > >>
>> >>> >> > >> On Fri, May 22, 2015 at 11:55 PM, Erick Erickson <
>> >>> >> > erickerick...@gmail.com>
>> >>> >> > >> wrote:
>> >>> >> > >>
>> >>> >> > >>> Actually, I've recently seen very similar behavior in Solr
>> >>> 4.10.3, but
>> >>> >> > >>> involving hard commits openSearcher=true, see:
>> >>> >> > >>> https://issues.apache.org/jira/browse/SOLR-7572. Of course I
>> >>> can't
>> >>> >> > >>> reproduce this at will, siigggghhhh.
>> >>> >> > >>>
>> >>> >> > >>> A unit test should be very simple to write though, maybe I can
>> >>> get to
>> >>> >> > it
>> >>> >> > >>> today.
>> >>> >> > >>>
>> >>> >> > >>> Erick
>> >>> >> > >>>
>> >>> >> > >>>
>> >>> >> > >>>
>> >>> >> > >>> On Fri, May 22, 2015 at 8:27 AM, Upayavira <u...@odoko.co.uk>
>> >>> wrote:
>> >>> >> > >>> >
>> >>> >> > >>> >
>> >>> >> > >>> > On Fri, May 22, 2015, at 03:55 PM, Shawn Heisey wrote:
>> >>> >> > >>> >> On 5/21/2015 6:21 AM, Modassar Ather wrote:
>> >>> >> > >>> >> > I am using Solr-5.1.0. I have an indexer class which
>> invokes
>> >>> >> > >>> >> > cloudSolrClient.optimize(true, true, 1). My indexer exits
>> >>> after
>> >>> >> > the
>> >>> >> > >>> >> > invocation of optimize and the optimization keeps on
>> >>> running in
>> >>> >> > the
>> >>> >> > >>> >> > background.
>> >>> >> > >>> >> > Kindly let me know if it is per design and how can I
>> make my
>> >>> >> > indexer
>> >>> >> > >>> to
>> >>> >> > >>> >> > wait until the optimization is over. Is there a
>> >>> >> > >>> configuration/parameter I
>> >>> >> > >>> >> > need to set for the same.
>> >>> >> > >>> >> >
>> >>> >> > >>> >> > Please note that the same indexer with
>> >>> >> > >>> cloudSolrServer.optimize(true, true,
>> >>> >> > >>> >> > 1) on Solr-4.10 used to wait till the optimize was over
>> >>> before
>> >>> >> > >>> exiting.
>> >>> >> > >>> >>
>> >>> >> > >>> >> This is very odd, because I could not get HttpSolrServer to
>> >>> >> > optimize in
>> >>> >> > >>> >> the background, even when that was what I wanted.
>> >>> >> > >>> >>
>> >>> >> > >>> >> I wondered if maybe the Cloud object behaves differently
>> with
>> >>> >> > regard to
>> >>> >> > >>> >> blocking until an optimize is finished ... except that
>> there
>> >>> is no
>> >>> >> > code
>> >>> >> > >>> >> for optimizing in CloudSolrClient at all ... so I don't
>> know
>> >>> where
>> >>> >> > the
>> >>> >> > >>> >> different behavior would actually be happening.
>> >>> >> > >>> >
>> >>> >> > >>> > A more important question is, why are you optimising?
>> >>> Generally it
>> >>> >> > isn't
>> >>> >> > >>> > recommended anymore as it reduces the natural distribution
>> of
>> >>> >> > documents
>> >>> >> > >>> > amongst segments and makes future merges more costly.
>> >>> >> > >>> >
>> >>> >> > >>> > Upayavira
>> >>> >> > >>>
>> >>> >> > >>
>> >>> >> > >>
>> >>> >> >
>> >>>
>> >>
>> >>
>>

Reply via email to