Thanks for letting us know, and do bring let us know if you see the problem again.
Erick On Tue, Dec 18, 2012 at 7:39 AM, John Nielsen <j...@mcb.dk> wrote: > I build a solr version from the solr-4x branch yesterday and so far am > unable to replicate the problems i had before. > > I am cautiously optimistic that the problem has been resolved. If i run > into any more problems, I'll let you all know. > > > -- > Med venlig hilsen / Best regards > > *John Nielsen* > Programmer > > > > *MCB A/S* > Enghaven 15 > DK-7500 Holstebro > > Kundeservice: +45 9610 2824 > p...@mcb.dk > www.mcb.dk > > > > On Fri, Dec 14, 2012 at 7:33 PM, Markus Jelsma > <markus.jel...@openindex.io>wrote: > > > Mark, no issue has been filed. That cluster runs a check out from round > > end of july/beginning of august. I'm in the process of including another > > cluster in the indexing and removal of documents besides the old > production > > clusters. I'll start writing to that one tuesday orso. > > If i notice a discrepancy after some time i am sure to report it. I doubt > > i'll find it before 2013, if the problem is still there. > > > > > > -----Original message----- > > > From:Mark Miller <markrmil...@gmail.com> > > > Sent: Fri 14-Dec-2012 19:05 > > > To: solr-user@lucene.apache.org > > > Subject: Re: Strange data-loss problem on one of our cores > > > > > > Have you filed a JIRA issue for this that I don't remember Markus? > > > > > > We need to make sure this is fixed. > > > > > > Any idea around when the trunk version came from? Before or after 4.0? > > > > > > - Mark > > > > > > On Dec 14, 2012, at 6:36 AM, Markus Jelsma <markus.jel...@openindex.io > > > > wrote: > > > > > > > We did not solve it but reindexing can remedy the problem. > > > > > > > > -----Original message----- > > > >> From:John Nielsen <j...@mcb.dk> > > > >> Sent: Fri 14-Dec-2012 12:31 > > > >> To: solr-user@lucene.apache.org > > > >> Subject: Re: Strange data-loss problem on one of our cores > > > >> > > > >> How did you solve the problem? > > > >> > > > >> > > > >> -- > > > >> Med venlig hilsen / Best regards > > > >> > > > >> *John Nielsen* > > > >> Programmer > > > >> > > > >> > > > >> > > > >> *MCB A/S* > > > >> Enghaven 15 > > > >> DK-7500 Holstebro > > > >> > > > >> Kundeservice: +45 9610 2824 > > > >> p...@mcb.dk > > > >> www.mcb.dk > > > >> > > > >> > > > >> > > > >> On Fri, Dec 14, 2012 at 12:04 PM, Markus Jelsma > > > >> <markus.jel...@openindex.io>wrote: > > > >> > > > >>> FYI, we observe the same issue, after some time (days, months) a > > cluster > > > >>> running an older trunk version has at least two shards where the > > leader and > > > >>> the replica do not contain the same number of records. No recovery > is > > > >>> attempted, it seems it thinks everything is alright. Also, one core > > of one > > > >>> of the unsynced shards waits forever loading > > > >>> /replication?command=detail&wt=json, other cores load it in a few > > ms. Both > > > >>> cores of another unsynced shard does not show this problem. > > > >>> > > > >>> -----Original message----- > > > >>>> From:John Nielsen <j...@mcb.dk> > > > >>>> Sent: Fri 14-Dec-2012 11:50 > > > >>>> To: solr-user@lucene.apache.org > > > >>>> Subject: Re: Strange data-loss problem on one of our cores > > > >>>> > > > >>>> I did a manual commit, and we are still missing docs, so it > doesn't > > look > > > >>>> like the search race condition you mention. > > > >>>> > > > >>>> My boss wasn't happy when i mentioned that I wanted to try out > > unreleased > > > >>>> code. Ill get him won over though and return with my findings. It > > will > > > >>>> probably be some time next week. > > > >>>> > > > >>>> Thanks for your help. > > > >>>> > > > >>>> > > > >>>> -- > > > >>>> Med venlig hilsen / Best regards > > > >>>> > > > >>>> *John Nielsen* > > > >>>> Programmer > > > >>>> > > > >>>> > > > >>>> > > > >>>> *MCB A/S* > > > >>>> Enghaven 15 > > > >>>> DK-7500 Holstebro > > > >>>> > > > >>>> Kundeservice: +45 9610 2824 > > > >>>> p...@mcb.dk > > > >>>> www.mcb.dk > > > >>>> > > > >>>> > > > >>>> > > > >>>> On Thu, Dec 13, 2012 at 4:10 PM, Mark Miller < > markrmil...@gmail.com > > > > > > >>> wrote: > > > >>>> > > > >>>>> Couple things to start: > > > >>>>> > > > >>>>> By default SolrCloud distributes updates a doc at a time. So if > you > > > >>> have 1 > > > >>>>> shard, whatever node you index too, it will send updates to the > > other. > > > >>>>> Replication is only used for recovery, not distributing data. So > > for > > > >>> some > > > >>>>> reason, there is an IOException when it tries to forward. > > > >>>>> > > > >>>>> The other issue is not something that Ive seen reported. Can/did > > you > > > >>> try > > > >>>>> and do another hard commit to make sure you had the latest search > > open > > > >>> when > > > >>>>> checking the # of docs on each node? There was previously a race > > around > > > >>>>> commit that could cause some issues around expected visibility. > > > >>>>> > > > >>>>> If you are able to, you might try out a nightly build - 4.1 will > be > > > >>> ready > > > >>>>> very soon and has numerous bug fixes for SolrCloud. > > > >>>>> > > > >>>>> - Mark > > > >>>>> > > > >>>>> On Dec 13, 2012, at 9:53 AM, John Nielsen <j...@mcb.dk> wrote: > > > >>>>> > > > >>>>>> Hi all, > > > >>>>>> > > > >>>>>> We are seeing a strange problem on our 2-node solr4 cluster. > This > > > >>> problem > > > >>>>>> has resultet in data loss. > > > >>>>>> > > > >>>>>> We have two servers, varnish01 and varnish02. Zookeeper is > > running on > > > >>>>>> varnish02, but in a separate jvm. > > > >>>>>> > > > >>>>>> We index directly to varnish02 and we read from varnish01. Data > is > > > >>> thus > > > >>>>>> replicated from varnish02 to varnish01. > > > >>>>>> > > > >>>>>> I found this in the varnish01 log: > > > >>>>>> > > > >>>>>> *INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=42 > > > >>>>>> Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=41 > > > >>>>>> Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=33 > > > >>>>>> Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=33 > > > >>>>>> Dec 13, 2012 12:23:39 PM org.apache.solr.common.SolrException > log > > > >>>>>> SEVERE: shard update error StdNode: > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish02.lynero.net:8000/solr/default1_Norwegian/:org.apache.solr.client.solrj.SolrServerException > > > >>>>> : > > > >>>>>> IOException occured when talking to server at: > > > >>>>>> http://varnish02.lynero.net:8000/solr/default1_Norwegian > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:413) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:335) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:309) > > > >>>>>> at > > > >>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > > > >>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:166) > > > >>>>>> at > > > >>>>>> > > > >>> > > java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471) > > > >>>>>> at > > > >>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334) > > > >>>>>> at java.util.concurrent.FutureTask.run(FutureTask.java:166) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603) > > > >>>>>> at java.lang.Thread.run(Thread.java:636) > > > >>>>>> Caused by: org.apache.http.NoHttpResponseException: The target > > server > > > >>>>>> failed to respond > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:101) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732) > > > >>>>>> at > > > >>>>>> > > > >>>>> > > > >>> > > > org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352) > > > >>>>>> ... 11 more > > > >>>>>> > > > >>>>>> Dec 13, 2012 12:23:39 PM > > > >>>>>> org.apache.solr.update.processor.DistributedUpdateProcessor > > doFinish > > > >>>>>> INFO: try and ask http://varnish02.lynero.net:8000/solr to > > recover* > > > >>>>>> > > > >>>>>> It looks like it is sending updates from varnish01 to > varnish02. I > > > >>> am not > > > >>>>>> sure for what since we only index on varnish02. Updates should > > never > > > >>> be > > > >>>>>> going from varnish01 to varnish02. > > > >>>>>> > > > >>>>>> Meanwhile on varnish02: > > > >>>>>> > > > >>>>>> *INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=16 > > > >>>>>> Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=15 > > > >>>>>> Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=16 > > > >>>>>> Dec 13, 2012 12:23:42 PM > > > >>> org.apache.solr.handler.admin.CoreAdminHandler > > > >>>>>> handleRequestRecoveryAction > > > >>>>>> INFO: It has been requested that we recover* > > > >>>>>> *Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Danish] webapp=/solr path=/select > > > >>>>>> > > > >>>>> > > > >>> > > > params={facet=false&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&group.distributed.first=true&facet.limit=1000&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&version=2&df=text&fl=docid&shard.url= > > > >>>>>> > > > >>>>> > > > >>> > > > varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822111&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&facet.mincount=1&qf=%0a++++++++++text > > > >>>>> > > > >>> > > > ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=0&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true} > > > >>>>>> status=0 QTime=1 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Danish] webapp=/solr path=/select/ > > > >>>>>> params={fq=site_guid:(2810678)&q=win} hits=0 status=0 QTime=17 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Danish] webapp=/solr path=/select > > > >>>>>> > > > >>>>> > > > >>> > > > params={facet=on&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&group.distributed.second=true&version=2&df=text&fl=docid&shard.url= > > > >>>>>> > > > >>>>> > > > >>> > > > varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822111&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&qf=%0a++++++++++text > > > >>>>> > > > >>> > > > ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=0&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true} > > > >>>>>> status=0 QTime=1 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Danish] webapp=/solr path=/select > > > >>>>>> > > > >>>>> > > > >>> > > > params={facet=false&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&group.distributed.first=true&facet.limit=1000&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&version=2&df=text&fl=docid&shard.url= > > > >>>>>> > > > >>>>> > > > >>> > > > varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822138&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&facet.mincount=1&qf=%0a++++++++++text > > > >>>>> > > > >>> > > > ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=40&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true} > > > >>>>>> status=0 QTime=1 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Danish] webapp=/solr path=/select > > > >>>>>> > > > >>>>> > > > >>> > > > params={facet=on&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&group.distributed.second=true&version=2&df=text&fl=docid&shard.url= > > > >>>>>> > > > >>>>> > > > >>> > > > varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822138&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&group.topgroups.groupby_variant_of_item_guid=2963217&group.topgroups.groupby_variant_of_item_guid=2963223&group.topgroups.groupby_variant_of_item_guid=2963219&group.topgroups.groupby_variant_of_item_guid=2963220&group.topgroups.groupby_variant_of_item_guid=2963221&group.topgroups.groupby_variant_of_item_guid=2963222&group.topgroups.groupby_variant_of_item_guid=2963224&group.topgroups.groupby_variant_of_item_guid=2963218&qf=%0a++++++++++text > > > >>>>> > > > >>> > > > ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=40&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true} > > > >>>>>> status=0 QTime=1 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=26 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=22 > > > >>>>>> Dec 13, 2012 12:23:42 PM > > org.apache.solr.update.DefaultSolrCoreState > > > >>>>>> doRecovery > > > >>>>>> Dec 13, 2012 12:23:42 PM > > org.apache.solr.update.DefaultSolrCoreState > > > >>>>>> doRecovery > > > >>>>>> INFO: Running recovery - first canceling any ongoing recovery > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=25 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=24 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=20 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=25 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=23 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=21 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=23 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update > > > >>>>> params={distrib.from= > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2 > > > >>>>> } > > > >>>>>> status=0 QTime=16 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy > > run > > > >>>>>> INFO: Starting recovery process. core=default1_Norwegian > > > >>>>>> recoveringAfterStartup=false > > > >>>>>> Dec 13, 2012 12:23:42 PM > > org.apache.solr.common.cloud.ZkStateReader > > > >>>>>> updateClusterState > > > >>>>>> INFO: Updating cloud state from ZooKeeper... > > > >>>>>> Dec 13, 2012 12:23:42 PM > > > >>>>>> org.apache.solr.update.processor.LogUpdateProcessor finish* > > > >>>>>> > > > >>>>>> And less than a second later: > > > >>>>>> > > > >>>>>> *Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy > > > >>>>> doRecovery > > > >>>>>> INFO: Attempting to PeerSync from > > > >>>>>> > > > >>>>> > > > >>> > > > http://varnish01.lynero.net:8000/solr/default1_Norwegian/core=default1_Norwegian > > > >>>>>> - recoveringAfterStartup=false > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.update.PeerSync sync > > > >>>>>> INFO: PeerSync: core=default1_Norwegian url= > > > >>>>>> http://varnish02.lynero.net:8000/solr START replicas=[ > > > >>>>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/] > > > >>> nUpdates=100 > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.update.PeerSync sync > > > >>>>>> WARNING: PeerSync: core=default1_Norwegian url= > > > >>>>>> http://varnish02.lynero.net:8000/solr too many updates received > > > >>> since > > > >>>>> start > > > >>>>>> - startingUpdates no longer overlaps with our currentUpdates > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy > > > >>>>> doRecovery > > > >>>>>> INFO: PeerSync Recovery was not successful - trying replication. > > > >>>>>> core=default1_Norwegian > > > >>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy > > > >>>>> doRecovery > > > >>>>>> INFO: Starting Replication Recovery. core=default1_Norwegian > > > >>>>>> Dec 13, 2012 12:23:42 PM > > > >>> org.apache.solr.client.solrj.impl.HttpClientUtil > > > >>>>>> createClient > > > >>>>>> INFO: Creating new http client, > > > >>>>>> > > > >>> > > config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false > > > >>>>>> Dec 13, 2012 12:23:42 PM > > org.apache.solr.common.cloud.ZkStateReader$2 > > > >>>>>> process > > > >>>>>> INFO: A cluster state change has occurred - updating...* > > > >>>>>> > > > >>>>>> State change on varnish01 at the same time: > > > >>>>>> > > > >>>>>> *Dec 13, 2012 12:23:42 PM > > > >>> org.apache.solr.common.cloud.ZkStateReader$2 > > > >>>>>> process > > > >>>>>> INFO: A cluster state change has occurred - updating...* > > > >>>>>> * > > > >>>>>> *And a few seconds later on varnish02, the recovery finishes: > > > >>>>>> * > > > >>>>>> Dec 13, 2012 12:23:48 PM org.apache.solr.cloud.RecoveryStrategy > > > >>>>> doRecovery > > > >>>>>> INFO: Replication Recovery was successful - registering as > Active. > > > >>>>>> core=default1_Norwegian > > > >>>>>> Dec 13, 2012 12:23:48 PM org.apache.solr.cloud.RecoveryStrategy > > > >>>>> doRecovery > > > >>>>>> INFO: Finished recovery process. core=default1_Norwegian > > > >>>>>> Dec 13, 2012 12:23:48 PM org.apache.solr.core.SolrCore execute > > > >>>>>> INFO: [default1_Danish] webapp=/solr path=/select > > > >>>>>> > > > >>>>> > > > >>> > > > params={facet=false&sort=item_group_56823_name_int+asc,+variant_of_item_guid+asc&group.distributed.first=true&facet.limit=1000&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&version=2&df=text&fl=docid&shard.url= > > > >>>>>> > > > >>>>> > > > >>> > > > varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397828395&group.field=groupby_variant_of_item_guid&facet.field=itemgroups_int_mv&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_56823_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&facet.mincount=1&qf=%0a++++++++++text > > > >>>>> > > > >>> > > > ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=0&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true} > > > >>>>>> status=0 QTime=8 > > > >>>>>> Dec 13, 2012 12:23:48 PM > > org.apache.solr.common.cloud.ZkStateReader > > > >>>>>> updateClusterState > > > >>>>>> INFO: Updating cloud state from ZooKeeper... * > > > >>>>>> > > > >>>>>> Which is picked up on varnish01: > > > >>>>>> > > > >>>>>> *Dec 13, 2012 12:23:48 PM > > > >>> org.apache.solr.common.cloud.ZkStateReader$2 > > > >>>>>> process > > > >>>>>> INFO: A cluster state change has occurred - updating...* > > > >>>>>> > > > >>>>>> It looks like it replicated successfully, only it didnt. The > > > >>>>>> default1_Norwegian core on varnish01 now has 55.071 docs and the > > same > > > >>>>> core > > > >>>>>> on varnish02 has 35.088 docs. > > > >>>>>> > > > >>>>>> I checked the log files for both JVM's and no stop-the-world GC > > were > > > >>>>> taking > > > >>>>>> place. > > > >>>>>> > > > >>>>>> There is also nothing in the zookeeper log of interest that I > can > > > >>> see. > > > >>>>>> > > > >>>>>> > > > >>>>>> -- > > > >>>>>> Med venlig hilsen / Best regards > > > >>>>>> > > > >>>>>> *John Nielsen* > > > >>>>>> Programmer > > > >>>>>> > > > >>>>>> > > > >>>>>> > > > >>>>>> *MCB A/S* > > > >>>>>> Enghaven 15 > > > >>>>>> DK-7500 Holstebro > > > >>>>>> > > > >>>>>> Kundeservice: +45 9610 2824 > > > >>>>>> p...@mcb.dk > > > >>>>>> www.mcb.dk > > > >>>>> > > > >>>>> > > > >>>> > > > >>> > > > >> > > > > > > > > >