Have you filed a JIRA issue for this that I don't remember Markus?

We need to make sure this is fixed.

Any idea around when the trunk version came from? Before or after 4.0?

- Mark

On Dec 14, 2012, at 6:36 AM, Markus Jelsma <markus.jel...@openindex.io> wrote:

> We did not solve it but reindexing can remedy the problem. 
> 
> -----Original message-----
>> From:John Nielsen <j...@mcb.dk>
>> Sent: Fri 14-Dec-2012 12:31
>> To: solr-user@lucene.apache.org
>> Subject: Re: Strange data-loss problem on one of our cores
>> 
>> How did you solve the problem?
>> 
>> 
>> -- 
>> Med venlig hilsen / Best regards
>> 
>> *John Nielsen*
>> Programmer
>> 
>> 
>> 
>> *MCB A/S*
>> Enghaven 15
>> DK-7500 Holstebro
>> 
>> Kundeservice: +45 9610 2824
>> p...@mcb.dk
>> www.mcb.dk
>> 
>> 
>> 
>> On Fri, Dec 14, 2012 at 12:04 PM, Markus Jelsma
>> <markus.jel...@openindex.io>wrote:
>> 
>>> FYI, we observe the same issue, after some time (days, months) a cluster
>>> running an older trunk version has at least two shards where the leader and
>>> the replica do not contain the same number of records. No recovery is
>>> attempted, it seems it thinks everything is alright. Also, one core of one
>>> of the unsynced shards waits forever loading
>>> /replication?command=detail&wt=json, other cores load it in a few ms. Both
>>> cores of another unsynced shard does not show this problem.
>>> 
>>> -----Original message-----
>>>> From:John Nielsen <j...@mcb.dk>
>>>> Sent: Fri 14-Dec-2012 11:50
>>>> To: solr-user@lucene.apache.org
>>>> Subject: Re: Strange data-loss problem on one of our cores
>>>> 
>>>> I did a manual commit, and we are still missing docs, so it doesn't look
>>>> like the search race condition you mention.
>>>> 
>>>> My boss wasn't happy when i mentioned that I wanted to try out unreleased
>>>> code. Ill get him won over though and return with my findings. It will
>>>> probably be some time next week.
>>>> 
>>>> Thanks for your help.
>>>> 
>>>> 
>>>> --
>>>> Med venlig hilsen / Best regards
>>>> 
>>>> *John Nielsen*
>>>> Programmer
>>>> 
>>>> 
>>>> 
>>>> *MCB A/S*
>>>> Enghaven 15
>>>> DK-7500 Holstebro
>>>> 
>>>> Kundeservice: +45 9610 2824
>>>> p...@mcb.dk
>>>> www.mcb.dk
>>>> 
>>>> 
>>>> 
>>>> On Thu, Dec 13, 2012 at 4:10 PM, Mark Miller <markrmil...@gmail.com>
>>> wrote:
>>>> 
>>>>> Couple things to start:
>>>>> 
>>>>> By default SolrCloud distributes updates a doc at a time. So if you
>>> have 1
>>>>> shard, whatever node you index too, it will send updates to the other.
>>>>> Replication is only used for recovery, not distributing data. So for
>>> some
>>>>> reason, there is an IOException when it tries to forward.
>>>>> 
>>>>> The other issue is not something that Ive seen reported. Can/did you
>>> try
>>>>> and do another hard commit to make sure you had the latest search open
>>> when
>>>>> checking the # of docs on each node? There was previously a race around
>>>>> commit that could cause some issues around expected visibility.
>>>>> 
>>>>> If you are able to, you might try out a nightly build - 4.1 will be
>>> ready
>>>>> very soon and has numerous bug fixes for SolrCloud.
>>>>> 
>>>>> - Mark
>>>>> 
>>>>> On Dec 13, 2012, at 9:53 AM, John Nielsen <j...@mcb.dk> wrote:
>>>>> 
>>>>>> Hi all,
>>>>>> 
>>>>>> We are seeing a strange problem on our 2-node solr4 cluster. This
>>> problem
>>>>>> has resultet in data loss.
>>>>>> 
>>>>>> We have two servers, varnish01 and varnish02. Zookeeper is running on
>>>>>> varnish02, but in a separate jvm.
>>>>>> 
>>>>>> We index directly to varnish02 and we read from varnish01. Data is
>>> thus
>>>>>> replicated from varnish02 to varnish01.
>>>>>> 
>>>>>> I found this in the varnish01 log:
>>>>>> 
>>>>>> *INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=42
>>>>>> Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=41
>>>>>> Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=33
>>>>>> Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish02.lynero.net:8000/solr/default1_Norwegian/&update.distrib=TOLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=33
>>>>>> Dec 13, 2012 12:23:39 PM org.apache.solr.common.SolrException log
>>>>>> SEVERE: shard update error StdNode:
>>>>>> 
>>>>> 
>>> http://varnish02.lynero.net:8000/solr/default1_Norwegian/:org.apache.solr.client.solrj.SolrServerException
>>>>> :
>>>>>> IOException occured when talking to server at:
>>>>>> http://varnish02.lynero.net:8000/solr/default1_Norwegian
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:413)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:335)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:309)
>>>>>>   at
>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>>>>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>>>>>   at
>>>>>> 
>>> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:471)
>>>>>>   at
>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:334)
>>>>>>   at java.util.concurrent.FutureTask.run(FutureTask.java:166)
>>>>>>   at
>>>>>> 
>>>>> 
>>> java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1110)
>>>>>>   at
>>>>>> 
>>>>> 
>>> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:603)
>>>>>>   at java.lang.Thread.run(Thread.java:636)
>>>>>> Caused by: org.apache.http.NoHttpResponseException: The target server
>>>>>> failed to respond
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.impl.conn.DefaultResponseParser.parseHead(DefaultResponseParser.java:101)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.impl.io.AbstractMessageParser.parse(AbstractMessageParser.java:252)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.impl.AbstractHttpClientConnection.receiveResponseHeader(AbstractHttpClientConnection.java:282)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.impl.conn.DefaultClientConnection.receiveResponseHeader(DefaultClientConnection.java:247)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.impl.conn.AbstractClientConnAdapter.receiveResponseHeader(AbstractClientConnAdapter.java:216)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.protocol.HttpRequestExecutor.doReceiveResponse(HttpRequestExecutor.java:298)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.protocol.HttpRequestExecutor.execute(HttpRequestExecutor.java:125)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.impl.client.DefaultRequestDirector.tryExecute(DefaultRequestDirector.java:647)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.impl.client.DefaultRequestDirector.execute(DefaultRequestDirector.java:464)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:820)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:754)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.http.impl.client.AbstractHttpClient.execute(AbstractHttpClient.java:732)
>>>>>>   at
>>>>>> 
>>>>> 
>>> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:352)
>>>>>>   ... 11 more
>>>>>> 
>>>>>> Dec 13, 2012 12:23:39 PM
>>>>>> org.apache.solr.update.processor.DistributedUpdateProcessor doFinish
>>>>>> INFO: try and ask http://varnish02.lynero.net:8000/solr to recover*
>>>>>> 
>>>>>> It looks like it is sending updates from varnish01 to varnish02. I
>>> am not
>>>>>> sure for what since we only index on varnish02. Updates should never
>>> be
>>>>>> going from varnish01 to varnish02.
>>>>>> 
>>>>>> Meanwhile on varnish02:
>>>>>> 
>>>>>> *INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=16
>>>>>> Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=15
>>>>>> Dec 13, 2012 12:23:36 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=16
>>>>>> Dec 13, 2012 12:23:42 PM
>>> org.apache.solr.handler.admin.CoreAdminHandler
>>>>>> handleRequestRecoveryAction
>>>>>> INFO: It has been requested that we recover*
>>>>>> *Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Danish] webapp=/solr path=/select
>>>>>> 
>>>>> 
>>> params={facet=false&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&group.distributed.first=true&facet.limit=1000&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&version=2&df=text&fl=docid&shard.url=
>>>>>> 
>>>>> 
>>> varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822111&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&facet.mincount=1&qf=%0a++++++++++text
>>>>> 
>>> ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=0&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true}
>>>>>> status=0 QTime=1
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Danish] webapp=/solr path=/select/
>>>>>> params={fq=site_guid:(2810678)&q=win} hits=0 status=0 QTime=17
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Danish] webapp=/solr path=/select
>>>>>> 
>>>>> 
>>> params={facet=on&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&group.distributed.second=true&version=2&df=text&fl=docid&shard.url=
>>>>>> 
>>>>> 
>>> varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822111&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&qf=%0a++++++++++text
>>>>> 
>>> ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=0&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true}
>>>>>> status=0 QTime=1
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Danish] webapp=/solr path=/select
>>>>>> 
>>>>> 
>>> params={facet=false&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&group.distributed.first=true&facet.limit=1000&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&version=2&df=text&fl=docid&shard.url=
>>>>>> 
>>>>> 
>>> varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822138&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&facet.mincount=1&qf=%0a++++++++++text
>>>>> 
>>> ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=40&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true}
>>>>>> status=0 QTime=1
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Danish] webapp=/solr path=/select
>>>>>> 
>>>>> 
>>> params={facet=on&sort=item_group_59700_name_int+asc,+variant_of_item_guid+asc&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&group.distributed.second=true&version=2&df=text&fl=docid&shard.url=
>>>>>> 
>>>>> 
>>> varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397822138&group.field=groupby_variant_of_item_guid&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_59700_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&group.topgroups.groupby_variant_of_item_guid=2963217&group.topgroups.groupby_variant_of_item_guid=2963223&group.topgroups.groupby_variant_of_item_guid=2963219&group.topgroups.groupby_variant_of_item_guid=2963220&group.topgroups.groupby_variant_of_item_guid=2963221&group.topgroups.groupby_variant_of_item_guid=2963222&group.topgroups.groupby_variant_of_item_guid=2963224&group.topgroups.groupby_variant_of_item_guid=2963218&qf=%0a++++++++++text
>>>>> 
>>> ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=40&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true}
>>>>>> status=0 QTime=1
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=26
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=22
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.update.DefaultSolrCoreState
>>>>>> doRecovery
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.update.DefaultSolrCoreState
>>>>>> doRecovery
>>>>>> INFO: Running recovery - first canceling any ongoing recovery
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=25
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=24
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=20
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=25
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=23
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=21
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=23
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Norwegian] webapp=/solr path=/update
>>>>> params={distrib.from=
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/&update.distrib=FROMLEADER&wt=javabin&version=2
>>>>> }
>>>>>> status=0 QTime=16
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy run
>>>>>> INFO: Starting recovery process.  core=default1_Norwegian
>>>>>> recoveringAfterStartup=false
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.common.cloud.ZkStateReader
>>>>>> updateClusterState
>>>>>> INFO: Updating cloud state from ZooKeeper...
>>>>>> Dec 13, 2012 12:23:42 PM
>>>>>> org.apache.solr.update.processor.LogUpdateProcessor finish*
>>>>>> 
>>>>>> And less than a second later:
>>>>>> 
>>>>>> *Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy
>>>>> doRecovery
>>>>>> INFO: Attempting to PeerSync from
>>>>>> 
>>>>> 
>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/core=default1_Norwegian
>>>>>> - recoveringAfterStartup=false
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.update.PeerSync sync
>>>>>> INFO: PeerSync: core=default1_Norwegian url=
>>>>>> http://varnish02.lynero.net:8000/solr START replicas=[
>>>>>> http://varnish01.lynero.net:8000/solr/default1_Norwegian/]
>>> nUpdates=100
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.update.PeerSync sync
>>>>>> WARNING: PeerSync: core=default1_Norwegian url=
>>>>>> http://varnish02.lynero.net:8000/solr too many updates received
>>> since
>>>>> start
>>>>>> - startingUpdates no longer overlaps with our currentUpdates
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy
>>>>> doRecovery
>>>>>> INFO: PeerSync Recovery was not successful - trying replication.
>>>>>> core=default1_Norwegian
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.cloud.RecoveryStrategy
>>>>> doRecovery
>>>>>> INFO: Starting Replication Recovery. core=default1_Norwegian
>>>>>> Dec 13, 2012 12:23:42 PM
>>> org.apache.solr.client.solrj.impl.HttpClientUtil
>>>>>> createClient
>>>>>> INFO: Creating new http client,
>>>>>> 
>>> config:maxConnections=128&maxConnectionsPerHost=32&followRedirects=false
>>>>>> Dec 13, 2012 12:23:42 PM org.apache.solr.common.cloud.ZkStateReader$2
>>>>>> process
>>>>>> INFO: A cluster state change has occurred - updating...*
>>>>>> 
>>>>>> State change on varnish01 at the same time:
>>>>>> 
>>>>>> *Dec 13, 2012 12:23:42 PM
>>> org.apache.solr.common.cloud.ZkStateReader$2
>>>>>> process
>>>>>> INFO: A cluster state change has occurred - updating...*
>>>>>> *
>>>>>> *And a few seconds later on varnish02, the recovery finishes:
>>>>>> *
>>>>>> Dec 13, 2012 12:23:48 PM org.apache.solr.cloud.RecoveryStrategy
>>>>> doRecovery
>>>>>> INFO: Replication Recovery was successful - registering as Active.
>>>>>> core=default1_Norwegian
>>>>>> Dec 13, 2012 12:23:48 PM org.apache.solr.cloud.RecoveryStrategy
>>>>> doRecovery
>>>>>> INFO: Finished recovery process. core=default1_Norwegian
>>>>>> Dec 13, 2012 12:23:48 PM org.apache.solr.core.SolrCore execute
>>>>>> INFO: [default1_Danish] webapp=/solr path=/select
>>>>>> 
>>>>> 
>>> params={facet=false&sort=item_group_56823_name_int+asc,+variant_of_item_guid+asc&group.distributed.first=true&facet.limit=1000&q.alt=*:*&q.alt=*:*&distrib=false&facet.method=enum&version=2&df=text&fl=docid&shard.url=
>>>>>> 
>>>>> 
>>> varnish02.lynero.net:8000/solr/default1_Danish/|varnish01.lynero.net:8000/solr/default1_Danish/&NOW=1355397828395&group.field=groupby_variant_of_item_guid&facet.field=itemgroups_int_mv&fq=site_guid:(11440)&fq=item_type:(PRODUCT)&fq=language_guid:(1)&fq=item_group_56823_combination:(*)&fq=item_group_45879_combination:(*)&fq=is_searchable:(True)&querytype=Technical&mm=100%25&facet.missing=on&group.ngroups=true&facet.mincount=1&qf=%0a++++++++++text
>>>>> 
>>> ^0.5+name^1.2+searchable_text^0.8+typeahead_text^1.0+keywords^1.1+item_no^5.0%0a++++++++++ranking1_text^1.0+ranking2_text^2.0+ranking3_text^3.0%0a+++++++&wt=javabin&group.facet=true&defType=edismax&rows=0&facet.sort=lex&start=0&group=true&group.sort=name+asc&isShard=true}
>>>>>> status=0 QTime=8
>>>>>> Dec 13, 2012 12:23:48 PM org.apache.solr.common.cloud.ZkStateReader
>>>>>> updateClusterState
>>>>>> INFO: Updating cloud state from ZooKeeper... *
>>>>>> 
>>>>>> Which is picked up on varnish01:
>>>>>> 
>>>>>> *Dec 13, 2012 12:23:48 PM
>>> org.apache.solr.common.cloud.ZkStateReader$2
>>>>>> process
>>>>>> INFO: A cluster state change has occurred - updating...*
>>>>>> 
>>>>>> It looks like it replicated successfully, only it didnt. The
>>>>>> default1_Norwegian core on varnish01 now has 55.071 docs and the same
>>>>> core
>>>>>> on varnish02 has 35.088 docs.
>>>>>> 
>>>>>> I checked the log files for both JVM's and no stop-the-world GC were
>>>>> taking
>>>>>> place.
>>>>>> 
>>>>>> There is also nothing in the zookeeper log of interest that I can
>>> see.
>>>>>> 
>>>>>> 
>>>>>> --
>>>>>> Med venlig hilsen / Best regards
>>>>>> 
>>>>>> *John Nielsen*
>>>>>> Programmer
>>>>>> 
>>>>>> 
>>>>>> 
>>>>>> *MCB A/S*
>>>>>> Enghaven 15
>>>>>> DK-7500 Holstebro
>>>>>> 
>>>>>> Kundeservice: +45 9610 2824
>>>>>> p...@mcb.dk
>>>>>> www.mcb.dk
>>>>> 
>>>>> 
>>>> 
>>> 
>> 

Reply via email to