I am not using the concurrent low pause garbage collector, I could look at
switching, I'm assuming you're talking about adding -XX:+UseConcMarkSweepGC
correct?

I also just had a shard go down and am seeing this in the log

SEVERE: org.apache.solr.common.SolrException: I was asked to wait on state
down for 10.38.33.17:7576_solr but I still do not see the requested state.
I see state: recovering live:false
        at
org.apache.solr.handler.admin.CoreAdminHandler.handleWaitForStateAction(CoreAdminHandler.java:890)
        at
org.apache.solr.handler.admin.CoreAdminHandler.handleRequestBody(CoreAdminHandler.java:186)
        at
org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
        at
org.apache.solr.servlet.SolrDispatchFilter.handleAdminRequest(SolrDispatchFilter.java:591)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:192)
        at
org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:141)
        at
org.eclipse.jetty.servlet.ServletHandler$CachedChain.doFilter(ServletHandler.java:1307)
        at
org.eclipse.jetty.servlet.ServletHandler.doHandle(ServletHandler.java:453)
        at
org.eclipse.jetty.server.handler.ScopedHandler.handle(ScopedHandler.java:137)
        at
org.eclipse.jetty.security.SecurityHandler.handle(SecurityHandler.java:560)
        at
org.eclipse.jetty.server.session.SessionHandler.doHandle(SessionHandler.java:231)

Nothing other than this in the log jumps out as interesting though.


On Wed, Apr 3, 2013 at 7:47 PM, Mark Miller <markrmil...@gmail.com> wrote:

> This shouldn't be a problem though, if things are working as they are
> supposed to. Another node should simply take over as the overseer and
> continue processing the work queue. It's just best if you configure so that
> session timeouts don't happen unless a node is really down. On the other
> hand, it's nicer to detect that faster. Your tradeoff to make.
>
> - Mark
>
> On Apr 3, 2013, at 7:46 PM, Mark Miller <markrmil...@gmail.com> wrote:
>
> > Yeah. Are you using the concurrent low pause garbage collector?
> >
> > This means the overseer wasn't able to communicate with zk for 15
> seconds - due to load or gc or whatever. If you can't resolve the root
> cause of that, or the load just won't allow for it, next best thing you can
> do is raise it to 30 seconds.
> >
> > - Mark
> >
> > On Apr 3, 2013, at 7:41 PM, Jamie Johnson <jej2...@gmail.com> wrote:
> >
> >> I am occasionally seeing this in the log, is this just a timeout issue?
> >> Should I be increasing the zk client timeout?
> >>
> >> WARNING: Overseer cannot talk to ZK
> >> Apr 3, 2013 11:14:25 PM
> >> org.apache.solr.cloud.DistributedQueue$LatchChildWatcher process
> >> INFO: Watcher fired on path: null state: Expired type None
> >> Apr 3, 2013 11:14:25 PM
> org.apache.solr.cloud.Overseer$ClusterStateUpdater
> >> run
> >> WARNING: Solr cannot talk to ZK, exiting Overseer main queue loop
> >> org.apache.zookeeper.KeeperException$SessionExpiredException:
> >> KeeperErrorCode = Session expired for /overseer/queue
> >>       at
> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:127)
> >>       at
> >> org.apache.zookeeper.KeeperException.create(KeeperException.java:51)
> >>       at org.apache.zookeeper.ZooKeeper.getChildren(ZooKeeper.java:1468)
> >>       at
> >>
> org.apache.solr.common.cloud.SolrZkClient$6.execute(SolrZkClient.java:236)
> >>       at
> >>
> org.apache.solr.common.cloud.SolrZkClient$6.execute(SolrZkClient.java:233)
> >>       at
> >>
> org.apache.solr.common.cloud.ZkCmdExecutor.retryOperation(ZkCmdExecutor.java:65)
> >>       at
> >>
> org.apache.solr.common.cloud.SolrZkClient.getChildren(SolrZkClient.java:233)
> >>       at
> >>
> org.apache.solr.cloud.DistributedQueue.orderedChildren(DistributedQueue.java:89)
> >>       at
> >>
> org.apache.solr.cloud.DistributedQueue.element(DistributedQueue.java:131)
> >>       at
> >> org.apache.solr.cloud.DistributedQueue.peek(DistributedQueue.java:326)
> >>       at
> >>
> org.apache.solr.cloud.Overseer$ClusterStateUpdater.run(Overseer.java:128)
> >>       at java.lang.Thread.run(Thread.java:662)
> >>
> >>
> >>
> >> On Wed, Apr 3, 2013 at 7:25 PM, Jamie Johnson <jej2...@gmail.com>
> wrote:
> >>
> >>> just an update, I'm at 1M records now with no issues.  This looks
> >>> promising as to the cause of my issues, thanks for the help.  Is the
> >>> routing method with numShards documented anywhere?  I know numShards is
> >>> documented but I didn't know that the routing changed if you don't
> specify
> >>> it.
> >>>
> >>>
> >>> On Wed, Apr 3, 2013 at 4:44 PM, Jamie Johnson <jej2...@gmail.com>
> wrote:
> >>>
> >>>> with these changes things are looking good, I'm up to 600,000
> documents
> >>>> without any issues as of right now.  I'll keep going and add more to
> see if
> >>>> I find anything.
> >>>>
> >>>>
> >>>> On Wed, Apr 3, 2013 at 4:01 PM, Jamie Johnson <jej2...@gmail.com>
> wrote:
> >>>>
> >>>>> ok, so that's not a deal breaker for me.  I just changed it to match
> the
> >>>>> shards that are auto created and it looks like things are happy.
>  I'll go
> >>>>> ahead and try my test to see if I can get things out of sync.
> >>>>>
> >>>>>
> >>>>> On Wed, Apr 3, 2013 at 3:56 PM, Mark Miller <markrmil...@gmail.com
> >wrote:
> >>>>>
> >>>>>> I had thought you could - but looking at the code recently, I don't
> >>>>>> think you can anymore. I think that's a technical limitation more
> than
> >>>>>> anything though. When these changes were made, I think support for
> that was
> >>>>>> simply not added at the time.
> >>>>>>
> >>>>>> I'm not sure exactly how straightforward it would be, but it seems
> >>>>>> doable - as it is, the overseer will preallocate shards when first
> creating
> >>>>>> the collection - that's when they get named shard(n). There would
> have to
> >>>>>> be logic to replace shard(n) with the custom shard name when the
> core
> >>>>>> actually registers.
> >>>>>>
> >>>>>> - Mark
> >>>>>>
> >>>>>> On Apr 3, 2013, at 3:42 PM, Jamie Johnson <jej2...@gmail.com>
> wrote:
> >>>>>>
> >>>>>>> answered my own question, it now says compositeId.  What is
> >>>>>> problematic
> >>>>>>> though is that in addition to my shards (which are say
> jamie-shard1)
> >>>>>> I see
> >>>>>>> the solr created shards (shard1).  I assume that these were created
> >>>>>> because
> >>>>>>> of the numShards param.  Is there no way to specify the names of
> these
> >>>>>>> shards?
> >>>>>>>
> >>>>>>>
> >>>>>>> On Wed, Apr 3, 2013 at 3:25 PM, Jamie Johnson <jej2...@gmail.com>
> >>>>>> wrote:
> >>>>>>>
> >>>>>>>> ah interesting....so I need to specify num shards, blow out zk and
> >>>>>> then
> >>>>>>>> try this again to see if things work properly now.  What is really
> >>>>>> strange
> >>>>>>>> is that for the most part things have worked right and on 4.2.1 I
> >>>>>> have
> >>>>>>>> 600,000 items indexed with no duplicates.  In any event I will
> >>>>>> specify num
> >>>>>>>> shards clear out zk and begin again.  If this works properly what
> >>>>>> should
> >>>>>>>> the router type be?
> >>>>>>>>
> >>>>>>>>
> >>>>>>>> On Wed, Apr 3, 2013 at 3:14 PM, Mark Miller <
> markrmil...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>
> >>>>>>>>> If you don't specify numShards after 4.1, you get an implicit doc
> >>>>>> router
> >>>>>>>>> and it's up to you to distribute updates. In the past,
> partitioning
> >>>>>> was
> >>>>>>>>> done on the fly - but for shard splitting and perhaps other
> >>>>>> features, we
> >>>>>>>>> now divvy up the hash range up front based on numShards and store
> >>>>>> it in
> >>>>>>>>> ZooKeeper. No numShards is now how you take complete control of
> >>>>>> updates
> >>>>>>>>> yourself.
> >>>>>>>>>
> >>>>>>>>> - Mark
> >>>>>>>>>
> >>>>>>>>> On Apr 3, 2013, at 2:57 PM, Jamie Johnson <jej2...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>>
> >>>>>>>>>> The router says "implicit".  I did start from a blank zk state
> but
> >>>>>>>>> perhaps
> >>>>>>>>>> I missed one of the ZkCLI commands?  One of my shards from the
> >>>>>>>>>> clusterstate.json is shown below.  What is the process that
> should
> >>>>>> be
> >>>>>>>>> done
> >>>>>>>>>> to bootstrap a cluster other than the ZkCLI commands I listed
> >>>>>> above?  My
> >>>>>>>>>> process right now is run those ZkCLI commands and then start
> solr
> >>>>>> on
> >>>>>>>>> all of
> >>>>>>>>>> the instances with a command like this
> >>>>>>>>>>
> >>>>>>>>>> java -server -Dshard=shard5 -DcoreName=shard5-core1
> >>>>>>>>>> -Dsolr.data.dir=/solr/data/shard5-core1
> >>>>>>>>> -Dcollection.configName=solr-conf
> >>>>>>>>>> -Dcollection=collection1
> >>>>>> -DzkHost=so-zoo1:2181,so-zoo2:2181,so-zoo3:2181
> >>>>>>>>>> -Djetty.port=7575 -DhostPort=7575 -jar start.jar
> >>>>>>>>>>
> >>>>>>>>>> I feel like maybe I'm missing a step.
> >>>>>>>>>>
> >>>>>>>>>> "shard5":{
> >>>>>>>>>>     "state":"active",
> >>>>>>>>>>     "replicas":{
> >>>>>>>>>>       "10.38.33.16:7575_solr_shard5-core1":{
> >>>>>>>>>>         "shard":"shard5",
> >>>>>>>>>>         "state":"active",
> >>>>>>>>>>         "core":"shard5-core1",
> >>>>>>>>>>         "collection":"collection1",
> >>>>>>>>>>         "node_name":"10.38.33.16:7575_solr",
> >>>>>>>>>>         "base_url":"http://10.38.33.16:7575/solr";,
> >>>>>>>>>>         "leader":"true"},
> >>>>>>>>>>       "10.38.33.17:7577_solr_shard5-core2":{
> >>>>>>>>>>         "shard":"shard5",
> >>>>>>>>>>         "state":"recovering",
> >>>>>>>>>>         "core":"shard5-core2",
> >>>>>>>>>>         "collection":"collection1",
> >>>>>>>>>>         "node_name":"10.38.33.17:7577_solr",
> >>>>>>>>>>         "base_url":"http://10.38.33.17:7577/solr"}}}
> >>>>>>>>>>
> >>>>>>>>>>
> >>>>>>>>>> On Wed, Apr 3, 2013 at 2:40 PM, Mark Miller <
> markrmil...@gmail.com
> >>>>>>>
> >>>>>>>>> wrote:
> >>>>>>>>>>
> >>>>>>>>>>> It should be part of your clusterstate.json. Some users have
> >>>>>> reported
> >>>>>>>>>>> trouble upgrading a previous zk install when this change came.
> I
> >>>>>>>>>>> recommended manually updating the clusterstate.json to have the
> >>>>>> right
> >>>>>>>>> info,
> >>>>>>>>>>> and that seemed to work. Otherwise, I guess you have to start
> >>>>>> from a
> >>>>>>>>> clean
> >>>>>>>>>>> zk state.
> >>>>>>>>>>>
> >>>>>>>>>>> If you don't have that range information, I think there will be
> >>>>>>>>> trouble.
> >>>>>>>>>>> Do you have an router type defined in the clusterstate.json?
> >>>>>>>>>>>
> >>>>>>>>>>> - Mark
> >>>>>>>>>>>
> >>>>>>>>>>> On Apr 3, 2013, at 2:24 PM, Jamie Johnson <jej2...@gmail.com>
> >>>>>> wrote:
> >>>>>>>>>>>
> >>>>>>>>>>>> Where is this information stored in ZK?  I don't see it in the
> >>>>>> cluster
> >>>>>>>>>>>> state (or perhaps I don't understand it ;) ).
> >>>>>>>>>>>>
> >>>>>>>>>>>> Perhaps something with my process is broken.  What I do when I
> >>>>>> start
> >>>>>>>>> from
> >>>>>>>>>>>> scratch is the following
> >>>>>>>>>>>>
> >>>>>>>>>>>> ZkCLI -cmd upconfig ...
> >>>>>>>>>>>> ZkCLI -cmd linkconfig ....
> >>>>>>>>>>>>
> >>>>>>>>>>>> but I don't ever explicitly create the collection.  What
> should
> >>>>>> the
> >>>>>>>>> steps
> >>>>>>>>>>>> from scratch be?  I am moving from an unreleased snapshot of
> 4.0
> >>>>>> so I
> >>>>>>>>>>> never
> >>>>>>>>>>>> did that previously either so perhaps I did create the
> >>>>>> collection in
> >>>>>>>>> one
> >>>>>>>>>>> of
> >>>>>>>>>>>> my steps to get this working but have forgotten it along the
> way.
> >>>>>>>>>>>>
> >>>>>>>>>>>>
> >>>>>>>>>>>> On Wed, Apr 3, 2013 at 2:16 PM, Mark Miller <
> >>>>>> markrmil...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>
> >>>>>>>>>>>>> Thanks for digging Jamie. In 4.2, hash ranges are assigned up
> >>>>>> front
> >>>>>>>>>>> when a
> >>>>>>>>>>>>> collection is created - each shard gets a range, which is
> >>>>>> stored in
> >>>>>>>>>>>>> zookeeper. You should not be able to end up with the same id
> on
> >>>>>>>>>>> different
> >>>>>>>>>>>>> shards - something very odd going on.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> Hopefully I'll have some time to try and help you reproduce.
> >>>>>> Ideally
> >>>>>>>>> we
> >>>>>>>>>>>>> can capture it in a test case.
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> - Mark
> >>>>>>>>>>>>>
> >>>>>>>>>>>>> On Apr 3, 2013, at 1:13 PM, Jamie Johnson <jej2...@gmail.com
> >
> >>>>>> wrote:
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>> no, my thought was wrong, it appears that even with the
> >>>>>> parameter
> >>>>>>>>> set I
> >>>>>>>>>>>>> am
> >>>>>>>>>>>>>> seeing this behavior.  I've been able to duplicate it on
> 4.2.0
> >>>>>> by
> >>>>>>>>>>>>> indexing
> >>>>>>>>>>>>>> 100,000 documents on 10 threads (10,000 each) when I get to
> >>>>>> 400,000
> >>>>>>>>> or
> >>>>>>>>>>>>> so.
> >>>>>>>>>>>>>> I will try this on 4.2.1. to see if I see the same behavior
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 12:37 PM, Jamie Johnson <
> >>>>>> jej2...@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> Since I don't have that many items in my index I exported
> all
> >>>>>> of
> >>>>>>>>> the
> >>>>>>>>>>>>> keys
> >>>>>>>>>>>>>>> for each shard and wrote a simple java program that checks
> for
> >>>>>>>>>>>>> duplicates.
> >>>>>>>>>>>>>>> I found some duplicate keys on different shards, a grep of
> the
> >>>>>>>>> files
> >>>>>>>>>>> for
> >>>>>>>>>>>>>>> the keys found does indicate that they made it to the wrong
> >>>>>> places.
> >>>>>>>>>>> If
> >>>>>>>>>>>>> you
> >>>>>>>>>>>>>>> notice documents with the same ID are on shard 3 and shard
> 5.
> >>>>>> Is
> >>>>>>>>> it
> >>>>>>>>>>>>>>> possible that the hash is being calculated taking into
> >>>>>> account only
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>> "live" nodes?  I know that we don't specify the numShards
> >>>>>> param @
> >>>>>>>>>>>>> startup
> >>>>>>>>>>>>>>> so could this be what is happening?
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> grep -c "7cd1a717-3d94-4f5d-bcb1-9d8a95ca78de" *
> >>>>>>>>>>>>>>> shard1-core1:0
> >>>>>>>>>>>>>>> shard1-core2:0
> >>>>>>>>>>>>>>> shard2-core1:0
> >>>>>>>>>>>>>>> shard2-core2:0
> >>>>>>>>>>>>>>> shard3-core1:1
> >>>>>>>>>>>>>>> shard3-core2:1
> >>>>>>>>>>>>>>> shard4-core1:0
> >>>>>>>>>>>>>>> shard4-core2:0
> >>>>>>>>>>>>>>> shard5-core1:1
> >>>>>>>>>>>>>>> shard5-core2:1
> >>>>>>>>>>>>>>> shard6-core1:0
> >>>>>>>>>>>>>>> shard6-core2:0
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 10:42 AM, Jamie Johnson <
> >>>>>> jej2...@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> Something interesting that I'm noticing as well, I just
> >>>>>> indexed
> >>>>>>>>>>> 300,000
> >>>>>>>>>>>>>>>> items, and some how 300,020 ended up in the index.  I
> thought
> >>>>>>>>>>> perhaps I
> >>>>>>>>>>>>>>>> messed something up so I started the indexing again and
> >>>>>> indexed
> >>>>>>>>>>> another
> >>>>>>>>>>>>>>>> 400,000 and I see 400,064 docs.  Is there a good way to
> find
> >>>>>>>>>>> possibile
> >>>>>>>>>>>>>>>> duplicates?  I had tried to facet on key (our id field)
> but
> >>>>>> that
> >>>>>>>>>>> didn't
> >>>>>>>>>>>>>>>> give me anything with more than a count of 1.
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 9:22 AM, Jamie Johnson <
> >>>>>> jej2...@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> Ok, so clearing the transaction log allowed things to go
> >>>>>> again.
> >>>>>>>>> I
> >>>>>>>>>>> am
> >>>>>>>>>>>>>>>>> going to clear the index and try to replicate the
> problem on
> >>>>>>>>> 4.2.0
> >>>>>>>>>>>>> and then
> >>>>>>>>>>>>>>>>> I'll try on 4.2.1
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>> On Wed, Apr 3, 2013 at 8:21 AM, Mark Miller <
> >>>>>>>>> markrmil...@gmail.com
> >>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> No, not that I know if, which is why I say we need to
> get
> >>>>>> to the
> >>>>>>>>>>>>> bottom
> >>>>>>>>>>>>>>>>>> of it.
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> - Mark
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>> On Apr 2, 2013, at 10:18 PM, Jamie Johnson <
> >>>>>> jej2...@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>> Mark
> >>>>>>>>>>>>>>>>>>> It's there a particular jira issue that you think may
> >>>>>> address
> >>>>>>>>>>> this?
> >>>>>>>>>>>>> I
> >>>>>>>>>>>>>>>>>> read
> >>>>>>>>>>>>>>>>>>> through it quickly but didn't see one that jumped out
> >>>>>>>>>>>>>>>>>>> On Apr 2, 2013 10:07 PM, "Jamie Johnson" <
> >>>>>> jej2...@gmail.com>
> >>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>> I brought the bad one down and back up and it did
> >>>>>> nothing.  I
> >>>>>>>>> can
> >>>>>>>>>>>>>>>>>> clear
> >>>>>>>>>>>>>>>>>>>> the index and try4.2.1. I will save off the logs and
> see
> >>>>>> if
> >>>>>>>>> there
> >>>>>>>>>>>>> is
> >>>>>>>>>>>>>>>>>>>> anything else odd
> >>>>>>>>>>>>>>>>>>>> On Apr 2, 2013 9:13 PM, "Mark Miller" <
> >>>>>> markrmil...@gmail.com>
> >>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> It would appear it's a bug given what you have said.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Any other exceptions would be useful. Might be best
> to
> >>>>>> start
> >>>>>>>>>>>>>>>>>> tracking in
> >>>>>>>>>>>>>>>>>>>>> a JIRA issue as well.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> To fix, I'd bring the behind node down and back
> again.
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> Unfortunately, I'm pressed for time, but we really
> need
> >>>>>> to
> >>>>>>>>> get
> >>>>>>>>>>> to
> >>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>> bottom of this and fix it, or determine if it's
> fixed in
> >>>>>>>>> 4.2.1
> >>>>>>>>>>>>>>>>>> (spreading
> >>>>>>>>>>>>>>>>>>>>> to mirrors now).
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> - Mark
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> On Apr 2, 2013, at 7:21 PM, Jamie Johnson <
> >>>>>> jej2...@gmail.com
> >>>>>>>>>>
> >>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Sorry I didn't ask the obvious question.  Is there
> >>>>>> anything
> >>>>>>>>>>> else
> >>>>>>>>>>>>>>>>>> that I
> >>>>>>>>>>>>>>>>>>>>>> should be looking for here and is this a bug?  I'd
> be
> >>>>>> happy
> >>>>>>>>> to
> >>>>>>>>>>>>>>>>>> troll
> >>>>>>>>>>>>>>>>>>>>>> through the logs further if more information is
> >>>>>> needed, just
> >>>>>>>>>>> let
> >>>>>>>>>>>>> me
> >>>>>>>>>>>>>>>>>>>>> know.
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> Also what is the most appropriate mechanism to fix
> >>>>>> this.
> >>>>>>>>> Is it
> >>>>>>>>>>>>>>>>>>>>> required to
> >>>>>>>>>>>>>>>>>>>>>> kill the index that is out of sync and let solr
> resync
> >>>>>>>>> things?
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 2, 2013 at 5:45 PM, Jamie Johnson <
> >>>>>>>>>>> jej2...@gmail.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> sorry for spamming here....
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> shard5-core2 is the instance we're having issues
> >>>>>> with...
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 7:27:14 PM
> >>>>>> org.apache.solr.common.SolrException
> >>>>>>>>>>> log
> >>>>>>>>>>>>>>>>>>>>>>> SEVERE: shard update error StdNode:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> http://10.38.33.17:7577/solr/dsc-shard5-core2/:org.apache.solr.common.SolrException
> >>>>>>>>>>>>>>>>>>>>> :
> >>>>>>>>>>>>>>>>>>>>>>> Server at
> >>>>>>>>>>> http://10.38.33.17:7577/solr/dsc-shard5-core2returned
> >>>>>>>>>>>>>>>>>> non
> >>>>>>>>>>>>>>>>>>>>> ok
> >>>>>>>>>>>>>>>>>>>>>>> status:503, message:Service Unavailable
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:373)
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.client.solrj.impl.HttpSolrServer.request(HttpSolrServer.java:181)
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:332)
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.update.SolrCmdDistributor$1.call(SolrCmdDistributor.java:306)
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>> java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:439)
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>> java.util.concurrent.FutureTask$Sync.innerRun(FutureTask.java:303)
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>> java.util.concurrent.FutureTask.run(FutureTask.java:138)
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.runTask(ThreadPoolExecutor.java:886)
> >>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:908)
> >>>>>>>>>>>>>>>>>>>>>>> at java.lang.Thread.run(Thread.java:662)
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 2, 2013 at 5:43 PM, Jamie Johnson <
> >>>>>>>>>>>>> jej2...@gmail.com>
> >>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> here is another one that looks interesting
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 7:27:14 PM
> >>>>>>>>> org.apache.solr.common.SolrException
> >>>>>>>>>>> log
> >>>>>>>>>>>>>>>>>>>>>>>> SEVERE: org.apache.solr.common.SolrException:
> >>>>>> ClusterState
> >>>>>>>>>>> says
> >>>>>>>>>>>>>>>>>> we are
> >>>>>>>>>>>>>>>>>>>>>>>> the leader, but locally we don't think so
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.update.processor.DistributedUpdateProcessor.doDefensiveChecks(DistributedUpdateProcessor.java:293)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.update.processor.DistributedUpdateProcessor.setupRequest(DistributedUpdateProcessor.java:228)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.update.processor.DistributedUpdateProcessor.processAdd(DistributedUpdateProcessor.java:339)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.update.processor.LogUpdateProcessor.processAdd(LogUpdateProcessorFactory.java:100)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.handler.loader.XMLLoader.processUpdate(XMLLoader.java:246)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>> org.apache.solr.handler.loader.XMLLoader.load(XMLLoader.java:173)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.handler.UpdateRequestHandler$1.load(UpdateRequestHandler.java:92)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.handler.ContentStreamHandlerBase.handleRequestBody(ContentStreamHandlerBase.java:74)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.handler.RequestHandlerBase.handleRequest(RequestHandlerBase.java:135)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>
> org.apache.solr.core.SolrCore.execute(SolrCore.java:1797)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.servlet.SolrDispatchFilter.execute(SolrDispatchFilter.java:637)
> >>>>>>>>>>>>>>>>>>>>>>>> at
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> org.apache.solr.servlet.SolrDispatchFilter.doFilter(SolrDispatchFilter.java:343)
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 2, 2013 at 5:41 PM, Jamie Johnson <
> >>>>>>>>>>>>> jej2...@gmail.com
> >>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> Looking at the master it looks like at some point
> >>>>>> there
> >>>>>>>>> were
> >>>>>>>>>>>>>>>>>> shards
> >>>>>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>>>>>> went down.  I am seeing things like what is
> below.
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> NFO: A cluster state change: WatchedEvent
> >>>>>>>>>>> state:SyncConnected
> >>>>>>>>>>>>>>>>>>>>>>>>> type:NodeChildrenChanged path:/live_nodes, has
> >>>>>> occurred -
> >>>>>>>>>>>>>>>>>>>>> updating... (live
> >>>>>>>>>>>>>>>>>>>>>>>>> nodes size: 12)
> >>>>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 8:12:52 PM
> >>>>>>>>>>>>>>>>>> org.apache.solr.common.cloud.ZkStateReader$3
> >>>>>>>>>>>>>>>>>>>>>>>>> process
> >>>>>>>>>>>>>>>>>>>>>>>>> INFO: Updating live nodes... (9)
> >>>>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 8:12:52 PM
> >>>>>>>>>>>>>>>>>>>>> org.apache.solr.cloud.ShardLeaderElectionContext
> >>>>>>>>>>>>>>>>>>>>>>>>> runLeaderProcess
> >>>>>>>>>>>>>>>>>>>>>>>>> INFO: Running the leader process.
> >>>>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 8:12:52 PM
> >>>>>>>>>>>>>>>>>>>>> org.apache.solr.cloud.ShardLeaderElectionContext
> >>>>>>>>>>>>>>>>>>>>>>>>> shouldIBeLeader
> >>>>>>>>>>>>>>>>>>>>>>>>> INFO: Checking if I should try and be the leader.
> >>>>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 8:12:52 PM
> >>>>>>>>>>>>>>>>>>>>> org.apache.solr.cloud.ShardLeaderElectionContext
> >>>>>>>>>>>>>>>>>>>>>>>>> shouldIBeLeader
> >>>>>>>>>>>>>>>>>>>>>>>>> INFO: My last published State was Active, it's
> okay
> >>>>>> to be
> >>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>> leader.
> >>>>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 8:12:52 PM
> >>>>>>>>>>>>>>>>>>>>> org.apache.solr.cloud.ShardLeaderElectionContext
> >>>>>>>>>>>>>>>>>>>>>>>>> runLeaderProcess
> >>>>>>>>>>>>>>>>>>>>>>>>> INFO: I may be the new leader - try and sync
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>> On Tue, Apr 2, 2013 at 5:09 PM, Mark Miller <
> >>>>>>>>>>>>>>>>>> markrmil...@gmail.com
> >>>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> I don't think the versions you are thinking of
> >>>>>> apply
> >>>>>>>>> here.
> >>>>>>>>>>>>>>>>>> Peersync
> >>>>>>>>>>>>>>>>>>>>>>>>>> does not look at that - it looks at version
> >>>>>> numbers for
> >>>>>>>>>>>>>>>>>> updates in
> >>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>> transaction log - it compares the last 100 of
> them
> >>>>>> on
> >>>>>>>>>>> leader
> >>>>>>>>>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>> replica.
> >>>>>>>>>>>>>>>>>>>>>>>>>> What it's saying is that the replica seems to
> have
> >>>>>>>>> versions
> >>>>>>>>>>>>>>>>>> that
> >>>>>>>>>>>>>>>>>>>>> the leader
> >>>>>>>>>>>>>>>>>>>>>>>>>> does not. Have you scanned the logs for any
> >>>>>> interesting
> >>>>>>>>>>>>>>>>>> exceptions?
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> Did the leader change during the heavy indexing?
> >>>>>> Did
> >>>>>>>>> any zk
> >>>>>>>>>>>>>>>>>> session
> >>>>>>>>>>>>>>>>>>>>>>>>>> timeouts occur?
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> - Mark
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>> On Apr 2, 2013, at 4:52 PM, Jamie Johnson <
> >>>>>>>>>>> jej2...@gmail.com
> >>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>> wrote:
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> I am currently looking at moving our Solr
> cluster
> >>>>>> to
> >>>>>>>>> 4.2
> >>>>>>>>>>> and
> >>>>>>>>>>>>>>>>>>>>> noticed a
> >>>>>>>>>>>>>>>>>>>>>>>>>>> strange issue while testing today.
>  Specifically
> >>>>>> the
> >>>>>>>>>>> replica
> >>>>>>>>>>>>>>>>>> has a
> >>>>>>>>>>>>>>>>>>>>>>>>>> higher
> >>>>>>>>>>>>>>>>>>>>>>>>>>> version than the master which is causing the
> >>>>>> index to
> >>>>>>>>> not
> >>>>>>>>>>>>>>>>>>>>> replicate.
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Because of this the replica has fewer documents
> >>>>>> than
> >>>>>>>>> the
> >>>>>>>>>>>>>>>>>> master.
> >>>>>>>>>>>>>>>>>>>>> What
> >>>>>>>>>>>>>>>>>>>>>>>>>>> could cause this and how can I resolve it
> short of
> >>>>>>>>> taking
> >>>>>>>>>>>>>>>>>> down the
> >>>>>>>>>>>>>>>>>>>>>>>>>> index
> >>>>>>>>>>>>>>>>>>>>>>>>>>> and scping the right version in?
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> MASTER:
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Last Modified:about an hour ago
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Num Docs:164880
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Max Doc:164880
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Deleted Docs:0
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Version:2387
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Segment Count:23
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> REPLICA:
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Last Modified: about an hour ago
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Num Docs:164773
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Max Doc:164773
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Deleted Docs:0
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Version:3001
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Segment Count:30
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> in the replicas log it says this:
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> INFO: Creating new http client,
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>
> config:maxConnectionsPerHost=20&maxConnections=10000&connTimeout=30000&socketTimeout=30000&retry=false
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 8:15:06 PM
> >>>>>> org.apache.solr.update.PeerSync
> >>>>>>>>>>> sync
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> INFO: PeerSync: core=dsc-shard5-core2
> >>>>>>>>>>>>>>>>>>>>>>>>>>> url=http://10.38.33.17:7577/solrSTARTreplicas=[
> >>>>>>>>>>>>>>>>>>>>>>>>>>> http://10.38.33.16:7575/solr/dsc-shard5-core1/
> ]
> >>>>>>>>>>>>> nUpdates=100
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 8:15:06 PM
> >>>>>> org.apache.solr.update.PeerSync
> >>>>>>>>>>>>>>>>>>>>> handleVersions
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> INFO: PeerSync: core=dsc-shard5-core2 url=
> >>>>>>>>>>>>>>>>>>>>>>>>>> http://10.38.33.17:7577/solr
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Received 100 versions from
> >>>>>>>>>>>>>>>>>> 10.38.33.16:7575/solr/dsc-shard5-core1/
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 8:15:06 PM
> >>>>>> org.apache.solr.update.PeerSync
> >>>>>>>>>>>>>>>>>>>>> handleVersions
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> INFO: PeerSync: core=dsc-shard5-core2 url=
> >>>>>>>>>>>>>>>>>>>>>>>>>> http://10.38.33.17:7577/solr  Our
> >>>>>>>>>>>>>>>>>>>>>>>>>>> versions are newer.
> >>>>>> ourLowThreshold=1431233788792274944
> >>>>>>>>>>>>>>>>>>>>>>>>>>> otherHigh=1431233789440294912
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> Apr 2, 2013 8:15:06 PM
> >>>>>> org.apache.solr.update.PeerSync
> >>>>>>>>>>> sync
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> INFO: PeerSync: core=dsc-shard5-core2
> >>>>>>>>>>>>>>>>>>>>>>>>>>> url=http://10.38.33.17:7577/solrDONE. sync
> >>>>>> succeeded
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>> which again seems to point that it thinks it
> has a
> >>>>>>>>> newer
> >>>>>>>>>>>>>>>>>> version of
> >>>>>>>>>>>>>>>>>>>>>>>>>> the
> >>>>>>>>>>>>>>>>>>>>>>>>>>> index so it aborts.  This happened while
> having 10
> >>>>>>>>> threads
> >>>>>>>>>>>>>>>>>> indexing
> >>>>>>>>>>>>>>>>>>>>>>>>>> 10,000
> >>>>>>>>>>>>>>>>>>>>>>>>>>> items writing to a 6 shard (1 replica each)
> >>>>>> cluster.
> >>>>>>>>> Any
> >>>>>>>>>>>>>>>>>> thoughts
> >>>>>>>>>>>>>>>>>>>>> on
> >>>>>>>>>>>>>>>>>>>>>>>>>> this
> >>>>>>>>>>>>>>>>>>>>>>>>>>> or what I should look for would be appreciated.
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>>
> >>>>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>>>
> >>>>>>>>>
> >>>>>>>>>
> >>>>>>>>
> >>>>>>
> >>>>>>
> >>>>>
> >>>>
> >>>
> >
>
>

Reply via email to