Re: Cluster with no overseer?

2019-05-22 Thread Erick Erickson
110 isn’t all that many, well within the normal range _assuming_ that they are being processed…. When you restart Solr, every state change operation writes an operation to the work queue which can mount up. Perhaps you’re hitting: https://issues.apache.org/jira/browse/SOLR-13416? In which case

Re: Cluster with no overseer?

2019-05-22 Thread Walter Underwood
The ZK ensemble appears to be OK. It is the Solr-related stuff that is borked. There are 110 items in /overseer/collection-queue-work/, which doesn’t seem healthy. If it is really hosed, I’ll shut down all the nodes, clean out the files in Zookeeper and start over. wunder Walter Underwood wun.

Re: Cluster with no overseer?

2019-05-22 Thread Erick Erickson
Good luck, this kind of assumes that your ZK ensemble is healthy of course... > On May 22, 2019, at 8:23 AM, Walter Underwood wrote: > > Thanks, we’ll try that. Bouncing one Solr node doesn’t fix it, because we did > a rolling restart yesterday. > > wunder > Walter Underwood > wun...@wunderwoo

Re: Cluster with no overseer?

2019-05-22 Thread Walter Underwood
Thanks, we’ll try that. Bouncing one Solr node doesn’t fix it, because we did a rolling restart yesterday. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 22, 2019, at 8:21 AM, Erick Erickson wrote: > > Walter: > > I have no idea what the root

Re: Cluster with no overseer?

2019-05-22 Thread Erick Erickson
Walter: I have no idea what the root cause is here, this really shouldn’t happen. But the Overseer role (and I’m assuming you’re talking Solr’s Overseer) is assigned similarly to a shard leader, the same election process happens. All the election nodes are ephemeral ZK nodes. Solr’s Overseer i

Re: Cluster with no overseer?

2019-05-21 Thread Will Martin
Worked with Fusion and Zookeeper at GSA for 18 months: admin role. Before blowing it away, you could try: - id a candidate node, with a snapshot you just might think is old enough to be robust. - clean data for zk nodes otherwise. - bring up the chosen node and wait for it to settle[wish i could

Re: Cluster with no overseer?

2019-05-21 Thread Walter Underwood
Yes, please. I have the logs from each of the Zookeepers. We are running 3.4.12. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 21, 2019, at 6:49 PM, Will Martin wrote: > > Walter. Can I cross-post to zk-dev? > > > > Will Martin > DEVOPS E

Re: Cluster with no overseer?

2019-05-21 Thread Will Martin
Walter. Can I cross-post to zk-dev? Will MartinDEVOPS ENGINEER540.454.9565 urgently-email-logo Description: application/apple-msg-attachment 8609 WESTWOOD CENTER DR, SUITE 475VIENNA, VA 22182geturgently.com On May 21, 2019, at 9:26 PM, Will Martin wrote:+1Will MartinDEVOPS ENG

Re: Cluster with no overseer?

2019-05-21 Thread Will Martin
+1 Will Martin DEVOPS ENGINEER 540.454.9565 8609 WESTWOOD CENTER DR, SUITE 475 VIENNA, VA 22182 geturgently.com On Tue, May 21, 2019 at 7:39 PM Walter Underwood wrote: > ADDROLE times out after 180 seconds. This seems to be an unrecoverable > state for the cluster, so that is a pretty serious

Re: Cluster with no overseer?

2019-05-21 Thread Walter Underwood
ADDROLE times out after 180 seconds. This seems to be an unrecoverable state for the cluster, so that is a pretty serious bug. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On May 21, 2019, at 4:10 PM, Walter Underwood wrote: > > We have a 6.6.2 clu

Re: Cluster without sharding

2017-07-12 Thread Erick Erickson
1> I would not do this. First there's the lock issues you mentioned. But let's say replica1 is your indexer and replicas2 and 3 point to the same index. When replica1 commits, now do replicas 2 and 3 know to open a new searcher? <2> and <3> just seem like variants of coupling Solr instances to col

Re: Cluster

2016-04-07 Thread Tamás Barta
No, I have two (now three) external zookeeper. We have two Solr nodes and two JBoss nodes, each of them in different machines. How many Zookeeper server should I start and on which machines for better performance and for better HA? 2016. ápr. 7. du. 6:00 ezt írta ("Erick Erickson" ): > Hmmm, perh

Re: Cluster

2016-04-07 Thread Erick Erickson
Hmmm, perhaps your running embedded Zookeeper? You should _not_ have to run a third Zookeeper, but if you're only running two Zookeepers _both_ must be up for SolrCloud to function. I often run with exactly one _external_ zookeeper when developing locally. So what it sounds like is that you're run

Re: Cluster

2016-04-07 Thread Shawn Heisey
On 4/7/2016 5:04 AM, Tamás Barta wrote: > Thanks, as soon as I set up a third ZK server, it works like a charm. The advice to run three servers minimum really only applies to zookeeper. You can run two Solr servers in your cloud and have no problems with redundancy, although if you do not have cl

Re: Cluster

2016-04-07 Thread Tamás Barta
you have just 2, so quorum is not reached for leader > selection. That's why it is adviced to run odd number of nodes and minimum > is 3. Hope that helps > > -Original Message- > From: "Tamás Barta" > Sent: ‎07-‎04-‎2016 04:17 PM > To: "solr-user@lu

RE: Cluster

2016-04-07 Thread prabhat singh
Hope that helps -Original Message- From: "Tamás Barta" Sent: ‎07-‎04-‎2016 04:17 PM To: "solr-user@lucene.apache.org" Subject: Re: Cluster Maybe I should ask a question instead how to set up a two-node solr cluster where both nodes contains the same data (collecti

Re: Cluster

2016-04-07 Thread Tamás Barta
Maybe I should ask a question instead how to set up a two-node solr cluster where both nodes contains the same data (collections are replicated to the other) and if I shut down one of the nodes the other node will work and i can send updates and queries to it. Is it possible? On Thu, Apr 7, 2016 a

Re: Cluster down for long time after zookeeper disconnection

2015-08-11 Thread danny teichthal
1. Erik, thanks, I agree that it is really serious, but I think that the 3 minutes on this case were not mandatory. On my case it was a deadlock, which smells like some kind of bug. One replica is waiting for other to come up, before it takes leadership, while the other is waiting for the election

Re: Cluster down for long time after zookeeper disconnection

2015-08-10 Thread Erick Erickson
Not that I know of. With ZK as the "one source of truth", dropping below quorum is Really Serious, so having to wait 3 minutes or so for action to be taken is the fallback. Best, Erick On Mon, Aug 10, 2015 at 1:34 PM, danny teichthal wrote: > Erick, I assume you are referring to zkClientTimeout,

Re: Cluster down for long time after zookeeper disconnection

2015-08-10 Thread danny teichthal
Erick, I assume you are referring to zkClientTimeout, it is set to 30 seconds. I also see these messages on Solr side: "Client session timed out, have not heard from server in 48865ms for sessionid 0x44efbb91b5f0001, closing socket connection and attempting reconnect". So, I'm not sure what was th

Re: Cluster down for long time after zookeeper disconnection

2015-08-10 Thread Erick Erickson
I didn't see the zk timeout you set (just skimmed). But if your Zookeeper was down _very_ termporarily, it may suffice to up the ZK timeout. The default in the 10.4 time-frame (if I remember correctly) was 15 seconds which has proven to be too short in many circumstances. Of course if your ZK was

Re: Cluster down for long time after zookeeper disconnection

2015-08-10 Thread danny teichthal
Hi Alexander , Thanks for your reply, I looked at the release notes. There is one bug fix - SOLR-7503 – register cores asynchronously. It may reduce the registration time since it is done on parallel, but still, 3 minutes (leaderVoteWait) is a long

Re: Cluster down for long time after zookeeper disconnection

2015-08-10 Thread Alexandre Rafalovitch
Did you look at release notes for Solr versions after your own? I am pretty sure some similar things were identified and/or resolved for 5.x. It may not help if you cannot migrate, but would at least give a confirmation and maybe workaround on what you are facing. Regards, Alex. Solr Anal

Re: Cluster state ranges are all null after reboot

2014-03-02 Thread Greg Pendlebury
Thanks again for the info. Hopefully we find some more clues if it continues to occur. The ops team are looking at alternative deployment methods as well, so we might end up avoiding the issue altogether. Ta, Greg On 28 February 2014 02:42, Shalin Shekhar Mangar wrote: > I think it is just a si

Re: Cluster state ranges are all null after reboot

2014-02-27 Thread Shalin Shekhar Mangar
I think it is just a side-effect of the current implementation that the ranges are assigned linearly. You can also verify this by choosing a document from each shard and running it's uniqueKey against the CompositeIdRouter's sliceHash method and verifying that it is included in the range. I couldn

Re: Cluster state ranges are all null after reboot

2014-02-26 Thread Greg Pendlebury
Thanks Shalin, that code might be helpful... do you know if there is a reliable way to line up the ranges with the shard numbers? When the problem occurred we had 80 million documents already in the index, and could not issue even a basic 'deleteById' call. I'm tempted to assume they are just assig

Re: Cluster state ranges are all null after reboot

2014-02-26 Thread Shalin Shekhar Mangar
If you have 15 shards and assuming that you've never used shard splitting, you can calculate the shard ranges by using new CompositeIdRouter().partitionRange(15, new CompositeIdRouter().fullRange()) This gives me: [8000-9110, 9111-a221, a222-b332, b333-c443, c444000

Re: Cluster Resizing question

2012-01-25 Thread Jamie Johnson
I think I need to provide a few more details here. I need the ability to add a shard to the cluster, in doing this I'd like to split an existing index and spin up this new shard with 1/2 (or there abouts) of this and allow the original to continue serving the pieces it has now. In our application

Re: Cluster Resizing question

2012-01-25 Thread Jamie Johnson
Thanks Otis. I have been following the SolrCloud development, but I was wondering specifically about elastically expanding the cloud by adding shards. I'm following the distributed indexing JIRA, but I'm having difficulty finding a JIRA which specifically references the issues with elasticity. A

Re: Cluster Resizing question

2012-01-25 Thread Otis Gospodnetic
Jamie, depending on how quickly you need this, it may be better to follow SolrCloud development because cluster resizing will work differently there. Otis  Performance Monitoring SaaS for Solr - http://sematext.com/spm/solr-performance-monitoring/index.html >_