Zookeeper hanging? If it was truly unresponsive I would think your entire SolrCloud would be down. I guess you could test this by, say, creating a new collection and seeing if it goes live, if Zookeeper is truly unresponsive that would fail.
Are you sure it's not just that the merging that's going on as part of MRIT? Best, Erick On Thu, Jun 2, 2016 at 11:37 AM, Jordan Drake <jordan.dr...@exterro.com> wrote: > Hi all, > > We are in the processing of streamlining our indexing process and trying to > increase some performance. We came across an issue where zookeeper seems to > hang for 10+ minutes (we've seen it as high as 40 min) after committing. > See the portion of the logs below. > > Our indexing is being done using the MapReduceIndexerTool with the go-live > option to merge into our live Solr. > The creation of the segments in mapreduce is fairly quick, and the merge is > usually fast. It's just that we occasionally see this issue in one of our > environments. > > I'm not sure whether this is a Zookeeper or Solr issue or if this is just > expected behavior. Any ideas on where to look for debugging? > > > > 16/06/02 09:03:06 INFO hadoop.MapReduceIndexerTool: Indexing 1 files > using 1 real mappers into 1 reducers > 16/06/02 09:04:08 INFO hadoop.MapReduceIndexerTool: Done. Indexing 1 > files using 1 real mappers into 1 reducers took 2.06103613E10 secs > 16/06/02 09:04:08 INFO hadoop.GoLive: Live merging of output shards > into Solr cluster... > 16/06/02 09:04:08 INFO hadoop.GoLive: Live merge > hdfs://192.168.5.228:8020/indexed/tmp/e2e/2223/results/part-00000 into > http://192.168.5.227:8983/solr > 16/06/02 09:04:22 INFO hadoop.GoLive: Committing live merge... > 16/06/02 09:04:22 INFO zookeeper.ZooKeeper: Initiating client > connection, connectString=192.168.5.227:9983 sessionTimeout=10000 > watcher=org.apache.solr.common.cloud.ConnectionManager@1deca477 > 16/06/02 09:04:22 INFO cloud.ConnectionManager: Waiting for client to > connect to ZooKeeper > 16/06/02 09:04:22 INFO zookeeper.ClientCnxn: Opening socket connection > to server 192.168.5.227/192.168.5.227:9983. Will not attempt to > authenticate using SASL (unknown error) > 16/06/02 09:04:22 INFO zookeeper.ClientCnxn: Socket connection > established to 192.168.5.227/192.168.5.227:9983, initiating session > 16/06/02 09:04:22 INFO zookeeper.ClientCnxn: Session establishment > complete on server 192.168.5.227/192.168.5.227:9983, sessionid = > 0x154e9ea749c028f, negotiated timeout = 10000 > 16/06/02 09:04:22 INFO cloud.ConnectionManager: Watcher > org.apache.solr.common.cloud.ConnectionManager@1deca477 > name:ZooKeeperConnection Watcher:192.168.5.227:9983 got event > WatchedEvent state:SyncConnected type:None path:null path:null > type:None > 16/06/02 09:04:22 INFO cloud.ConnectionManager: Client is connected to > ZooKeeper*16/06/02 09:04:22 INFO cloud.ZkStateReader: Updating cluster > state from ZooKeeper... > 16/06/02 09:18:17 INFO zookeeper.ZooKeeper: Session: 0x154e9ea749c028f closed* > 16/06/02 09:18:17 INFO zookeeper.ClientCnxn: EventThread shut down > 16/06/02 09:18:17 INFO hadoop.GoLive: Done committing live merge > 16/06/02 09:18:17 INFO hadoop.GoLive: Live merging of index shards > into Solr cluster took 2.83196359E11 secs > 16/06/02 09:18:17 INFO hadoop.GoLive: Live merging completed successfully > 16/06/02 09:18:17 INFO hadoop.MapReduceIndexerTool: Succeeded with > job: jobName: org.apache.solr.hadoop.MapReduceIndexerTool/MorphlineMapper, > jobId: job_1464681461364_0604 > 16/06/02 09:18:17 INFO hadoop.MapReduceIndexerTool: Success. Done. > Program took 3.04902275E11 secs. Goodbye. > > > > Thanks, > Jordan Drake