So Many Zookeeper Warnings--There Must Be a Problem

2019-01-03 Thread Joe Lerner
Hi,

We have a simple architecture: 2 SOLR Cloud servers (on servers #1 and #2),
and 3 zookeeper instances (on servers #1, #2, and #3). Things work fine
(although we had a couple of brief unexplained outages), but:

One worrisome thing is that when I status zookeeper on #1 and #2, I get
Mode=Leader on both--#3 shows follower. This seems to be a pretty permanent
condition, at least right now as I look at it. And there isn't any big
maintenance or anything going on.

Also, we are getting *TONS* of continuous log warnings from our client
applications. From one server it shows this:



And from another server we get this:


These are making our logs impossible to read, but worse, I assume indicate
that something is wrong.

Thanks for any help!

Joe Lerner



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: So Many Zookeeper Warnings--There Must Be a Problem

2019-01-03 Thread Joe Lerner
Hi Scott,

First, we are definitely mis-onfigured for the myid thing. Basically two of
them were identifying as ID #2, and they are the two ZK's claiming to be the
leader. Definitely something to straighten out!

Our 3 lines in zoo.cfg look correct. Except they look like this:

clientPort:2181

server.1=host1:2190:2195 
server.2=host2:2191:2196 
server.3=host3:2192:2197

Notice the port range, and overlap...

Is that.../copacetic/?

Thanks!

Joe 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: So Many Zookeeper Warnings--There Must Be a Problem

2019-01-04 Thread Joe Lerner
wrt, "You'll probably have to delete the contents of the zk data directory
and rebuild your collections."

Rebuild my *SOLR* collections? That's easy enough for us. 

If this is how we're incorrectly configured now:

server #1 = myid#1
server #2 = myid#2
server #3 = myid#2

My plan would be to do the following, while users are still online (it's a
big [bad] deal if we need to take search offline):

1. Take zk #3 down.
2. Fix zk #3 by deleting the contents of the zk data directory and assign it
myid#3
3. Bring zk#3 back up
4. Do a full re-build of all collections

Thanks!

Joe



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Continuous Zookeeper Client Warnings

2019-01-04 Thread Joe Lerner
(ClientCnxn.java:1063) 
[MYAPP-WEB] 2019-01-03 14:19:49,912 WARN [org.apache.zookeeper.ClientCnxn] -

java.lang.NoClassDefFoundError: org/apache/zookeeper/Login 
at
org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslClient(ZooKeeperSaslClient.java:216)
 
at
org.apache.zookeeper.client.ZooKeeperSaslClient.(ZooKeeperSaslClient.java:119)
 
at
org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1011) 
at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1063) 
[MYAPP-WEB] 2019-01-03 14:19:50,977 WARN [org.apache.zookeeper.ClientCnxn] -

java.lang.NoClassDefFoundError: org/apache/zookeeper/Login 
at
org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslClient(ZooKeeperSaslClient.java:216)
 
at
org.apache.zookeeper.client.ZooKeeperSaslClient.(ZooKeeperSaslClient.java:119)
 
at
org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1011) 
at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1063) 
[MYAPP-WEB] 2019-01-03 14:19:51,233 WARN [org.apache.zookeeper.ClientCnxn] -

java.lang.NoClassDefFoundError: org/apache/zookeeper/Login 
at
org.apache.zookeeper.client.ZooKeeperSaslClient.createSaslClient(ZooKeeperSaslClient.java:216)
 
at
org.apache.zookeeper.client.ZooKeeperSaslClient.(ZooKeeperSaslClient.java:119)
 
at
org.apache.zookeeper.ClientCnxn$SendThread.startConnect(ClientCnxn.java:1011) 
at
org.apache.zookeeper.ClientCnxn$SendThread.run(ClientCnxn.java:1063) 





These are making our application logs impossible to read, and I assume
indicate 
that something is wrong. 

Thanks for any help! 

Joe Lerner 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Warnings in Zookeeper Server Logs

2019-01-04 Thread Joe Lerner
Hi (yes again):

We have a simple architecture: 2 SOLR Cloud servers (on servers #1 and #2),
and 3 zookeeper instances (on servers #1, #2, and #3). Things appear to work
fine, and I have confirmed that our basic configuration is correct. But we
are seeing TONS of the following warnings in all of our zookeeper server
logs:

2019-01-04 14:48:04,266 [myid:1] - INFO 
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxnFactory@192] -
Accepted socket connection from /XXX.YY.ZZZ.46:51516
2019-01-04 14:48:04,266 [myid:1] - WARN 
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@368] - caught end
of stream exception
EndOfStreamException: Unable to read additional data from client sessionid
0x0, likely client has closed socket
at
org.apache.zookeeper.server.NIOServerCnxn.doIO(NIOServerCnxn.java:239)
at
org.apache.zookeeper.server.NIOServerCnxnFactory.run(NIOServerCnxnFactory.java:203)
at java.lang.Thread.run(Thread.java:748)
2019-01-04 14:48:04,266 [myid:1] - INFO 
[NIOServerCxn.Factory:0.0.0.0/0.0.0.0:2181:NIOServerCnxn@1044] - Closed
socket connection for client /XXX.YY.ZZZ.46:51516 (no session established
for client)


These messages seem to correspond to similar message we are seeing in the
application client-side logs. (I don’t see any messages that would indicate
Too many connections.) 

Reading the log content, it seems to be saying that a connection is
accepted, but then there is an "end of stream" exception. But our users are
not experiencing any problems--they are searching SOLR like crazy.

Any suggestions?

Thanks!

Joe





--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Warnings in Zookeeper Server Logs

2019-01-21 Thread Joe Lerner
After a long slog, I am now able to answer my own question, just in case
anybody is listening. 

We determined that when we deploy our application to Tomcat using the Tomcat
deploy service, which happens when we deploy with Jenkins and Ansible, these
errors start. Conversely, if we re-start Tomcat from scratch, the errors go
away. Nothing else we tried (and we tried a lot) worked. Our guess is that
the Zookeeper libraries we build into our application do something that do
not go away, even when the application is re-deployed.

This isn't a great answer from us, as we use Ansible to deploy our
application to production, and we use Jenkins to continuously deploy in
development. But, it is what it is, and at least our logs are readable now.

Joe 



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Cannot Figure out Reason for Persistent Zookeeper Warning

2019-02-06 Thread Joe Lerner
Our application runs on Tomcat. We found that when we deploy to Tomcat using
Jenkins or Ansible--a "hot" deployment--the ZK log problem starts. The only
solution we've been able to find was to bounce Tomcat.



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Schema Change for Solr 7.4

2018-08-03 Thread Joe Lerner
We recently set up Solr 7.4 in Production. There are 2 Solr nodes, with 3
zookeepers. We need to make a schema change. What I want to do is simply
push the updated schema to Solr, and then re-index all the content to pick
up the change. But I am being told that I need to:

1.  Delete the collection that depends on this config-set.
2.  Reload the config-set
3.  Recreate the dependent collection

It seems to me that between steps #1 and #3, users will not be able to
search, which is not cool.

Can I avoid the outage to my search capabilitty?

Thanks!

Joe



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Schema Change for Solr 7.4

2018-08-03 Thread Joe Lerner
OK--yes, I can see how that would work. But it would require some quick
infrastructure flexibility that, at least to this point, we don't really
have.

Joe



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Solr Migration to The AWS Cloud

2019-06-05 Thread Joe Lerner
Hi,

Our application is migrating from on-premise to AWS. We are currently on
Solr Cloud 7.3.0.

We are interested in exploring ways to do this with minimal,  down-time, as
in, maybe one hour.

One strategy would be to set up a new empty Solr Cloud instance in AWS, and
reindex the world. But reindexing takes us around ~14 hours, so, that is not
a viable approach.

I think one very attractive option would be to set up a new live
node/replica in AWS, and, once it replicates, we're essentially
done--literally zero down time (for search anyway). But I don't think we're
going to be able to do that from a networking/security perspective.

>From what I've seen, the other option is to copy the Solr index files to
AWS, and somehow use them to set up a new pre-indexed instance. Do I need to
shut down my application and Solr on prem before I copy the files, or can I
copy while things are active. 

If I can do the copy while the application is running, I can probably:

1. Copy files to AWS Friday at noon
2. Keep a record of what got re-indexed after Friday at noon (or, heck,
11:45am)
3. Start up the new Solr in AWS against the copied files
4. Reindex the stuff that got re-indexed after Friday at noon

Is there a cleaner/simpler/more official way of moving an index from what
place to another? Export/import, or something like that?

Thanks for any help!

Joe




--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Solr Migration to The AWS Cloud

2019-06-06 Thread Joe Lerner
Ooohh...interesting. Then, presumably there is some way to have what was the
cross-data-center replica become the new "primary"? 

It's getting too easy!

Joe



--
Sent from: http://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Error Adding a Replica to SOLR Cloud 8.2.0

2021-01-26 Thread Joe Lerner
We finally got this fixed by temporarily disabling any updates to the SOLR
index. 



--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html


Re: Dynamic starting or stoping of zookeepers in a cluster

2021-02-19 Thread Joe Lerner
This is solid information. *How about the application, which uses
SOLR/Zookeeper?*

Do we have to follow this guidance, to make the application ZK config aware:

https://zookeeper.apache.org/doc/r3.5.5/zookeeperReconfig.html#ch_reconfig_rebalancing

  

Or, could we leave it as is, and as long as the ZK Ensemble has the same
IPs?

Thanks!

Joe




--
Sent from: https://lucene.472066.n3.nabble.com/Solr-User-f472068.html