On 6/20/2018 6:39 AM, Satya Marivada wrote:
Yes, there are some other errors that there is a javabin character 2
expected and is returning 60 which is "<" .
This happens when the response is an error. Error responses are sent in
HTML format (so they render properly when viewed in a browser),
Having time drift longer than the TTL would definitely cause these types of
problems.
In our case, the clusters are time-synchronized and the error is still
encountered periodically.
On Wed, Jun 20, 2018 at 10:07 AM Erick Erickson
wrote:
> We've seen this exact issue when the times reported by
We've seen this exact issue when the times reported by various
machines have different wall-clock times, so getting these times
coordinated is definitely the very first thing I'd do.
It's particularly annoying because if the clocks are drifting apart
gradually, your setup can be running find for d
Chris,
You are spot on with the timestamps. The date command returns different
times on these vms and are not in sync with ntp. The ntpstat returns a
difference of about 8-10 seconds on the 4 vms and that would caused this
synchronization issues and marked the replicas as down. This just happened
Satya,
There should be some other log messages that are probably relevant to the
issue you are having. Something along the lines of "leader cannot
communicate with follower...publishing replica as down." It's likely there
also is a message of "expecting json/xml but got html" in another
instance's
Hi, We are using solr 6.3.0 and a collection has 3 of 4 replicas down and 1
is up and serving.
I see a single line error repeating in logs as below. nothing else specific
exception apart from it. Wondering what this below message is saying, is it
the cause of nodes being down, but saw that this ha