Thanks, Erick that's exactly the clarification/confirmation I was looking for!
Greg
Right, it's a little arcane. But the lockup is because the
various leaders send documents to each other and wait
for returns. If there are a _lot_ of incoming packets to
various leaders, it can generate the distributed deadlock.
So the shuffling you refer to is the root of the issue.
If the leader
Erick,
I've read over SOLR-4816 after finding your comment about the server-side
stack traces showing threads locked up over semaphores and I'm curious how
that issue cures the problem on the server-side as the patch only includes
client-side changes. Do the servers get so tied up shuffling docume
Hi,
My cluster hangs again running an update process, the HTTP POST request was
aborted because a timeout error. After the hang, I couldn't do more updates
without restart the cluster.
I could see this error on node's log after kill it. Is like if solr waits for
the update response forever …
Did you take a stack trace of your _server_ and see if the
fragment I posted is the place a bunch of threads are
stuck? If so, then it's what I mentioned, and the patch
I pointed to should fix it up (when it's ready)...
The fact that it hangs more frequently with replication > 1
is consistent with
Shawn:
replicationFactor higher than one yes.
--
Yago Riveiro
Sent with Sparrow (http://www.sparrowmailapp.com/?sig)
On Sunday, June 2, 2013 at 4:07 PM, Shawn Heisey wrote:
> On 6/2/2013 8:28 AM, Yago Riveiro wrote:
> > Erick:
> >
> > In my case, when server hangs, no exception is thrown,
On 6/2/2013 8:28 AM, Yago Riveiro wrote:
> Erick:
>
> In my case, when server hangs, no exception is thrown, the logs on both
> servers stop registering the update INFO messages. if a shutdown one node,
> immediately the log of the alive node register some update INFO messages that
> appears wa
Erick:
In my case, when server hangs, no exception is thrown, the logs on both servers
stop registering the update INFO messages. if a shutdown one node, immediately
the log of the alive node register some update INFO messages that appears was
stuck at some place on the update operation.
Other
Yago:
Batches of 100k docs at a time are pretty big, you're way past the
diminishing returns point. I rarely go over 1,000. That said, reducing
the size might be a work-around, perhaps down to one.
All:
Look on your Solr servers (not client) for a stack trace fragment similar to:
at org.apache.
As far as I know, partial update in Solr 4.X doesn’t partially update Lucene
index , but instead removes a document from the index and indexes an
updated one. The underlying lucene always requires to delete the old
document and index the new one..
We usually dont use partial update when updating
Hi,
I'm experimenting the same issue, I'm indexing a big file with 15M in batches
of 100K.
Sometimes, the indexing operation hangs and my HTTP client return an error of
timeout.
I see that is more frequent when the collection has more replicas.
Other thing that I can see is a lot of POST up
11 matches
Mail list logo