Cool, useful info.
As soon as I can duplicate the issue I'll work out what we need to do
differently for this case.
- Mark
On Mar 7, 2013, at 10:19 AM, Brett Hoerner wrote:
> As an update to this, I did my SolrCloud dance and made it 2xJVMs per
> machine (2 machines still, the same ones) and
As an update to this, I did my SolrCloud dance and made it 2xJVMs per
machine (2 machines still, the same ones) and spread the load around. Each
Solr instance now has 16 total shards (master for 8, replica for 8).
*drum roll* ... I can repeatedly run my delete script and nothing breaks. :)
On Th
No, not a poor idea at all, definitely a valid setup.
- Mark
On Mar 7, 2013, at 9:30 AM, Brett Hoerner wrote:
> As a side note, do you think that was a poor idea? I figured it's better to
> spread the master "load" around?
>
>
> On Thu, Mar 7, 2013 at 11:29 AM, Mark Miller wrote:
>
>>
>> O
As a side note, do you think that was a poor idea? I figured it's better to
spread the master "load" around?
On Thu, Mar 7, 2013 at 11:29 AM, Mark Miller wrote:
>
> On Mar 7, 2013, at 9:03 AM, Brett Hoerner wrote:
>
> > To be clear, neither is really "the replica", I have 32 shards and each
>
On Mar 7, 2013, at 9:03 AM, Brett Hoerner wrote:
> To be clear, neither is really "the replica", I have 32 shards and each
> physical server is the leader for 16, and the replica for 16.
Ah, interesting. That actually could be part of the issue - some brain cells
are firing. I'm away from home
Here is the other server when it's locked:
https://gist.github.com/3529b7b6415756ead413
To be clear, neither is really "the replica", I have 32 shards and each
physical server is the leader for 16, and the replica for 16.
Also, related to the max threads hunch: my working cluster has many, many
f
Any chance you can grab the stack trace of a replica as well? (also when it's
locked up of course).
- Mark
On Mar 6, 2013, at 3:34 PM, Brett Hoerner wrote:
> If there's anything I can try, let me know. Interestingly, I think I have
> noticed that if I stop my indexer, do my delete, and restart
If there's anything I can try, let me know. Interestingly, I think I have
noticed that if I stop my indexer, do my delete, and restart the indexer
then I'm fine. Which goes along with the update thread contention theory.
On Wed, Mar 6, 2013 at 5:03 PM, Mark Miller wrote:
> This is what I see:
>
This is what I see:
We currently limit the number of outstanding update requests at one time to
avoid a crazy number of threads being used.
It looks like a bunch of update requests are stuck in socket reads and are
taking up the available threads. It looks like the deletes are hanging out
wait
It does not look like a deadlock, though it could be a distributed one. Or
it could be a livelock, though that's less likely.
Here is what we used to recommend in similar situations for large Java
systems (BEA Weblogic):
1) Do thread dump of both systems before anything. As simultaneous as you
can
Thans Brett, good stuff (though not a good problem).
We def need to look into this.
- Mark
On Mar 6, 2013, at 1:53 PM, Brett Hoerner wrote:
> Here is a dump after the delete, indexing has been stopped:
> https://gist.github.com/bretthoerner/c7ea3bf3dc9e676a3f0e
>
> An interesting hint that I
Here is a dump after the delete, indexing has been stopped:
https://gist.github.com/bretthoerner/c7ea3bf3dc9e676a3f0e
An interesting hint that I forgot to mention: it doesn't always happen on
the first delete. I manually ran the delete cron, and the server continued
to work. I waited about 5 minut
4.1, I'll induce it again and run jstack.
On Wed, Mar 6, 2013 at 1:50 PM, Mark Miller wrote:
> Which version of Solr?
>
> Can you use jconsole, visualvm, or jstack to get some stack traces and see
> where things are halting?
>
> - Mark
>
> On Mar 6, 2013, at 11:45 AM, Brett Hoerner wrote:
>
>
Which version of Solr?
Can you use jconsole, visualvm, or jstack to get some stack traces and see
where things are halting?
- Mark
On Mar 6, 2013, at 11:45 AM, Brett Hoerner wrote:
> I have a SolrCloud cluster (2 machines, 2 Solr instances, 32 shards,
> replication factor of 2) that I've been
14 matches
Mail list logo