Yeah, that would overload it. To get good indexing speed, I configure two 
clients per CPU on the indexing machine. With one shard on a 16 processor 
machine, that would be 32 threads. With four shards on four 16 processor 
machines, 128 clients. Basically, one thread is waiting while the CPU processes 
a batch and the other is sending the next batch.

That should get the cluster to about 80% CPU. If the cluster is handling 
queries at the same time, I cut that way back, like one client thread for every 
two CPUs.

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Apr 2, 2019, at 8:13 PM, Aroop Ganguly <aroopgang...@icloud.com> wrote:
> 
> Mutliple threads to the same index ? And how many concurrent threads?
> 
> Our case is not merely multiple threads but actually large scale spark 
> indexer jobs that index 1B records at a time with a concurrency of 400.
> In this case multiple such jobs were indexing into the same index. 
> 
> 
>> On Apr 2, 2019, at 7:25 AM, Walter Underwood <wun...@wunderwood.org> wrote:
>> 
>> We run multiple threads indexing to Solr all the time and have been doing so 
>> for years.
>> 
>> How big are your documents and how big are your batches?
>> 
>> wunder
>> Walter Underwood
>> wun...@wunderwood.org
>> http://observer.wunderwood.org/  (my blog)
>> 
>>> On Apr 1, 2019, at 10:51 PM, Aroop Ganguly <aroopgang...@icloud.com> wrote:
>>> 
>>> Turns out the cause was multiple indexing jobs indexing into the index 
>>> simultaneously, which one can imagine can cause jvm loads on certain 
>>> replicas for sure.
>>> Once this was found and only one job ran at a time, things were back to 
>>> normal.
>>> 
>>> Your comments seem right on no correlation to the stack trace! 
>>> 
>>>> On Apr 1, 2019, at 5:32 PM, Shawn Heisey <apa...@elyograg.org> wrote:
>>>> 
>>>> 4/1/2019 5:40 PM, Aroop Ganguly wrote:
>>>>> Thanks Shawn, for the initial response.
>>>>> Digging into a bit, I was wondering if we’d care to read the inner most 
>>>>> stack.
>>>>> From the inner most stack it seems to be telling us something about what 
>>>>> trigger it ?
>>>>> Ofcourse, the system could have been overloaded as well, but is the 
>>>>> exception telling us something or its of no use to consider this stack
>>>> 
>>>> The stacktrace on OOME is rarely useful.  The memory allocation where the 
>>>> error is thrown probably has absolutely no connection to the part of the 
>>>> program where major amounts of memory are being used.  It could be ANY 
>>>> memory allocation that actually causes the error.
>>>> 
>>>> Thanks,
>>>> Shawn
>>> 
>> 
> 

Reply via email to