Yeah, that would overload it. To get good indexing speed, I configure two clients per CPU on the indexing machine. With one shard on a 16 processor machine, that would be 32 threads. With four shards on four 16 processor machines, 128 clients. Basically, one thread is waiting while the CPU processes a batch and the other is sending the next batch.
That should get the cluster to about 80% CPU. If the cluster is handling queries at the same time, I cut that way back, like one client thread for every two CPUs. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Apr 2, 2019, at 8:13 PM, Aroop Ganguly <aroopgang...@icloud.com> wrote: > > Mutliple threads to the same index ? And how many concurrent threads? > > Our case is not merely multiple threads but actually large scale spark > indexer jobs that index 1B records at a time with a concurrency of 400. > In this case multiple such jobs were indexing into the same index. > > >> On Apr 2, 2019, at 7:25 AM, Walter Underwood <wun...@wunderwood.org> wrote: >> >> We run multiple threads indexing to Solr all the time and have been doing so >> for years. >> >> How big are your documents and how big are your batches? >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >>> On Apr 1, 2019, at 10:51 PM, Aroop Ganguly <aroopgang...@icloud.com> wrote: >>> >>> Turns out the cause was multiple indexing jobs indexing into the index >>> simultaneously, which one can imagine can cause jvm loads on certain >>> replicas for sure. >>> Once this was found and only one job ran at a time, things were back to >>> normal. >>> >>> Your comments seem right on no correlation to the stack trace! >>> >>>> On Apr 1, 2019, at 5:32 PM, Shawn Heisey <apa...@elyograg.org> wrote: >>>> >>>> 4/1/2019 5:40 PM, Aroop Ganguly wrote: >>>>> Thanks Shawn, for the initial response. >>>>> Digging into a bit, I was wondering if we’d care to read the inner most >>>>> stack. >>>>> From the inner most stack it seems to be telling us something about what >>>>> trigger it ? >>>>> Ofcourse, the system could have been overloaded as well, but is the >>>>> exception telling us something or its of no use to consider this stack >>>> >>>> The stacktrace on OOME is rarely useful. The memory allocation where the >>>> error is thrown probably has absolutely no connection to the part of the >>>> program where major amounts of memory are being used. It could be ANY >>>> memory allocation that actually causes the error. >>>> >>>> Thanks, >>>> Shawn >>> >> >