A sharded index will go faster, because the indexing workload is split among 
the machines.

A 5 Mbyte batch for indexing seems a little large, but it may be OK. Increase 
the client threads until you get CPU around 80%. 

wunder
Walter Underwood
wun...@wunderwood.org
http://observer.wunderwood.org/  (my blog)

> On Mar 19, 2019, at 8:53 AM, Aaron Yingcai Sun <y...@vizrt.com> wrote:
> 
> Hello, Walter,
> 
> Thanks for the hint. it looks like the size matters, our documents size are 
> not fixed, there are many small documents, such as 59KB perl 10 documents, 
> the response time is around 10ms which is pretty good, there I could let it 
> send bigger batch still I get reasonable response time.
> 
> 
> I will try with Solr Could cluster, maybe get better speed there.
> 
> 
> //Aaron
> 
> ________________________________
> From: Walter Underwood <wun...@wunderwood.org>
> Sent: Tuesday, March 19, 2019 3:29:17 PM
> To: solr-user@lucene.apache.org
> Subject: Re: Solr index slow response
> 
> Indexing is CPU bound. If you have enough RAM, SSD disks, and enough client 
> threads, you should be able to drive CPU to over 90%.
> 
> Start with two client threads per CPU. That allows one thread to be sending 
> data over the network while another is waiting for Solr to process the batch.
> 
> A couple of years ago, I was indexing a million docs per minute into a Solr 
> Cloud cluster. I think that was four shards on instances with 16 CPUs, so it 
> was 64 CPUs available for indexing. That was with Java 8, G1GC, and 8 GB of 
> heap.
> 
> Your document are averaging about 50 kbytes, which is pretty big. Our 
> documents average about 3.5 kbytes. A lot of the indexing work is handling 
> the text, so those larger documents would be at least 10X slower than ours.
> 
> Are you doing atomic updates? That would slow things down a lot.
> 
> If you want to use G1GC, use the configuration I sent earlier.
> 
> wunder
> Walter Underwood
> wun...@wunderwood.org
> http://observer.wunderwood.org/  (my blog)
> 
>> On Mar 19, 2019, at 7:05 AM, Bernd Fehling <bernd.fehl...@uni-bielefeld.de> 
>> wrote:
>> 
>> Isn't there somthing about largePageTables which must be enabled
>> in JAVA and also supported by OS for such huge heaps?
>> 
>> Just a guess.
>> 
>> Am 19.03.19 um 15:01 schrieb Jörn Franke:
>>> It could be an issue with jdk 8 that may not be suitable for such large 
>>> heaps. Have more nodes with smaller heaps (eg 31 gb)
>>>> Am 18.03.2019 um 11:47 schrieb Aaron Yingcai Sun <y...@vizrt.com>:
>>>> 
>>>> Hello, Solr!
>>>> 
>>>> 
>>>> We are having some performance issue when try to send documents for solr 
>>>> to index. The repose time is very slow and unpredictable some time.
>>>> 
>>>> 
>>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, 
>>>> while 300 GB is reserved for solr, while this happening, cpu usage is 
>>>> around 30%, mem usage is 34%.  io also look ok according to iotop. SSD 
>>>> disk.
>>>> 
>>>> 
>>>> Our application send 100 documents to solr per request, json encoded. the 
>>>> size is around 5M each time. some times the response time is under 1 
>>>> seconds, some times could be 300 seconds, the slow response happens very 
>>>> often.
>>>> 
>>>> 
>>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for 
>>>> 3600000ms; if 1000000 uncommited docs"
>>>> 
>>>> 
>>>> There are around 100 clients sending those documents at the same time, but 
>>>> each for the client is blocking call which wait the http response then 
>>>> send the next one.
>>>> 
>>>> 
>>>> I tried to make the number of documents smaller in one request, such as 
>>>> 20, but  still I see slow response time to time, like 80 seconds.
>>>> 
>>>> 
>>>> Would you help to give some hint how improve the response time?  solr does 
>>>> not seems very loaded, there must be a way to make the response faster.
>>>> 
>>>> 
>>>> BRs
>>>> 
>>>> //Aaron
>>>> 
>>>> 
>>>> 
> 

Reply via email to