A sharded index will go faster, because the indexing workload is split among the machines.
A 5 Mbyte batch for indexing seems a little large, but it may be OK. Increase the client threads until you get CPU around 80%. wunder Walter Underwood wun...@wunderwood.org http://observer.wunderwood.org/ (my blog) > On Mar 19, 2019, at 8:53 AM, Aaron Yingcai Sun <y...@vizrt.com> wrote: > > Hello, Walter, > > Thanks for the hint. it looks like the size matters, our documents size are > not fixed, there are many small documents, such as 59KB perl 10 documents, > the response time is around 10ms which is pretty good, there I could let it > send bigger batch still I get reasonable response time. > > > I will try with Solr Could cluster, maybe get better speed there. > > > //Aaron > > ________________________________ > From: Walter Underwood <wun...@wunderwood.org> > Sent: Tuesday, March 19, 2019 3:29:17 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr index slow response > > Indexing is CPU bound. If you have enough RAM, SSD disks, and enough client > threads, you should be able to drive CPU to over 90%. > > Start with two client threads per CPU. That allows one thread to be sending > data over the network while another is waiting for Solr to process the batch. > > A couple of years ago, I was indexing a million docs per minute into a Solr > Cloud cluster. I think that was four shards on instances with 16 CPUs, so it > was 64 CPUs available for indexing. That was with Java 8, G1GC, and 8 GB of > heap. > > Your document are averaging about 50 kbytes, which is pretty big. Our > documents average about 3.5 kbytes. A lot of the indexing work is handling > the text, so those larger documents would be at least 10X slower than ours. > > Are you doing atomic updates? That would slow things down a lot. > > If you want to use G1GC, use the configuration I sent earlier. > > wunder > Walter Underwood > wun...@wunderwood.org > http://observer.wunderwood.org/ (my blog) > >> On Mar 19, 2019, at 7:05 AM, Bernd Fehling <bernd.fehl...@uni-bielefeld.de> >> wrote: >> >> Isn't there somthing about largePageTables which must be enabled >> in JAVA and also supported by OS for such huge heaps? >> >> Just a guess. >> >> Am 19.03.19 um 15:01 schrieb Jörn Franke: >>> It could be an issue with jdk 8 that may not be suitable for such large >>> heaps. Have more nodes with smaller heaps (eg 31 gb) >>>> Am 18.03.2019 um 11:47 schrieb Aaron Yingcai Sun <y...@vizrt.com>: >>>> >>>> Hello, Solr! >>>> >>>> >>>> We are having some performance issue when try to send documents for solr >>>> to index. The repose time is very slow and unpredictable some time. >>>> >>>> >>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, >>>> while 300 GB is reserved for solr, while this happening, cpu usage is >>>> around 30%, mem usage is 34%. io also look ok according to iotop. SSD >>>> disk. >>>> >>>> >>>> Our application send 100 documents to solr per request, json encoded. the >>>> size is around 5M each time. some times the response time is under 1 >>>> seconds, some times could be 300 seconds, the slow response happens very >>>> often. >>>> >>>> >>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for >>>> 3600000ms; if 1000000 uncommited docs" >>>> >>>> >>>> There are around 100 clients sending those documents at the same time, but >>>> each for the client is blocking call which wait the http response then >>>> send the next one. >>>> >>>> >>>> I tried to make the number of documents smaller in one request, such as >>>> 20, but still I see slow response time to time, like 80 seconds. >>>> >>>> >>>> Would you help to give some hint how improve the response time? solr does >>>> not seems very loaded, there must be a way to make the response faster. >>>> >>>> >>>> BRs >>>> >>>> //Aaron >>>> >>>> >>>> >