If you start indexing with just a single thread/client, do you still see slow bulks?
Emir -- Monitoring - Log Management - Alerting - Anomaly Detection Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > On 19 Mar 2019, at 12:54, Aaron Yingcai Sun <y...@vizrt.com> wrote: > > "QTime" value is from the solr rest api response, extracted from the > http/json payload. The "Request time" is what I measured from client side, > it's almost the same value as QTime, just some milliseconds difference. I > could provide tcpdump to prove that it is really solr slow response. > > Those long response time is not really spikes, it's constantly happening, > almost half of the request has such long delay. The more document added in > one request the more delay it has. > > ________________________________ > From: Emir Arnautović <emir.arnauto...@sematext.com> > Sent: Tuesday, March 19, 2019 12:30:33 PM > To: solr-user@lucene.apache.org > Subject: Re: Solr index slow response > > Just to add different perspective here: how do you send documents to Solr? > Are those log lines from your client? Maybe it is not Solr that is slow. > Could it be network or client itself. If you have some dry run on client, > maybe try running it without Solr to eliminate client from the suspects. > > Do you observe similar spikes when you run indexing with less concurrent > clients? > > It is really hard to pinpoint the issue without looking at some monitoring > tool. > > Emir > -- > Monitoring - Log Management - Alerting - Anomaly Detection > Solr & Elasticsearch Consulting Support Training - http://sematext.com/ > > > >> On 19 Mar 2019, at 09:17, Aaron Yingcai Sun <y...@vizrt.com> wrote: >> >> We have around 80 million documents to index, total index size around 3TB, >> I guess I'm not the first one to work with this big amount of data. with >> such slow response time, the index process would take around 2 weeks. While >> the system resource is not very loaded, there must be a way to speed it up. >> >> >> To Walter, I don't see why G1GC would improve this, we only do index, no >> query in the background. There is no memory constraint. it's more feel like >> some internal thread are blocking each other. >> >> >> I used to run with more documents in one request, that give much worse >> response time, 300 documents in one request could end up 20 minutes response >> time, now I changed to max 10 documents in one request, still many response >> time around 30 seconds, while some of them are very fast( ~100 ms). How >> come there are such big difference? the documents size does not have such >> big difference. >> >> >> I just want to speed it up since nothing seems to be overloaded. Are there >> any other faster way to index such big amount of data? >> >> >> BRs >> >> //Aaron >> >> ________________________________ >> From: Walter Underwood <wun...@wunderwood.org> >> Sent: Monday, March 18, 2019 4:59:20 PM >> To: solr-user@lucene.apache.org >> Subject: Re: Solr index slow response >> >> Solr is not designed to have consistent response times for updates. You are >> expecting Solr to do something that it does not do. >> >> About Xms and Xmx, the JVM will continue to allocate memory until it hits >> the max. After it hits the max, it will start to collect garbage. A smaller >> Xms just wastes time doing allocations after the JVM is running. Avoid that >> by making Xms and Xms the same. >> >> We run all of our JVMs with 8 GB of heap and the G1 collector. You probably >> do not need more than 8 GB unless you are doing high-cardinality facets or >> some other memory-hungry querying. >> >> The first step would be to use a good configuration. We start our Java 8 >> JVMs with these parameters: >> >> SOLR_HEAP=8g >> # Use G1 GC -- wunder 2017-01-23 >> # Settings from https://wiki.apache.org/solr/ShawnHeisey >> GC_TUNE=" \ >> -XX:+UseG1GC \ >> -XX:+ParallelRefProcEnabled \ >> -XX:G1HeapRegionSize=8m \ >> -XX:MaxGCPauseMillis=200 \ >> -XX:+UseLargePages \ >> -XX:+AggressiveOpts \ >> “ >> >> Use SSD for disks, with total space about 3X as big as the expected index >> size. >> >> Have RAM not used by Solr or the OS that is equal to the expected index size. >> >> After that, let’s figure out what the real requirement is. If you must have >> consistent response times for update requests, you’ll need to do that >> outside of Solr. But if you need high data import rates, we can probably >> help. >> >> wunder >> Walter Underwood >> wun...@wunderwood.org >> http://observer.wunderwood.org/ (my blog) >> >>> On Mar 18, 2019, at 8:31 AM, Aaron Yingcai Sun <y...@vizrt.com> wrote: >>> >>> Hello, Chris >>> >>> >>> Thanks for the tips. So I tried to set it as you suggested, not see too >>> much improvement. Since I don't need it visible immediately, softCommit is >>> disabled totally. >>> >>> The slow response is happening every few seconds, if it happens hourly I >>> would suspect the hourly auto-commit. But it happen much more frequently. >>> I don't see any CPU/RAM/NETWORK IO/DISK IO bottleneck on OS level. It just >>> looks like solr server is blocking internally itself. >>> >>> >>> < <maxTime>${solr.autoCommit.maxTime:3600000}</maxTime> >>> --- >>>> <maxTime>${solr.autoCommit.maxTime:15000}</maxTime> >>> 16c16 >>> < <openSearcher>true</openSearcher> >>> --- >>>> <openSearcher>false</openSearcher> >>> >>> >>> >>> 190318-162811.610-189982 DBG1:doc_count: 10 , doc_size: 539 KB, Res code: >>> 200, QTime: 1405 ms, Request time: 1407 ms. >>> 190318-162811.636-189968 DBG1:doc_count: 10 , doc_size: 465 KB, Res code: >>> 200, QTime: 1357 ms, Request time: 1360 ms. >>> 190318-162811.732-189968 DBG1:doc_count: 10 , doc_size: 473 KB, Res code: >>> 200, QTime: 90 ms, Request time: 92 ms. >>> 190318-162811.995-189981 DBG1:doc_count: 10 , doc_size: 610 KB, Res code: >>> 200, QTime: 5306 ms, Request time: 5308 ms. >>> 190318-162814.873-190003 DBG1:doc_count: 10 , doc_size: 508 KB, Res code: >>> 200, QTime: 4775 ms, Request time: 4777 ms. >>> 190318-162814.889-189972 DBG1:doc_count: 10 , doc_size: 563 KB, Res code: >>> 200, QTime: 20222 ms, Request time: 20224 ms. >>> 190318-162814.975-191817 DBG1:doc_count: 10 , doc_size: 539 KB, Res code: >>> 200, QTime: 27732 ms, Request time: 27735 ms. >>> 190318-162814.975-189958 DBG1:doc_count: 10 , doc_size: 616 KB, Res code: >>> 200, QTime: 28106 ms, Request time: 28109 ms. >>> 190318-162814.975-190004 DBG1:doc_count: 10 , doc_size: 473 KB, Res code: >>> 200, QTime: 16703 ms, Request time: 16706 ms. >>> 190318-162814.982-189963 DBG1:doc_count: 10 , doc_size: 540 KB, Res code: >>> 200, QTime: 28216 ms, Request time: 28218 ms. >>> 190318-162814.988-190007 DBG1:doc_count: 10 , doc_size: 673 KB, Res code: >>> 200, QTime: 28133 ms, Request time: 28136 ms. >>> 190318-162814.993-189962 DBG1:doc_count: 10 , doc_size: 631 KB, Res code: >>> 200, QTime: 27909 ms, Request time: 27912 ms. >>> 190318-162814.996-191818 DBG1:doc_count: 10 , doc_size: 529 KB, Res code: >>> 200, QTime: 28172 ms, Request time: 28174 ms. >>> 190318-162815.056-189986 DBG1:doc_count: 10 , doc_size: 602 KB, Res code: >>> 200, QTime: 11375 ms, Request time: 11378 ms. >>> 190318-162815.100-189988 DBG1:doc_count: 10 , doc_size: 530 KB, Res code: >>> 200, QTime: 8663 ms, Request time: 8666 ms. >>> 190318-162815.275-189980 DBG1:doc_count: 10 , doc_size: 553 KB, Res code: >>> 200, QTime: 27526 ms, Request time: 27528 ms. >>> 190318-162815.283-189997 DBG1:doc_count: 10 , doc_size: 600 KB, Res code: >>> 200, QTime: 27529 ms, Request time: 27535 ms. >>> 190318-162815.318-189961 DBG1:doc_count: 10 , doc_size: 661 KB, Res code: >>> 200, QTime: 16617 ms, Request time: 16621 ms. >>> 190318-162815.484-189952 DBG1:doc_count: 10 , doc_size: 549 KB, Res code: >>> 200, QTime: 11707 ms, Request time: 11711 ms. >>> 190318-162815.485-189993 DBG1:doc_count: 10 , doc_size: 618 KB, Res code: >>> 200, QTime: 28315 ms, Request time: 28321 ms. >>> 190318-162816.216-189972 DBG1:doc_count: 10 , doc_size: 493 KB, Res code: >>> 200, QTime: 1320 ms, Request time: 1322 ms. >>> 190318-162816.354-189972 DBG1:doc_count: 10 , doc_size: 631 KB, Res code: >>> 200, QTime: 130 ms, Request time: 132 ms. >>> 190318-162816.471-189972 DBG1:doc_count: 10 , doc_size: 563 KB, Res code: >>> 200, QTime: 111 ms, Request time: 113 ms. >>> 190318-162816.586-189972 DBG1:doc_count: 10 , doc_size: 554 KB, Res code: >>> 200, QTime: 110 ms, Request time: 111 ms. >>> 190318-162816.716-189972 DBG1:doc_count: 10 , doc_size: 590 KB, Res code: >>> 200, QTime: 124 ms, Request time: 125 ms. >>> 190318-162820.494-189972 DBG1:doc_count: 10 , doc_size: 583 KB, Res code: >>> 200, QTime: 3772 ms, Request time: 3774 ms. >>> 190318-162820.574-189953 DBG1:doc_count: 10 , doc_size: 550 KB, Res code: >>> 200, QTime: 32802 ms, Request time: 32804 ms. >>> 190318-162820.609-189991 DBG1:doc_count: 10 , doc_size: 586 KB, Res code: >>> 200, QTime: 33787 ms, Request time: 33790 ms. >>> 190318-162820.621-189976 DBG1:doc_count: 10 , doc_size: 572 KB, Res code: >>> 200, QTime: 33397 ms, Request time: 33400 ms. >>> 190318-162820.622-189985 DBG1:doc_count: 10 , doc_size: 498 KB, Res code: >>> 200, QTime: 32917 ms, Request time: 32919 ms. >>> 190318-162820.987-190005 DBG1:doc_count: 10 , doc_size: 629 KB, Res code: >>> 200, QTime: 22207 ms, Request time: 22209 ms. >>> 190318-162821.028-189979 DBG1:doc_count: 10 , doc_size: 584 KB, Res code: >>> 200, QTime: 22800 ms, Request time: 22802 ms. >>> 190318-162821.056-189948 DBG1:doc_count: 10 , doc_size: 670 KB, Res code: >>> 200, QTime: 34193 ms, Request time: 34195 ms. >>> 190318-162821.062-189983 DBG1:doc_count: 10 , doc_size: 675 KB, Res code: >>> 200, QTime: 22250 ms, Request time: 22252 ms. >>> >>> >>> >>> >>> BRs >>> >>> //Aaron >>> >>> ________________________________ >>> From: Chris Ulicny <culicny@iq.media> >>> Sent: Monday, March 18, 2019 2:54:25 PM >>> To: solr-user@lucene.apache.org >>> Subject: Re: Solr index slow response >>> >>> One other thing to look at besides the heap is your commit settings. We've >>> experienced something similar, and changing commit settings alleviated the >>> issue. >>> >>> Are you opening a search on every hardcommit? If so, you might want to >>> reconsider and use the softcommit for the hourly creation of a new searcher. >>> >>> The hardcommit interval should be much lower. Probably something on the >>> order of seconds (15000) instead of hours (currently 3600000). When the >>> hard commit fires, numerous merges might be firing off in the background >>> due to the volume of documents you are indexing, which might explain the >>> periodic bad response times shown in your logs. >>> >>> It would depend on your specific scenario, but here's our setup. During >>> long periods of constant indexing of documents to a staging collection (~2 >>> billion documents), we have following commit settings >>> >>> softcommit: 3600000ms (for periodic validation of data, since it's not in >>> production) >>> hardcommit: openSearcher -> false, 15000ms (no document limit) >>> >>> This makes the documents available for searching every hour, but doesn't >>> result in the large bursts of IO due to the infrequent hard commits. >>> >>> For more info, Erick Erickson has a great write up: >>> https://lucidworks.com/2013/08/23/understanding-transaction-logs-softcommit-and-commit-in-sorlcloud/ >>> >>> Best, >>> Chris >>> >>> On Mon, Mar 18, 2019 at 9:36 AM Aaron Yingcai Sun <y...@vizrt.com> wrote: >>> >>>> Hi, Emir, >>>> >>>> My system used to run with max 32GB, the response time is bad as well. >>>> swap is set to 4GB, there 3.2 free, I doubt swap would affect it since >>>> there is such huge free memory. >>>> >>>> I could try to with set Xms and Xmx to the same value, but I doubt how >>>> much would that change the response time. >>>> >>>> >>>> BRs >>>> >>>> //Aaron >>>> >>>> ________________________________ >>>> From: Emir Arnautović <emir.arnauto...@sematext.com> >>>> Sent: Monday, March 18, 2019 2:19:19 PM >>>> To: solr-user@lucene.apache.org >>>> Subject: Re: Solr index slow response >>>> >>>> Hi Aaron, >>>> Without looking too much into numbers, my bet would be that it is large >>>> heap that is causing issues. I would decrease is significantly (<30GB) and >>>> see if it is enough for your max load. Also, disable swap or reduce >>>> swappiness to min. >>>> >>>> In any case, you should install some monitoring tool that would help you >>>> do better analysis when you run into problems. One such tool is our >>>> monitoring solution: https://sematext.com/spm >>>> >>>> HTH, >>>> Emir >>>> -- >>>> Monitoring - Log Management - Alerting - Anomaly Detection >>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >>>> >>>> >>>> >>>>> On 18 Mar 2019, at 13:14, Aaron Yingcai Sun <y...@vizrt.com> wrote: >>>>> >>>>> Hello, Emir, >>>>> >>>>> Thanks for the reply, this is the solr version and heap info, standalone >>>> single solr server. I don't have monitor tool connected. only look at >>>> 'top', has not seen cpu spike so far, when the slow response happens, cpu >>>> usage is not high at all, around 30%. >>>>> >>>>> >>>>> # curl 'http://.../solr/admin/info/system?wt=json&indent=true' >>>>> { >>>>> "responseHeader":{ >>>>> "status":0, >>>>> "QTime":27}, >>>>> "mode":"std", >>>>> "solr_home":"/ardome/solr", >>>>> "lucene":{ >>>>> "solr-spec-version":"6.5.1", >>>>> "solr-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 - >>>> jimczi - 2017-04-21 12:23:42", >>>>> "lucene-spec-version":"6.5.1", >>>>> "lucene-impl-version":"6.5.1 cd1f23c63abe03ae650c75ec8ccb37762806cc75 >>>> - jimczi - 2017-04-21 12:17:15"}, >>>>> "jvm":{ >>>>> "version":"1.8.0_144 25.144-b01", >>>>> "name":"Oracle Corporation Java HotSpot(TM) 64-Bit Server VM", >>>>> "spec":{ >>>>> "vendor":"Oracle Corporation", >>>>> "name":"Java Platform API Specification", >>>>> "version":"1.8"}, >>>>> "jre":{ >>>>> "vendor":"Oracle Corporation", >>>>> "version":"1.8.0_144"}, >>>>> "vm":{ >>>>> "vendor":"Oracle Corporation", >>>>> "name":"Java HotSpot(TM) 64-Bit Server VM", >>>>> "version":"25.144-b01"}, >>>>> "processors":32, >>>>> "memory":{ >>>>> "free":"69.1 GB", >>>>> "total":"180.2 GB", >>>>> "max":"266.7 GB", >>>>> "used":"111 GB (%41.6)", >>>>> "raw":{ >>>>> "free":74238728336, >>>>> "total":193470136320, >>>>> "max":286331502592, >>>>> "used":119231407984, >>>>> "used%":41.64103736566334}}, >>>>> "jmx":{ >>>>> >>>> "bootclasspath":"/usr/java/jdk1.8.0_144/jre/lib/resources.jar:/usr/java/jdk1.8.0_144/jre/lib/rt.jar:/usr/java/jdk1.8.0_144/jre/lib/sunrsasign.jar:/usr/java/jdk1.8.0_144/jre/lib/jsse.jar:/usr/java/jdk1.8.0_144/jre/lib/jce.jar:/usr/java/jdk1.8.0_144/jre/lib/charsets.jar:/usr/java/jdk1.8.0_144/jre/lib/jfr.jar:/usr/java/jdk1.8.0_144/jre/classes", >>>>> "classpath":"...", >>>>> "commandLineArgs":["-Xms100G", >>>>> "-Xmx300G", >>>>> "-DSTOP.PORT=8079", >>>>> "-DSTOP.KEY=..", >>>>> "-Dsolr.solr.home=..", >>>>> "-Djetty.port=8983"], >>>>> "startTime":"2019-03-18T09:35:27.892Z", >>>>> "upTimeMS":9258422}}, >>>>> "system":{ >>>>> "name":"Linux", >>>>> "arch":"amd64", >>>>> "availableProcessors":32, >>>>> "systemLoadAverage":14.72, >>>>> "version":"3.0.101-311.g08a8a9d-default", >>>>> "committedVirtualMemorySize":2547960700928, >>>>> "freePhysicalMemorySize":4530696192, >>>>> "freeSwapSpaceSize":3486846976, >>>>> "processCpuLoad":0.3257436126790475, >>>>> "processCpuTime":93869450000000, >>>>> "systemCpuLoad":0.3279781055816521, >>>>> "totalPhysicalMemorySize":406480175104, >>>>> "totalSwapSpaceSize":4302303232 <(430)%20230-3232>, >>>>> "maxFileDescriptorCount":32768, >>>>> "openFileDescriptorCount":385, >>>>> "uname":"Linux ... 3.0.101-311.g08a8a9d-default #1 SMP Wed Dec 14 >>>> 10:15:37 UTC 2016 (08a8a9d) x86_64 x86_64 x86_64 GNU/Linux\n", >>>>> "uptime":" 13:09pm up 5 days 21:23, 7 users, load average: 14.72, >>>> 12.28, 11.48\n"}} >>>>> >>>>> >>>>> >>>>> >>>>> ________________________________ >>>>> From: Emir Arnautović <emir.arnauto...@sematext.com> >>>>> Sent: Monday, March 18, 2019 12:10:30 PM >>>>> To: solr-user@lucene.apache.org >>>>> Subject: Re: Solr index slow response >>>>> >>>>> Hi Aaron, >>>>> Which version of Solr? How did you configure your heap? Is it standalone >>>> Solr or SolrCloud? A single server? Do you use some monitoring tool? Do you >>>> see some spikes, pauses or CPU usage is constant? >>>>> >>>>> Thanks, >>>>> Emir >>>>> -- >>>>> Monitoring - Log Management - Alerting - Anomaly Detection >>>>> Solr & Elasticsearch Consulting Support Training - http://sematext.com/ >>>>> >>>>> >>>>> >>>>>> On 18 Mar 2019, at 11:47, Aaron Yingcai Sun <y...@vizrt.com> wrote: >>>>>> >>>>>> Hello, Solr! >>>>>> >>>>>> >>>>>> We are having some performance issue when try to send documents for >>>> solr to index. The repose time is very slow and unpredictable some time. >>>>>> >>>>>> >>>>>> Solr server is running on a quit powerful server, 32 cpus, 400GB RAM, >>>> while 300 GB is reserved for solr, while this happening, cpu usage is >>>> around 30%, mem usage is 34%. io also look ok according to iotop. SSD >>>> disk. >>>>>> >>>>>> >>>>>> Our application send 100 documents to solr per request, json encoded. >>>> the size is around 5M each time. some times the response time is under 1 >>>> seconds, some times could be 300 seconds, the slow response happens very >>>> often. >>>>>> >>>>>> >>>>>> "Soft AutoCommit: disabled", "Hard AutoCommit: if uncommited for >>>> 3600000ms; if 1000000 uncommited docs" >>>>>> >>>>>> >>>>>> There are around 100 clients sending those documents at the same time, >>>> but each for the client is blocking call which wait the http response then >>>> send the next one. >>>>>> >>>>>> >>>>>> I tried to make the number of documents smaller in one request, such as >>>> 20, but still I see slow response time to time, like 80 seconds. >>>>>> >>>>>> >>>>>> Would you help to give some hint how improve the response time? solr >>>> does not seems very loaded, there must be a way to make the response >>>> faster. >>>>>> >>>>>> >>>>>> BRs >>>>>> >>>>>> //Aaron >>>>>> >>>>>> >>>>>> >>>>> >>>> >>>> >> >