Thanks for the tool recommendation! This is the dstat output during commit bombardment / concurrent search requests:
----total-cpu-usage---- -dsk/total- ---paging-- ---system-- ----swap--- --io/total- ---file-locks-- usr sys idl wai hiq siq| read writ| in out | int csw | used free| read writ|pos lck rea wri 11 1 87 1 0 0|1221k 833k| 538B 828B| 784 920 | 197M 11G|16.8 15.5 |4.0 9.0 0 13 60 0 40 0 0 0| 0 0 | 0 0 | 811 164 | 197M 11G| 0 0 |4.0 9.0 0 13 25 0 75 0 0 0| 0 0 | 0 0 | 576 85 | 197M 11G| 0 0 |4.0 9.0 0 13 25 0 75 0 0 0| 0 0 | 0 0 | 572 90 | 197M 11G| 0 0 |4.0 9.0 0 13 25 0 74 0 0 0| 0 0 | 0 0 | 730 204 | 197M 11G| 0 0 |4.0 9.0 0 13 26 1 71 2 0 0|4096B 1424k| 0 0 | 719 415 | 197M 11G|1.00 46.0 |4.0 9.0 0 13 31 1 68 0 0 0| 0 136k| 0 0 | 877 741 | 197M 11G| 0 6.00 |5.0 9.0 0 14 70 6 24 0 0 0| 0 516k| 0 0 |1705 1027 | 197M 11G| 0 46.0 |5.0 11 1.0 15 72 3 25 0 0 0|4096B 384k| 0 0 |1392 910 | 197M 11G|1.00 25.0 |5.0 9.0 0 14 60 2 25 12 0 0| 688k 108k| 0 0 |1162 509 | 197M 11G|79.0 9.00 |4.0 9.0 0 13 94 1 5 0 0 0| 116k 0 | 0 0 |1271 654 | 197M 11G|4.00 0 |4.0 9.0 0 13 57 0 43 0 0 0| 0 0 | 0 0 |1076 238 | 197M 11G| 0 0 |4.0 9.0 0 13 26 0 73 0 0 0| 0 16k| 0 0 | 830 188 | 197M 11G| 0 2.00 |4.0 9.0 0 13 29 1 70 0 0 0| 0 0 | 0 0 |1088 360 | 197M 11G| 0 0 |4.0 9.0 0 13 29 1 70 0 0 1| 0 228k| 0 0 | 890 590 | 197M 11G| 0 21.0 |4.0 9.0 0 13 81 6 13 0 0 0|4096B 1596k| 0 0 |1227 441 | 197M 11G|1.00 52.0 |5.0 9.0 0 14 48 2 48 1 0 0| 172k 0 | 0 0 | 953 292 | 197M 11G|21.0 0 |5.0 9.0 0 14 25 0 74 0 0 0| 0 0 | 0 0 | 808 222 | 197M 11G| 0 0 |5.0 9.0 0 14 25 0 74 0 0 0| 0 0 | 0 0 | 607 90 | 197M 11G| 0 0 |5.0 9.0 0 14 25 0 75 0 0 0| 0 0 | 0 0 | 603 106 | 197M 11G| 0 0 |5.0 9.0 0 14 25 0 75 0 0 0| 0 144k| 0 0 | 625 104 | 197M 11G| 0 7.00 |5.0 9.0 0 14 85 3 9 2 0 0| 248k 92k| 0 0 |1441 887 | 197M 11G|33.0 7.00 |5.0 9.0 0 14 32 1 65 2 0 0| 404k 636k| 0 0 | 999 337 | 197M 11G|38.0 96.0 |5.0 9.0 0 14 25 0 75 0 0 0| 0 0 | 0 0 | 609 117 | 197M 11G| 0 0 |5.0 9.0 0 14 25 0 75 0 0 0| 0 0 | 0 0 | 604 77 | 197M 11G| 0 0 |5.0 9.0 0 14 26 0 74 0 0 0| 0 0 | 0 0 | 781 183 | 197M 11G| 0 0 |5.0 9.0 0 14 25 0 75 0 0 0| 0 0 | 0 0 | 620 110 | 197M 11G| 0 0 |5.0 9.0 0 14 46 4 50 0 0 0| 0 116k| 0 0 | 901 398 | 197M 11G| 0 12.0 |4.0 9.0 0 13 50 2 47 0 0 0| 0 0 | 0 0 |1031 737 | 197M 11G| 0 0 |5.0 9.0 0 14 28 1 71 0 0 0|4096B 168k| 0 0 | 800 254 | 197M 11G|1.00 9.00 |5.0 9.0 0 14 25 0 75 0 0 0| 0 0 | 0 0 | 571 84 | 197M 11G| 0 0 |5.0 9.0 0 14 26 0 73 1 0 0| 0 1172k| 0 0 | 632 209 | 197M 11G| 0 40.0 |5.0 9.0 0 14 For the short term, we should be fine if we put those single-document jobs in a queue that gets flushed every 60 seconds. Also, I should have mentioned that our index size is currently 27 GB containing 23.223.885 "documents" (only the PK is actually stored). For some reason I was assuming the commit time complexity to be constant, but that is probably not the case (?) Sooner or later someone is going to profile the container that runs Solr and our document streamer. I'll post the results if we find anything of interest. ================================= As a side note I've only just discovered that Solr 3.1 has been released (yaaaay!) We're currently using 1.4.1. > If you are on linux, I would recommend two tools you can use to track what is > going on on the machine, atop ( http://freshmeat.net/projects/atop/ ) and > dstat ( http://freshmeat.net/projects/dstat/ ). > > atop in particular has been very useful to me in tracking down performance > issues in real time (when I am running a process) or at random intervals > (when the machine slows down for no apparent reason. > > From the little you have told us my hunch is that you are saturating a disk > somewhere, either the index disk or swap (as pointed out by Mike) > > Cheers > > François > > On May 1, 2011, at 9:54 AM, Michael McCandless wrote: > >> Committing too frequently is very costly, since this calls fsync on >> numerous files under-the-hood, which strains the IO system and can cut >> into queries. If you really want to commit frequently, turning on compound >> file format could help things, since that's 1 file to fsync instead of N, per >> segment. >> >> Also, if you have a large merge running (turning on IW's infoStream >> will tell you), this can cause the OS to swap pages out, unless you >> set swappiness (if you're on Linux) to 0. >> >> Finally, beware of having too-large a JVM max heap; you may accumulate >> long-lived, uncollected garbage, which the OS may happily swap out >> (since the pages are never touched), which then kills performance when >> GC finally runs. I describe this here: >> http://blog.mikemccandless.com/2011/04/just-say-no-to-swapping.html >> It's good to leave some RAM for the OS to use as IO cache. >> >> Ideally, merging should not evict pages from the OS's buffer cache, >> but unfortunately the low-level IO flags to control this (eg >> fadvise/madvise) are not available in Java (I wrote about that here: >> http://blog.mikemccandless.com/2010/06/lucene-and-fadvisemadvise.html). >> >> However, we have a GCoC student this summer working on the problem >> (see https://issues.apache.org/jira/browse/LUCENE-2795), so after this >> is done we'll have a NativeUnixDirectory impl that hopefully prevents >> buffer cache eviction due to merging without you having to tweak >> swappiness settings. >> >> Mike >> >> http://blog.mikemccandless.com >> >> On Sat, Apr 30, 2011 at 9:23 PM, Craig Stires <craig.sti...@gmail.com> wrote: >>> Daniel, >>> >>> I've been able to post documents to Solr without degrading the performance >>> of search. But, I did have to make some changes to the solrconfig.xml >>> (ramBufferSize, mergeFactor, autoCommit, etc). >>> >>> What I found to be helpful was having a look at what was the causing the OS >>> to grind. If your system is swapping too much to disk, you can check if >>> bumping up the ram (-Xms512m -Xmx1024m) alleviates it. Even if this isn't >>> the fix, you can at least isolate if it's a memory issue, or if your issue >>> is related to a disk I/O issue (e.g. running optimization on every commit). >>> >>> >>> Also, is worth having a look in your logs to see if the server is having >>> complaints about memory or issues with your schema, or some other unexpected >>> issue. >>> >>> A resource that has been helpful for me >>> http://wiki.apache.org/solr/SolrPerformanceFactors >>> >>> >>> >>> >>> >>> >>> -----Original Message----- >>> From: Daniel Huss [mailto:hussdl1985-solrus...@yahoo.de] >>> Sent: Sunday, 1 May 2011 5:35 AM >>> To: solr-user@lucene.apache.org >>> Subject: Searching performance suffers tremendously during indexing >>> >>> Hi everyone, >>> >>> our Solr-based search is unresponsive while documents are being indexed. >>> The documents to index (results of a DB query) are sent to Solr by a >>> daemon in batches of varying size. The number of documents per batch may >>> vary between one and several hundreds of thousands. >>> >>> Before investigating any further, I would like to ask if this can be >>> considered an issue at all. I was expecting Solr to handle concurrent >>> indexing/searching quite well, in fact this was one of the main reasons >>> for chosing Solr over the searching capabilities of our RDMS. >>> >>> Is searching performance *supposed* to drop while documents are being >>> indexed? >>> >>>