OK, scratch autowarming. In fact your autowarm counts are quite high, I suspect far past "diminishing returns". I usually see autowarm counts < 64, but YMMV.
Are you seeing actual hit ratios that are decent on those caches (admin UI>>plugins/stats>>cache>>...) And your cache sizes are also quite high in my experience, it's probably worth measuring the utilization there as well. And, BTW, your filterCache can occupy up to 2G of your heap. That's probably not your central problem, but it's something to consider. So I don't know why your queries are taking that long, my assumption is that they may simply be very complex queries, or you have grouping on or..... I guess the next thing I'd do is start trying to characterize what queries are slow. Grouping? Pivot Faceting? 'cause from everything you've said so far it's surprising that you're seeing queries take this long, something doesn't feel right but what it is I don't have a clue. Best, Erick On Fri, Jul 22, 2016 at 9:15 AM, Rallavagu <rallav...@gmail.com> wrote: > > > On 7/22/16 8:34 AM, Erick Erickson wrote: >> >> Mostly this sounds like a problem that could be cured with >> autowarming. But two things are conflicting here: >> 1> you say "We have a requirement to have updates available immediately >> (NRT)" >> 2> your docs aren't available for 120 seconds given your autoSoftCommit >> settings unless you're specifying >> -Dsolr.autoSoftCommit.maxTime=some_other_interval >> as a startup parameter. >> > Yes. We have 120 seconds available. > >> So assuming you really do have a 120 second autocommit time, you should be >> able to smooth out the spikes by appropriate autowarming. You also haven't >> indicated what your filterCache and queryResultCache settings are. They >> come with a default of 0 for autowarm. But what is their size? And do you >> see a correlation between longer queries every on 2 minute intervals? And >> do you have some test harness in place (jmeter works well) to demonstrate >> that differences in your configuration help or hurt? I can't >> over-emphasize the >> importance of this, otherwise if you rely on somebody simply saying "it's >> slow" >> you have no way to know what effect changes have. > > > Here is the cache configuration. > > <filterCache class="solr.FastLRUCache" > size="5000" > initialSize="5000" > autowarmCount="500"/> > > <!-- Query Result Cache > > Caches results of searches - ordered lists of document ids > (DocList) based on a query, a sort, and the range of documents > requested. > --> > <queryResultCache class="solr.LRUCache" > size="20000" > initialSize="20000" > autowarmCount="500"/> > > <!-- Document Cache > > Caches Lucene Document objects (the stored fields for each > document). Since Lucene internal document ids are transient, > this cache will not be autowarmed. > --> > <documentCache class="solr.LRUCache" > size="100000" > initialSize="100000" > autowarmCount="0"/> > > We have run load tests using JMeter with directory pointing to Solr and also > tests that are pointing to the application that queries Solr. In both cases, > we have noticed the results being slower. > > Thanks > >> >> Best, >> Erick >> >> >> On Thu, Jul 21, 2016 at 11:22 PM, Shawn Heisey <apa...@elyograg.org> >> wrote: >>> >>> On 7/21/2016 11:25 PM, Rallavagu wrote: >>>> >>>> There is no other software running on the system and it is completely >>>> dedicated to Solr. It is running on Linux. Here is the full version. >>>> >>>> Linux version 3.8.13-55.1.6.el7uek.x86_64 >>>> (mockbu...@ca-build56.us.oracle.com) (gcc version 4.8.3 20140911 (Red >>>> Hat 4.8.3-9) (GCC) ) #2 SMP Wed Feb 11 14:18:22 PST 2015 >>> >>> >>> Run the top program, press shift-M to sort by memory usage, and then >>> grab a screenshot of the terminal window. Share it with a site like >>> dropbox, imgur, or something similar, and send the URL. You'll end up >>> with something like this: >>> >>> https://www.dropbox.com/s/zlvpvd0rrr14yit/linux-solr-top.png?dl=0 >>> >>> If you know what to look for, you can figure out all the relevant memory >>> details from that. >>> >>> Thanks, >>> Shawn >>> >