Hi Otis, Yes the query results cache is just about worthless. I guess we have too diverse of a set of user queries. The business unit has decided to let bots crawl our search pages too so that doesn't help either. I turned it way down but decided to keep it because my understanding was that it would still help for users going from page 1 to page 2 in a search. Is that true?
Thanks Robi -----Original Message----- From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] Sent: Monday, June 17, 2013 6:39 PM To: solr-user@lucene.apache.org Subject: Re: yet another optimize question Hi Robi, This goes against the original problem of getting OOMEs, but it looks like each of your Solr caches could be a little bigger if you want to eliminate evictions, with the query results one possibly not being worth keeping if you can't get the hit % up enough. Otis -- Solr & ElasticSearch Support -- http://sematext.com/ On Mon, Jun 17, 2013 at 2:21 PM, Petersen, Robert <robert.peter...@mail.rakuten.com> wrote: > Hi Otis, > > Right I didn't restart the JVMs except on the one slave where I was > experimenting with using G1GC on the 1.7.0_21 JRE. Also some time ago I > made all our caches small enough to keep us from getting OOMs while still > having a good hit rate. Our index has about 50 fields which are mostly int > IDs and there are some dynamic fields also. These dynamic fields can be used > for custom faceting. We have some standard facets we always facet on and > other dynamic facets which are only used if the query is filtering on a > particular category. There are hundreds of these fields but since they are > only for a small subset of the overall index they are very sparsely populated > with regard to the overall index. With CMS GC we get a sawtooth on the old > generation (I guess every replication and commit causes it's usage to drop > down to 10GB or so) and it seems to be the old generation which is the main > space consumer. With the G1GC, the memory map looked totally different! I > was a little lost looking at memory consumption with that GC. Maybe I'll try > it again now that the index is a bit smaller than it was last time I tried > it. After four days without running an optimize now it is 21GB. BTW our > indexing speed is mostly bound by the DB so reducing the segments might be > ok... > > Here is a quick snapshot of one slaves memory map as reported by PSI-Probe, > but unfortunately I guess I can't send the history graphics to the solr-user > list to show their changes over time: > Name Used Committed Max > Initial Group > Par Survivor Space 20.02 MB 108.13 MB 108.13 MB > 108.13 MB HEAP > CMS Perm Gen 42.29 MB 70.66 MB 82.00 MB 20.75 > MB NON_HEAP > Code Cache 9.73 MB 9.88 MB 48.00 MB 2.44 MB > NON_HEAP > CMS Old Gen 20.22 GB 30.94 GB 30.94 GB > 30.94 GB HEAP > Par Eden Space 42.20 MB 865.31 MB 865.31 MB > 865.31 MB HEAP > Total 20.33 GB 31.97 GB 32.02 GB > 31.92 GB TOTAL > > And here's our current cache stats from a random slave: > > name: queryResultCache > class: org.apache.solr.search.LRUCache > version: 1.0 > description: LRU Cache(maxSize=488, initialSize=6, autowarmCount=6, > regenerator=org.apache.solr.search.SolrIndexSearcher$3@461ff4c3) > stats: lookups : 619 > hits : 36 > hitratio : 0.05 > inserts : 592 > evictions : 101 > size : 488 > warmupTime : 2949 > cumulative_lookups : 681225 > cumulative_hits : 73126 > cumulative_hitratio : 0.10 > cumulative_inserts : 602396 > cumulative_evictions : 428868 > > > name: fieldCache > class: org.apache.solr.search.SolrFieldCacheMBean > version: 1.0 > description: Provides introspection of the Lucene FieldCache, this is > **NOT** a cache that is managed by Solr. > stats: entries_count : 359 > > > name: documentCache > class: org.apache.solr.search.LRUCache > version: 1.0 > description: LRU Cache(maxSize=2048, initialSize=512, autowarmCount=10, > regenerator=null) > stats: lookups : 12710 > hits : 7160 > hitratio : 0.56 > inserts : 5636 > evictions : 3588 > size : 2048 > warmupTime : 0 > cumulative_lookups : 10590054 > cumulative_hits : 6166913 > cumulative_hitratio : 0.58 > cumulative_inserts : 4423141 > cumulative_evictions : 3714653 > > > name: fieldValueCache > class: org.apache.solr.search.FastLRUCache > version: 1.0 > description: Concurrent LRU Cache(maxSize=280, initialSize=280, > minSize=252, acceptableSize=266, cleanupThread=false, autowarmCount=6, > regenerator=org.apache.solr.search.SolrIndexSearcher$1@143eb77a) > stats: lookups : 1725 > hits : 1481 > hitratio : 0.85 > inserts : 122 > evictions : 0 > size : 128 > warmupTime : 4426 > cumulative_lookups : 3449712 > cumulative_hits : 3281805 > cumulative_hitratio : 0.95 > cumulative_inserts : 83261 > cumulative_evictions : 3479 > > > name: filterCache > class: org.apache.solr.search.FastLRUCache > version: 1.0 > description: Concurrent LRU Cache(maxSize=248, initialSize=12, > minSize=223, acceptableSize=235, cleanupThread=false, autowarmCount=10, > regenerator=org.apache.solr.search.SolrIndexSearcher$2@36e831d6) > stats: lookups : 3990 > hits : 3831 > hitratio : 0.96 > inserts : 239 > evictions : 26 > size : 244 > warmupTime : 1 > cumulative_lookups : 5745011 > cumulative_hits : 5496150 > cumulative_hitratio : 0.95 > cumulative_inserts : 351485 > cumulative_evictions : 276308 > > -----Original Message----- > From: Otis Gospodnetic [mailto:otis.gospodne...@gmail.com] > Sent: Saturday, June 15, 2013 5:52 AM > To: solr-user@lucene.apache.org > Subject: Re: yet another optimize question > > Hi Robi, > > I'm going to guess you are seeing smaller heap also simply because you > restarted the JVM recently (hm, you don't say you restarted, maybe I'm making > this up). If you are indeed indexing continuously then you shouldn't > optimize. Lucene will merge segments itself. Lower mergeFactor will force it > to do it more often (it means slower indexing, bigger IO hit when segments > are merged, more per-segment data that Lucene/Solr need to read from the > segment for faceting and such, etc.) so maybe you shouldn't mess with that. > Do you know what your caches are like in terms of size, hit %, evictions? > We've recently seen people set those to a few hundred K or even higher, which > can eat a lot of heap. We have had luck with G1 recently, too. > Maybe you can run jstat and see which of the memory pools get filled up and > change/increase appropriate JVM param based on that? How many fields do you > index, facet, or group on? > > Otis > -- > Performance Monitoring - http://sematext.com/spm/index.html > Solr & ElasticSearch Support -- http://sematext.com/ > > > > > > On Fri, Jun 14, 2013 at 8:04 PM, Petersen, Robert > <robert.peter...@mail.rakuten.com> wrote: >> Hi guys, >> >> We're on solr 3.6.1 and I've read the discussions about whether to optimize >> or not to optimize. I decided to try not optimizing our index as was >> recommended. We have a little over 15 million docs in our biggest index and >> a 32gb heap for our jvm. So without the optimizes the index folder seemed >> to grow in size and quantity of files. There seemed to be an upper limit >> but eventually it hit 300 files consuming 26gb of space and that seemed to >> push our slave farm over the edge and we started getting the dreaded OOMs. >> We have continuous indexing activity, so I stopped the indexer and manually >> ran an optimize which made the index become 9 files consuming 15gb of space >> and our slave farm started having acceptable memory usage. Our merge factor >> is 10, we're on java 7. Before optimizing, I tried on one slave machine to >> go with the latest JVM and tried switching from the CMS GC to the G1GC but >> it hit OOM condition even faster. So it seems like I have to continue to >> schedule a regular optimize. Right now it has been a couple of days since >> running the optimize and the index is slowly growing bigger, now up to a bit >> over 19gb. What do you guys think? Did I miss something that would make us >> able to run without doing an optimize? >> >> Robert (Robi) Petersen >> Senior Software Engineer >> Search Department > >