Hi, We've just found a very similar issue at a client installation. They have around 27 million documents and are faceting on fields with high cardinality, and are unhappy with query performance and the server hardware necessary to make this performance acceptable. Last night we noticed the filter cache had a pretty low hit rate and seemed to be filling up with many unexpected items (we were testing with only a *single* actual filter query). Diagnosing this with the showItems flag set on the Solr admin statistics we could see entries relating to facets, even though we were sure we were using the default facet.method=fc setting that should prevent filters being constructed. We're thus seeing similar cache pollution to Ken and Anca.
We're trying a different type of cache (LFUCache) now and also may try tweaking cache sizes to try and help, as the filter creation seems to be something we can't easily get round. cheers Charlie Flax www.flax.co.uk On 18 October 2013 14:32, Anca Kopetz <anca.kop...@kelkoo.com> wrote: > Hi Ken, > > Have you managed to find out why these entries were stored into > filterCache and if they have an impact on the hit ratio ? > We noticed the same problem, there are entries of this type : > item_+(+(title:western^10.0 | ... in our filterCache. > > Thanks, > Anca > > > On 07/02/2013 09:01 PM, Ken Krugler wrote: > > Hi all, > > After upgrading from Solr 3.5 to 4.2.1, I noticed our filterCache hit > ratio had dropped significantly. > > Previously it was at 95+%, but now it's < 50%. > > I enabled recording 100 entries for debugging, and in looking at them it > seems that edismax (and faceting) is creating entries for me. > > This is in a sharded setup, so it's a distributed search. > > If I do a search for the string "bogus text" using edismax on two fields, > I get an entry in each of the shard's filter caches that looks like: > > item_+(((field1:bogus | field2:bogu) (field1:text | field2:text))~2): > > Is this expected? > > I have a similar situation happening during faceted search, even though my > fields are single-value/untokenized strings, and I'm not using the enum > facet method. > > But I'll get many, many entries in the filterCache for facet values, and > they all look like "item_<facet field>:<facet value>:" > > The net result of the above is that even with a very big filterCache size > of 2K, the hit ratio is still only 60%. > > Thanks for any insights, > > -- Ken > > -------------------------- > Ken Krugler > +1 530-210-6378 > http://www.scaleunlimited.com > custom big data solutions & training > Hadoop, Cascading, Cassandra & Solr > > > > > > > > > ________________________________ > Kelkoo SAS > Société par Actions Simplifiée > Au capital de € 4.168.964,30 > Siège social : 8, rue du Sentier 75002 Paris > 425 093 069 RCS Paris > > Ce message et les pièces jointes sont confidentiels et établis à > l'attention exclusive de leurs destinataires. Si vous n'êtes pas le > destinataire de ce message, merci de le détruire et d'en avertir > l'expéditeur. >