Erick - Thanks for the insight. Does the filter cache just cache the internal document id's of the result set, correct (as opposed to the document)? If so, am I correct in the following math:
10,000,000 document index Internal Document id is 32 bit unsigned int Max Memory Used by a single cache slot in the filter cache = 32 bits x 10,000,000 docs = 320,000,000 bits or 38 MB Of course, I realize there some additional overhead if we're dealing with Integer objects as opposed to primitives -- and I'm way off if the internal document id is implemented as a long. Also, does SOLR fail gracefully when an OOM occurs (e.g. the cache fails but the query still succeeds)? Thanks! Josh On Thu, Aug 25, 2011 at 2:55 PM, Erick Erickson <erickerick...@gmail.com>wrote: > The pitfalls of filter queries is also their strength. The results will be > cached and re-used if possible. This will take some memory, > of course. Depending upon how big your index is, this could > be quite a lot. > > Yet another time/space tradeoff.... But yeah, use filter queries > until you have OOMs, then get more memory <G>... > > Best > Erick > > On Wed, Aug 24, 2011 at 8:07 PM, Joshua Harness <jkharnes...@gmail.com> > wrote: > > Shawn - > > > > Thanks for your reply. Given that my application is mainly used as > > faceted search, would the following types of queries make sense or are > there > > other pitfalls to consider? > > > > *q=*:*&fq=someField:someValue&fq=anotherField:anotherValue* > > > > Thanks! > > > > Josh > > > > On Wed, Aug 24, 2011 at 4:48 PM, Shawn Heisey <s...@elyograg.org> wrote: > > > >> On 8/24/2011 2:02 PM, Joshua Harness wrote: > >> > >>> I've done some basic query performance testing on my SOLR > instance, > >>> which allows users to search via a faceted search interface. As such, > >>> document relevancy is less important to me since I am performing exact > >>> match > >>> searching. Comparing using filter queries with a plain query has > yielded > >>> remarkable performance. However, I'm suspicious of statements like > >>> 'always > >>> use filter queries since they are so much faster'. In my experience, > >>> things > >>> are never so straightforward. Can anybody provide any further guidance? > >>> What > >>> are the pitfalls of relying heavily on filter queries? When would one > want > >>> to use plain vanilla SOLR queries as opposed to filter queries? > >>> > >> > >> Completely separate from any performance consideration, the key to their > >> usage lies in their name: They are filters. They are particularly > useful > >> in a faceted situation, because you can have more than one of them, and > the > >> overall result is the intersection (AND) of them all. > >> > >> When someone tells the interface to restrict their search by a facet, > you > >> can simply add a filter query with the field:value relating to that > facet > >> and reissue the query. If they decide to remove that restriction, you > just > >> have to remove the filter query. You don't have to try and combine the > >> various pieces in the query, which means you'll have much less hassle > with > >> parentheses. > >> > >> If you need a union (OR) operation with your filters, you'll have to use > >> more complex construction within a single filter query, or not use them > at > >> all. > >> > >> Thanks, > >> Shawn > >> > >> > > >