Erick -

Thanks for the insight. Does the filter cache just cache the internal
document id's of the result set, correct (as opposed to the document)? If
so, am I correct in the following math:

10,000,000 document index
Internal Document id is 32 bit unsigned int
Max Memory Used by a single cache slot in the filter cache = 32 bits x
10,000,000 docs = 320,000,000 bits or 38 MB

Of course, I realize there some additional overhead if we're dealing with
Integer objects as opposed to primitives -- and I'm way off if the internal
document id is implemented as a long.

Also, does SOLR fail gracefully when an OOM occurs (e.g. the cache fails but
the query still succeeds)?

Thanks!

Josh

On Thu, Aug 25, 2011 at 2:55 PM, Erick Erickson <erickerick...@gmail.com>wrote:

> The pitfalls of filter queries is also their strength. The results will be
> cached and re-used if possible. This will take some memory,
> of course. Depending upon how big your index is, this could
> be quite a lot.
>
> Yet another time/space tradeoff.... But yeah, use filter queries
> until you have OOMs, then get more memory <G>...
>
> Best
> Erick
>
> On Wed, Aug 24, 2011 at 8:07 PM, Joshua Harness <jkharnes...@gmail.com>
> wrote:
> > Shawn -
> >
> >     Thanks for your reply. Given that my application is mainly used as
> > faceted search, would the following types of queries make sense or are
> there
> > other pitfalls to consider?
> >
> > *q=*:*&fq=someField:someValue&fq=anotherField:anotherValue*
> >
> > Thanks!
> >
> > Josh
> >
> > On Wed, Aug 24, 2011 at 4:48 PM, Shawn Heisey <s...@elyograg.org> wrote:
> >
> >> On 8/24/2011 2:02 PM, Joshua Harness wrote:
> >>
> >>>      I've done some basic query performance testing on my SOLR
> instance,
> >>> which allows users to search via a faceted search interface. As such,
> >>> document relevancy is less important to me since I am performing exact
> >>> match
> >>> searching. Comparing using filter queries with a plain query has
> yielded
> >>> remarkable performance.  However, I'm suspicious of statements like
> >>> 'always
> >>> use filter queries since they are so much faster'. In my experience,
> >>> things
> >>> are never so straightforward. Can anybody provide any further guidance?
> >>> What
> >>> are the pitfalls of relying heavily on filter queries? When would one
> want
> >>> to use plain vanilla SOLR queries as opposed to filter queries?
> >>>
> >>
> >> Completely separate from any performance consideration, the key to their
> >> usage lies in their name:  They are filters.  They are particularly
> useful
> >> in a faceted situation, because you can have more than one of them, and
> the
> >> overall result is the intersection (AND) of them all.
> >>
> >> When someone tells the interface to restrict their search by a facet,
> you
> >> can simply add a filter query with the field:value relating to that
> facet
> >> and reissue the query.  If they decide to remove that restriction, you
> just
> >> have to remove the filter query.  You don't have to try and combine the
> >> various pieces in the query, which means you'll have much less hassle
> with
> >> parentheses.
> >>
> >> If you need a union (OR) operation with your filters, you'll have to use
> >> more complex construction within a single filter query, or not use them
> at
> >> all.
> >>
> >> Thanks,
> >> Shawn
> >>
> >>
> >
>

Reply via email to