Multi-value faceting is fast for queries, however because it's cached
per-multi-segment, each soft commit will flush the cache, and it will be
reloaded on the first query.  As the index grows it becomes expensive to
build, as well as being RAM consuming.

I am not aware of any Jira issues open with activity regarding adding this
feature to Solr.

On Sat, Jul 7, 2012 at 8:32 PM, Andy <angelf...@yahoo.com> wrote:

> Jason,
>
> If I just use stock Solr 4.0 without modifying the source code, does that
> mean multi-value faceting will be very slow when I'm constantly
> inserting/updating documents?
>
> Which open source library are you referring to? Will Solr adopt this
> per-segment approach any time soon?
>
> Thanks
>
>
> ________________________________
>  From: Jason Rutherglen <jason.rutherg...@gmail.com>
> To: solr-user@lucene.apache.org
> Sent: Saturday, July 7, 2012 2:05 PM
> Subject: Re: Nrt and caching
>
> Andy,
>
> You'd need to hack on the Solr code, specifically the SimpleFacets class.
> Solr uses UnInvertedField to build an in memory doc -> terms mapping, which
> would need to be cached per-segment.  Then you'd need to aggregate the
> resultant per-segment counts.
>
> There is another open source library that has taken the same basic faceting
> approach (it is per-segment), and could be colloquially faster, however it
> is built for Lucene 3.x at the moment.
>
> On Sat, Jul 7, 2012 at 12:21 PM, Andy <angelf...@yahoo.com> wrote:
>
> > So If I want to use multi-value facet with NRT I'd need to convert the
> > cache to per-segment? How do I do that?
> >
> > Thanks.
> >
> >
> > ________________________________
> >  From: Jason Rutherglen <jason.rutherg...@gmail.com>
> > To: solr-user@lucene.apache.org
> > Sent: Saturday, July 7, 2012 11:32 AM
> > Subject: Re: Nrt and caching
> >
> > The field caches are per-segment, which are used for sorting and basic
> > [slower] facets.  The result set, document, filter, and multi-value facet
> > caches are [in Solr] per-multi-segment.
> >
> > Of these, the document, filter, and multi-value facet caches could be
> > converted to be [performant] per-segment, as with some other Apache
> > licensed Lucene based search engines.
> >
> > On Sat, Jul 7, 2012 at 10:42 AM, Yonik Seeley <
> yo...@lucidimagination.com
> > >wrote:
> >
> > > On Sat, Jul 7, 2012 at 9:59 AM, Jason Rutherglen
> > > <jason.rutherg...@gmail.com> wrote:
> > > > Currently the caches are stored per-multiple-segments, meaning after
> > each
> > > > 'soft' commit, the cache(s) will be purged.
> > >
> > > Depends which caches.  Some caches are per-segment, and some caches
> > > are top level.
> > > It's also a trade-off... for some things, per-segment data structures
> > > would indeed turn around quicker on a reopen, but every query would be
> > > slower for it.
> > >
> > > -Yonik
> > > http://lucidimagination.com
> > >
> >
>

Reply via email to