Re: Faceted Search Slows Down as index gets larger

2010-12-16 Thread Yonik Seeley
Another thing you can try is trunk. This specific case has been improved by an order of magnitude recenty. The case that has been sped up is initial population of the filterCache, or when the filterCache can't hold all of the unique values, or when faceting is configured to not use the filterCache

Re: Faceted Search Slows Down as index gets larger

2010-12-16 Thread Furkan Kuru
I am sorry for raising up this thread after 6 months. But we have still problems with faceted search on full-text fields. We try to get most frequent words in a text field that is created in 1 hour. The faceted search takes too much time even the matching number of documents (created_at within 1

Re: Faceted Search Slows Down as index gets larger

2010-06-09 Thread Lance Norskog
case? > If not, is there any other way to mitigate the cache re-building problem of > facet search? > > --- On Sun, 6/6/10, Yonik Seeley wrote: > >> From: Yonik Seeley >> Subject: Re: Faceted Search Slows Down as index gets larger >> To: solr-user@lucene.apache.org

Re: Faceted Search Slows Down as index gets larger

2010-06-06 Thread Andy
core strategy still work in this case? If not, is there any other way to mitigate the cache re-building problem of facet search? --- On Sun, 6/6/10, Yonik Seeley wrote: > From: Yonik Seeley > Subject: Re: Faceted Search Slows Down as index gets larger > To: solr-user@lucene.apa

Re: Faceted Search Slows Down as index gets larger

2010-06-06 Thread Furkan Kuru
Ok, I will have a look at distributed search, multi-core solr solution. Thank you Yonik, On Sun, Jun 6, 2010 at 8:54 PM, Yonik Seeley wrote: > On Sun, Jun 6, 2010 at 1:12 PM, Furkan Kuru wrote: > > We try to provide real-time search. So the index is changing almost in > every > > minute. > > >

Re: Faceted Search Slows Down as index gets larger

2010-06-06 Thread John Wang
Using the Zoie/Bobo combination gives you realtime faceting. (Lucene based) http://sna-projects.com/zoie/ http://sna-projects.com/bobo/ wiki write-up: http://snaprojects.jira.com/wiki/display/BOBO/Realtime+Faceting+with+Zoie We can take this over to the zoie/bobo mailing list if you have questio

Re: Faceted Search Slows Down as index gets larger

2010-06-06 Thread Yonik Seeley
On Sun, Jun 6, 2010 at 1:12 PM, Furkan Kuru wrote: > We try to provide real-time search. So the index is changing almost in every > minute. > > We commit for every 100 documents received. > > The facet search is executed every 5 mins. OK, that's the problem - pretty much every facet search is reb

Re: Faceted Search Slows Down as index gets larger

2010-06-06 Thread Furkan Kuru
We try to provide real-time search. So the index is changing almost in every minute. We commit for every 100 documents received. The facet search is executed every 5 mins. Here is the stats result after facet search with normal facet.method=fc (it took 95 seconds) *name: * fieldValueCache *cl

Re: Faceted Search Slows Down as index gets larger

2010-06-06 Thread Yonik Seeley
On Sun, Jun 6, 2010 at 7:38 AM, Furkan Kuru wrote: > facet.limit = default value 100 > facet.minCount is 1 > > The document count that matches the query is 8-10K in average. I did not > calculate the terms (maybe using using facet.limit=-1 and facet.minCount=1) > > My index entirely fits into memo

Re: Faceted Search Slows Down as index gets larger

2010-06-06 Thread Furkan Kuru
eturning the facet > counts of all 1M of facet terms, or did you limit the number of facet terms > returned to a small number? > > Also did your entire index fit within RAM? > > > --- On Sat, 6/5/10, Furkan Kuru wrote: > > > From: Furkan Kuru > > Subject: Re: F

Re: Faceted Search Slows Down as index gets larger

2010-06-05 Thread Andy
all 1M of facet terms, or did you limit the number of facet terms returned to a small number? Also did your entire index fit within RAM? --- On Sat, 6/5/10, Furkan Kuru wrote: > From: Furkan Kuru > Subject: Re: Faceted Search Slows Down as index gets larger > To: solr-user@lucene.a

Re: Faceted Search Slows Down as index gets larger

2010-06-05 Thread Furkan Kuru
The documents full-text fields are 140 chars length (tweets). Actually I had looked at those parameters and thought no change was neccessary because the terms per document would be few and the unique term count was nearly 1 M. I don't know exactly but average term count per document text can be 10

Re: Faceted Search Slows Down as index gets larger

2010-06-04 Thread Yonik Seeley
On Fri, Jun 4, 2010 at 7:33 PM, Andy wrote: > Yonik, > > Just curious why does using enum improve the facet performance. > > Furkan was faceting on a text field with each word being a facet value. I'd > imagine that'd mean there's a large number of facet values. According to the > documentation

Re: Faceted Search Slows Down as index gets larger

2010-06-04 Thread Andy
FacetParameters#facet.method) facet.method=fc is faster when a field has many unique terms. So how come enum, not fc, is faster in this case? Also why use filterCache less? Thanks Andy --- On Fri, 6/4/10, Furkan Kuru wrote: > From: Furkan Kuru > Subject: Re: Faceted Search Slows Down as i

Re: Faceted Search Slows Down as index gets larger

2010-06-04 Thread Furkan Kuru
I am using 1.4 version. I have tried your suggestion, it takes around 25-30 seconds now. Thank you, On Fri, Jun 4, 2010 at 5:54 PM, Yonik Seeley wrote: > Faceting on a full-text field is hard. > What version of Solr are you using? > > If it's 1.4 or later, try setting > facet.method=enum > >

Re: Faceted Search Slows Down as index gets larger

2010-06-04 Thread Yonik Seeley
Faceting on a full-text field is hard. What version of Solr are you using? If it's 1.4 or later, try setting facet.method=enum And to use the filterCache less, try facet.enum.cache.minDf=100 -Yonik http://www.lucidimagination.com On Fri, Jun 4, 2010 at 10:31 AM, Furkan Kuru wrote: > Hello, > >

Faceted Search Slows Down as index gets larger

2010-06-04 Thread Furkan Kuru
Hello, I have been dealing with real-time data. As the number of total indexed documents gets larger (now 5 M) a faceted search on a text field limited by the creation time, which we use to find the most used word in all these text fields, gets slow down. query string: created_time:[NOW-1HOUR