Re: Aggregating/Grouping Document Search Results on a Field

2009-07-13 Thread John Wang
Hi Brad: We have since (Bobo) added some perf tests which allows you to do some benchmarking very quickly: http://code.google.com/p/bobo-browse/wiki/BoboPerformance Let me know if you need help setting up. -John On Mon, Jul 13, 2009 at 10:41 AM, Jason Rutherglen < jason.rutherg...@gmail

Re: Aggregating/Grouping Document Search Results on a Field

2009-07-13 Thread Jason Rutherglen
SOLR 1.4 has a new feature https://issues.apache.org/jira/browse/SOLR-475that speeds up faceting on fields with many terms by adding an UnInvertedField. Bobo uses a custom field cache as well. It may be useful to benchmark the 3 different approaches (bitsets, SOLR-475, Bobo). This could be a good w

Re: Aggregating/Grouping Document Search Results on a Field

2009-07-13 Thread Bradford Stephens
Thanks for this -- we're also trying out bobo-browse for Lucene, and early results look pretty enticing. They greatly sped up how fast you read in documents from disk, among other things: http://bobo-browse.wiki.sourceforge.net/ On Sat, Jul 11, 2009 at 12:10 AM, Shalin Shekhar Mangar wrote: > On S

Re: Aggregating/Grouping Document Search Results on a Field

2009-07-11 Thread Shalin Shekhar Mangar
On Sat, Jul 11, 2009 at 12:01 AM, Bradford Stephens < bradfordsteph...@gmail.com> wrote: > Does the facet aggregation take place on the Solr search server, or > the Solr client? > > It's pretty slow for me -- on a machine with 8 cores/ 8 GB RAM, 50 > million document index (about 36M unique values

Re: Aggregating/Grouping Document Search Results on a Field

2009-07-10 Thread Avlesh Singh
> > Does the facet aggregation take place on the Solr search server, or the > Solr client? > Solr server. Faceting is an expensive operation by nature, especially when the hits are large in number. Solr caches these values once computed. You might want to tweak cache related parameters in your sol

Re: Aggregating/Grouping Document Search Results on a Field

2009-07-10 Thread Bradford Stephens
Does the facet aggregation take place on the Solr search server, or the Solr client? It's pretty slow for me -- on a machine with 8 cores/ 8 GB RAM, 50 million document index (about 36M unique values in the "author" field), a query that returns 131,000 hits takes about 20 seconds to calculate the

Re: Aggregating/Grouping Document Search Results on a Field

2009-07-09 Thread Bradford Stephens
Oh, wow... I think that faceted search is the right path, especially since seeing this amazing site: http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Faceted-Search-Solr I hope it's performant over hundreds of thousands of search results :) On Thu, Jul 9, 2009 at 10:13 PM,

Re: Aggregating/Grouping Document Search Results on a Field

2009-07-09 Thread Bradford Stephens
It looks like field collapsing may be the key: http://issues.apache.org/jira/browse/SOLR-236 But it also doesn't seem to be 'finalized' yet. I wonder how performant it is with indexes of 50 million documents+? On Thu, Jul 9, 2009 at 9:42 PM, shb wrote: > you can refer to the facet search of solr,

Re: Aggregating/Grouping Document Search Results on a Field

2009-07-09 Thread shb
you can refer to the facet search of solr, that might help you. 2009/7/10 Bradford Stephens > Greetings, > > We've been experimenting with grouping fields returned from document > search results in Lucene, and we haven't gotten anything very > encouraging. Basically, the more results we return,