Oh, wow... I think that faceted search is the right path, especially since seeing this amazing site: http://www.lucidimagination.com/Community/Hear-from-the-Experts/Articles/Faceted-Search-Solr
I hope it's performant over hundreds of thousands of search results :) On Thu, Jul 9, 2009 at 10:13 PM, Bradford Stephens<bradfordsteph...@gmail.com> wrote: > It looks like field collapsing may be the key: > http://issues.apache.org/jira/browse/SOLR-236 > > But it also doesn't seem to be 'finalized' yet. I wonder how > performant it is with indexes of 50 million documents+? > > On Thu, Jul 9, 2009 at 9:42 PM, shb<suh...@gmail.com> wrote: >> you can refer to the facet search of solr, that might help you. >> >> 2009/7/10 Bradford Stephens <bradfordsteph...@gmail.com> >> >>> Greetings, >>> >>> We've been experimenting with grouping fields returned from document >>> search results in Lucene, and we haven't gotten anything very >>> encouraging. Basically, the more results we return, the longer it >>> takes -- tens of seconds. Probably because we're doing expensive disks >>> seeks. I'm hoping the SOLR crew out there may provide some insight :) >>> >>> What we're trying to do is similar to SQL's "GROUP BY". Let's say we >>> have documents indexed by keyword for a content body, and also indexed >>> by an Author name. If I search our document store (very large) for the >>> word "laptop", I would like to be able to calculate the 10 authors >>> that appeared the most. >>> >>> I've done some searching through the mailing list, but couldn't glean >>> much insight. What do you think? >>> >>> -- >>> http://www.roadtofailure.com -- The Fringes of Scalability, Social >>> Media, and Computer Science >>> >> > > > > -- > http://www.roadtofailure.com -- The Fringes of Scalability, Social > Media, and Computer Science > -- http://www.roadtofailure.com -- The Fringes of Scalability, Social Media, and Computer Science