Hello Solrites (or Solrorians) Is it possible to get the average ranking score for a set of docs that would be returned for a given facet value.
If not in SOLR, what about Lucene? How hard to implement? I have years of Java experience, but no Lucene coding experience. Would be happy to implement if someone could guide me. thanks Gene On Tue, Apr 28, 2009 at 11:39 AM, Gene Campbell <g...@picante.co.nz> wrote: > Thanks for the reply > > Your thoughts are what I initially was thinking. But, given some more > consideration, I imagined a system that would take all the docs that > would be returned for a given facet, and get an average score based on > their scores from the original search that produced the facets. This > would be the facet values rank. So, a higher ranked facet value would > be more likely to return higher ranked results. > > The idea is that if you want a broad loose search over a large > dataset, and you order the results based on rank, so you get the most > relevant results at the top, e.g. the first page in a search engine > website. You might have pages and pages of results, but it's the > first few pages of results that are highly ranked that most users > generally see. As the relevance tapers off, then generally do another > search. > > However, if you compute facet values on these results, you have no way > of knowing if one facet value for a field is more or less likely to > return higher scored, relevant records for the user. You end up > getting facet values that match records that is often totally > irrelevant. > > We can sort by Index order, or Count of docs returned. Would I would > like is a sort based on Score, such that it would be > sum(scores)/Count. > > I would assume that most users would be interested in the higher > ranked ones more often. So, a more efficient UI could be built to > show just the high ranked facets on this score, and provide a control > to show all the facets (not just the high ranked ones.) > > Does this clear up my post at all? > > Perhaps this wouldn't be too hard for me to implement. I have lots of > Java experience, but no experience with Lucene or Solr code. > thoughts? > > thanks > gene > > > > > On Tue, Apr 28, 2009 at 10:56 AM, Shalin Shekhar Mangar > <shalinman...@gmail.com> wrote: >> On Fri, Apr 24, 2009 at 12:25 PM, ristretto.rb <ristretto...@gmail.com>wrote: >> >>> Hello, >>> >>> Is it possible to order the facet results on some ranking score? >>> I've had a look at the facet.sort param, >>> ( >>> http://wiki.apache.org/solr/SimpleFacetParameters#head-569f93fb24ec41b061e37c702203c99d8853d5f1 >>> ) >>> but that seems to order the facet either by count or by index value >>> (in my case alphabetical.) >>> >> >> Facets are not ranked because there is no criteria for determining relevancy >> for them. They are just the count of documents for each term in a given >> field computed for the current result set. >> >> >>> >>> We are facing a big number of facet results for multiple termed >>> queries that are OR'ed together. We want to keep the OR nature of our >>> queries, >>> but, we want to know which facet values are likely to give you higher >>> ranked results. We could AND together the terms, to get the facet >>> list to be >>> more manageable, but we would be filtering out too many results. We >>> prefer to OR terms and let the ranking bring the good stuff to the >>> top. >>> >>> For example, suppose we have a index of all known animals and >>> each doc has a field AO for animal-origin. >>> >>> Suppose we search for: wolf grey forest Europe >>> And generate facets AO. We might get the following >>> facet results: >>> >>> For the AO field, lots of countries of the world probably have grey or >>> forest or wolf or Europe in their indexing data, so I'm asserting we'd >>> get a big list here. >>> But, only some of the countries will have all 4 terms, and those are >>> the facets that will be the most interesting to drill down on. Is >>> there >>> a way to figure out which facet is the most highly ranked like this? >>> >> >> Suppose 10 documents match the query you described. If you facet on AO, then >> it would just go through all the terms in AO and give you the number of >> documents which have that term. There's no question of relevance at all >> here. The returned documents themselves are of course ranked according to >> the relevancy score. >> >> Perhaps I've misunderstood the query? >> >> -- >> Regards, >> Shalin Shekhar Mangar. >> >