On Jun 23, 2009, at 3:58 AM, Asif Rahman wrote:
Hi again,
I guess nobody has used facets in the way I described below before.
Do any
of the experts have any ideas as to how to do this efficiently and
correctly? Any thoughts would be greatly appreciated.
Thanks,
Asif
On Wed, Jun 17, 2009 at 12:42 PM, Asif Rahman <a...@newscred.com>
wrote:
Hi all,
We have an index of news articles that are tagged with news topics.
Currently, we use solr facets to see which topics are popular for a
given
query or time period. I'd like to apply the concept of IDF to the
facet
counts so as to penalize the topics that occur broadly through our
index.
I've begun to write custom facet component that applies the IDF to
the facet
counts, but I also wanted to check if anyone has experience using
facets in
this way.
I'm not sure I'm following. Would you be faceting on one field, but
using the DF from some other field? Faceting is already a count of
all the documents that contain the term on a given field for that
search. If I'm understanding, you would still do the typical
faceting, but then rerank by the global DF values, right?
Backing up, what is the problem you are seeing that you are trying to
solve?
I think you could do this, but you'd have to hook it in yourself. By
penalize, do you mean remove, or just have them in the sort?
Generally speaking, looking up the DF value can be expensive,
especially if you do a lot of skipping around. I don't know how
pluggable the sort capabilities are for faceting, but that might be
the place to start if you are just looking at the sorting options.
--------------------------
Grant Ingersoll
http://www.lucidimagination.com/
Search the Lucene ecosystem (Lucene/Solr/Nutch/Mahout/Tika/Droids)
using Solr/Lucene:
http://www.lucidimagination.com/search