Hi Yonik,
For my particular needs, IDF considerations are fine and helpful; if a
user is requesting a rare term/phrase, increasing the score based on
that makes sense as the match has higher confidence. I simply need to
compensate for title and category type fields that may contain redundant
information and disregard length considerations (these fields are
multi-valued and may be populated from a varying number of sources, and
I don't want the number of sources and the level of repetitiveness to
affect the score). Basically, a boolean "does it match" score adjusted
solely based on IDF. Of course, I'm sure there are others who probably
wouldn't need or care about IDF, either, but still want phrase matching.
Cheers,
Aaron
Yonik Seeley wrote:
On Fri, Sep 18, 2009 at 11:05 AM, Aaron McKee <ucbmc...@gmail.com> wrote:
I wonder, though, if it could also make sense to support a
query-time only boolean to optionally disable TF independently, on a
per-field basis?
I guess it could make sense. But do you still want idf too? length
norm? or do you really want a constant score (match/no-match)?
-Yonik
http://www.lucidimagination.com