Re: overlap function query

Chris Hostetter Wed, 30 Jan 2013 11:17:37 -0800

: I think coord works at the document level, I was thinking of having
: something that worked at a field level, against a 'principle/primary'
: field.


I'm not sure what you mean by "works at hte document level" ... coord is 
used y the BooleanQuery scoring mechanism to define how scores should be 
affected when a document doens't match all terms in the query.

Mikhail's suggestion was that with an appropriately definied coord method 
in a custom similarity, you could probably get close to what you are 
asking about using the "query()" function on a BooleanQuery containing hte 
terms you are interested in.

It may be easier then that though ... you might want to take a look at the 
termfreq() and norm() functions .. combined with map() (to ensure you get 
a "1" for docs that match a term, no matter what the termfreq() is) you 
could probably get proportionate values to what you are looking for -- but 
the denominators won't be the exact number of terms in the field unless 
you customize the norm function in your similarity.

in general though this smells like an XY problem, because the kind of 
boosting you seem to be trying to achieve sounds like exactly what the 
normal TF/IDF scoring algorithm will give you.  so perhaps you should tell 
us more about some real world specifics of hte types of data/query you are 
using, what types of results you are seeing, and the types of results you 
want...

https://people.apache.org/~hossman/#xyproblem
XY Problem

Your question appears to be an "XY Problem" ... that is: you are dealing
with "X", you are assuming "Y" will help you, and you are asking about "Y"
without giving more details about the "X" so that we can understand the
full issue.  Perhaps the best solution doesn't involve "Y" at all?
See Also: http://www.perlmonks.org/index.pl?node_id=542341



-Hoss

Re: overlap function query

Reply via email to